class: center, middle
 
# Through the Lens of Oracle: Eﬃcient Algorithm for Finding Selective Classiﬁer

### Presented by Yanlin (Alan) Duan, Rong Zhou
 
---
 
# Review of selective classifier

Recall the definition of a selective classifier:

--

A selective classiﬁer `\(C\)` is a tuple `\((h, (\gamma_1, \cdots , \gamma_m)\)`, where `\(h\)` lies in a hypothesis class `\(\mathcal{H}\)`, and `0 \le \gamma_i \le 1\)` for all `\(i = 1, \cdots , m\)`. For any `\(x_{n+j} \in U\)`, `\(C(x_{n+j}) = h(x_{n+j})\)` wp `\(gamma_j\)` and `\(0\)` wp `\(1 − \gamma_j\)`.

---

# Review of selective classifier

Below is the algorithm (optimization problem) for finding selective classifier:

.center[![:scale 60%](https://yanlin-duan.github.io/assets/pics/selective-classifier1.png)]

--

How many constraints do we have?

--

As many as `\(|V|)`!

---

# Motivation of our project

--

Can we solve the optimization problem more efficiently by accessing the hypothesis class not directly through (large amount of) constraints, but through some ERM oracle?

--

Yes!

--

Definition of ERM oracle we use: For a set of hypothesis `\(\mathcal{H}\)`, the `\(\textit{Weighted ERM Oracle}\)` is an algorithm, which for any sequence `\((x_1,y_1,w_1),(x_2,y_2,w_2),...(x_t,y_t,w_t) \in Z \subseteq \mathcal{X} \times Y \times \mathbb{R}_{\ge 0}\)`, returns
`\(
  \text{argmin}_{h \in \mathcal{H}} \text{ } \sum_{(x,y,w) \in Z}\mathbbm{1}\{h(x) \ne y\}\cdot w
\)`

---

# Main Algorithm

.center[![:scale 60%](https://yanlin-duan.github.io/assets/pics/selective-classifier2.png)]

---

# Analysis of algorithm

Question: After how many rounds will the algorithm halt?

--

High level idea:

.center[![:scale 60%](https://yanlin-duan.github.io/assets/pics/selective-classifier3.png)]

---

# Find step size

Transform from primal to dual by introducing Lagrange multipliers:

`\(
  
  \mathcal{L} = \sum_{i = 1}^m \frac{1}{\gamma_i} + \sum_{j = 1}^N \lambda_j \left( \left( \sum_{i = 1}^m \gamma_i \mathbbm{1}_{h_j(x_{n+i}) \neq h_0(x_{n+i})} \right) - \epsilon m \right)

\)`

--

Find derivative and set to zero. Substitute `\(\gamma_i\)` using the following:

`\(

  \gamma_i^2 = \left(\sum_{j = 1}^N \lambda_j \mathbbm{1}_{h_j(x_{n+i}) \neq h_0(x_{n+i})} \right)^{-1}

\)'