Logistic regression
5. Logistic regression#
Consider a set of training data \((x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), \ldots\), where \(x^{(i)}=(x^{(i)}_1, x^{(i)}_2, \ldots, x^{(i)}_n)\) is a \(n\)-dim vector, and \(y^{(i)}\) is a real number. We would like to use Linear regression to find the relation between \(x\) and \(y\).
In this case, we assume that \(y\) is a linear function of \(x\):
\[
y=\theta_0 + \sum_{j=1}^n\theta_jx_j.
\]
The purpose of Linear regression is to used the given training data to find out the best \(\Theta=(\theta_0, \theta_1, \theta_2,\ldots,\theta_n)\).
If we set \(\hat{x}=(1, x_1, \ldots,x_n)\), then the above formula can be reformulated by matrix multiplication.
\[
y=\Theta \hat{x}^T.
\]
When we want to deal with classification problem, we may still use this regression idea, but we have to do some modification.