Machine Learning - Perceptron
This post is an exposition of perceptron in machine learning.
1 Introduction
Perceptron is a simple model for binary classification. Its construction idea is to use a hyperplane to divide data into positive and negative categories, and output
2 Model construction
According to the above construction idea, the perceptron needs to map a data with a feature space of
where
Its parameters are the weight vector
3 Learning strategies of perceptron
In order to optimize the parameters of the perceptron, we need to find its loss function. It is easy to think that the loss function can be measured by the number of classification errors, but the number of errors is discrete, and this function is non differentiable, so gradient descent is not necessary. Therefore, considering the total distance from the misclassified point to the hyperplane S, the calculation method from a point to the hyperplane is
where
If we analyze the two situations when the perceptron misclassifies, that is, the negative judgment of the positive sample and the positive judgment of the negative sample. For the first,
Therefore, the absolute value in the distance calculation formula can be removed to obtain a new distance calculation formula
Now do not consider the weight of
This function means that if there are fewer misclassifications and the point of misclassification is closer to the hyperplane, the loss function will be smaller.
4 Optimization
The optimization problem of perceptron is to solve the parameters
Then, the stochastic gradient descent method is used for optimization. First, a random
For normal vector
For intercept
Therefore, if a misclassified data is randomly selected, the update method is
Where
Therefore, the algorithm suitable for programming implementation should be
- Random
, - Select a data
in the training set - If
- Use the update method to update the parameters
- Loop to step 2 until the perceptron does not generate misclassified data.
So far, the construction of perceptron, learning strategy and optimization algorithm have been explained.