Partial Derivatives
In the real world, and especially in Machine Learning, the functions we deal with rarely depend on just one variable. Our Loss Function depends on hundreds, even millions, of parameters (weights and biases). To navigate this high-dimensional space, we need the concept of a Partial Derivative.
1. Multi-Variable Functions in ML
Consider the simplest linear model. The predicted output is a function of the input feature , the weight , and the bias :
The Loss Function (e.g., Mean Squared Error) depends on the parameters and :
The goal is to adjust both and simultaneously to minimize . This requires us to know the rate of change of with respect to each parameter individually.
2. What is a Partial Derivative?
A partial derivative of a multi-variable function is simply the derivative of that function with respect to one variable, while treating all other variables as constants.
Notation
The partial derivative of a function with respect to is denoted by the curly symbol:
3. Calculating the Partial Derivative
The calculation uses all the standard derivative rules, but with the assumption that everything not being differentiated is a constant.
Example
Let the function be .
A. Partial Derivative with respect to :
We treat as a constant.
- (Power Rule)
- (Treat as the constant coefficient of )
- (Treat as a constant)
B. Partial Derivative with respect to :
We treat as a constant.