The sigmoid function looks like this (made with a bit of MATLAB code):

x=-10:0.1:10; s = 1./(1+exp(-x)); figure; plot(x,s); title('sigmoid'); |

Alright, now let’s put on our calculus hats…

## Here’s how you compute the derivative of a sigmoid function

Convert the original equation so we can use the product rule (I always forget the quotient rule).

Now we take the derivative:

Nice! We computed the derivative of a sigmoid! Okay, let’s simplify a bit.

Okay! That looks pretty good to me. Let’s quickly plot it and see if it looks reasonable. Again here’s some MATLAB code to check:

x=-10:0.1:10; % Test values. s = 1./(1+exp(-x)); % Sigmoid. ds = (exp(-x))./((1+exp(-x)).^2); % Derivative of sigmoid. figure; plot(x,s,'b*'); hold on; plot(x,ds,'r+'); legend('sigmoid', 'derivative-sigmoid','location','best') |

Looks like a derivative. Good! But wait… there’s more!

If you’ve been reading some of the neural net literature, you’ve probably come across text that says the derivative of a sigmoid `s(x)`

is equal to `s'(x) = s(x)(1-s(x))`

.

*[note that and s'(x) are the same thing, just different notation.]*

*[also note that Andrew Ng writes, f'(z) = f(z)(1 – f(z)), where f(z) is the sigmoid function, which is the exact same thing that we are doing here.]
*

So your next question should be, is our derivative we calculated earlier equivalent to `s'(x) = s(x)(1-s(x))`

?

So, using Andrew Ng’s notation…

## How does the derivative of a sigmoid f(z) equal f(z)(1-(f(z))?

Swapping with our notation, we can ask the equivalent question:

## How does the derivative of a sigmoid s(x) equal s(x)(1-(s(x))?

Okay we left off with…

This part is not intuitive… but let’s add and subtract a 1 to the numerator (this does not change the equation).

// factor out a

Hmmm…. look at that! There’s actually two sigmoid functions there… Recall that the sigmoid function is, . Let’s replace them with s(x).

Just like Prof Ng said… 🙂

And for a sanity check, do they both show the same function?

x=-10:0.1:10; % Test values. s = 1./(1+exp(-x)); % Sigmoid. ds = (exp(-x))./((1+exp(-x)).^2); % Derivative of sigmoid. ds1 = s.*(1-s); % Another simpler way to compute the derivative of a sigmoid. figure; plot(x,ds,'r+'); hold on; plot(x,ds1, 'go'); legend('(e^{-x})/((1+e^{-x})^2)','(s(x))(1-s(x))','location','best'); title('derivative of sigmoid') |

Yes! They perfectly match!

So there you go. Hopefully this satisfies your mathematical curiosity of why the derivative of a sigmoid s(x) is equal to s'(x) = s(x)(1-s(x)).

excellent. Thanks!