The sigmoid function looks like this (made with a bit of MATLAB code):

x=-10:0.1:10; s = 1./(1+exp(-x)); figure; plot(x,s); title('sigmoid'); |

Alright, now let’s put on our calculus hats…

## Here’s how you compute the derivative of a sigmoid function

Convert the original equation so we can use the product rule (I always forget the quotient rule).

Now we take the derivative:

Nice! We computed the derivative of a sigmoid! Okay, let’s simplify a bit.

Okay! That looks pretty good to me. Let’s quickly plot it and see if it looks reasonable. Again here’s some MATLAB code to check:

x=-10:0.1:10; % Test values. s = 1./(1+exp(-x)); % Sigmoid. ds = (exp(-x))./((1+exp(-x)).^2); % Derivative of sigmoid. figure; plot(x,s,'b*'); hold on; plot(x,ds,'r+'); legend('sigmoid', 'derivative-sigmoid','location','best') |

Looks like a derivative. Good! But wait… there’s more!

If you’ve been reading some of the neural net literature, you’ve probably come across text that says the derivative of a sigmoid `s(x)`

is equal to `s'(x) = s(x)(1-s(x))`

.

*[note that and s'(x) are the same thing, just different notation.]*

*[also note that Andrew Ng writes, f'(z) = f(z)(1 – f(z)), where f(z) is the sigmoid function, which is the exact same thing that we are doing here.] *

So your next question should be, is our derivative we calculated earlier equivalent to `s'(x) = s(x)(1-s(x))`

?

So, using Andrew Ng’s notation…

## How does the derivative of a sigmoid f(z) equal f(z)(1-(f(z))?

Swapping with our notation, we can ask the equivalent question:

## How does the derivative of a sigmoid s(x) equal s(x)(1-(s(x))?

Okay we left off with…

This part is not intuitive… but let’s add and subtract a 1 to the numerator (this does not change the equation).

// factor out a

Hmmm…. look at that! There’s actually two sigmoid functions there… Recall that the sigmoid function is, . Let’s replace them with s(x).

Just like Prof Ng said… ðŸ™‚

And for a sanity check, do they both show the same function?

x=-10:0.1:10; % Test values. s = 1./(1+exp(-x)); % Sigmoid. ds = (exp(-x))./((1+exp(-x)).^2); % Derivative of sigmoid. ds1 = s.*(1-s); % Another simpler way to compute the derivative of a sigmoid. figure; plot(x,ds,'r+'); hold on; plot(x,ds1, 'go'); legend('(e^{-x})/((1+e^{-x})^2)','(s(x))(1-s(x))','location','best'); title('derivative of sigmoid') |

Yes! They perfectly match!

So there you go. Hopefully this satisfies your mathematical curiosity of why the derivative of a sigmoid s(x) is equal to s'(x) = s(x)(1-s(x)).

Excellent walkthrough. For a guy just getting into activation fn’s, this really helps! Thanks so much!

You’re welcome Sid!

EASILY, the best blog post on finding the derivative of a sigmoid function. You didn’t leave any details out. Took me forever to wrap my head around this. The +1 – 1 thing is definitely not intuitive. Thanks for writing this.

happy to hear it helped!

Thanks! really helped with Prof. Hinton’s NNML Coursera lecture I was struggling to understand.

Glad it helped! It wasn’t obvious to me either ðŸ™‚

Superb!

Exactly what I was looking for!

ðŸ™‚

Very detailed. Thank you !!

You’re welcome!

Thanks much. I was breaking my head on this today.

Glad it helped clear things up!

excellent. Thanks!