The reason is the following. We use the notation
θxi:=θ0+θ1xi1+⋯+θpxip.
Then
loghθ(xi)=log11+e−θxi=−log(1+e−θxi),
log(1−hθ(xi))=log(1−11+e−θxi)=log(e−θxi)−log(1+e−θxi)=−θxi−log(1+e−θxi),
[ this used: 1=(1+e−θxi)(1+e−θxi), the
1's in numerator cancel, then we used: log(x/y)=log(x)−log(y) ]
Since our original cost function is the form of:
J(θ)=−1m∑i=1myilog(hθ(xi))+(1−yi)log(1−hθ(xi))
Plugging in the two simplified expressions above, we obtain
J(θ)=−1m∑i=1m[−yi(log(1+e−θxi))+(1−yi)(−θxi−log(1+e−θxi))]
, which can be simplified to:
J(θ)=−1m∑i=1m[yiθxi−θxi−log(1+e−θxi)]=−1m∑i=1m[yiθxi−log(1+eθxi)], (∗)
where the second equality follows from
−θxi−log(1+e−θxi)=−[logeθxi+log(1+e−θxi)]=−log(1+eθxi).
[ we used log(x)+log(y)=log(xy) ]
All you need now is to compute the partial derivatives of (∗) w.r.t. θj. As
∂∂θjyiθxi=yixij,
∂∂θjlog(1+eθxi)=xijeθxi1+eθxi=xijhθ(xi),
the thesis follows.