I think you could adopt one of two approaches.
The first and probably the most practical is to replace the histogram entries that are zero with
the smallest machine precision number.
So after you calculate the histograms but before you calculate the mutual information... do
histRed = histRed /. 0 -> $MachineEpsilon
histGreen = histGreen /. 0 -> $MachineEpsilon
The other approach would be to split the calculation of the second part of the mutual information from:=
Log[ P(x,y)/(p(x) p(y))]
Log[p(x,y)] - Log[p(x) p(y)]
This removes the division by zero ... but leaves you with an infinite probability .. which is theoretically correct but perhaps not that useful.
Hope that helps,