Google has patent for dropout layer . So if you don't want to violate it, you might as well just use random noise layer. Results pretty similar for MNIST. convolution,1,28,28,200,8,8,0.5,-0.001 max,200,21,21,7,7 noise,1800,0.5 matrix,1800,130,0.5,-0.001 sigmoid,130 matrix,130,10,0.5,-0.001 softmax,10 Tested with WideOpenThoughts