when searching for 8 by 8 convolutions, as with 64 free parameters for every one of them you can create output of any convolution by linear combination of outputs from 64 random convolutions (which are not trained).
Config:
convolution,1,28,28,64,8,8,0.5,0
convolution,1,64,441,200,64,1,0.5,-0.001
max,200,21,21,7,7
matrix,1800,130,0.5,-0.001
sigmoid,130
dropout,130,0.5
matrix,130,10,0.5,-0.001
softmax,10
First convolution is not trained, and you need only 64 of them (8x8 parameters = 64). Then with next convolution you find convolutions which you need by combining outputs of previous convolutions.
Results (above 99%):