Fractional Max-Pooling Benjamin Graham Dept of Statistics, University of Warwick, CV4 7AL, UK b.graham@warwick.ac.uk May 13, 2015 Abstract Convolutional networks almost always incorporate some form of spatial pooling, and very often it is α× α max-pooling with α = 2. Max-pooling act on the hidden layers of the network, reducing their size by an integer multiplicative factor α. The amazing by-product of discarding 75% of your data is that you build into the network a degree of invariance with respect to translations and elastic distortions. However, if you simply alternate convolutional layers with max-pooling layers, performance is limited due to the rapid reduction in spatial size, and the disjoint nature of the pooling regions. We have formulated a fractional version of max- pooling where α is allowed to take non-integer values. Our version of max-pooling is stochastic as there are lots of different ways of constructing suitable pooling regions. We find that our form