Please check function log_prb_labels() for logarithm bag probability calculation. Although we just demonstrate its application to multiple object detection and localization, this function accepts probability tensor P with other orders, as long as the first dimension is number_of_classes + 1.
We have used this optimization code to train our models. You may try your favorite optimization methods.
We demonstrated its applications to image recognition and street view house number transcription using all convolutional networks. For MNIST, we got test classification error rate slightly higher than 0.3%. For SVHN, we got sequence transcription error rate slightly higher than 5% with a model having about 7.5M coefficients (vs. about 4% by this specialized and strongly supervised model having about 51M coefficients). Our learned model might be able to directly read the numbers in many of the original images as shown below. Preprocessed SVHN dataset and some pretrained models are here.