I will present a novel learning framework based on stochastic Bregman iterations. It allows to train sparse neural networks with an inverse scale space approach, starting from a very sparse network and gradually adding significant parameters. Apart from a baseline algorithm called LinBreg, I will also speak about an accelerated version using momentum, and AdaBreg, which is a Bregmanized generalization of the Adam algorithm. I will present a statistically profound sparse parameter initialization strategy, stochastic convergence analysis of the loss decay, and additional convergence proofs in the convex regime. The Bregman learning framework can also be applied to Neural Architecture Search and can, for instance, unveil autoencoder architectures for denoising or deblurring tasks.