Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing - MachineLearningMastery.com

Training a language model is memory-intensive, not only because the model itself is large but also because training data batches often contain long sequences. Training a model with limited memory i...

By · · 1 min read
Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing - MachineLearningMastery.com

Source: MachineLearningMastery.com

Training a language model is memory-intensive, not only because the model itself is large but also because training data batches often contain long sequences. Training a model with limited memory is challenging. In this article, you will learn techniques that enable model training in memory-constrained environments. In particular, you will learn about: Low-precision floating-point numbers […]