Embedding Training With 1% GPU Memory and 100 Times Less Budget, an Open Source Solution for Super-Large Recommendation Model Training on a Single GPU | Synced

Colossal-AI has successfully used a heterogeneous training strategy to increase the number of NLP model training parameters capacity by hundreds of times at the same hardware. And experiment result...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

Colossal-AI has successfully used a heterogeneous training strategy to increase the number of NLP model training parameters capacity by hundreds of times at the same hardware. And experiment results show that it only needs to keep 1~5% of the embedding parameters in the GPU, and is still able to maintain excellent end-to-end training speed.