Hugging Face Uses Block Pruning to Speedup Transformer Training While Maintaining Accuracy | Synced

A research team from Hugging Face introduces a block pruning approach targeting both small and fast models, which learns to eliminate full components of the original model while effectively droppin...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

A research team from Hugging Face introduces a block pruning approach targeting both small and fast models, which learns to eliminate full components of the original model while effectively dropping a large number of attention heads.