Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA Performance | Synced
In the new paper Fast DistilBERT on CPUs, researchers from Intel Corporation and Intel Labs propose a pipeline and hardware-aware extreme compression technique for creating and running fast transfo...
Source: Synced | AI Technology & Industry Review
In the new paper Fast DistilBERT on CPUs, researchers from Intel Corporation and Intel Labs propose a pipeline and hardware-aware extreme compression technique for creating and running fast transformer models on CPUs. The approach achieves impressive speed ups and SOTA performance in production environments.