Applying Linearly Scalable Transformers to Model Longer Protein Sequences | Synced

Researchers proposed a new transformer architecture called “Performer” — based on what they call fast attention via orthogonal random features (FAVOR).

By · · 1 min read

Source: Synced | AI Technology & Industry Review

Researchers proposed a new transformer architecture called “Performer” — based on what they call fast attention via orthogonal random features (FAVOR).