Kaiming He’s MetaAI Team Proposes ViTDet: A Plain Vision Transformer Backbone Competitive With Hierarchical Backbones on Object Detection | Synced
A Meta AI research team explores the plain, non-hierarchical vision transformer (ViT) as a backbone network for object detection, proposing a ViT Detector that achieves performance competitive with...
Source: Synced | AI Technology & Industry Review
A Meta AI research team explores the plain, non-hierarchical vision transformer (ViT) as a backbone network for object detection, proposing a ViT Detector that achieves performance competitive with traditional hierarchical backbones.