Long-form video representation learning (Part 2: Video as sparse transformers) | Towards Data Science

We explore novel video representations methods that are equipped with long-form reasoning capability. This is part II focusing on sparse video-text transformers. See Part I on video as graphs. And ...

By · · 1 min read
Long-form video representation learning (Part 2: Video as sparse transformers) | Towards Data Science

Source: Towards Data Science

We explore novel video representations methods that are equipped with long-form reasoning capability. This is part II focusing on sparse video-text transformers. See Part I on video as graphs. And Part III provides a sneak peek into our latest and greatest explorations. The first blog in this series was about learning explicit sparse graph-based video […]