About 19,800 results
Open links in new tab
  1. [2509.14476] AToken: A Unified Tokenizer for Vision - arXiv.org

    Sep 17, 2025 · We present AToken, the first unified visual tokenizer that achieves both high-fidelity reconstruction and semantic understanding across images, videos, and 3D assets. Unlike existing …

  2. GitHub - apple/ml-atoken

    Oct 22, 2025 · AToken is a unified vision tokenizer that handles multiple modalities (images, videos, and 3D) for both understanding and reconstruction through a single framework. It provides both …

  3. AToken: A Unified Tokenizer for Vision - Semantic Scholar

    Sep 17, 2025 · The first unified visual tokenizer that achieves both high-fidelity reconstruction and semantic understanding across images, videos, and 3D assets is presented, and a pure transformer …

  4. AToken - A Unified Tokenizer for Vision

    Sep 23, 2025 · ATOKEN's standard patchification is applied, and features are aggregated back into the voxel space. Pure Transformer Architecture ATOKEN employs a unified transformer architecture for …

  5. AToken: Unified Visual Tokenizer

    Sep 17, 2025 · AToken: A Unified Tokenizer for Vision Motivation and Problem Statement The fragmentation of visual tokenization across modalities and tasks has impeded the development of …

  6. AToken: A Unified Tokenizer for Vision - Apple Machine ...

    Jul 11, 2025 · We present AToken, the first unified visual tokenizer that achieves both high-fidelity reconstruction and semantic understanding across images, videos, and 3D assets. Unlike existing …

  7. ATOKEN: A Unified Tokenizer for Vision (September 2025)

    Date: September 2025 Summary: ATOKEN, a unified visual tokenizer, achieves high-fidelity reconstruction and semantic understanding across images, videos, and 3D assets. It encodes …