Improve vision transformer by using anti-aliasing

Blending Anti-Aliasing into Vision Transformer
arXiv paper abstract
arXiv PDF paper

The transformer architectures, based on self-attention mechanism and convolution-free design, recently found superior performance and booming applications in computer vision.

However, the discontinuous patch-wise tokenization process implicitly introduces jagged artifacts into attention maps

… analyze the uncharted problem of aliasing in vision transformer and explore to incorporate anti-aliasing properties.

… propose a plug-and-play Aliasing-Reduction Module(ARM) to alleviate the aforementioned issue.

We investigate the effectiveness and generalization of the proposed method across multiple tasks and various vision transformer families.

This lightweight design consistently attains a clear boost over several famous structures. … improves data efficiency and robustness of vision transformers.

