Survey of vision transformers and hybrid CNN-transformer variants


Survey of vision transformers and hybrid CNN-transformer variants

A survey of the Vision Transformers and its CNN-Transformer based Variants
arXiv paper abstract
arXiv PDF paper

Vision transformers have recently become popular as a possible alternative to convolutional neural networks (CNNs) for a variety of computer vision applications.

… the hybridization of convolution and self-attention mechanisms in vision transformers is gaining popularity due to their ability of exploiting both local and global image representations.

These CNN-Transformer architectures also known as hybrid vision transformers have shown remarkable results for vision applications.

… due to the rapidly growing number of these hybrid vision transformers, there is a need for a taxonomy and explanation of these architectures.

This survey presents a taxonomy of the recent vision transformer architectures, and more specifically that of the hybrid vision transformers.

… the key features of each architecture such as the attention mechanisms, positional embeddings, multi-scale processing, and convolution are also discussed …

Stay up to date. Subscribe to my posts
Web site with my other posts by category


Photo by American Public Power Association on Unsplash



AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.