Survey of vision transformers and hybrid CNN-transformer variants

Survey of vision transformers and hybrid CNN-transformer variants

A survey of the Vision Transformers and its CNN-Transformer based Variants
arXiv paper abstract https://arxiv.org/abs/2305.09880
arXiv PDF paper https://arxiv.org/ftp/arxiv/papers/2305/2305.09880.pdf

Vision transformers have recently become popular as a possible alternative to convolutional neural networks (CNNs) for a variety of computer vision applications.

… the hybridization of convolution and self-attention mechanisms in vision transformers is gaining popularity due to their ability of exploiting both local and global image representations.

These CNN-Transformer architectures also known as hybrid vision transformers have shown remarkable results for vision applications.

… due to the rapidly growing number of these hybrid vision transformers, there is a need for a taxonomy and explanation of these architectures.

This survey presents a taxonomy of the recent vision transformer architectures, and more specifically that of the hybrid vision transformers.

… the key features of each architecture such as the attention mechanisms, positional embeddings, multi-scale processing, and convolution are also discussed …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by American Public Power Association on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.