Survey of vision transformers and hybrid CNN-transformer variants
--
Survey of vision transformers and hybrid CNN-transformer variants
A survey of the Vision Transformers and its CNN-Transformer based Variants
arXiv paper abstract https://arxiv.org/abs/2305.09880
arXiv PDF paper https://arxiv.org/ftp/arxiv/papers/2305/2305.09880.pdf
Vision transformers have recently become popular as a possible alternative to convolutional neural networks (CNNs) for a variety of computer vision applications.
… the hybridization of convolution and self-attention mechanisms in vision transformers is gaining popularity due to their ability of exploiting both local and global image representations.
These CNN-Transformer architectures also known as hybrid vision transformers have shown remarkable results for vision applications.
… due to the rapidly growing number of these hybrid vision transformers, there is a need for a taxonomy and explanation of these architectures.
This survey presents a taxonomy of the recent vision transformer architectures, and more specifically that of the hybrid vision transformers.
… the key features of each architecture such as the attention mechanisms, positional embeddings, multi-scale processing, and convolution are also discussed …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b