Vision transformer morphed to CNN works better

Vision transformer morphed to CNN works better

Visformer: The Vision-friendly Transformer
arXiv paper PDF https://arxiv.org/abs/2104.12533
arXiv PDF paper https://arxiv.org/pdf/2104.12533.pdf
GitHub https://github.com/danczs/Visformer

… rapid development … Transformer module to vision …
… there are still growing number of evidences showing that these models suffer over-fitting especially when the training data is limited.
… gradually transit a Transformer-based model to a convolution-based model.
… With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy, and the advantage becomes more significant when the model complexity is lower or the training set is smaller. …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Dilyara Garifullina on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.