Vision transformer morphed to CNN works better

Vision transformer morphed to CNN works better

Visformer: The Vision-friendly Transformer
arXiv paper PDF
arXiv PDF paper

… rapid development … Transformer module to vision …
… there are still growing number of evidences showing that these models suffer over-fitting especially when the training data is limited.
… gradually transit a Transformer-based model to a convolution-based model.
… With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy, and the advantage becomes more significant when the model complexity is lower or the training set is smaller. …

Stay up to date. Subscribe to my posts
Web site with my other posts by category


Photo by Dilyara Garifullina on Unsplash



AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.