Survey of computer vision backbones on many tasks
Survey of computer vision backbones on many tasks
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks
arXiv paper abstract https://arxiv.org/abs/2310.19909
arXiv PDF paper https://arxiv.org/pdf/2310.19909.pdf
GitHub https://github.com/hsouri/Battle-of-the-Backbones
Neural network based computer vision systems are typically built on a backbone … it is difficult for practitioners to make informed decisions about which backbone to choose.
… benchmarking a diverse suite of pretrained models, including vision-language models, those trained via self-supervised learning, and the Stable Diffusion backbone, across a diverse set of computer vision tasks ranging from classification to object detection to OOD generalization and more.
… sheds light on promising directions for the research community to advance computer vision by illuminating strengths and weakness of existing approaches through a comprehensive analysis conducted on more than 1500 training runs.
While vision transformers (ViTs) and self-supervised learning (SSL) are increasingly popular, … find that convolutional neural networks pretrained in a supervised fashion on large training sets still perform best on most tasks among the models …
Moreover, in apples-to-apples comparisons on the same architectures and similarly sized pretraining datasets, … find that SSL backbones are highly competitive, indicating that future works should perform SSL pretraining with advanced architectures and larger pretraining datasets.
… release the raw results of … experiments along with code that allows researchers to put their own backbones through the gauntlet …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b