Train new object detector without bounding box annotations using captioned images
Train new object detector without bounding box annotations using captioned images
Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
arXiv paper abstract https://arxiv.org/abs/2111.09452
arXiv PDF paper https://arxiv.org/pdf/2111.09452.pdf
… in object detection, most existing methods are limited to a small set of object categories, due to the tremendous human effort needed for instance-level bounding-box annotation.
… recent open vocabulary and zero-shot detection methods attempt to detect object categories not seen during training.
… still rely on manually provided bounding-box annotations on a set of base classes.
… propose … framework that can be trained without manually provided bounding-box annotations.
… by leveraging the localization ability of pre-trained vision-language models and generating pseudo bounding-box labels that can be used directly for training object detectors.
… outperforms the state-of-the-arts (SOTA) that are trained using human annotated bounding-boxes by 3% AP on COCO novel categories even though our training source is not equipped with manual bounding-box labels. …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b