Detect unknown objects and attributes better by jointly training on both tasks with OvarNet

Detect unknown objects and attributes better by jointly training on both tasks with OvarNet

OvarNet: Towards Open-vocabulary Object Attribute Recognition
arXiv paper abstract https://arxiv.org/abs/2301.09506
arXiv PDF paper https://arxiv.org/pdf/2301.09506.pdf

… consider … detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.

… make the following contributions: (i) … start with a naive two-stage approach for open-vocabulary object detection and attribute classification, termed CLIP-Attr.

The candidate objects are first proposed with an offline RPN and later classified for semantic category and attributes;

(ii) … combine all available datasets and train with a federated strategy to finetune the CLIP model, aligning the visual representation with attributes …

(iii) … train a Faster-RCNN type model end-to-end with knowledge distillation, that performs class-agnostic object proposals and classification on semantic categories and attributes …

(iv) … show that recognition of semantic category and attributes … largely outperform existing approaches that treat the two tasks independently, demonstrating strong generalization ability to novel attributes and categories.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by S-Art Photography on Unsplash

--

--

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.