Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation

Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation

Two-shot Video Object Segmentation
arXiv paper abstract https://arxiv.org/abs/2303.12078
arXiv PDF paper https://arxiv.org/pdf/2303.12078.pdf

Previous works on video object segmentation (VOS) are trained on densely annotated videos … acquiring annotations in pixel level is expensive and time-consuming.

… demonstrate … two labeled frames per training video … idea is to generate pseudo labels for unlabeled frames during training and to optimize the model on the combination of labeled and pseudo-labeled data … approach … can be applied to a majority of existing frameworks.

… first pre-train a VOS model on sparsely annotated videos in a semi-supervised manner, with the first frame always being a labeled one.

… adopt the pre-trained VOS model to generate pseudo labels for all unlabeled frames, which are subsequently stored in a pseudo-label bank.

… retrain a VOS model on both labeled and pseudo-labeled data without any restrictions on the first frame.

… approach achieves comparable results in contrast to the counterparts trained on fully labeled set …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Jakob Owens on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.