Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation
Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation
Two-shot Video Object Segmentation
arXiv paper abstract https://arxiv.org/abs/2303.12078
arXiv PDF paper https://arxiv.org/pdf/2303.12078.pdf
Previous works on video object segmentation (VOS) are trained on densely annotated videos … acquiring annotations in pixel level is expensive and time-consuming.
… demonstrate … two labeled frames per training video … idea is to generate pseudo labels for unlabeled frames during training and to optimize the model on the combination of labeled and pseudo-labeled data … approach … can be applied to a majority of existing frameworks.
… first pre-train a VOS model on sparsely annotated videos in a semi-supervised manner, with the first frame always being a labeled one.
… adopt the pre-trained VOS model to generate pseudo labels for all unlabeled frames, which are subsequently stored in a pseudo-label bank.
… retrain a VOS model on both labeled and pseudo-labeled data without any restrictions on the first frame.
… approach achieves comparable results in contrast to the counterparts trained on fully labeled set …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website