Improved object segmentation in video by using object descriptors instead of pixel matching

Improved object segmentation in video by using object descriptors instead of pixel matching

HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images
arXiv paper abstract https://arxiv.org/abs/2112.09131v1
arXiv PDF paper https://arxiv.org/pdf/2112.09131v1.pdf

Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video.

This requires a large amount of densely annotated video data, which is costly to annotate, and largely redundant since frames within a video are highly correlated.

… propose HODOR: a novel method that tackles VOS by effectively leveraging annotated static images for understanding object appearance and scene context.

We encode object instances and scene information from an image frame into robust high-level descriptors which can then be used to re-segment those objects in different frames.

… achieves state-of-the-art performance on the DAVIS and YouTube-VOS benchmarks compared to existing methods trained without video annotations.

… HODOR can also learn from video context around single annotated video frames by utilizing cyclic consistency, whereas other methods rely on dense, temporally consistent annotations.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Siegfried Poepperl on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.