Do many types of video segmentation with one model without retraining with TarViS

AI News Clips by Morris Lee: News to help your R&D

2 min readJan 9, 2023

Do many types of video segmentation with one model without retraining with TarViS

TarViS: A Unified Approach for Target-based Video Segmentation
arXiv paper abstract https://arxiv.org/abs/2301.02657
arXiv PDF paper https://arxiv.org/pdf/2301.02657.pdf

… video segmentation is currently fragmented into different tasks spanning multiple benchmarks … methods are overwhelmingly task-specific and cannot conceptually generalize to other tasks.

… propose TarViS: a novel, unified network architecture that can be applied to any task that requires segmenting a set of arbitrarily defined ‘targets’ in video.

… approach is flexible with respect to how tasks define these targets, since it models the latter as abstract ‘queries’ which are then used to predict pixel-precise target masks.

A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining.

… apply TarViS to four different tasks, namely Video Instance Segmentation (VIS), Video Panoptic Segmentation (VPS), Video Object Segmentation (VOS) and Point Exemplar-guided Tracking (PET).

… unified, jointly trained model achieves state-of-the-art performance on 5/7 benchmarks spanning these four tasks, and competitive performance on the remaining two.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Do many types of video segmentation with one model without retraining with TarViS

Written by AI News Clips by Morris Lee: News to help your R&D

No responses yet