Real-time video segmentation using shared decoder with RAP-SAM
Real-time video segmentation using shared decoder with RAP-SAM
RAP-SAM: Towards Real-Time All-Purpose Segment Anything
arXiv paper abstract https://arxiv.org/abs/2401.10228
arXiv PDF paper https://arxiv.org/pdf/2401.10228.pdf
GitHub https://github.com/xushilin1/RAP-SAM
Project page https://xushilin1.github.io/rap_sam
… vision foundation models (VFMs) achieve remarkable progress … Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation.
However, most VFMs cannot run in realtime, which makes it difficult to transfer them into several products.
… this work explores … segmentation in real-time … contains three different tasks, including interactive segmentation, panoptic segmentation, and video segmentation.
… aim to use one model to achieve the above tasks in real-time … present Real-Time All Purpose SAM (RAP-SAM).
It contains an efficient encoder and an efficient decoupled decoder to perform prompt-driven decoding.
… further explore different training strategies and tuning methods to boost co-training performance further …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b