Segment image into open set of categories using cost from a hierarchical encoder with SED

Segment image into open set of categories using cost from a hierarchical encoder with SED

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
arXiv paper abstract https://arxiv.org/abs/2311.15537
arXiv PDF paper https://arxiv.org/pdf/2311.15537.pdf

Open-vocabulary semantic segmentation strives to distinguish pixels into different semantic groups from an open set of categories.

… propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.

… Compared to plain transformer, hierarchical backbone better captures local spatial information and has linear computational complexity with respect to input size.

… gradual fusion decoder employs a top-down structure to combine cost map and the feature maps of different backbone levels for segmentation.

To accelerate inference speed, … introduce a category early rejection scheme in the decoder that rejects many no-existing categories at the early layer of decoder, resulting in at most 4.7 times acceleration without accuracy degradation.

Experiments … demonstrates the efficacy of … SED method …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by engin akyurt on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.