Improve object discovery with self-supervised transformers using TokenCut
Improve object discovery with self-supervised transformers using TokenCut
Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut
arXiv paper abstract https://arxiv.org/abs/2202.11539
arXiv PDF paper https://arxiv.org/pdf/2202.11539.pdf
Transformers trained with self-supervised learning using self-distillation loss (DINO) have been shown to produce attention maps that highlight salient foreground objects.
… In this paper, … demonstrate a graph-based approach that uses the self-supervised transformer features to discover an object from an image.
Visual tokens are … nodes in a weighted graph with edges representing a connectivity score based on the similarity of tokens.
Foreground objects can then be segmented using a normalized graph-cut to group self-similar regions.
… solve the graph-cut … using spectral clustering with generalized eigen-decomposition and show … second smallest eigenvector … a cutting solution since its absolute value indicates … token belongs to a foreground object.
… performance of unsupervised object discovery: … improve over … state of the art LOST by a margin of 6.9%, 8.1%, and 8.1% respectively on the VOC07, VOC12, and COCO20K. …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b