Improve video retrieval with text by comparing coarse and fine features with X-CLIP

--

Improve video retrieval with text by comparing coarse and fine features with X-CLIP

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
arXiv paper abstract https://arxiv.org/abs/2207.07285
arXiv PDF paper https://arxiv.org/pdf/2207.07285.pdf
Appliciton of X-CLIP for zero-shot video classification
Twitter video https://twitter.com/fcakyon/status/1569294816428556289
Online demo https://huggingface.co/spaces/fcakyon/zero-shot-video-classification

Video-text retrieval … a crucial … task … However, cross-grained contrast, which is the contrast between coarse-grained representations and fine-grained representations, has rarely been explored

… cross-grained contrast calculate the correlation between coarse-grained features and each fine-grained feature, and is able to filter out the unnecessary fine-grained features guided by the coarse-grained feature during similarity calculation

… presents a novel multi-grained contrastive model, namely X-CLIP, for video-text retrieval.

… another challenge lies in the similarity aggregation problem, which aims to aggregate fine-grained and cross-grained similarity matrices to instance-level similarity.

… propose the Attention Over Similarity Matrix (AOSM) module to make the model focus on the contrast between essential frames and words, thus lowering the impact of unnecessary frames and words on retrieval results.

… outperforms the previous state-of-theart by +6.3%, +6.6%, +11.1%, +6.7%, +3.8% relative improvements on these benchmarks …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Alexei Maridashvili on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D
AI News Clips by Morris Lee: News to help your R&D

Written by AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

No responses yet