Find moment in video matching a description
Find moment in video matching a description
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
arXiv paper abstract https://arxiv.org/abs/2107.09609
arXiv PDF paper https://arxiv.org/pdf/2107.09609.pdf
GitHub https://github.com/jayleicn/moment_detr
Detecting customized moments and highlights from videos given natural language (NL) user queries is an important but under-studied topic.
… present a strong baseline for this task, Moment-DETR, a transformer encoder-decoder model that views moment retrieval as a direct set prediction problem, taking extracted video and query representations as inputs and predicting moment coordinates and saliency scores end-to-end.
While our model does not utilize any human prior, we show that it performs competitively when compared to well-engineered architectures.
With weakly supervised pretraining using ASR captions, Moment-DETR substantially outperforms previous methods. …
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b