Get summary video from a text search

Get summary video from a text search

GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization
arXiv paper abstract https://arxiv.org/abs/2104.12465v1
arXiv PDF paper https://arxiv.org/pdf/2104.12465v1.pdf

… a text-based query is considered as one of the main drivers of video summary generation, as it is user-defined. … The proposed model consists of a contextualized video summary controller, multi-modal attention mechanisms, an interactive attention network, and a video summary generator. Based on the evaluation of the existing multi-modal video summarization benchmark, experimental results show that the proposed model is effective with the increase of +5.88% in accuracy and +4.06% increase of F1-score, compared with the state-of-the-art method.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Najib Kalil on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D
AI News Clips by Morris Lee: News to help your R&D

Written by AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

No responses yet