Survey of video captioning using deep learning

AI News Clips by Morris Lee: News to help your R&D

1 min readApr 26, 2023

Survey of video captioning using deep learning

A Review of Deep Learning for Video Captioning
arXiv paper abstract https://arxiv.org/abs/2304.11431
arXiv PDF paper https://arxiv.org/pdf/2304.11431.pdf

Video captioning (VC) is a fast-moving, cross-disciplinary area of research that bridges work in the fields of computer vision, natural language processing (NLP), linguistics, and human-computer interaction.

In essence, VC involves understanding a video and describing it with language.

Captioning is used in a host of applications from creating more accessible interfaces (e.g., low-vision navigation) to video question answering (V-QA), video retrieval and content generation.

This survey covers deep learning-based VC, including but, not limited to, attention-based architectures, graph networks, reinforcement learning, adversarial networks, dense video captioning (DVC), and more.

… discuss the datasets and evaluation metrics used in the field, and limitations, applications, challenges, and future directions for VC.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Survey of video captioning using deep learning

Written by AI News Clips by Morris Lee: News to help your R&D

No responses yet