Retrieve images using complex text descriptions by Neural Divide-and-Conquer Reasoning with NDCR

--

Retrieve images using complex text descriptions by Neural Divide-and-Conquer Reasoning with NDCR

A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text
arXiv paper abstract https://arxiv.org/abs/2305.02265
arXiv PDF paper https://arxiv.org/pdf/2305.02265.pdf

Pretrained Vision-Language Models (VLMs) have remarkable performance in image retrieval from text. However … performance drops … with linguistically complex texts …

… regard … complex texts as compound proposition texts composed of multiple simple proposition sentences and propose an end-to-end Neural Divide-and-Conquer Reasoning framework, dubbed NDCR.

It contains three main components: 1) Divide: … divides the compound proposition text into simple proposition sentences …

2) Conquer: … a pretrained … interactor achieves the interaction between … sentences and images,

3) Combine: … reasoner combines the … states to obtain the final solution …

… Experimental results and analyses indicate NDCR significantly improves performance in the complex image-text reasoning problem …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Nicolas Houdayer on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D
AI News Clips by Morris Lee: News to help your R&D

Written by AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

No responses yet