Match an image taken at ground-level to an aerial photo with TransGCNN
Match an image taken at ground-level to an aerial photo with TransGCNN
Transformer-Guided Convolutional Neural Network for Cross-View Geolocalization
arXiv paper abstract https://arxiv.org/abs/2204.09967v1
arXiv PDF paper https://arxiv.org/pdf/2204.09967v1.pdf
Ground-to-aerial geolocalization refers to localizing a ground-level query image by matching it to a reference database of geo-tagged aerial imagery.
This is very challenging due to the huge perspective differences in visual appearances and geometric configurations between these two views.
… propose … Transformer-guided convolutional neural network (TransGCNN) … couples CNN-based local features with Transformer-based global representations for enhanced representation learning.
… TransGCNN consists of a CNN backbone extracting feature map from an input image and a Transformer head modeling global context from the CNN map.
… Transformer head acts as a spatial-aware importance generator to select salient CNN features as the final feature representation.
… model achieves top-1 accuracy … which outperforms the second-performing baseline with less than 50% parameters and almost 2x higher frame rate
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b