Scene segmentation 7.3 times faster with 3D transformer using patch attention

Scene segmentation 7.3 times faster with 3D transformer using patch attention

PatchFormer: A Versatile 3D Transformer Based on Patch Attention
arXiv paper abstract https://arxiv.org/abs/2111.00207
arXiv PDF paper https://arxiv.org/pdf/2111.00207.pdf

… 3D vision community … shift from CNNs to … pure Transformer architectures have attained top accuracy on the major 3D learning benchmarks.

… 3D Transformers … has quadratic complexity (both in space and time) with respect to input size.

To solve … introduce patch-attention to adaptively learn a much smaller set of bases upon which the attention maps are computed.

… patch-attention not only captures the global shape context but also achieves linear complexity to input size.

… propose a lightweight Multi-scale Attention (MSA) block to build attentions among features of different scales, providing the model with multi-scale features.

… network achieves strong accuracy on general 3D recognition tasks with 7.3x speed-up than previous 3D Transformers.

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Joao Tzanno on Unsplash

--

--

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.