Scene segmentation 7.3 times faster with 3D transformer using patch attention
Scene segmentation 7.3 times faster with 3D transformer using patch attention
PatchFormer: A Versatile 3D Transformer Based on Patch Attention
arXiv paper abstract https://arxiv.org/abs/2111.00207
arXiv PDF paper https://arxiv.org/pdf/2111.00207.pdf
… 3D vision community … shift from CNNs to … pure Transformer architectures have attained top accuracy on the major 3D learning benchmarks.
… 3D Transformers … has quadratic complexity (both in space and time) with respect to input size.
To solve … introduce patch-attention to adaptively learn a much smaller set of bases upon which the attention maps are computed.
… patch-attention not only captures the global shape context but also achieves linear complexity to input size.
… propose a lightweight Multi-scale Attention (MSA) block to build attentions among features of different scales, providing the model with multi-scale features.
… network achieves strong accuracy on general 3D recognition tasks with 7.3x speed-up than previous 3D Transformers.
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b