TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing

doi:10.1109/CVPR52688.2022.01222

科研成果详情

题名	TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing
作者	Chen，Jierun 1; He，Tianlang 1; Zhuo，Weipeng 1; Ma，Li 1; Ha，Sangtae 2; Chan，S. H.Gary 1
发表日期	2022
会议名称	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
会议录名称	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN	1063-6919
卷号	2022-June
页码	12538-12548
会议日期	2022-06-19——2022-06-24
会议地点	New Orleans
摘要	As convolution has empowered many smart applications, dynamic convolution further equips it with the ability to adapt to diverse inputs. However, the static and dynamic convolutions are either layout-agnostic or computation-heavy, making it inappropriate for layout-specific applications, e.g., face recognition and medical image segmentation. We observe that these applications naturally exhibit the characteristics of large intra-image (spatial) variance and small cross-image variance. This observation motivates our efficient translation variant convolution (TVConv) for layout-aware visual processing. Technically, TVConv is composed of affinity maps and a weight-generating block. While affinity maps depict pixel-paired relationships gracefully, the weight-generating block can be explicitly over-parameterized for better training while maintaining efficient inference. Although conceptually simple, TVConv significantly improves the efficiency of the convolution and can be readily plugged into various network architectures. Extensive experiments on face recognition show that TVConv reduces the computational cost by up to 3.1 × and improves the corresponding throughput by 2.3× while maintaining a high accuracy compared to the depthwise convolution. Moreover, for the same computation cost, we boost the mean accuracy by up to 4.21%. We also conduct experiments on the optic disc/cup segmentation task and obtain better generalization performance, which helps mitigate the critical data scarcity issue. Code is available at https://github.com/JierunChen/TVConv.
关键词	biological and cell microscopy Deep learning architectures and techniques Efficient learning and inferences Face and gestures Medical Vision applications and systems
DOI	10.1109/CVPR52688.2022.01222
URL	查看来源
语种	英语English
Scopus入藏号	2-s2.0-85141567667
引用统计	被引频次：28[WOS] [WOS记录] [WOS相关记录]
文献类型	会议论文
条目标识符	https://repository.uic.edu.cn/handle/39GCC9TT/13687
专题	个人在本单位外知识产出
作者单位	1.The Hong Kong University of Science and Technology,Hong Kong 2.University of Colorado at Boulder,United States
推荐引用方式 GB/T 7714	Chen，Jierun,He，Tianlang,Zhuo，Weipenget al. TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing[C], 2022: 12538-12548.