Natural Adversarial Examples

Friday Poster Session

Bottleneck Transformers for Visual Recognition

Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. [Expand]

334.00

Friday Poster Session

Involution: Inverting the Inherence of Convolution for Visual Recognition

Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. [Expand]

300.75

Thursday Poster Session

Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation

Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph

Building instance segmentation models that are data-efficient and can handle rare object categories is an important challenge in computer vision. [Expand]

289.50

Tuesday Poster Session

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth

We present a learning-based method for synthesizingnovel views of complex scenes using only unstructured collections of in-the-wild photographs. [Expand]

268.50

Wednesday Poster Session

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yagiz Aksoy

Neural networks have shown great abilities in estimating depth from a single image. [Expand]

Wednesday Poster Session

Robust Consistent Video Depth Estimation

Johannes Kopf, Xuejian Rong, Jia-Bin Huang

We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video. [Expand]

246.50

Monday Poster Session

NeX: Real-Time View Synthesis With Neural Basis Expansion

Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn

We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce next-level view-dependent effects--in real time. [Expand]

246.25

Wednesday Poster Session

Motion Representations for Articulated Animation

Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

We propose novel motion representations for animating articulated objects consisting of distinct parts. [Expand]

Thursday Poster Session

Omnimatte: Associating Objects and Their Effects in Video

Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein

Computer vision has become increasingly better at segmenting objects in images and videos; however, scene effects related to the objects -- shadows, reflections, generated smoke, etc. [Expand]

Tuesday Poster Session

Closed-Form Factorization of Latent Semantics in GANs

Yujun Shen, Bolei Zhou

A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. [Expand]

211.00

Monday Poster Session

Scene Essence

Jiayan Qiu, Yiding Yang, Xinchao Wang, Dacheng Tao

What scene elements, if any, are indispensable for recognizing a scene? We strive to answer this question through the lens of an end-to-end learning scheme. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. [Expand]

207.25

Thursday Poster Session

Back to the Feature: Learning Robust Camera Localization From Pixels To Pose

Paul-Edouard Sarlin, Ajaykumar Unagar, Mans Larsson, Hugo Germain, Carl Toft, Viktor Larsson, Marc Pollefeys, Vincent Lepetit, Lars Hammarstrand, Fredrik Kahl, Torsten Sattler

Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. [Expand]

198.25

Tuesday Poster Session

Seung Wook Kim, Jonah Philion, Antonio Torralba, Sanja Fidler

Realistic simulators are critical for training and verifying robotics systems. [Expand]

Tuesday Poster Session

Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

Zhengqi Li, Simon Niklaus, Noah Snavely, Oliver Wang

We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. [Expand]

151.00

Tuesday Poster Session

GAN Prior Embedded Network for Blind Face Restoration in the Wild

Tao Yang, Peiran Ren, Xuansong Xie, Lei Zhang

Blind face restoration (BFR) from severely degraded face images in the wild is a very challenging problem. [Expand]

Monday Poster Session

Image Generators With Conditionally-Independent Pixel Synthesis

Ivan Anokhin, Kirill Demochkin, Taras Khakhulin, Gleb Sterkin, Victor Lempitsky, Denis Korzhenkov

Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner. [Expand]

143.00

Thursday Poster Session

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

Daiqing Li, Junlin Yang, Karsten Kreis, Antonio Torralba, Sanja Fidler

Training deep networks with limited labeled data while achieving a strong generalization ability is key in the quest to reduce human annotation efforts. [Expand]

137.00

Wednesday Poster Session

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens

Self-attention has the promise of improving computer vision systems due to parameter-independent scaling of receptive fields and content-dependent interactions, in contrast to parameter-dependent scaling and content-independent interactions of convolutions. [Expand]

136.50

Thursday Poster Session

Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or

We present a generic image-to-image translation framework, pixel2style2pixel (pSp). [Expand]

136.25

Monday Poster Session

Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers

Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang

Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoder-decoder architecture. [Expand]

133.75

Tuesday Poster Session

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments From a Single Moving Camera

Felix Wimbauer, Nan Yang, Lukas von Stumberg, Niclas Zeller, Daniel Cremers

In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. [Expand]

130.25

Tuesday Poster Session

Information-Theoretic Segmentation by Inpainting Error Maximization

Pedro Savarese, Sunnie S. Y. Kim, Michael Maire, Greg Shakhnarovich, David McAllester

We study image segmentation from an information-theoretic perspective, proposing a novel adversarial method that performs unsupervised segmentation by partitioning images into maximally independent sets. [Expand]

130.00

Tuesday Poster Session

IBRNet: Learning Multi-View Image-Based Rendering

Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P. Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser

We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. [Expand]

128.00

Tuesday Poster Session

On Robustness and Transferability of Convolutional Neural Networks

Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. [Expand]

126.25

Friday Poster Session

LoFTR: Detector-Free Local Feature Matching With Transformers

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou

We present a novel method for local image feature matching. [Expand]

125.50

Wednesday Poster Session

Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings

Brett D. Roads, Bradley C. Love

Advances in supervised learning approaches to object recognition flourished in part because of the availability of high-quality datasets and associated benchmarks. [Expand]

119.00

Tuesday Poster Session

Shape and Material Capture at Home

Daniel Lichy, Jiaye Wu, Soumyadip Sengupta, David W. Jacobs

In this paper, we present a technique for estimating the geometry and reflectance of objects using only a camera, flashlight, and optionally a tripod. [Expand]

112.75

Tuesday Poster Session

Re-Labeling ImageNet: From Single to Multi-Labels, From Global to Localized Labels

Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun

ImageNet has been the most popular image classification benchmark, but it is also the one with a significant level of label noise. [Expand]

111.75

Monday Poster Session

NeuralRecon: Real-Time Coherent 3D Reconstruction From Monocular Video

Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao

We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video. [Expand]

Friday Poster Session

Deep Animation Video Interpolation in the Wild

Li Siyao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming. [Expand]

Tuesday Poster Session

Anton Cherepkov, Andrey Voynov, Artem Babenko

Generative Adversarial Networks (GANs) are currently an indispensable tool for visual editing, being a standard component of image-to-image translation and image restoration pipelines. [Expand]

93.50

Tuesday Poster Session

Skip-Convolutions for Efficient Video Processing

Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi

We propose Skip-Convolutions to leverage the large amount of redundancies in video streams and save computations. [Expand]

Monday Poster Session

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth

Space-Time Neural Irradiance Fields for Free-Viewpoint Video

Wenqi Xian, Jia-Bin Huang, Johannes Kopf, Changil Kim

We present a method that learns a spatiotemporal neural irradiance field for dynamic scenes from a single video. [Expand]

84.25

Wednesday Poster Session

SMPLicit: Topology-Aware Generative Model for Clothed People

Enric Corona, Albert Pumarola, Guillem Alenya, Gerard Pons-Moll, Francesc Moreno-Noguer

In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. [Expand]

83.50

Thursday Poster Session

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

Yasamin Jafarian, Hyun Soo Park

A key challenge of learning the geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real world imagery. [Expand]

Thursday Poster Session

Multimodal Motion Prediction With Stacked Transformers

Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, Bolei Zhou

Predicting multiple plausible future trajectories of the nearby vehicles is crucial for the safety of autonomous driving. [Expand]

82.25

Wednesday Poster Session

Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction

Aljaz Bozic, Pablo Palafox, Michael Zollhofer, Justus Thies, Angela Dai, Matthias Niessner

We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. [Expand]

78.00

Monday Poster Session

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Aditya Prakash, Kashyap Chitta, Andreas Geiger

How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. [Expand]

77.75

Tuesday Poster Session

Line Segment Detection Using Transformers Without Edges

Yifan Xu, Weijian Xu, David Cheung, Zhuowen Tu

In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. [Expand]

77.25

Tuesday Poster Session

Training Generative Adversarial Networks in One Stage

Chengchao Shen, Youtan Yin, Xinchao Wang, Xubin Li, Jie Song, Mingli Song

Generative Adversarial Networks (GANs) have demonstrated unprecedented success in various image generation tasks. [Expand]

Tuesday Poster Session

GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields

Michael Niemeyer, Andreas Geiger

Deep generative models allow for photorealistic image synthesis at high resolutions. [Expand]

76.25

Thursday Poster Session

Spatiotemporal Contrastive Video Representation Learning

Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui

We present a self-supervised Contrastive Video Representation Learning (CVRL) method to learn spatiotemporal visual representations from unlabeled videos. [Expand]

76.25

Tuesday Poster Session

Transformer Interpretability Beyond Attention Visualization

Hila Chefer, Shir Gur, Lior Wolf

Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. [Expand]

75.75

Monday Poster Session

SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor

In this paper we introduce a Transformer-based approach to video object segmentation (VOS). [Expand]

75.75

Tuesday Poster Session

Positional Encoding As Spatial Inductive Bias in GANs

Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy

SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field. [Expand]

74.75

Thursday Poster Session

Probabilistic Embeddings for Cross-Modal Retrieval

Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus

Cross-modal retrieval methods build a common representation space for samples from multiple modalities, typically from the vision and the language domains. [Expand]

74.25

Wednesday Poster Session

Dual Contradistinctive Generative Autoencoder

Gaurav Parmar, Dacheng Li, Kwonjoon Lee, Zhuowen Tu

We present a new generative autoencoder model with dual contradistinctive losses to improve generative autoencoder that performs simultaneous inference (reconstruction) and synthesis (sampling). [Expand]

74.25

Monday Poster Session

D-NeRF: Neural Radiance Fields for Dynamic Scenes

Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer

Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. [Expand]

74.25

Wednesday Poster Session

On Feature Normalization and Data Augmentation

Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger

The moments (a.k.a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time. [Expand]

73.50

Thursday Poster Session

Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction

Guy Gafni, Justus Thies, Michael Zollhofer, Matthias Niessner

We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. [Expand]

73.00

Wednesday Poster Session

You Only Look One-Level Feature

Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun

This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out that the success of FPN is due to its divide-and-conquer solution to the optimization problem in object detection rather than multi-scale feature fusion. [Expand]

Thursday Poster Session

Metadata Normalization

Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. [Expand]

Wednesday Poster Session

End-to-End Video Instance Segmentation With Transformers

Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia

Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. [Expand]

72.00

Wednesday Poster Session

Repurposing GANs for One-Shot Semantic Part Segmentation

Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn

While GANs have shown success in realistic image generation, the idea of using GANs for other tasks unrelated to synthesis is underexplored. [Expand]

70.50

Tuesday Poster Session

Neural Lumigraph Rendering

Petr Kellnhofer, Lars C. Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, Gordon Wetzstein

Novel view synthesis is a challenging and ill-posed inverse rendering problem. [Expand]

69.75

Tuesday Poster Session

Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image Editing

Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh

Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. [Expand]

Monday Poster Session

Passive Inter-Photon Imaging

Atul Ingle, Trevor Seets, Mauro Buttafava, Shantanu Gupta, Alberto Tosi, Mohit Gupta, Andreas Velten

Digital camera pixels measure image intensities by converting incident light energy into an analog electrical current, and then digitizing it into a fixed-width binary representation. [Expand]

Wednesday Poster Session

Plan2Scene: Converting Floorplans to 3D Scenes

Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva

We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene. [Expand]

Wednesday Poster Session

Task Programming: Learning Data Efficient Behavior Representations

Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Yue, Pietro Perona

Specialized domain knowledge is often necessary to accurately annotate training sets for in-depth analysis, but can be burdensome and time-consuming to acquire from domain experts. [Expand]

67.75

Tuesday Poster Session

Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries. [Expand]

66.25

Tuesday Poster Session

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks

Despoina Paschalidou, Angelos Katharopoulos, Andreas Geiger, Sanja Fidler

Impressive progress in 3D shape extraction led to representations that can capture object geometries with high fidelity. [Expand]

65.75

Tuesday Poster Session

Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging

Alvaro Parra, Shin-Fang Chng, Tat-Jun Chin, Anders Eriksson, Ian Reid

Under mild conditions on the noise level of the measurements, rotation averaging satisfies strong duality, which enables global solutions to be obtained via semidefinite programming (SDP) relaxation. [Expand]

Tuesday Poster Session

MOS: Towards Scaling Out-of-Distribution Detection for Large Semantic Space

Rui Huang, Yixuan Li

Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying machine learning models in the real world. [Expand]

Wednesday Poster Session

Anycost GANs for Interactive Image Synthesis and Editing

Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing. [Expand]

64.00

Thursday Poster Session

Generative Hierarchical Features From Synthesizing Images

Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data. [Expand]

63.25

Tuesday Poster Session

NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces

Miguel Jaques, Michael Burke, Timothy M. Hospedales

Learning low-dimensional latent state space dynamics models has proven powerful for enabling vision-based planning and learning for control. [Expand]

63.00

Tuesday Poster Session

DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution

Siyuan Qiao, Liang-Chieh Chen, Alan Yuille

Many modern object detectors demonstrate outstanding performances by using the mechanism of looking and thinking twice. [Expand]

62.00

Wednesday Poster Session

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Zongze Wu, Dani Lischinski, Eli Shechtman

We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. [Expand]

61.75

Thursday Poster Session

MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers

Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee, Yinfei Yang

The output of text-to-image synthesis systems should be coherent, clear, photo-realistic scenes with high semantic fidelity to their conditioned text descriptions. [Expand]

51.25

Monday Poster Session

HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms

Mahmoud Afifi, Marcus A. Brubaker, Michael S. Brown

While generative adversarial networks (GANs) can successfully produce high-quality images, they can be challenging to control. [Expand]

51.00

Wednesday Poster Session

Single Image Depth Prediction With Wavelet Decomposition

Michael Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit, Daniyar Turmukhambetov

We present a novel method for predicting accurate depths from monocular images with high efficiency. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Birds of a Feather: Capturing Avian Shape Models From Images

Yufu Wang, Nikos Kolotouros, Kostas Daniilidis, Marc Badger

Animals are diverse in shape, but building a deformable shape model for a new species is not always possible due to the lack of 3D data. [Expand]

Thursday Poster Session

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu

Object tracking has achieved significant progress over the past few years. [Expand]

Thursday Poster Session

Body2Hands: Learning To Infer 3D Hands From Conversational Gesture Body Dynamics

Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo

We propose a novel learned deep prior of body motion for 3D hand shape synthesis and estimation in the domain of conversational gestures. [Expand]

48.50

Thursday Poster Session

OSTeC: One-Shot Texture Completion

Unsupervised Learning of 3D Object Categories From Videos in the Wild

Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny

Recently, numerous works have attempted to learn 3D reconstructors of textured 3D models of visual categories given a training set of annotated static images of objects. [Expand]

46.25

Tuesday Poster Session

Reconstructing 3D Human Pose by Watching Humans in the Mirror

Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror. [Expand]

Thursday Poster Session

LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization

Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler

In this paper, we present a video-based learning framework for animating personalized 3D talking faces from audio. [Expand]

Monday Poster Session

Permute, Quantize, and Fine-Tune: Efficient Compression of Neural Networks

Julieta Martinez, Jashan Shewakramani, Ting Wei Liu, Ioan Andrei Barsan, Wenyuan Zeng, Raquel Urtasun

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. [Expand]

Friday Poster Session

Blur, Noise, and Compression Robust Generative Adversarial Networks

Takuhiro Kaneko, Tatsuya Harada

Generative adversarial networks (GANs) have gained considerable attention owing to their ability to reproduce images. [Expand]

45.00

Thursday Poster Session

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While accurate lip synchronization has been achieved for arbitrary-subject audio-driven talking face generation, the problem of how to efficiently drive the head pose remains. [Expand]

44.75

Tuesday Poster Session

NeRD: Neural 3D Reflection Symmetry Detector

Yichao Zhou, Shichen Liu, Yi Ma

Recent advances have shown that symmetry, a structural prior that most objects exhibit, can support a variety of single-view 3D understanding tasks. [Expand]

Friday Poster Session

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations

Adel Ahmadyan, Liangkai Zhang, Artsiom Ablavatski, Jianing Wei, Matthias Grundmann

3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. [Expand]

44.00

Wednesday Poster Session

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He

We present a large-scale study on unsupervised spatiotemporal representation learning from videos. [Expand]

44.00

Tuesday Poster Session

Localizing Visual Sounds the Hard Way

Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman

The objective of this work is to localize sound sources that are visible in a video without using manual annotations. [Expand]

43.75

Friday Poster Session

Learning Optical Flow From Still Images

Filippo Aleotti, Matteo Poggi, Stefano Mattoccia

This paper deals with the scarcity of data for training optical flow networks, highlighting the limitations of existing sources such as labeled synthetic datasets or unlabeled real videos. [Expand]

Thursday Poster Session

Three Ways To Improve Semantic Segmentation With Self-Supervised Depth Estimation

Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Koring, Suman Saha, Luc Van Gool

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process. [Expand]

43.50

Wednesday Poster Session

VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency

Fast and Accurate Model Scaling

Piotr Dollar, Mannat Singh, Ross Girshick

In this work we analyze strategies for convolutional neural network scaling; that is, the process of scaling a base convolutional network to endow it with greater computational complexity and consequently representational power. [Expand]

41.00

Monday Poster Session

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys

Objects moving at high speed appear significantly blurred when captured with cameras. [Expand]

41.00

Tuesday Poster Session

Few-Shot Transformation of Common Actions Into Time and Space

Pengwan Yang, Pascal Mettes, Cees G. M. Snoek

This paper introduces the task of few-shot common action localization in time and space. [Expand]

Friday Poster Session

Understanding Failures of Deep Networks via Robust Feature Extraction

Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz

Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances. [Expand]

Thursday Poster Session

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Vitor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner

We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. [Expand]

40.00

Wednesday Poster Session

High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network

Jie Liang, Hui Zeng, Lei Zhang

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. [Expand]

Wednesday Poster Session

User-Guided Line Art Flat Filling With Split Filling Mechanism

Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu

Flat filling is a critical step in digital artistic content creation with the objective of filling line arts with flat colors. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Neural Scene Graphs for Dynamic Scenes

Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, Felix Heide

Recent implicit neural rendering methods have demonstrated that it is possible to learn accurate view synthesis for complex scenes by predicting their volumetric density and color supervised solely by a set of RGB images. [Expand]

39.50

Tuesday Poster Session

LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. [Expand]

Friday Poster Session

Representation Learning via Global Temporal Alignment and Cycle-Consistency

Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson

We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e.g., videos) of the same process (e.g., human action). [Expand]

Wednesday Poster Session

A Sliced Wasserstein Loss for Neural Texture Synthesis

Eric Heitz, Kenneth Vanhoey, Thomas Chambon, Laurent Belcour

We address the problem of computing a textural loss based on the statistics extracted from the feature activations of a convolutional neural network optimized for object recognition (e.g. [Expand]

Wednesday Poster Session

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution

Kelvin C.K. Chan, Xintao Wang, Xiangyu Xu, Jinwei Gu, Chen Change Loy

We show that pre-trained Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR). [Expand]

38.75

Thursday Poster Session

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts

Ji Hou, Benjamin Graham, Matthias Niessner, Saining Xie

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. [Expand]

37.50

Friday Poster Session

Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning

Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu, Dongmei Fu, Jianlong Fu

We study on joint learning of Convolutional Neural Network (CNN) and Transformer for vision-language pre-training (VLPT) which aims to learn cross-modal alignments from millions of image-text pairs. [Expand]

Thursday Poster Session

CASTing Your Model: Learning To Localize Improves Self-Supervised Representations

Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik

Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining. [Expand]

37.25

Wednesday Poster Session

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

Inverted bottleneck layers, which are built upon depthwise convolutions, have been the predominant building blocks in state-of-the-art object detection models on mobile devices. [Expand]

37.25

Tuesday Poster Session

Rethinking Channel Dimensions for Efficient Model Design

Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo

Designing an efficient model within the limited computational cost is challenging. [Expand]

Monday Poster Session

ManipulaTHOR: A Framework for Visual Object Manipulation

Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha Kembhavi, Roozbeh Mottaghi

The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. [Expand]

Tuesday Poster Session

Efficient Initial Pose-Graph Generation for Global SfM

Daniel Barath, Dmytro Mishkin, Ivan Eichhardt, Ilia Shipachev, Jiri Matas

We propose ways to speed up the initial pose-graph generation for global Structure-from-Motion algorithms. [Expand]

36.50

Thursday Poster Session

Efficient Conditional GAN Transfer With Knowledge Propagation Across Classes

Mohamad Shahbazi, Zhiwu Huang, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

Generative adversarial networks (GANs) have shown impressive results in both unconditional and conditional image generation. [Expand]

Thursday Poster Session

End-to-End Object Detection With Fully Convolutional Network

Jianfeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng

Mainstream object detectors based on the fully convolutional network has achieved impressive performance. [Expand]

36.50

Friday Poster Session

Co-Attention for Conditioned Image Matching

Olivia Wiles, Sebastien Ehrhardt, Andrew Zisserman

We propose a new approach to determine correspondences between image pairs in the wild under large changes in illumination, viewpoint, context, and material. [Expand]

Friday Poster Session

Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans

Zhenyu Chen, Ali Gholami, Matthias Niessner, Angel X. Chang

We introduce the new task of dense captioning in RGB-D scans. [Expand]

36.00

Tuesday Poster Session

Learning Monocular 3D Reconstruction of Articulated Categories From Motion

Filippos Kokkinos, Iasonas Kokkinos

Monocular 3D reconstruction of articulated object categories is challenging due to the lack of training data and the inherent ill-posedness of the problem. [Expand]

36.00

Monday Poster Session

Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation

Nikita Araslanov, Stefan Roth

We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. [Expand]

Thursday Poster Session

Learned Initializations for Optimizing Coordinate-Based Neural Representations

Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng

Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations for complex low dimensional signals. [Expand]

35.75

Tuesday Poster Session

Audio-Visual Instance Discrimination with Cross-Modal Agreement

Pedro Morgado, Nuno Vasconcelos, Ishan Misra

We present a self-supervised learning approach to learn audio-visual representations from video and audio. [Expand]

35.50

Thursday Poster Session

We Are More Than Our Joints: Predicting How 3D Bodies Move

Yan Zhang, Michael J. Black, Siyu Tang

A key step towards understanding human behavior is the prediction of 3D human motion. [Expand]

35.50

Tuesday Poster Session

Rethinking and Improving the Robustness of Image Style Transfer

Pei Wang, Yijun Li, Nuno Vasconcelos

Extensive research in neural style transfer methods has shown that the correlation between features extracted by a pre-trained VGG network has remarkable ability to capture the visual style of an image. [Expand]

35.00

Monday Poster Session

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun

Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving. [Expand]

Wednesday Poster Session

Robust and Accurate Object Detection via Adversarial Learning

Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection. [Expand]

34.50

Friday Poster Session

Self-Supervised Multi-Frame Monocular Scene Flow

Junhwa Hur, Stefan Roth

Estimating 3D scene flow from a sequence of monocular images has been gaining increased attention due to the simple, economical capture setup. [Expand]

Monday Poster Session

PPR10K: A Large-Scale Portrait Photo Retouching Dataset With Human-Region Mask and Group-Level Consistency

Jie Liang, Hui Zeng, Miaomiao Cui, Xuansong Xie, Lei Zhang

Different from general photo retouching tasks, portrait photo retouching (PPR), which aims to enhance the visual quality of a collection of flat-looking portrait photos, has its special and practical requirements such as human-region priority (HRP) and group-level consistency (GLC). [Expand]

Monday Poster Session

Causal Attention for Vision-Language Tasks

Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai

We present a novel attention mechanism: Causal Attention (CATT), to remove the ever-elusive confounding effect in existing attention-based vision-language models. [Expand]

34.50

Wednesday Poster Session

VinVL: Revisiting Visual Representations in Vision-Language Models

Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao

This paper presents a detailed study of improving vision features and develops an improved object detection model for vision language (VL) tasks. [Expand]

Tuesday Poster Session

Fast End-to-End Learning on Protein Surfaces

Freyr Sverrisson, Jean Feydy, Bruno E. Correia, Michael M. Bronstein

Proteins' biological functions are defined by the geometric and chemical structure of their 3D molecular surfaces. [Expand]

33.50

Thursday Poster Session

Multimodal Contrastive Training for Visual Representation Learning

Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. [Expand]

Tuesday Poster Session

DeRF: Decomposed Radiance Fields

Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi

With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. [Expand]

32.75

Thursday Poster Session

Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes

Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang

Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity. [Expand]

32.75

Wednesday Poster Session

TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text

Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner

A crucial component for the scene text based reasoning required for TextVQA and TextCaps datasets involve detecting and recognizing text present in the images using an optical character recognition (OCR) system. [Expand]

32.50

Wednesday Poster Session

Knowledge Evolution in Neural Networks

Ahmed Taha, Abhinav Shrivastava, Larry S. Davis

Deep learning relies on the availability of a large corpus of data (labeled or unlabeled). [Expand]

Thursday Poster Session

Rotation-Only Bundle Adjustment

Seong Hun Lee, Javier Civera

We propose a novel method for estimating the global rotations of the cameras independently of their positions and the scene structure. [Expand]

32.00

Monday Poster Session

Learning To Count Everything

Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai

Existing works on visual counting primarily focus on one specific category at a time, such as people, animals, and cells. [Expand]

Tuesday Poster Session

Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. [Expand]

Monday Poster Session

VS-Net: Voting With Segmentation for Visual Localization

Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li

Visual localization is of great importance in robotics and computer vision. [Expand]

Tuesday Poster Session

STaR: Self-Supervised Tracking and Reconstruction of Rigid Objects in Motion With Neural Rendering

Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove

We present STaR, a novel method that performs Self-supervised Tracking and Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos without any manual annotation. [Expand]

31.00

Thursday Poster Session

Learning Multi-Scale Photo Exposure Correction

Mahmoud Afifi, Konstantinos G. Derpanis, Bjorn Ommer, Michael S. Brown

Capturing photographs with wrong exposures remains a major source of errors in camera-based imaging. [Expand]

Wednesday Poster Session

Weakly Supervised Learning of Rigid 3D Scene Flow

Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal

We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. [Expand]

30.25

Tuesday Poster Session

Monte Carlo Scene Search for 3D Scene Understanding

Shreyas Hampali, Sinisa Stekovic, Sayan Deb Sarkar, Chetan S. Kumar, Friedrich Fraundorfer, Vincent Lepetit

We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need for training data. [Expand]

30.25

Thursday Poster Session

Robust Reference-Based Super-Resolution via C2-Matching

Yuming Jiang, Kelvin C.K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image by introducing an additional high-resolution (HR) reference image. [Expand]

Monday Poster Session

SwiftNet: Real-Time Video Object Segmentation

Haochen Wang, Xiaolong Jiang, Haibing Ren, Yao Hu, Song Bai

In this work we present SwiftNet for real-time semi-supervised video object segmentation (one-shot VOS), which reports 77.8% J&F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance. [Expand]

30.00

Monday Poster Session

Surrogate Gradient Field for Latent Space Manipulation

Minjun Li, Yanghua Jin, Huachun Zhu

Generative adversarial networks (GANs) can generate high-quality images from sampled latent codes. [Expand]

Tuesday Poster Session

Deep Burst Super-Resolution

Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte

While single-image super-resolution (SISR) has attracted substantial interest in recent years, the proposed approaches are limited to learning image priors in order to add high frequency details. [Expand]

29.50

Wednesday Poster Session

CDFI: Compression-Driven Network Design for Frame Interpolation

Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov

DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e.g., mobile devices. [Expand]

Wednesday Poster Session

Intentonomy: A Dataset and Study Towards Human Intent Understanding

Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim

An image is worth a thousand words, conveying information that goes beyond the physical visual content therein. [Expand]

29.50

Thursday Poster Session

Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris

Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. [Expand]

28.50

Wednesday Poster Session

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements

Qianli Ma, Shunsuke Saito, Jinlong Yang, Siyu Tang, Michael J. Black

Learning to model and reconstruct humans in clothing is challenging due to articulation, non-rigid deformation, and varying clothing types and topologies. [Expand]

28.25

Friday Poster Session

MIST: Multiple Instance Spatial Transformer

Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi

We propose a deep network that can be trained to tackle image reconstruction and classification problems that involve detection of multiple object instances, without any supervision regarding their whereabouts. [Expand]

Monday Poster Session

Learning Compositional Radiance Fields of Dynamic Human Heads

Ziyan Wang, Timur Bagautdinov, Stephen Lombardi, Tomas Simon, Jason Saragih, Jessica Hodgins, Michael Zollhofer

Photorealistic rendering of dynamic humans is an important ability for telepresence systems, virtual shopping, synthetic data generation, and more. [Expand]

27.75

Tuesday Poster Session

Pixel Codec Avatars

Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando De la Torre, Yaser Sheikh

Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances. [Expand]

Monday Poster Session

Visual Semantic Role Labeling for Video Understanding

Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi

We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. [Expand]

Tuesday Poster Session

Benchmarking Representation Learning for Natural World Image Collections

Grant Van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha

Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. [Expand]

27.50

Thursday Poster Session

Adversarial Generation of Continuous Images

Ivan Skorokhodov, Savva Ignatyev, Mohamed Elhoseiny

In most existing learning systems, images are typically viewed as 2D pixel arrays. [Expand]

27.25

Wednesday Poster Session

The Spatially-Correlative Loss for Various Image Translation Tasks

Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai

We propose a novel spatially-correlative loss that is simple, efficient, and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. [Expand]

Friday Poster Session

Ensembling With Deep Generative Views

Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang

Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. [Expand]

Thursday Poster Session

VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization

Seunghwan Choi, Sunghyun Park, Minsoo Lee, Jaegul Choo

The task of image-based virtual try-on aims to transfer a target clothing item onto the corresponding region of a person, which is commonly tackled by fitting the item to the desired body part and fusing the warped item with the person. [Expand]

Thursday Poster Session

SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans

Stephan Alaniz, Diego Marcos, Bernt Schiele, Zeynep Akata

Integrated interpretability without sacrificing the prediction accuracy of decision making algorithms has the potential of greatly improving their value to the user. [Expand]

Thursday Poster Session

Teachers Do More Than Teach: Compressing Image-to-Image Models

Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

Generative Adversarial Networks (GANs) have achieved huge success in generating high-fidelity images, however, they suffer from low efficiency due to tremendous computational cost and bulky memory usage. [Expand]

26.00

Thursday Poster Session

Image-to-Image Translation via Hierarchical Style Disentanglement

Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji

Recently, image-to-image translation has made significant progress in achieving both multi-label (i.e., translation conditioned on different labels) and multi-style (i.e., generation with diverse styles) tasks. [Expand]

Wednesday Poster Session

3D CNNs With Adaptive Temporal Feature Resolutions

Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Jurgen Gall

While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. [Expand]

Tuesday Poster Session

De-Rendering the World's Revolutionary Artefacts

Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa

Recent works have shown exciting results in unsupervised image de-rendering--learning to decompose 3D shape, appearance, and lighting from single-image collections without explicit supervision. [Expand]

Tuesday Poster Session

VarifocalNet: An IoU-Aware Dense Object Detector

Haoyang Zhang, Ying Wang, Feras Dayoub, Niko Sunderhauf

Accurately ranking the vast number of candidate detections is crucial for dense object detectors to achieve high performance. [Expand]

25.75

Wednesday Poster Session

Multi-Objective Interpolation Training for Robustness To Label Noise

Diego Ortego, Eric Arazo, Paul Albert, Noel E. O'Connor, Kevin McGuinness

Deep neural networks trained with standard cross-entropy loss memorize noisy labels, which degrades their performance. [Expand]

25.50

Tuesday Poster Session

Center-Based 3D Object Detection and Tracking

Tianwei Yin, Xingyi Zhou, Philipp Krahenbuhl

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. [Expand]

25.50

Thursday Poster Session

A 3D GAN for Improved Large-Pose Facial Recognition

Richard T. Marriott, Sami Romdhani, Liming Chen

Facial recognition using deep convolutional neural networks relies on the availability of large datasets of face images. [Expand]

25.25

Thursday Poster Session

Pixel-Wise Anomaly Detection in Complex Driving Scenes

Giancarlo Di Biase, Hermann Blum, Roland Siegwart, Cesar Cadena

The inability of state-of-the-art semantic segmentation methods to detect anomaly instances hinders them from being deployed in safety-critical and complex applications, such as autonomous driving. [Expand]

25.00

Friday Poster Session

High-Fidelity and Arbitrary Face Editing

Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

Cycle consistency is widely used for face editing. [Expand]

Friday Poster Session

StylePeople: A Generative Model of Fullbody Human Avatars

Artur Grigorev, Karim Iskakov, Anastasia Ianina, Renat Bashirov, Ilya Zakharkin, Alexander Vakhitov, Victor Lempitsky

We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. [Expand]

25.00

Tuesday Poster Session

How Privacy-Preserving Are Line Clouds? Recovering Scene Details From 3D Lines

Kunal Chelani, Fredrik Kahl, Torsten Sattler

Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. [Expand]

Friday Poster Session

Style-Aware Normalized Loss for Improving Arbitrary Style Transfer

Jiaxin Cheng, Ayush Jaiswal, Yue Wu, Pradeep Natarajan, Prem Natarajan

Neural Style Transfer (NST) has quickly evolved from single-style to infinite-style models, also known as Arbitrary Style Transfer (AST). [Expand]

Monday Poster Session

Multi-Stage Progressive Image Restoration

Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

Malik Boudiaf, Hoel Kervadec, Ziko Imtiaz Masud, Pablo Piantanida, Ismail Ben Ayed, Jose Dolz

We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances--an aspect often overlooked in the literature in favor of the meta-learning paradigm. [Expand]

23.50

Thursday Poster Session

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas J. Guibas

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. [Expand]

23.50

Wednesday Poster Session

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham

An essential prerequisite for unleashing the potential of supervised deep learning algorithms in the area of 3D scene understanding is the availability of large-scale and richly annotated datasets. [Expand]

23.50

Tuesday Poster Session

CodedStereo: Learned Phase Masks for Large Depth-of-Field Stereo

Shiyu Tan, Yicheng Wu, Shoou-I Yu, Ashok Veeraraghavan

Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio (SNR) -- due to the conflicting impact of aperture size on both these variables. [Expand]

Wednesday Poster Session

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang

Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data. [Expand]

23.25

Wednesday Poster Session

Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata

Having access to multi-modal cues (e.g. [Expand]

23.00

Tuesday Poster Session

Populating 3D Scenes by Learning Human-Scene Interaction

Mohamed Hassan, Partha Ghosh, Joachim Tesch, Dimitrios Tzionas, Michael J. Black

Humans live within a 3D space and constantly interact with it to perform tasks. [Expand]

23.00

Thursday Poster Session

CoCoNets: Continuous Contrastive 3D Scene Representations

Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki

This paper explores self-supervised learning of amodal 3D feature representations from RGB and RGB-D posed images and videos, agnostic to object and scene semantic content, and evaluates the resulting scene representations in the downstream tasks of visual correspondence, object tracking, and object detection. [Expand]

Thursday Poster Session

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Abulikemu Abuduweili, Xingjian Li, Humphrey Shi, Cheng-Zhong Xu, Dejing Dou

While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. [Expand]

22.75

Tuesday Poster Session

PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting

Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, Noah Snavely

We present an end-to-end inverse rendering pipeline that includes a fully differentiable renderer, and can reconstruct geometry, materials, and illumination from scratch from a set of images. [Expand]

22.75

Tuesday Poster Session

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers

Antoine Miech, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic, Andrew Zisserman

Our objective is language-based search of large-scale image and video datasets. [Expand]

22.50

Wednesday Poster Session

Energy-Based Learning for Scene Graph Generation

Mohammed Suhail, Abhay Mittal, Behjat Siddiquie, Chris Broaddus, Jayan Eledath, Gerard Medioni, Leonid Sigal

Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. [Expand]

22.50

Thursday Poster Session

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape From a Video

Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

Despite the recent success of single image-based 3D human pose and shape estimation methods, recovering temporally consistent and smooth 3D human motion from a video is still challenging. [Expand]

22.25

Monday Poster Session

Generative Classifiers as a Basis for Trustworthy Image Classification

Radek Mackowiak, Lynton Ardizzone, Ullrich Kothe, Carsten Rother

With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. [Expand]

22.25

Tuesday Poster Session

Unsupervised Visual Representation Learning by Tracking Patches in Video

Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

Inspired by the fact that human eyes continue to develop tracking ability in early and middle childhood, we propose to use tracking as a proxy task for a computer vision system to learn the visual representations. [Expand]

Monday Poster Session

An Alternative Probabilistic Interpretation of the Huber Loss

Gregory P. Meyer

The Huber loss is a robust loss function used for a wide range of regression tasks. [Expand]

21.75

Tuesday Poster Session

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

We present the full-resolution correspondence learning for cross-domain images, which aids image translation. [Expand]

Thursday Poster Session

Extreme Rotation Estimation Using Dense Correlation Volumes

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. [Expand]

21.50

Thursday Poster Session

Depth From Camera Motion and Object Detection

Brent A. Griffin, Jason J. Corso

This paper addresses the problem of learning to estimate the depth of detected objects given some measurement of camera motion (e.g., from robot kinematics or vehicle odometry). [Expand]

Monday Poster Session

Coordinate Attention for Efficient Mobile Network Design

Qibin Hou, Daquan Zhou, Jiashi Feng

Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. [Expand]

21.50

Thursday Poster Session

Universal Spectral Adversarial Attacks for Deformable Shapes

Arianna Rampini, Franco Pestarini, Luca Cosmo, Simone Melzi, Emanuele Rodola

Machine learning models are known to be vulnerable to adversarial attacks, namely perturbations of the data that lead to wrong predictions despite being imperceptible. [Expand]

Tuesday Poster Session

Unsupervised Real-World Image Super Resolution via Domain-Distance Aware Training

Yunxuan Wei, Shuhang Gu, Yawei Li, Radu Timofte, Longcun Jin, Hengjie Song

These days, unsupervised super-resolution (SR) is soaring due to its practical and promising potential in real scenarios. [Expand]

21.50

Thursday Poster Session

From Points to Multi-Object 3D Reconstruction

Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari

We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. [Expand]

21.25

Tuesday Poster Session

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Hugo Germain, Vincent Lepetit, Guillaume Bourmaud

Absolute camera pose estimation is usually addressed by sequentially solving two distinct subproblems: First a feature matching problem that seeks to establish putative 2D-3D correspondences, and then a Perspective-n-Point problem that minimizes, w.r.t. [Expand]

21.25

Monday Poster Session

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun

Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. [Expand]

20.75

Thursday Poster Session

The Lottery Tickets Hypothesis for Supervised and Self-Supervised Pre-Training in Computer Vision Models

Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The computer vision world has been re-gaining enthusiasm in various pre-trained models, including both classical ImageNet supervised pre-training and recently emerged self-supervised pre-training such as simCLR and MoCo. [Expand]

20.50

Friday Poster Session

Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning

Riccardo Volpi, Diane Larlus, Gregory Rogez

Most standard learning approaches lead to fragile models which are prone to drift when sequentially trained on samples of a different nature -- the well-known "catastrophic forgetting" issue. [Expand]

Tuesday Poster Session

No Shadow Left Behind: Removing Objects and Their Shadows Using Approximate Lighting and Geometry

Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian L. Curless

Removing objects from images is a challenging technical problem that is important for many applications, including mixed reality. [Expand]

Friday Poster Session

4D Panoptic LiDAR Segmentation

Mehmet Aygun, Aljosa Osep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixe

Temporal semantic scene understanding is critical for self-driving cars or robots operating in dynamic environments. [Expand]

20.25

Tuesday Poster Session

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. [Expand]

20.25

Tuesday Poster Session

Semi-Supervised Synthesis of High-Resolution Editable Textures for 3D Humans

Bindita Chaudhuri, Nikolaos Sarafianos, Linda Shapiro, Tony Tung

We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup. [Expand]

Wednesday Poster Session

Roses Are Red, Violets Are Blue... but Should VQA Expect Them To?

Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to ""reason"", leading them to perform ""educated guesses"" instead. [Expand]

19.75

Monday Poster Session

Sketch2Model: View-Aware 3D Modeling From Single Free-Hand Sketches

Song-Hai Zhang, Yuan-Chen Guo, Qing-Wen Gu

We investigate the problem of generating 3D meshes from single free-hand sketches, aiming at fast 3D modeling for novice users. [Expand]

Tuesday Poster Session

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

The rapid progress of photorealistic synthesis techniques has reached at a critical point where the boundary between real and manipulated images starts to blur. [Expand]

Tuesday Poster Session

Ranking Neural Checkpoints

Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong

This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task. [Expand]

19.25

Monday Poster Session

Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang

Localization Quality Estimation (LQE) is crucial and popular in the recent advancement of dense object detectors since it can provide accurate ranking scores that benefit the Non-Maximum Suppression processing and improve detection performance. [Expand]

19.00

Thursday Poster Session

3D Spatial Recognition Without Spatially Labeled 3D

Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar

We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision. [Expand]

Thursday Poster Session

Unpaired Image-to-Image Translation via Latent Energy Transport

Yang Zhao, Changyou Chen

Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains. [Expand]

19.00

Friday Poster Session

DeepVideoMVS: Multi-View Stereo on Video With Recurrent Spatio-Temporal Fusion

Arda Duzceker, Silvano Galliani, Christoph Vogel, Pablo Speciale, Mihai Dusmanu, Marc Pollefeys

We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. [Expand]

18.50

Thursday Poster Session

i3DMM: Deep Implicit 3D Morphable Model of Human Heads

Tarun Yenamandra, Ayush Tewari, Florian Bernard, Hans-Peter Seidel, Mohamed Elgharib, Daniel Cremers, Christian Theobalt

We present the first deep implicit 3D morphable model (i3DMM) of full heads. [Expand]

18.50

Thursday Poster Session

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene---even for humans. [Expand]

Thursday Poster Session

Deep Active Surface Models

Udaranga Wickramasinghe, Pascal Fua, Graham Knott

Active Surface Models have a long history of being useful to model complex 3D surfaces. [Expand]

17.50

Thursday Poster Session

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani

We present Vx2Text, a framework for text generation from multimodal inputs consisting of video plus text, speech, or audio. [Expand]

17.25

Tuesday Poster Session

Mask Guided Matting via Progressive Refinement Network

Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance. [Expand]

Monday Poster Session

Masksembles for Uncertainty Estimation

Nikita Durasov, Timur Bagautdinov, Pierre Baque, Pascal Fua

Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. [Expand]

17.00

Thursday Poster Session

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

This paper presents an uncalibrated deep neural network framework for the photometric stereo problem. [Expand]

Tuesday Poster Session

Learning Graph Embeddings for Compositional Zero-Shot Learning

Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata

In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. [Expand]

17.00

Monday Poster Session

Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles

Jevgenij Gamper, Nasir Rajpoot

We present ARCH, a computational pathology (CP) multiple instance captioning dataset to facilitate dense supervision of CP tasks. [Expand]

Friday Poster Session

Quantifying Explainers of Graph Neural Networks in Computational Pathology

Friday Poster Session

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, Junjie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jiwen Lu, Dalong Du, Jie Zhou

In this paper, we contribute a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol. [Expand]

16.00

Wednesday Poster Session

Sequential Graph Convolutional Network for Active Learning

Razvan Caramalau, Binod Bhattarai, Tae-Kyun Kim

We propose a novel pool-based Active Learning frame-work constructed on a sequential Graph Convolution Net-work (GCN). [Expand]

15.75

Wednesday Poster Session

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll

Recent neural view synthesis methods have achieved impressive quality and realism, surpassing classical pipelines which rely on multi-view reconstruction. [Expand]

Wednesday Poster Session

Zhiqin Chen, Vladimir G. Kim, Matthew Fisher, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri

We introduce a deep generative network for 3D shape detailization, akin to stylization with the style being geometric details. [Expand]

15.00

Friday Poster Session

Continual Learning via Bit-Level Information Preserving

Yujun Shi, Li Yuan, Yunpeng Chen, Jiashi Feng

Continual learning tackles the setting of learning different tasks sequentially. [Expand]

Friday Poster Session

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

Li Yi, Boqing Gong, Thomas Funkhouser

We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors. [Expand]

15.00

Thursday Poster Session

IIRC: Incremental Implicitly-Refined Classification

Mohamed Abdelsalam, Mojtaba Faramarzi, Shagun Sodhani, Sarath Chandar

We introduce the 'Incremental Implicitly-Refined Classification (IIRC)' setup, an extension to the class incremental learning setup where the incoming batches of classes have two granularity levels. [Expand]

14.75

Wednesday Poster Session

Bo Sun, Banghuai Li, Shengcai Cai, Ye Yuan, Chi Zhang

Emerging interests have been brought to recognize previously unseen objects given very few training examples, known as few-shot object detection (FSOD). [Expand]

13.50

Wednesday Poster Session

Lifting 2D StyleGAN for 3D-Aware Face Generation

Yichun Shi, Divyansh Aggarwal, Anil K. Jain

We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. [Expand]

Tuesday Poster Session

Self-Supervised Learning of Depth Inference for Multi-View Stereo

Jiayu Yang, Jose M. Alvarez, Miaomiao Liu

Recent supervised multi-view depth estimation networks have achieved promising results. [Expand]

Wednesday Poster Session

Unsupervised Human Pose Estimation Through Transforming Shape Templates

Luca Schmidtke, Athanasios Vlontzos, Simon Ellershaw, Anna Lukens, Tomoki Arichi, Bernhard Kainz

Human pose estimation is a major computer vision problem with applications ranging from augmented reality and video capture to surveillance and movement tracking. [Expand]

Monday Poster Session

LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network

Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu

With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. [Expand]

12.75

Thursday Poster Session

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Artistic style transfer aims at migrating the style from an example image to a content image. [Expand]

Mianlun Zheng, Yi Zhou, Duygu Ceylan, Jernej Barbic

Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. [Expand]

Tuesday Poster Session

Content-Aware GAN Compression

Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, Sun-Yuan Kung

Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. [Expand]

Thursday Poster Session

Faster Meta Update Strategy for Noise-Robust Deep Learning

Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang

It has been shown that deep neural networks are prone to overfitting on biased training data. [Expand]

12.50

Monday Poster Session

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision

Yang Hong, Juyong Zhang, Boyi Jiang, Yudong Guo, Ligang Liu, Hujun Bao

In this paper, we propose StereoPIFu, which integrates the geometric constraints of stereo vision with implicit function representation of PIFu, to recover the 3D shape of the clothed human from a pair of low-cost rectified images. [Expand]

12.25

Monday Poster Session

CoMoGAN: Continuous Model-Guided Image-to-Image Translation

Fabio Pizzati, Pietro Cerri, Raoul de Charette

CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. [Expand]

Thursday Poster Session

Self-Supervised Motion Learning From Static Images

Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo H. Ang

Motions are reflected in videos as the movement of pixels, and actions are essentially patterns of inconsistent motions between the foreground and the background. [Expand]

12.00

Monday Poster Session

KOALAnet: Blind Super-Resolution Using Kernel-Oriented Adaptive Local Adjustment

UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training

Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu, Jingjing Liu

Vision-and-language pre-training has achieved impressive success in learning multimodal representations between vision and language. [Expand]

Tuesday Poster Session

HyperSeg: Patch-Wise Hypernetwork for Real-Time Semantic Segmentation

Yuval Nirkin, Lior Wolf, Tal Hassner

We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. [Expand]

11.75

Tuesday Poster Session

Kwanyoung Kim, Dongwon Park, Kwang In Kim, Se Young Chun

Often, labeling large amount of data is challenging due to high labeling cost limiting the application domain of deep learning techniques. [Expand]

11.25

Wednesday Poster Session

Visually Informed Binaural Audio Generation without Binaural Audios

Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Stereophonic audio, especially binaural audio, plays an essential role in immersive viewing environments. [Expand]

10.25

Thursday Poster Session

DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation

Xiong Zhang, Hongmin Xu, Hong Mo, Jianchao Tan, Cheng Yang, Lei Wang, Wenqi Ren

Existing NAS methods for dense image prediction tasks usually compromise on restricted search space or search on proxy task to meet the achievable computational demands. [Expand]

10.25

Thursday Poster Session

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

Kelvin C.K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy

Video super-resolution (VSR) approaches tend to have more components than the image counterparts as they need to exploit the additional temporal dimension. [Expand]

10.00

Tuesday Poster Session

Generalizable Pedestrian Detection: The Elephant in the Room

Irtiza Hasan, Shengcai Liao, Jinpeng Li, Saad Ullah Akram, Ling Shao

Pedestrian detection is used in many vision based applications ranging from video surveillance to autonomous driving. [Expand]

10.00

Wednesday Poster Session

Progressive Semantic Segmentation

Chuong Huynh, Anh Tuan Tran, Khoa Luu, Minh Hoai

The objective of this work is to segment high-resolution images without overloading GPU memory usage or losing the fine details in the output segmentation map. [Expand]

Friday Poster Session

AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations From Self-Trained Negative Adversaries

Qianjiang Hu, Xiao Wang, Wei Hu, Guo-Jun Qi

Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. [Expand]

10.00

Monday Poster Session

Weakly-Supervised Physically Unconstrained Gaze Estimation

Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios. [Expand]

Wednesday Poster Session

Self-Point-Flow: Self-Supervised Scene Flow Estimation From Point Clouds With Optimal Transport and Random Walk

Ruibo Li, Guosheng Lin, Lihua Xie

Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. [Expand]

Friday Poster Session

ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks

Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M. Robertson

To train robust deep neural networks (DNNs), we systematically study several target modification approaches, which include output regularisation, self and non-self label correction (LC). [Expand]

10.00

Monday Poster Session

Correlated Input-Dependent Label Noise in Large-Scale Image Classification

Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

Large scale image classification datasets often contain noisy labels. [Expand]

9.75

Monday Poster Session

Learning to Track Instances without Video Annotations

Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches. [Expand]

Wednesday Poster Session

HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching

Vladimir Tankovich, Christian Hane, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz

This paper presents HITNet, a novel neural network architecture for real-time stereo matching. [Expand]

9.75

Thursday Poster Session

PatchmatchNet: Learned Multi-View Patchmatch Stereo

Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, Hongwei Qin

For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance. [Expand]

Thursday Poster Session

Audio-Driven Emotional Video Portraits

Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

Despite previous success in generating audio-driven talking heads, most of the previous studies focus on the correlation between speech content and the mouth shape. [Expand]

9.25

Thursday Poster Session

MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation

Sanjay Kariyappa, Atul Prakash, Moinuddin K Qureshi

High quality Machine Learning (ML) models are often considered valuable intellectual property by companies. [Expand]

9.25

Thursday Poster Session

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra

Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. [Expand]

9.25

Tuesday Poster Session

Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification

Jianwen Xie, Yifei Xu, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

We propose a generative model of unordered point sets, such as point clouds, in the forms of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network. [Expand]

Thursday Poster Session

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens

Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei Zhang, Chao Xu, Chunjing Xu, Dacheng Tao, Chang Xu

Neural Architecture Search (NAS) aims to automatically discover optimal architectures. [Expand]

9.25

Wednesday Poster Session

SimPoE: Simulated Character Control for 3D Human Pose Estimation

Ye Yuan, Shih-En Wei, Tomas Simon, Kris Kitani, Jason Saragih

Accurate estimation of 3D human motion from monocular video requires modeling both kinematics (body motion without physical forces) and dynamics (motion with physical forces). [Expand]

Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia

Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network. [Expand]

Tuesday Poster Session

Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition

Valentin Deschaintre, Yiming Lin, Abhijeet Ghosh

We present a novel method for efficient acquisition of shape and spatially varying reflectance of 3D objects using polarization cues. [Expand]

Friday Poster Session

Searching by Generating: Flexible and Efficient One-Shot NAS With Architecture Generator

Sian-Yao Huang, Wei-Ta Chu

In one-shot NAS, sub-networks need to be searched from the supernet to meet different hardware constraints. [Expand]

Monday Poster Session

Revamping Cross-Modal Recipe Retrieval With Hierarchical Transformers and Self-Supervised Learning

Amaia Salvador, Erhan Gundogdu, Loris Bazzani, Michael Donoser

Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models. [Expand]

Thursday Poster Session

AutoFlow: Learning a Better Training Set for Optical Flow

Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications. [Expand]

Wednesday Poster Session

PISE: Person Image Synthesis and Editing With Decoupled GAN

Jinsong Zhang, Kun Li, Yu-Kun Lai, Jingyu Yang

Person image synthesis, e.g., pose transfer, is a challenging problem due to large variation and occlusion. [Expand]

9.00

Wednesday Poster Session

Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang

To address the problem of long-tail distribution for the large vocabulary object detection task, existing methods usually divide the whole categories into several groups and treat each group with different strategies. [Expand]

Tuesday Poster Session

Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation

Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, Nenghai Yu

Image matting is a fundamental and challenging problem in computer vision and graphics. [Expand]

Thursday Poster Session

Mitigating Face Recognition Bias via Group Adaptive Classifier

Sixue Gong, Xiaoming Liu, Anil K. Jain

Face recognition is known to exhibit bias -- subjects in a certain demographic group can be better recognized than other groups. [Expand]

8.25

Tuesday Poster Session

Interpolation-Based Semi-Supervised Learning for Object Detection

Jisoo Jeong, Vikas Verma, Minsung Hyun, Juho Kannala, Nojun Kwak

Despite the data labeling cost for the object detection tasks being substantially more than that of the classification tasks, semi-supervised learning methods for object detection have not been studied much. [Expand]

8.25

Thursday Poster Session

Learning Invariant Representations and Risks for Semi-Supervised Domain Adaptation

Bo Li, Yezhen Wang, Shanghang Zhang, Dongsheng Li, Kurt Keutzer, Trevor Darrell, Han Zhao

The success of supervised learning crucially hinges on the assumption that training data matches test data, which rarely holds in practice due to potential distribution shift. [Expand]

8.25

Monday Poster Session

DeepSurfels: Learning Online Appearance Fusion

Marko Mihajlovic, Silvan Weder, Marc Pollefeys, Martin R. Oswald

We present DeepSurfels, a novel hybrid scene representation for geometry and appearance information. [Expand]

Thursday Poster Session

Stefan Stojanov, Anh Thai, James M. Rehg

It is widely accepted that reasoning about object shape is important for object recognition. [Expand]

Monday Poster Session

Structured Scene Memory for Vision-Language Navigation

Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. [Expand]

8.00

Wednesday Poster Session

Dense Label Encoding for Boundary Discontinuity Free Rotation Detection

Xue Yang, Liping Hou, Yue Zhou, Wentao Wang, Junchi Yan

Rotation detection serves as a fundamental building block in many visual applications involving aerial image, scene text, and face etc. [Expand]

8.00

Friday Poster Session

Rethinking BiSeNet for Real-Time Semantic Segmentation

Mingyuan Fan, Shenqi Lai, Junshi Huang, Xiaoming Wei, Zhenhua Chai, Junfeng Luo, Xiaolin Wei

BiSeNet has been proved to be a popular two-stream network for real-time segmentation. [Expand]

7.75

Wednesday Poster Session

Kumara Kahatapitiya, Michael S. Ryoo

In this paper, we introduce 'Coarse-Fine Networks', a two-stream architecture which benefits from different abstractions of temporal resolution to learn better video representations for long-term motion. [Expand]

7.50

Wednesday Poster Session

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection

Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang

Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries. [Expand]

Tuesday Poster Session

SMD-Nets: Stereo Mixture Density Networks

Fabio Tosi, Yiyi Liao, Carolin Schmitt, Andreas Geiger

Despite stereo matching accuracy has greatly improved by deep learning in the last few years, recovering sharp boundaries and high-resolution outputs efficiently remains challenging. [Expand]

Wednesday Poster Session

Learning Accurate Dense Correspondences and When To Trust Them

How Robust Are Randomized Smoothing Based Defenses to Data Poisoning?

Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm

Predictions of certifiably robust classifiers remain constant in a neighborhood of a point, making them resilient to test-time attacks with a guarantee. [Expand]

Thursday Poster Session

Improving Panoptic Segmentation at All Scales

Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder

Crop-based training strategies decouple training resolution from GPU memory consumption, allowing the use of large-capacity panoptic segmentation networks on multi-megapixel images. [Expand]

Wednesday Poster Session

AdderSR: Towards Energy Efficient Image Super-Resolution

Dehua Song, Yunhe Wang, Hanting Chen, Chang Xu, Chunjing Xu, Dacheng Tao

This paper studies the single image super-resolution problem using adder neural networks (AdderNets). [Expand]

7.25

Friday Poster Session

Truly Shift-Invariant Convolutional Neural Networks

Anadi Chaman, Ivan Dokmanic

Thanks to the use of convolution and pooling layers, convolutional neural networks were for a long time thought to be shift-invariant. [Expand]

7.00

Tuesday Poster Session

MetricOpt: Learning To Optimize Black-Box Evaluation Metrics

Chen Huang, Shuangfei Zhai, Pengsheng Guo, Josh Susskind

We study the problem of directly optimizing arbitrary non-differentiable task evaluation metrics such as misclassification rate and recall. [Expand]

Monday Poster Session

Bi-GCN: Binary Graph Convolutional Network

Junfu Wang, Yunhong Wang, Zhen Yang, Liang Yang, Yuanfang Guo

Graph Neural Networks (GNNs) have achieved tremendous success in graph representation learning. [Expand]

7.00

Monday Poster Session

Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

When people observe events, they are able to abstract key information and build concise summaries of what is happening. [Expand]

6.25

Thursday Poster Session

TearingNet: Point Cloud Autoencoder To Learn Topology-Friendly Representations

Jiahao Pang, Duanshun Li, Dong Tian

Topology matters. [Expand]

6.25

Wednesday Poster Session

Uncertainty-Guided Model Generalization to Unseen Domains

Fengchun Qiao, Xi Peng

We study a worst-case scenario in generalization: Out-of-domain generalization from a single source. [Expand]

6.25

Tuesday Poster Session

Fingerspelling Detection in American Sign Language

Bowen Shi, Diane Brentari, Greg Shakhnarovich, Karen Livescu

Fingerspelling, in which words are signed letter by letter, is an important component of American Sign Language. [Expand]

Tuesday Poster Session

Semi-Supervised Action Recognition With Temporal Contrastive Learning

Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

Learning to recognize actions from only a handful of labeled videos is a challenging problem due to the scarcity of tediously collected activity labels. [Expand]

6.25

Wednesday Poster Session

Rectification-Based Knowledge Retention for Continual Learning

Pravendra Singh, Pratik Mazumder, Piyush Rai, Vinay P. Namboodiri

Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting. [Expand]

6.25

Thursday Poster Session

Multiple Instance Active Learning for Object Detection

Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection. [Expand]

6.25

Tuesday Poster Session

AQD: Towards Accurate Quantized Object Detection

Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. [Expand]

6.00

Monday Poster Session

Polarimetric Normal Stereo

Yan Bai, Jile Jiao, Wang Ce, Jun Liu, Yihang Lou, Xuetao Feng, Ling-Yu Duan

Recently, person re-identification (ReID) has vastly benefited from the surging waves of data-driven methods. [Expand]

PDF

Show Tweets

Monday Poster Session

Convolutional Dynamic Alignment Networks for Interpretable Classifications

Moritz Bohle, Mario Fritz, Bernt Schiele

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. [Expand]

Wednesday Poster Session

Semantic Audio-Visual Navigation

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Yu-Chao Gu, Li-Juan Wang, Yun Liu, Yi Yang, Yu-Huan Wu, Shao-Ping Lu, Ming-Ming Cheng

Differentiable Architecture Search (DARTS) has attracted extensive attention due to its efficiency in searching for cell structures. [Expand]

5.25

Thursday Poster Session

Distilling Causal Effect of Data in Class-Incremental Learning

Qing Yu, Atsushi Hashimoto, Yoshitaka Ushiku

Universal domain adaptation (UniDA) has been proposed to transfer knowledge learned from a label-rich source domain to a label-scarce target domain without any constraints on the label sets. [Expand]

Monday Poster Session

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

Modern 3D object detectors have immensely benefited from the end-to-end learning idea. [Expand]

Wednesday Poster Session

BRepNet: A Topological Message Passing System for Solid Models

Joseph G. Lambourne, Karl D.D. Willis, Pradeep Kumar Jayaraman, Aditya Sanghi, Peter Meltzer, Hooman Shayani

Boundary representation (B-rep) models are the standard way 3D shapes are described in Computer-Aided Design (CAD) applications. [Expand]

4.50

Thursday Poster Session

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion

Shi Qiu, Saeed Anwar, Nick Barnes

Given the prominence of current 3D sensors, a fine-grained analysis on the basic point cloud data is worthy of further investigation. [Expand]

Monday Poster Session

TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking

N Dinesh Reddy, Laurent Guigues, Leonid Pishchulin, Jayan Eledath, Srinivasa G. Narasimhan

We consider the task of 3D pose estimation and trackingof multiple people seen in an arbitrary number of camerafeeds. [Expand]

PDF

Show Tweets

Thursday Poster Session

DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for Deep Neural Networks

Abhishek Singh, Ayush Chopra, Ethan Garza, Emily Zhang, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar

Recent deep learning models have shown remarkable performance in image classification. [Expand]

Thursday Poster Session

Practical Wide-Angle Portraits Correction With Deep Structured Models

Jing Tan, Shan Zhao, Pengfei Xiong, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu

Wide-angle portraits often enjoy expanded views. [Expand]

Tuesday Poster Session

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

Longyin Wen, Dawei Du, Pengfei Zhu, Qinghua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured large-scale dataset, named as DroneCrowd, formed by 112 video clips with 33,600 HD frames in various scenarios. [Expand]

Wednesday Poster Session

Regularizing Neural Networks via Adversarial Model Perturbation

Yaowei Zheng, Richong Zhang, Yongyi Mao

Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. [Expand]

4.50

Wednesday Poster Session

Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network

Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Chen Change Loy, Jinwei Gu

Recent development of Under-Display Camera (UDC) systems provides a true bezel-less and notch-free viewing experience on smartphones (and TV, laptops, tablets), while allowing images to be captured from the selfie camera embedded underneath. [Expand]

Monday Poster Session

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, Xinjing Cheng

Real-world training data usually exhibits long-tailed distribution, where several majority classes have a significantly larger number of samples than the remaining minority classes. [Expand]

Tuesday Poster Session

Simulating Unknown Target Models for Query-Efficient Black-Box Attacks

Chen Ma, Li Chen, Jun-Hai Yong

Many adversarial attacks have been proposed to investigate the security issues of deep neural networks. [Expand]

Thursday Poster Session

Beyond Image to Depth: Improving Depth Prediction Using Echoes

Kranti Kumar Parida, Siddharth Srivastava, Gaurav Sharma

We address the problem of estimating depth with multi modal audio visual data. [Expand]

4.25

Wednesday Poster Session

On the Difficulty of Membership Inference Attacks

Shahbaz Rezaei, Xin Liu

Recent studies propose membership inference (MI) attacks on deep models, where the goal is to infer if a sample has been used in the training process. [Expand]

Wednesday Poster Session

Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-Constrained Optimization

Fakai Wang, Kang Zheng, Le Lu, Jing Xiao, Min Wu, Shun Miao

Accurate vertebra localization and identification are required in many clinical applications of spine disorder diagnosis and surgery planning. [Expand]

Tuesday Poster Session

Kartikeya Bhardwaj, Guihong Li, Radu Marculescu

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. [Expand]

Thursday Poster Session

Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning

Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto

We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. [Expand]

4.00

Monday Poster Session

SLADE: A Self-Training Framework for Distance Metric Learning

Jiali Duan, Yen-Liang Lin, Son Tran, Larry S. Davis, C.-C. Jay Kuo

Most existing distance metric learning approaches use fully labeled data to learn the sample similarities in an embedding space. [Expand]

Wednesday Poster Session

Generalized Few-Shot Object Detection Without Forgetting

Zhibo Fan, Yuchen Ma, Zeming Li, Jian Sun

Learning object detection from few examples recently emerged to deal with data-limited situations. [Expand]

Tuesday Poster Session

Regressive Domain Adaptation for Unsupervised Keypoint Detection

Junguang Jiang, Yifei Ji, Ximei Wang, Yufeng Liu, Jianmin Wang, Mingsheng Long

Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain. [Expand]

Tuesday Poster Session

Taskology: Utilizing Task Relations at Scale

Yao Lu, Soren Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon

Many computer vision tasks address the problem of scene understanding and are naturally interrelated e.g. [Expand]

4.00

Wednesday Poster Session

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, Nong Sang

Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. [Expand]

4.00

Monday Poster Session

Affective Processes: Stochastic Modelling of Temporal Context for Emotion and Facial Expression Recognition

Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos

Temporal context is key to the recognition of expressions of emotion. [Expand]

Wednesday Poster Session

Look Before You Speak: Visually Contextualized Utterances

Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid

While most conversational AI systems focus on textual dialogue only, conditioning utterances on visual context (when it's available) can lead to more realistic conversations. [Expand]

4.00

Friday Poster Session

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

Yapeng Tian, Di Hu, Chenliang Xu

There are rich synchronized audio and visual events in our daily life. [Expand]

4.00

Monday Poster Session

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. [Expand]

4.00

Wednesday Poster Session

Deep Optimized Priors for 3D Shape Modeling and Reconstruction

Mingyue Yang, Yuxin Wen, Weikai Chen, Yongwei Chen, Kui Jia

Many learning-based approaches have difficulty scaling to unseen data, as the generality of its learned prior is limited to the scale and variations of the training samples. [Expand]

4.00

Tuesday Poster Session

Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition

Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, Jian Sun

Despite the success of the deep neural networks, it remains challenging to effectively build a system for long-tail visual recognition tasks. [Expand]

Monday Poster Session

Few-Shot 3D Point Cloud Semantic Segmentation

Na Zhao, Tat-Seng Chua, Gim Hee Lee

Many existing approaches for 3D point cloud semantic segmentation are fully supervised. [Expand]

4.00

Wednesday Poster Session

A Second-Order Approach to Learning With Instance-Dependent Label Noise

Zhaowei Zhu, Tongliang Liu, Yang Liu

The presence of label noise often misleads the training of deep neural networks. [Expand]

4.00

Wednesday Poster Session

Meta Batch-Instance Normalization for Generalizable Person Re-Identification

Seokeon Choi, Taekyung Kim, Minki Jeong, Hyoungseob Park, Changick Kim

Although supervised person re-identification (Re-ID) methods have shown impressive performance, they suffer from a poor generalization capability on unseen domains. [Expand]

3.75

Tuesday Poster Session

A Peek Into the Reasoning of Neural Networks: Interpreting With Structural Visual Concepts

Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu

Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from a lack of transparency and interpretability. [Expand]

Monday Poster Session

StyleMix: Separating Content and Style for Enhanced Data Augmentation

Minui Hong, Jinwoo Choi, Gunhee Kim

In spite of the great success of deep neural networks for many challenging classification tasks, the learned networks are vulnerable to overfitting and adversarial attacks. [Expand]

PDF

Show Tweets

Thursday Poster Session

General Multi-Label Image Classification With Transformers

Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. [Expand]

3.75

Friday Poster Session

Model-Contrastive Federated Learning

Qinbin Li, Bingsheng He, Dawn Song

Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. [Expand]

3.75

Wednesday Poster Session

UAV-Human: A Large Benchmark for Human Behavior Understanding With Unmanned Aerial Vehicles

Tianjiao Li, Jun Liu, Wei Zhang, Yun Ni, Wenqian Wang, Zhiheng Li

Human behavior understanding with unmanned aerial vehicles (UAVs) is of great significance for a wide range of applications, which simultaneously brings an urgent demand of large, challenging, and comprehensive benchmarks for the development and evaluation of UAV-based models. [Expand]

Friday Poster Session

Learning Asynchronous and Sparse Human-Object Interaction in Videos

Romero Morais, Vuong Le, Svetha Venkatesh, Truyen Tran

Human activities can be learned from video. [Expand]

Friday Poster Session

Neural Prototype Trees for Interpretable Fine-Grained Image Recognition

Tuesday Poster Session

Backdoor Attacks Against Deep Learning Systems in the Physical World

Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao

Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific "trigger." Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that apply digitally generated patterns as triggers. [Expand]

Tuesday Poster Session

Neural Splines: Fitting 3D Surfaces With Infinitely-Wide Neural Networks

Francis Williams, Matthew Trager, Joan Bruna, Denis Zorin

We present Neural Splines, a technique for 3D surface reconstruction that is based on random feature kernels arising from infinitely-wide shallow ReLU networks. [Expand]

3.50

Wednesday Poster Session

Track To Detect and Segment: An Online Multi-Object Tracker

Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. [Expand]

3.50

Thursday Poster Session

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc Van Gool

To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. [Expand]

3.50

Monday Poster Session

One Shot Face Swapping on Megapixels

Yuhao Zhu, Qi Li, Jian Wang, Cheng-Zhong Xu, Zhenan Sun

Face swapping has both positive applications such as entertainment, human-computer interaction, etc., and negative applications such as DeepFake threats to politics, economics, etc. [Expand]

Tuesday Poster Session

PointDSC: Robust Point Cloud Registration Using Deep Spatial Consistency

Xuyang Bai, Zixin Luo, Lei Zhou, Hongkai Chen, Lei Li, Zeyu Hu, Hongbo Fu, Chiew-Lan Tai

Removing outlier correspondences is one of the critical steps for successful feature-based point cloud registration. [Expand]

Friday Poster Session

RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening

Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne T. Kim, Seungryong Kim, Jaegul Choo

Enhancing the generalization capability of deep neural networks to unseen domains is crucial for safety-critical applications in the real world such as autonomous driving. [Expand]

3.25

Thursday Poster Session

Anomaly Detection in Video via Self-Supervised and Multi-Task Learning

Searching for Fast Model Families on Datacenter Accelerators

Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc V. Le, Norman P. Jouppi

Neural Architecture Search (NAS), together with model scaling, has shown remarkable progress in designing high accuracy and fast convolutional architecture families. [Expand]

3.00

Wednesday Poster Session

Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection

Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai

Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding. [Expand]

3.00

Wednesday Poster Session

Diffusion Probabilistic Models for 3D Point Cloud Generation

Positive-Congruent Training: Towards Regression-Free Model Updates

Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto

Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error. [Expand]

3.00

Thursday Poster Session

Robust Instance Segmentation Through Reasoning About Multi-Object Occlusion

Xiaoding Yuan, Adam Kortylewski, Yihong Sun, Alan Yuille

Analyzing complex scenes with Deep Neural Networks is a challenging task, particularly when images contain multiple objects that partially occlude each other. [Expand]

Wednesday Poster Session

Deep Stable Learning for Out-of-Distribution Generalization

Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen

Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. [Expand]

Tuesday Poster Session

DoDNet: Learning To Segment Multi-Organ and Tumors From Multiple Partially Labeled Datasets

Jianpeng Zhang, Yutong Xie, Yong Xia, Chunhua Shen

Due to the intensive cost of labor and expertise in annotating 3D medical images at a voxel level, most benchmark datasets are equipped with the annotations of only one type of organs and/or tumors, resulting in the so-called partially labeling issue. [Expand]

3.00

Monday Poster Session

Improving Sign Language Translation With Monolingual Data by Sign Back-Translation

Hao Zhou, Wengang Zhou, Weizhen Qi, Junfu Pu, Houqiang Li

Despite existing pioneering works on sign language translation (SLT), there is a non-trivial obstacle, i.e., the limited quantity of parallel sign-text data. [Expand]

Monday Poster Session

Spatially-Varying Outdoor Lighting Estimation From Intrinsics

Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi

We present SOLID-Net, a neural network for spatially-varying outdoor lighting estimation from a single outdoor image for any 2D pixel location. [Expand]

Thursday Poster Session

ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows

Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo

Universal style transfer retains styles from reference images in content images. [Expand]

2.75

Monday Poster Session

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

Bowen Cheng, Ross Girshick, Piotr Dollar, Alexander C. Berg, Alexander Kirillov

We present Boundary IoU (Intersection-over-Union), a new segmentation evaluation measure focused on boundary quality. [Expand]

2.75

Thursday Poster Session

Equivariant Point Network for 3D Point Cloud Analysis

Haiwei Chen, Shichen Liu, Weikai Chen, Hao Li, Randall Hill

Features that are equivariant to a larger group of symmetries have been shown to be more discriminative and powerful in recent studies. [Expand]

Thursday Poster Session

Compatibility-Aware Heterogeneous Visual Search

Rahul Duggal, Hao Zhou, Shuo Yang, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We tackle the problem of visual search under resource constraints. [Expand]

Wednesday Poster Session

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds

Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi

We introduce Position Adaptive Convolution (PAConv), a generic convolution operation for 3D point cloud processing. [Expand]

2.75

Tuesday Poster Session

Patch-VQ: 'Patching Up' the Video Quality Problem

Zhenqiang Ying, Maniratnam Mandal, Deepti Ghadiyaram, Alan Bovik

No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem for social and streaming media applications. [Expand]

2.75

Thursday Poster Session

Are Labels Always Necessary for Classifier Accuracy Evaluation?

Weijian Deng, Liang Zheng

To calculate the model accuracy on a computer vision task, e.g., object recognition, we usually require a test set composing of test samples and their ground truth labels. [Expand]

2.50

Thursday Poster Session

XProtoNet: Diagnosis in Chest Radiography With Global and Local Explanations

Eunji Kim, Siwon Kim, Minji Seo, Sungroh Yoon

Automated diagnosis using deep neural networks in chest radiography can help radiologists detect life-threatening diseases. [Expand]

2.50

Friday Poster Session

MongeNet: Efficient Sampler for Geometric Deep Learning

Leo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Olivier Salvado

Recent advances in geometric deep-learning introduce complex computational challenges for evaluating the distance between meshes. [Expand]

Friday Poster Session

One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation

Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

Point cloud semantic segmentation often requires largescale annotated training data, but clearly, point-wise labels are too tedious to prepare. [Expand]

Monday Poster Session

PointGuard: Provably Robust 3D Point Cloud Classification

Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

3D point cloud classification has many safety-critical applications such as autonomous driving and robotic grasping. [Expand]

2.50

Tuesday Poster Session

UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning

When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks

Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo

Human pose estimation is a fundamental yet challenging task in computer vision, which aims at localizing human anatomical keypoints. [Expand]

Thursday Poster Session

Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision

Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, Serena Yeung

Instance segmentation is an active topic in computer vision that is usually solved by using supervised learning approaches over very large datasets composed of object level masks. [Expand]

Monday Poster Session

Capturing Omni-Range Context for Omnidirectional Segmentation

More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. [Expand]

2.25

Tuesday Poster Session

InverseForm: A Loss Function for Structured Boundary-Aware Segmentation

Shubhankar Borse, Ying Wang, Yizhe Zhang, Fatih Porikli

We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network, which efficiently learns the degree of parametric transformations between estimated and target boundaries. [Expand]

Tuesday Poster Session

Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds

Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. [Expand]

Wednesday Poster Session

Chengyue Gong, Dilin Wang, Qiang Liu

Semi-supervised learning (SSL) is a key approach toward more data-efficient machine learning by jointly leverage both labeled and unlabeled data. [Expand]

Thursday Poster Session

ReDet: A Rotation-Equivariant Detector for Aerial Object Detection

Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia

Recently, object detection in aerial images has gained much attention in computer vision. [Expand]

2.25

Monday Poster Session

Reinforced Attention for Few-Shot Learning and Beyond

Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson

Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images. [Expand]

2.25

Monday Poster Session

A Multiplexed Network for End-to-End, Multilingual OCR

Jing Huang, Guan Pang, Rama Kovvuri, Mandy Toh, Kevin J Liang, Praveen Krishnan, Xi Yin, Tal Hassner

Recent advances in OCR have shown that an end-to-end (E2E) training pipeline that includes both detection and recognition leads to the best results. [Expand]

2.25

Tuesday Poster Session

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

Yair Kittenplon, Yonina C. Eldar, Dan Raviv

Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision. [Expand]

2.25

Tuesday Poster Session

Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration

Liyuan Pan, Shah Chowdhury, Richard Hartley, Miaomiao Liu, Hongguang Zhang, Hongdong Li

The dual-pixel (DP) hardware works by splitting each pixel in half and creating an image pair in a single snapshot. [Expand]

2.25

Tuesday Poster Session

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-Bit Neural Networks via Guided Distribution Calibration

Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides

Previous studies dominantly target at self-supervised learning on real-valued networks and have achieved many promising results. [Expand]

2.25

Monday Poster Session

NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras

Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Minye Wu, Kaiwen Guo, Lan Xu

4D reconstruction and rendering of human activities is critical for immersive VR/AR experience. [Expand]

2.25

Tuesday Poster Session

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Praveen Tirupattur, Kevin Duarte, Yogesh S Rawat, Mubarak Shah

Real world videos contain many complex actions with inherent relationships between action classes. [Expand]

Monday Poster Session

ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning

Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, Jun Zhu

Continual learning usually assumes the incoming data are fully labeled, which might not be applicable in real applications. [Expand]

Tuesday Poster Session

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning

Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun

Self-supervised learning has shown great potentials in improving the video representation ability of deep neural networks by getting supervision from the data itself. [Expand]

2.25

Thursday Poster Session

T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval

Xiaohan Wang, Linchao Zhu, Yi Yang

Text-video retrieval is a challenging task that aims to search relevant video contents based on natural language descriptions. [Expand]

2.25

Tuesday Poster Session

Few-Shot Classification With Feature Map Reconstruction Networks

Davis Wertheimer, Luming Tang, Bharath Hariharan

In this paper we reformulate few-shot classification as a reconstruction problem in latent space. [Expand]

Wednesday Poster Session

Zhixiang Chen, Tae-Kyun Kim

3D morphable models are widely used for the shape representation of an object class in computer vision and graphics applications. [Expand]

Thursday Poster Session

I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors

Chaoqi Chen, Zebiao Zheng, Yue Huang, Xinghao Ding, Yizhou Yu

Recent works on two-stage cross-domain detection have widely explored the local feature patterns to achieve more accurate adaptation results. [Expand]

2.00

Thursday Poster Session

Semantic-Aware Knowledge Distillation for Few-Shot Class-Incremental Learning

Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi

Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner. [Expand]

Xiaoqing Guo, Chen Yang, Baopu Li, Yixuan Yuan

Unsupervised domain adaptation (UDA) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain. [Expand]

Tuesday Poster Session

Lips Don't Lie: A Generalisable and Robust Approach To Face Forgery Detection

Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Although current deep learning-based face forgery detectors achieve impressive performance in constrained scenarios, they are vulnerable to samples created by unseen manipulation methods. [Expand]

2.00

Tuesday Poster Session

Neural Cellular Automata Manifold

Alejandro Hernandez, Armand Vilalta, Francesc Moreno-Noguer

Very recently, the Neural Cellular Automata (NCA) has been proposed to simulate the morphogenesis process with deep networks. [Expand]

2.00

Wednesday Poster Session

Visualizing Adapted Knowledge in Domain Transfer

Yunzhong Hou, Liang Zheng

A source model trained on source data and a target model learned through unsupervised domain adaptation (UDA) usually encode different knowledge. [Expand]

2.00

Thursday Poster Session

Multi-Target Domain Adaptation With Collaborative Consistency Learning

Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

Recently unsupervised domain adaptation for the semantic segmentation task has become more and more popular due to the high-cost of pixel-level annotation on real-world images. [Expand]

2.00

Wednesday Poster Session

Luke Melas-Kyriazi, Arjun K. Manrai

Unsupervised domain adaptation is a promising technique for semantic segmentation and other computer vision tasks for which large-scale data annotation is costly and time-consuming. [Expand]

Thursday Poster Session

Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks

Roi Pony, Itay Naeh, Shie Mannor

Deep neural networks for video classification, just like image classification networks, may be subjected to adversarial manipulation. [Expand]

2.00

Monday Poster Session

Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect

Tuesday Poster Session

Partition-Guided GANs

Mohammadreza Armandpour, Ali Sadeghian, Chunyuan Li, Mingyuan Zhou

Despite the success of Generative Adversarial Networks (GANs), their training suffers from several well-known problems, including mode collapse and difficulties learning a disconnected set of manifolds. [Expand]

Tuesday Poster Session

ReAgent: Point Cloud Registration Using Imitation and Reinforcement Learning

Dominik Bauer, Timothy Patten, Markus Vincze

Point cloud registration is a common step in many 3D computer vision tasks such as object pose estimation, where a 3D model is aligned to an observation. [Expand]

Thursday Poster Session

FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise

Jaeseok Byun, Sungmin Cha, Taesup Moon

We consider the challenging blind denoising problem for Poisson-Gaussian noise, in which no additional information about clean images or noise level parameters is available. [Expand]

Tuesday Poster Session

Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles

Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu

Controllable Image Captioning (CIC) -- generating image descriptions following designated control signals -- has received unprecedented attention over the last few years. [Expand]

1.75

Friday Poster Session

3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding

Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia

The ability to understand the ways to interact with objects from visual cues, a.k.a. [Expand]

Monday Poster Session

Unbiased Mean Teacher for Cross-Domain Object Detection

Learning Optical Flow From a Few Matches

Shihao Jiang, Yao Lu, Hongdong Li, Richard Hartley

State-of-the-art neural network models for optical flow estimation require a dense correlation volume at high resolutions for representing per-pixel displacement. [Expand]

Friday Poster Session

Multi-Shot Temporal Event Localization: A Benchmark

Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr

Current developments in temporal event or action localization usually target actions captured by a single camera. [Expand]

Thursday Poster Session

Retinex-Inspired Unrolling With Cooperative Prior Architecture Search for Low-Light Image Enhancement

Risheng Liu, Long Ma, Jiaao Zhang, Xin Fan, Zhongxuan Luo

Low-light image enhancement plays very important roles in low-level vision areas. [Expand]

1.75

Wednesday Poster Session

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu

In semi-supervised domain adaptation, a few labeled samples per class in the target domain guide features of the remaining target samples to aggregate around them. [Expand]

Monday Poster Session

Temporal Action Segmentation From Timestamp Supervision

Zhe Li, Yazan Abu Farha, Jurgen Gall

Temporal action segmentation approaches have been very successful recently. [Expand]

Wednesday Poster Session

Variational Relational Point Completion Network

Distilling Object Detectors via Decoupled Features

Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu

Knowledge distillation is a widely used paradigm for inheriting information from a complicated teacher network to a compact student network and maintaining the strong performance. [Expand]

1.50

Monday Poster Session

Learning by Aligning Videos in Time

Sanjay Haresh, Sateesh Kumar, Huseyin Coskun, Shahram N. Syed, Andrey Konin, Zeeshan Zia, Quoc-Huy Tran

We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. [Expand]

1.50

Tuesday Poster Session

DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation

Yufan He, Dong Yang, Holger Roth, Can Zhao, Daguang Xu

Recently, neural architecture search(NAS) has been applied to automatically search high-performance networks for medical image segmentation. [Expand]

1.50

Tuesday Poster Session

Deep Dual Consecutive Network for Human Pose Estimation

Zhenguang Liu, Haoming Chen, Runyang Feng, Shuang Wu, Shouling Ji, Bailin Yang, Xun Wang

Multi-frame human pose estimation in complicated situations is challenging. [Expand]

Monday Poster Session

Invertible Denoising Network: A Light Solution for Real Noise Removal

Yang Liu, Zhenyue Qin, Saeed Anwar, Pan Ji, Dongwoo Kim, Sabrina Caldwell, Tom Gedeon

Invertible networks have various benefits for image denoising since they are lightweight, information-lossless, and memory-saving during back-propagation. [Expand]

Thursday Poster Session

The Blessings of Unlabeled Background in Untrimmed Videos

Yuan Liu, Jingyuan Chen, Zhenfang Chen, Bing Deng, Jianqiang Huang, Hanwang Zhang

Weakly-supervised Temporal Action Localization (WTAL) aims to detect the action segments with only video-level action labels in training. [Expand]

1.50

Tuesday Poster Session

SurFree: A Fast Surrogate-Free Black-Box Attack

Thibault Maho, Teddy Furon, Erwan Le Merrer

Machine learning classifiers are critically prone to evasion attacks. [Expand]

1.50

Wednesday Poster Session

Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization

Haoyu Ma, Xiangru Lin, Zifeng Wu, Yizhou Yu

Unsupervised domain adaptation (UDA) in semantic segmentation is a fundamental yet promising task relieving the need for laborious annotation works. [Expand]

1.50

Tuesday Poster Session

Convolutional Hough Matching Networks

Juhong Min, Minsu Cho

Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images. [Expand]

1.50

Tuesday Poster Session

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu

Weakly supervised object localization (WSOL) remains an open problem due to the deficiency of finding object extent information using a classification network. [Expand]

Thursday Poster Session

Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation

Hyojin Park, Jayeon Yoo, Seohyeong Jeong, Ganesh Venkatesh, Nojun Kwak

Current state-of-the-art approaches for Semi-supervised Video Object Segmentation (Semi-VOS) propagates information from previous frames to generate segmentation mask for the current frame. [Expand]

Wednesday Poster Session

HoHoNet: 360 Indoor Holistic Understanding With Latent Horizontal Features

Cheng Sun, Min Sun, Hwann-Tzong Chen

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). [Expand]

1.50

Monday Poster Session

Layerwise Optimization by Gradient Decomposition for Continual Learning

Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang

Deep neural networks achieve state-of-the-art and sometimes super-human performance across a variety of domains. [Expand]

1.50

Wednesday Poster Session

Consensus Maximisation Using Influences of Monotone Boolean Functions

Ruwan Tennakoon, David Suter, Erchuan Zhang, Tat-Jun Chin, Alireza Bab-Hadiashar

Consensus maximisation (MaxCon), widely used for robust fitting in computer vision, aims to find the largest subset of data that fits the model within some tolerance level. [Expand]

Tuesday Poster Session

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

Aisha Urooj, Hilde Kuehne, Kevin Duarte, Chuang Gan, Niels Lobo, Mubarak Shah

The problem of grounding VQA tasks has seen an increased attention in the research community recently, with most attempts usually focusing on solving this task by using pretrained object detectors. [Expand]

Wednesday Poster Session

Efficient Feature Transformations for Discriminative and Generative Continual Learning

Vinay Kumar Verma, Kevin J Liang, Nikhil Mehta, Piyush Rai, Lawrence Carin

As neural networks are increasingly being applied to real-world applications, mechanisms to address distributional shift and sequential task learning without forgetting are critical. [Expand]

Thursday Poster Session

PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds

Yi Wei, Ziyi Wang, Yongming Rao, Jiwen Lu, Jie Zhou

In this paper, we propose a Point-Voxel Recurrent All-Pairs Field Transforms (PV-RAFT) method to estimate scene flow from point clouds. [Expand]

Tuesday Poster Session

Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association

Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang

Nowadays, we have witnessed the early progress on learning the association between voice and face automatically, which brings a new wave of studies to the computer vision community. [Expand]

1.50

Friday Poster Session

Rethinking Class Relations: Absolute-Relative Supervised and Unsupervised Few-Shot Learning

Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr

The majority of existing few-shot learning methods describe image relations with binary labels. [Expand]

Wednesday Poster Session

Variational Pedestrian Detection

Yuang Zhang, Huanyu He, Jianguo Li, Yuxi Li, John See, Weiyao Lin

Pedestrian detection in a crowd is a challenging task due to a high number of mutually-occluding human instances, which brings ambiguity and optimization difficulties to the current IoU-based ground truth assignment procedure in classical object detection methods. [Expand]

Thursday Poster Session

Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias

Yunhan Zhao, Shu Kong, Charless Fowlkes

Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. [Expand]

Friday Poster Session

Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, Ping Tan

This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. [Expand]

Tuesday Poster Session

View Generalization for Single Image Textured 3D Models

Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro

Humans can easily infer the underlying 3D geometry and texture of an object only from a single 2D image. [Expand]

Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang

Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. [Expand]

1.25

Wednesday Poster Session

Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors

Vladimir Guzov, Aymen Mir, Torsten Sattler, Gerard Pons-Moll

We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. [Expand]

1.25

Tuesday Poster Session

Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation

Ryuhei Hamaguchi, Yasutaka Furukawa, Masaki Onishi, Ken Sakurada

This paper proposes a novel heterogeneous grid convolution that builds a graph-based image representation by exploiting heterogeneity in the image content, enabling adaptive, efficient, and controllable computations in a convolutional architecture. [Expand]

Thursday Poster Session

ChallenCap: Monocular 3D Capture of Challenging Human Performances Using Multi-Modal References

Yannan He, Anqi Pang, Xin Chen, Han Liang, Minye Wu, Yuexin Ma, Lan Xu

Capturing challenging human motions is critical for numerous applications, but it suffers from complex motion patterns and severe self-occlusion under the monocular setting. [Expand]

1.25

Thursday Poster Session

Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries

Saif Imran, Xiaoming Liu, Daniel Morris

Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. [Expand]

1.25

Monday Poster Session

Memory-Guided Unsupervised Image-to-Image Translation

Somi Jeong, Youngjung Kim, Eungbean Lee, Kwanghoon Sohn

We present a novel unsupervised framework for instance-level image-to-image translation. [Expand]

Tuesday Poster Session

Locate Then Segment: A Strong Pipeline for Referring Image Segmentation

Ya Jing, Tao Kong, Wei Wang, Liang Wang, Lei Li, Tieniu Tan

Referring image segmentation aims to segment the objects referred by a natural language expression. [Expand]

1.25

Wednesday Poster Session

Hierarchical Lovasz Embeddings for Proposal-Free Panoptic Segmentation

Tommi Kerola, Jie Li, Atsushi Kanehira, Yasunori Kudo, Alexis Vallet, Adrien Gaidon

Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. [Expand]

PDF

Show Tweets

Thursday Poster Session

IronMask: Modular Architecture for Protecting Deep Face Template

Sunpill Kim, Yunseong Jeong, Jinsu Kim, Jungkon Kim, Hyung Tae Lee, Jae Hong Seo

Convolutional neural networks have made remarkable progress in the face recognition field. [Expand]

1.25

Friday Poster Session

Interpretable Social Anchors for Human Trajectory Forecasting in Crowds

Parth Kothari, Brian Sifringer, Alexandre Alahi

Human trajectory forecasting in crowds, at its core, is a sequence prediction problem with specific challenges of capturing inter-sequence dependencies (social interactions) and consequently predicting socially-compliant multimodal distributions. [Expand]

Thursday Poster Session

BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation

Jungbeom Lee, Jihun Yi, Chaehun Shin, Sungroh Yoon

Weakly supervised segmentation methods using bounding box annotations focus on obtaining a pixel-level mask from each box containing an object. [Expand]

1.25

Monday Poster Session

Looking Into Your Speech: Learning Cross-Modal Affinity for Audio-Visual Speech Separation

Jiyoung Lee, Soo-Whan Chung, Sunok Kim, Hong-Goo Kang, Kwanghoon Sohn

In this paper, we address the problem of separating individual speech signals from videos using audio-visual neural processing. [Expand]

1.25

Monday Poster Session

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu

The remarkable success in face forgery techniques has received considerable attention in computer vision due to security concerns. [Expand]

Troubleshooting Blind Image Quality Models in the Wild

Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics. [Expand]

1.25

Friday Poster Session

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

Human pose estimation has achieved significant progress in recent years. [Expand]

Friday Poster Session

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian

Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. [Expand]

1.25

Tuesday Poster Session

FP-NAS: Fast Probabilistic Neural Architecture Search

Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli

Differential Neural Architecture Search (NAS) requires all layer choices to be held in memory simultaneously; this limits the size of both search space and final architecture. [Expand]

Thursday Poster Session

DER: Dynamically Expandable Representation for Class Incremental Learning

Shipeng Yan, Jiangwei Xie, Xuming He

We address the problem of class incremental learning, which is a core step towards achieving adaptive vision intelligence. [Expand]

1.25

Tuesday Poster Session

Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations

Yanyi Zhang, Xinyu Li, Ivan Marsic

Multi-label activity recognition is designed for recognizing multiple activities that are performed simultaneously or sequentially in each video. [Expand]

Thursday Poster Session

Weakly Supervised Video Salient Object Detection

Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han

Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are timeconsuming and expensive to obtain. [Expand]

Friday Poster Session

Simpler Certified Radius Maximization by Propagating Covariances

Xingjian Zhen, Rudrasis Chakraborty, Vikas Singh

One strategy for adversarially training a robust model is to maximize its certified radius -- the neighborhood around a given training sample for which the model's prediction remains unchanged. [Expand]

Wednesday Poster Session

Progressive Temporal Feature Alignment Network for Video Inpainting

Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee

Video inpainting aims to fill spatio-temporal "corrupted" regions with plausible content. [Expand]

Friday Poster Session

What's in the Image? Explorable Decoding of Compressed Images

Yuval Bahat, Tomer Michaeli

The ever-growing amounts of visual contents captured on a daily basis necessitate the use of lossy compression methods in order to save storage space and transmission bandwidth. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Behavior-Driven Synthesis of Human Dynamics

Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer

Generating and representing human behavior are of major importance for various computer vision applications. [Expand]

1.00

Thursday Poster Session

On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective

Nontawat Charoenphakdee, Jayakorn Vongkulbhisal, Nuttapong Chairatanakul, Masashi Sugiyama

The focal loss has demonstrated its effectiveness in many real-world applications such as object detection and image classification, but its theoretical understanding has been limited so far. [Expand]

1.00

Tuesday Poster Session

Wide-Baseline Relative Camera Pose Estimation With Directional Learning

Kefan Chen, Noah Snavely, Ameesh Makadia

Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. [Expand]

1.00

Tuesday Poster Session

A Hyperbolic-to-Hyperbolic Graph Convolutional Network

Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia

Hyperbolic graph convolutional networks (GCNs) demonstrate powerful representation ability to model graphs with hierarchical structure. [Expand]

1.00

Monday Poster Session

Square Root Bundle Adjustment for Large-Scale Reconstruction

Nikolaus Demmel, Christiane Sommer, Daniel Cremers, Vladyslav Usenko

We propose a new formulation for the bundle adjustment problem which relies on nullspace marginalization of landmark variables by QR decomposition. [Expand]

Thursday Poster Session

StickyPillars: Robust and Efficient Feature Matching on Point Clouds Using Graph Neural Networks

Kai Fischer, Martin Simon, Florian Olsner, Stefan Milz, Horst-Michael Gross, Patrick Mader

Robust point cloud registration in real-time is an important prerequisite for many mapping and localization algorithms. [Expand]

1.00

Monday Poster Session

Unsupervised Pre-Training for Person Re-Identification

Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation. [Expand]

Thursday Poster Session

Privacy-Preserving Collaborative Learning With Automatic Transformation Search

Wei Gao, Shangwei Guo, Tianwei Zhang, Han Qiu, Yonggang Wen, Yang Liu

Collaborative learning has gained great popularity due to its benefit of data privacy protection: participants can jointly train a Deep Learning model without sharing their training sets. [Expand]

Monday Poster Session

Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation

Rui Gong, Yuhua Chen, Danda Pani Paudel, Yawei Li, Ajad Chhatkuli, Wen Li, Dengxin Dai, Luc Van Gool

Open compound domain adaptation (OCDA) is a domain adaptation setting, where target domain is modeled as a compound of multiple unknown homogeneous domains, which brings the advantage of improved generalization to unseen domains. [Expand]

1.00

Wednesday Poster Session

Li Hu, Peng Zhang, Bang Zhang, Pan Pan, Yinghui Xu, Rong Jin

This paper studies the problem of semi-supervised video object segmentation(VOS). [Expand]

1.00

Tuesday Poster Session

EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation

Yang Jiao, Trac D. Tran, Guangming Shi

This paper addresses the challenging unsupervised scene flow estimation problem by jointly learning four low-level vision sub-tasks: optical flow F, stereo-depth D, camera pose P and motion segmentation S. [Expand]

Tuesday Poster Session

Embedding Transfer With Label Relaxation for Improved Metric Learning

Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

This paper presents a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another. [Expand]

Tuesday Poster Session

Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution

Hyungjun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim

Binarization of neural network models is considered as one of the promising methods to deploy deep neural network models on resource-constrained environments such as mobile devices. [Expand]

1.00

Wednesday Poster Session

Haiyang Mei, Bo Dong, Wen Dong, Pieter Peers, Xin Yang, Qiang Zhang, Xiaopeng Wei

We present a novel mirror segmentation method that leverages depth estimates from ToF-based cameras as an additional cue to disambiguate challenging cases where the contrast or relation in RGB colors between the mirror reflection and the surrounding scene is subtle. [Expand]

1.00

Tuesday Poster Session

GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction

Cheol-Hui Min, Jinseok Bae, Junho Lee, Young Min Kim

We present GATSBI, a generative model that can transform a sequence of raw observations into a structured latent representation that fully captures the spatio-temporal context of the agent's actions. [Expand]

Tuesday Poster Session

Background Splitting: Finding Rare Classes in a Sea of Background

Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian

We focus on the problem of training deep image classification models for a small number of extremely rare categories. [Expand]

Wednesday Poster Session

LayoutGMN: Neural Graph Matching for Structural Layout Similarity

Akshay Gadi Patil, Manyi Li, Matthew Fisher, Manolis Savva, Hao Zhang

We present a deep neural network to predict structural similarity between 2D layouts by leveraging Graph Matching Networks (GMN). [Expand]

1.00

Wednesday Poster Session

Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals

Kun Qian, Shilin Zhu, Xinyu Zhang, Li Erran Li

Vehicle detection with visual sensors like lidar and camera is one of the critical functions enabling autonomous driving. [Expand]

1.00

Monday Poster Session

Every Annotation Counts: Multi-Label Deep Supervision for Medical Image Segmentation

Simon Reiss, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen

Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field. [Expand]

1.00

Wednesday Poster Session

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Xiaoliang Cheng, Kewei Liang

Binary grid mask representation is broadly used in instance segmentation. [Expand]

1.00

Wednesday Poster Session

Yunrui Yu, Xitong Gao, Cheng-Zhong Xu

Deep convolutional neural networks are susceptible to adversarial attacks. [Expand]

1.00

Tuesday Poster Session

CorrNet3D: Unsupervised End-to-End Learning of Dense Correspondence for 3D Point Clouds

Yiming Zeng, Yue Qian, Zhiyu Zhu, Junhui Hou, Hui Yuan, Ying He

Motivated by the intuition that one can transform two aligned point clouds to each other more easily and meaningfully than a misaligned pair, we propose CorrNet3D -the first unsupervised and end-to-end deep learning-based framework - to drive the learning of dense correspondence between 3D shapes by means of deformation-like reconstruction to overcome the need for annotated data. [Expand]

Tuesday Poster Session

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

Chi Zhang, Baoxiong Jia, Song-Chun Zhu, Yixin Zhu

Spatial-temporal reasoning is a challenging task in Artificial Intelligence (AI) due to its demanding but unique nature: a theoretic requirement on representing and reasoning based on spatial-temporal knowledge in mind, and an applied requirement on a high-level cognitive system capable of navigating and acting in space and time. [Expand]

1.00

Wednesday Poster Session

ACRE: Abstract Causal REasoning Beyond Covariation

Chi Zhang, Baoxiong Jia, Mark Edmonds, Song-Chun Zhu, Yixin Zhu

Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. [Expand]

1.00

Wednesday Poster Session

Body Meshes as Points

Jianfeng Zhang, Dongdong Yu, Jun Hao Liew, Xuecheng Nie, Jiashi Feng

We consider the challenging multi-person 3D body mesh estimation task in this work. [Expand]

1.00

Monday Poster Session

EDNet: Efficient Disparity Estimation With Cost Volume Combination and Attention-Based Spatial Residual

Songyan Zhang, Zhicheng Wang, Qiang Wang, Jinshuo Zhang, Gang Wei, Xiaowen Chu

Existing state-of-the-art disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression, which is inefficient due to the high memory consumption and slow inference speed. [Expand]

UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification

Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains. [Expand]

1.00

Thursday Poster Session

Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction From Raw Point Clouds

Wenbin Zhao, Jiabao Lei, Yuxin Wen, Jianguo Zhang, Kui Jia

Shape modeling and reconstruction from raw point clouds of objects stand as a fundamental challenge in vision and graphics research. [Expand]

1.00

Wednesday Poster Session

Antonio Alliegro, Diego Valsesia, Giulia Fracastoro, Enrico Magli, Tatiana Tommasi

In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. [Expand]

Tuesday Poster Session

Dogfight: Detecting Drones From Drones Videos

Muhammad Waseem Ashraf, Waqas Sultani, Mubarak Shah

As airborne vehicles are becoming more autonomous and ubiquitous, it has become vital to develop the capability to detect the objects in their surroundings. [Expand]

Tuesday Poster Session

What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa

Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. [Expand]

Tuesday Poster Session

Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown Generic Reflectance

Ziang Cheng, Hongdong Li, Yuta Asano, Yinqiang Zheng, Imari Sato

Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e.g. [Expand]

Friday Poster Session

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng

Recent years have witnessed significant progress in 3D hand mesh recovery. [Expand]

Thursday Poster Session

Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision

Xiaokang Chen, Yuhui Yuan, Gang Zeng, Jingdong Wang

In this paper, we study the semi-supervised semantic segmentation problem via exploring both labeled data and extra unlabeled data. [Expand]

Monday Poster Session

PiCIE: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering

Jang Hyun Cho, Utkarsh Mall, Kavita Bala, Bharath Hariharan

We present a new framework for semantic segmentation without annotations via clustering. [Expand]

Friday Poster Session

Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation

Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu

Unsupervised Domain Adaptation (UDA) aims to generalize the knowledge learned from a well-labeled source domain to an unlabled target domain. [Expand]

Tuesday Poster Session

Siamese Natural Language Tracker: Tracking by Natural Language Descriptions With Siamese Trackers

Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff

We propose a novel Siamese Natural Language Tracker (SNLT), which brings the advancements in visual tracking to the tracking by natural language (NL) specification task. [Expand]

Tuesday Poster Session

OTA: Optimal Transport Assignment for Object Detection

Zheng Ge, Songtao Liu, Zeming Li, Osamu Yoshie, Jian Sun

Recent advances in label assignment in object detection mainly seek to independently define positive/negative training samples for each ground-truth (gt) object. [Expand]

Monday Poster Session

Bidirectional Projection Network for Cross Dimension Scene Understanding

Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong

2D image representations are in regular grids and can be processed efficiently, whereas 3D point clouds are unordered and scattered in 3D space. [Expand]

Thursday Poster Session

Few-Shot Open-Set Recognition by Transformation Consistency

Minki Jeong, Seokeon Choi, Changick Kim

In this paper, we attack a few-shot open-set recognition (FSOSR) problem, which is a combination of few-shot learning (FSL) and open-set recognition (OSR). [Expand]

Thursday Poster Session

Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification?

Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, Dawn Song

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption. [Expand]

Wednesday Poster Session

Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network

Aupendu Kar, Prabir Kumar Biswas

Convolutional neural network (CNN) has achieved unprecedented success in image super-resolution tasks in recent years. [Expand]

Tuesday Poster Session

Deep Implicit Moving Least-Squares Functions for 3D Reconstruction

Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu

Point set is a flexible and lightweight representation widely used for 3D deep learning. [Expand]

Monday Poster Session

PD-GAN: Probabilistic Diverse GAN for Image Inpainting

Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao

We propose PD-GAN, a probabilistic diverse GAN forimage inpainting. [Expand]

Wednesday Poster Session

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

Yongfei Liu, Bo Wan, Lin Ma, Xuming He

Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding. [Expand]

Tuesday Poster Session

Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation

Astuti Sharma, Tarun Kalluri, Manmohan Chandraker

Domain adaptation deals with training models using large scale labeled data from a specific source domain and then adapting the knowledge to certain target domains that have few or no labels. [Expand]

Tuesday Poster Session

Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning

Mingjie Sun, Jimin Xiao, Eng Gee Lim

In this paper, we are tackling the proposal-free referring expression grounding task, aiming at localizing the target object according to a query sentence, without relying on off-the-shelf object proposals. [Expand]

Thursday Poster Session

Delving into Data: Effectively Substitute Training for Black-box Attack

Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, Xiangyang Xue

Deep models have shown their vulnerability when processing adversarial samples. [Expand]

Tuesday Poster Session

Exploring Sparsity in Image Super-Resolution for Efficient Inference

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo

Current CNN-based super-resolution (SR) methods process all locations equally with computational resources being uniformly assigned in space. [Expand]

Tuesday Poster Session

From Rain Generation to Rain Removal

Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, Deyu Meng

For the single image rain removal (SIRR) task, the performance of deep learning (DL)-based methods is mainly affected by the designed deraining models and training datasets. [Expand]

Thursday Poster Session

Cycle4Completion: Unpaired Point Cloud Completion Using Cycle Transformation With Missing Region Coding

Xin Wen, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu

In this paper, we present a novel unpaired point cloud completion network, named Cycle4Completion, to infer the complete geometries from a partial 3D object. [Expand]

Thursday Poster Session

Bilateral Grid Learning for Stereo Matching Networks

Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo

Real-time performance of stereo matching networks is important for many applications, such as automatic driving, robot navigation and augmented reality (AR). [Expand]

Thursday Poster Session

Diversifying Sample Generation for Accurate Data-Free Quantization

Xiangguo Zhang, Haotong Qin, Yifu Ding, Ruihao Gong, Qinghua Yan, Renshuai Tao, Yuhang Li, Fengwei Yu, Xianglong Liu

Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. [Expand]

Friday Poster Session

Fostering Generalization in Single-View 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors

Jan Bechtold, Maxim Tatarchenko, Volker Fischer, Thomas Brox

Single-view 3D object reconstruction has seen much progress, yet methods still struggle generalizing to novel shapes unseen during training. [Expand]

Friday Poster Session

Towards Part-Based Understanding of RGB-D Scans

Alexey Bokhovkin, Vladislav Ishimtsev, Emil Bogomolov, Denis Zorin, Alexey Artemov, Evgeny Burnaev, Angela Dai

Recent advances in 3D semantic scene understanding have shown impressive progress in 3D instance segmentation, enabling object-level reasoning about 3D scenes; however, a finer-grained understanding is required to enable interactions with objects and their functional understanding. [Expand]

Wednesday Poster Session

Fine-Grained Angular Contrastive Learning With Coarse Labels

Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky

Few-shot learning methods offer pre-training techniques optimized for easier later adaptation of the model to new classes (unseen during training) using one or a few examples. [Expand]

Wednesday Poster Session

Semantic Scene Completion via Integrating Instances and Scene In-the-Loop

Yingjie Cai, Xuesong Chen, Chao Zhang, Kwan-Yee Lin, Xiaogang Wang, Hongsheng Li

Semantic Scene Completion aims at reconstructing a complete 3D scene with precise voxel-wise semantics from a single-view depth or RGBD image. [Expand]

Monday Poster Session

Globally Optimal Relative Pose Estimation With Gravity Prior

Yaqing Ding, Daniel Barath, Jian Yang, Hui Kong, Zuzana Kukelova

Smartphones, tablets and camera systems used, e.g., in cars and UAVs, are typically equipped with IMUs (inertial measurement units) that can measure the gravity vector accurately. [Expand]

Monday Poster Session

Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball

Andrew Elliott, Stephen Law, Chris Russell

We present a simple regularization of adversarial perturbations based upon the perceptual loss. [Expand]

Wednesday Poster Session

Learning Goals From Failure

Dave Epstein, Carl Vondrick

We introduce a framework that predicts the goals behind observable human action in video. [Expand]

Wednesday Poster Session

Fair Feature Distillation for Visual Recognition

Sangwon Jung, Donggyu Lee, Taeeon Park, Taesup Moon

Fairness is becoming an increasingly crucial issue for computer vision, especially in the human-related decision systems. [Expand]

Thursday Poster Session

How To Exploit the Transferability of Learned Image Compression to Conventional Codecs

Jan P. Klopp, Keng-Chi Liu, Liang-Gee Chen, Shao-Yi Chien

Lossy image compression is often limited by the simplicity of the chosen loss measure. [Expand]

Friday Poster Session

Restore From Restored: Video Restoration With Pseudo Clean Video

Seunghwan Lee, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim

In this study, we propose a self-supervised video denoising method called ""restore-from-restored."" This method fine-tunes a pre-trained network by using a pseudo clean video during the test phase. [Expand]

Tuesday Poster Session

Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation

Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim

Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. [Expand]

Tuesday Poster Session

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes With Biharmonic Coordinates

Minghua Liu, Minhyuk Sung, Radomir Mech, Hao Su

We propose DeepMetaHandles, a 3D conditional generative model based on mesh deformation. [Expand]

Monday Poster Session

Anchor-Constrained Viterbi for Set-Supervised Action Segmentation

Jun Li, Sinisa Todorovic

This paper is about action segmentation under weak supervision in training, where the ground truth provides only a set of actions present, but neither their temporal ordering nor when they occur in a training video. [Expand]

Wednesday Poster Session

Continuous Face Aging via Self-Estimated Residual Age Embedding

Zeqi Li, Ruowei Jiang, Parham Aarabi

Face synthesis, including face aging, in particular, has been one of the major topics that witnessed a substantial improvement in image fidelity by using generative adversarial networks (GANs). [Expand]

Thursday Poster Session

HCRF-Flow: Scene Flow From Point Clouds With Continuous High-Order CRFs and Position-Aware Flow Embedding

Ruibo Li, Guosheng Lin, Tong He, Fayao Liu, Chunhua Shen

Scene flow in 3D point clouds plays an important role in understanding dynamic environments. [Expand]

Monday Poster Session

Context Modeling in 3D Human Pose Estimation: A Unified Perspective

Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang

Estimating 3D human pose from a single image suffers from severe ambiguity since multiple 3D joint configurations may have the same 2D projection. [Expand]

Tuesday Poster Session

Lipstick Ain't Enough: Beyond Color Matching for In-the-Wild Makeup Transfer

Thao Nguyen, Anh Tuan Tran, Minh Hoai

Makeup transfer is the task of applying on a source face the makeup style from a reference image. [Expand]

Thursday Poster Session

Lifelong Person Re-Identification via Adaptive Knowledge Accumulation

Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew

Person ReID methods always learn through a stationary domain that is fixed by the choice of a given dataset. [Expand]

Wednesday Poster Session

PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation

Tal Reiss, Niv Cohen, Liron Bergman, Yedid Hoshen

Anomaly detection methods require high-quality features. [Expand]

Monday Poster Session

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. [Expand]

Wednesday Poster Session

Improved Handling of Motion Blur in Online Object Detection

Mohamed Sayed, Gabriel Brostow

We wish to detect specific categories of objects, for online vision systems that will run in the real world. [Expand]

Monday Poster Session

Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution

Baoli Sun, Xinchen Ye, Baopu Li, Haojie Li, Zhihui Wang, Rui Xu

Existing color-guided depth super-resolution (DSR) approaches require paired RGB-D data as training examples where the RGB image is used as structural guidance to recover the degraded depth map due to their geometrical similarity. [Expand]

Wednesday Poster Session

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun

As self-driving systems become better, simulating scenarios where the autonomy stack may fail becomes more important. [Expand]

Wednesday Poster Session

Image Inpainting With External-Internal Learning and Monochromic Bottleneck

Tengfei Wang, Hao Ouyang, Qifeng Chen

Although recent inpainting approaches have demonstrated significant improvement with deep neural networks, they still suffer from artifacts such as blunt structures and abrupt colors when filling in the missing regions. [Expand]

Tuesday Poster Session

Multiple Object Tracking With Correlation Learning

Qiang Wang, Yun Zheng, Pan Pan, Yinghui Xu

Recent works have shown that convolutional networks have substantially improved the performance of multiple object tracking by simultaneously learning detection and appearance features. [Expand]

Tuesday Poster Session

Invertible Image Signal Processing

Yazhou Xing, Zian Qian, Qifeng Chen

Unprocessed RAW data is a highly valuable image format for image editing and computer vision. [Expand]

Tuesday Poster Session

Open-Book Video Captioning With Retrieve-Copy-Generate Network

Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu

In this paper, we convert traditional video captioning task into a new paradigm, i.e., Open-book Video Captioning, which generates natural language under the prompts of video-content-relevant sentences, not limited to the video itself. [Expand]

Wednesday Poster Session

MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. [Expand]

Friday Poster Session

Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

Yu Cheng, Bo Wang, Bo Yang, Robby T. Tan

In monocular video 3D multi-person pose estimation, inter-person occlusion and close interactions can cause human detection to be erroneous and human-joints grouping to be unreliable. [Expand]

Wednesday Poster Session

Contrastive Neural Architecture Search With Neural Architecture Comparators

Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, Yaowei Wang, Mingkui Tan

One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. [Expand]

Wednesday Poster Session

Efficient Object Embedding for Spliced Image Retrieval

Bor-Chun Chen, Zuxuan Wu, Larry S. Davis, Ser-Nam Lim

Detecting spliced images is one of the emerging challenges in computer vision. [Expand]

Thursday Poster Session

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

Minghao Chen, Jianlong Fu, Haibin Ling

Despite remarkable progress achieved, most neural architecture search (NAS) methods focus on searching for one single accurate and robust architecture. [Expand]

Friday Poster Session

Robust Representation Learning With Feedback for Single Image Deraining

Chenghao Chen, Hao Li

A deraining network can be interpreted as a conditional generator that aims at removing rain streaks from image. [Expand]

Wednesday Poster Session

Scale-Aware Automatic Augmentation for Object Detection

Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia

We propose Scale-aware AutoAug to learn data augmentation policies for object detection. [Expand]

Wednesday Poster Session

Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging

Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, Felix Heide

We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. [Expand]

Wednesday Poster Session

Diverse Branch Block: Building a Convolution as an Inception-Like Unit

Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding

We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. [Expand]

Wednesday Poster Session

Deep Graph Matching Under Quadratic Constraint

Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia

Recently, deep learning based methods have demonstrated promising results on the graph matching problem, by relying on the descriptive capability of deep features extracted on graph nodes. [Expand]

Tuesday Poster Session

SSAN: Separable Self-Attention Network for Video Representation Learning

Xudong Guo, Xun Guo, Yan Lu

Self-attention has been successfully applied to video representation learning due to the effectiveness of modeling long range dependencies. [Expand]

Thursday Poster Session

Capsule Network Is Not More Robust Than Convolutional Network

Jindong Gu, Volker Tresp, Han Hu

The Capsule Network is widely believed to be more robust than Convolutional Networks. [Expand]

Thursday Poster Session

Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline

Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, Yao Zhao

Depth maps obtained by commercial depth sensors are always in low-resolution, making it difficult to be used in various computer vision tasks. [Expand]

Wednesday Poster Session

Transformation Driven Visual Reasoning

Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

This paper defines a new visual reasoning paradigm by introducing an important factor, i.e. [Expand]

Tuesday Poster Session

Affordance Transfer Learning for Human-Object Interaction Detection

Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, Dacheng Tao

Reasoning the human-object interactions (HOI) is essential for deeper scene understanding, while object affordances (or functionalities) are of great importance for human to discover unseen HOIs with novel objects. [Expand]

Monday Poster Session

DARCNN: Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images

Joy Hsu, Wah Chiu, Serena Yeung

In the biomedical domain, there is an abundance of dense, complex data where objects of interest may be challenging to detect or constrained by limits of human knowledge. [Expand]

Monday Poster Session

FVC: A New Framework Towards Deep Video Compression in Feature Space

Zhihao Hu, Guo Lu, Dong Xu

Learning based video compression attracts increasing attention in the past few years. [Expand]

Monday Poster Session

SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction From Video Data

Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing

Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. [Expand]

Monday Poster Session

MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking

Jennifer Jang, Heinrich Jiang

MeanShift is a popular mode-seeking clustering algorithm used in a wide range of applications in machine learning. [Expand]

Tuesday Poster Session

LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents

ByeoungDo Kim, Seong Hyeon Park, Seokhwan Lee, Elbek Khoshimjonov, Dongsuk Kum, Junsoo Kim, Jeong Soo Kim, Jun Won Choi

In this paper, we address the problem of predicting the future motion of a dynamic agent (called a target agent) given its current and past states as well as the information on its environment. [Expand]

Thursday Poster Session

SIPSA-Net: Shift-Invariant Pan Sharpening With Moving Object Alignment for Satellite Imagery

Jaehyup Lee, Soomin Seo, Munchurl Kim

Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. [Expand]

Wednesday Poster Session

Flow-Based Kernel Prior With Application to Blind Super-Resolution

Jingyun Liang, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte

Kernel estimation is generally one of the key problems for blind image super-resolution (SR). [Expand]

Wednesday Poster Session

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

Tingting Liang, Yongtao Wang, Zhi Tang, Guosheng Hu, Haibin Ling

Recently, neural architecture search (NAS) has been exploited to design feature pyramid networks (FPNs) and achieved promising results for visual object detection. [Expand]

Wednesday Poster Session

Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation

Dohun Lim, Hyeonseok Lee, Sungchan Kim

We present a novel method for reliably explaining the predictions of neural networks. [Expand]

Tuesday Poster Session

Region-Aware Adaptive Instance Normalization for Image Harmonization

Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu

Image composition plays a common but important role in photo editing. [Expand]

Wednesday Poster Session

Scene-Intuitive Agent for Remote Embodied Visual Grounding

Xiangru Lin, Guanbin Li, Yizhou Yu

Humans learn from life events to form intuitions towards the understanding of visual environments and languages. [Expand]

Tuesday Poster Session

From Shadow Generation To Shadow Removal

Zhihao Liu, Hui Yin, Xinyi Wu, Zhenyao Wu, Yang Mi, Song Wang

Shadow removal is a computer-vision task that aims to restore the image content in shadow regions. [Expand]

Tuesday Poster Session

Fully Convolutional Scene Graph Generation

Hengyue Liu, Ning Yan, Masood Mortazavi, Bir Bhanu

This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. [Expand]

Thursday Poster Session

No Frame Left Behind: Full Video Action Recognition

Xin Liu, Silvia L. Pintea, Fatemeh Karimi Nejadasl, Olaf Booij, Jan C. van Gemert

Not all video frames are equally informative for recognizing an action. [Expand]

Thursday Poster Session

Towards Unified Surgical Skill Assessment

Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li

Surgical skills have a great influence on surgical safety and patients' well-being. [Expand]

Wednesday Poster Session

Causal Hidden Markov Model for Time Series Disease Forecasting

Jing Li, Botong Wu, Xinwei Sun, Yizhou Wang

We propose a causal hidden Markov model to achieve robust prediction of irreversible disease at an early stage, which is safety-critical and vital for medical treatment in early stages. [Expand]

Thursday Poster Session

Exploring intermediate representation for monocular vehicle pose estimation

Shichao Li, Zengqiang Yan, Hongyang Li, Kwang-Ting Cheng

We present a new learning-based framework to recover vehicle pose in SO(3) from a single RGB image. [Expand]

Monday Poster Session

DeepI2P: Image-to-Point Cloud Registration via Deep Classification

Jiaxin Li, Gim Hee Lee

This paper presents DeepI2P: a novel approach for cross-modality registration between an image and a point cloud. [Expand]

Friday Poster Session

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Zhichao Li, Feng Wang, Naiyan Wang

LiDAR-based 3D detection in point cloud is essential in the perception system of autonomous driving. [Expand]

Wednesday Poster Session

Generalizing Face Forgery Detection With High-Frequency Features

Yuchen Luo, Yong Zhang, Junchi Yan, Wei Liu

Current face forgery detection methods achieve high accuracy under the within-database scenario where training and testing forgeries are synthesized by the same algorithm. [Expand]

Friday Poster Session

Self-Supervised Pillar Motion Learning for Autonomous Driving

Chenxu Luo, Xiaodong Yang, Alan Yuille

Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments. [Expand]

Tuesday Poster Session

Learning Semantic Person Image Generation by Region-Adaptive Normalization

Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, Wangmeng Zuo

Human pose transfer has received great attention due to its wide applications, yet is still a challenging task that is not well solved. [Expand]

Wednesday Poster Session

FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions

Weian Mao, Zhi Tian, Xinlong Wang, Chunhua Shen

We propose a fully convolutional multi-person pose estimation framework using dynamic instance-aware convolutions, termed FCPose. [Expand]

Wednesday Poster Session

Polygonal Point Set Tracking

Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

In this paper, we propose a novel learning-based polygonal point set tracking method. [Expand]

Tuesday Poster Session

Reducing Domain Gap by Reducing Style Bias

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, Donggeun Yoo

Convolutional Neural Networks (CNNs) often fail to maintain their performance when they confront new test domains, which is known as the problem of domain shift. [Expand]

Wednesday Poster Session

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, Yasutaka Furukawa

This paper proposes a generative adversarial layout refinement network for automated floorplan generation. [Expand]

PDF

We address the problem of unsupervised localization of key-steps and feature learning in instructional videos using both visual and language instructions. [Expand]

Xiaogang Wang, Xun Sun, Xinyu Cao, Kai Xu, Bin Zhou

Existing learning-based approaches to 3D shape segmentation usually formulate it as a semantic labeling problem, assuming that all parts of training shapes are annotated with a given set of labels. [Expand]

Wednesday Poster Session

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization

Guangming Wang, Xinrui Wu, Zhe Liu, Hesheng Wang

A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, using hierarchical embedding mask optimization is proposed in this paper. [Expand]

PDF

Show Tweets

Friday Poster Session

Scene-Aware Generative Network for Human Motion Synthesis

Jingbo Wang, Sijie Yan, Bo Dai, Dahua Lin

We revisit human motion synthesis, a task useful in various real-world applications, in this paper. [Expand]

Thursday Poster Session

TDN: Temporal Difference Networks for Efficient Action Recognition

Limin Wang, Zhan Tong, Bin Ji, Gangshan Wu

Temporal modeling still remains challenging for action recognition in videos. [Expand]

Monday Poster Session

Training Networks in Null Space of Feature Covariance for Continual Learning

Shipeng Wang, Xiaorong Li, Jian Sun, Zongben Xu

In the setting of continual learning, a network is trained on a sequence of tasks, and suffers from catastrophic forgetting. [Expand]

Monday Poster Session

Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images

Xinggang Wang, Jiapei Feng, Bin Hu, Qi Ding, Longjin Ran, Xiaoxin Chen, Wenyu Liu

Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation. [Expand]

Wednesday Poster Session

Unsupervised Degradation Representation Learning for Blind Super-Resolution

Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jungang Yang, Wei An, Yulan Guo

Most existing CNN-based super-resolution (SR) methods are developed based on an assumption that the degradation is fixed and known (e.g., bicubic downsampling). [Expand]

Wednesday Poster Session

Forecasting Irreversible Disease via Progression Learning

Botong Wu, Sijie Ren, Jing Li, Xinwei Sun, Shi-Ming Li, Yizhou Wang

Forecasting Parapapillary atrophy (PPA), i.e., a symptom related to most irreversible eye diseases, provides an alarm for implementing an intervention to slow down the disease progression at early stage. [Expand]

Wednesday Poster Session

SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences

Shun-Cheng Wu, Johanna Wald, Keisuke Tateno, Nassir Navab, Federico Tombari

Scene graphs are a compact and explicit representation successfully used in a variety of 2D scene understanding tasks. [Expand]

Wednesday Poster Session

A Dual Iterative Refinement Method for Non-Rigid Shape Matching

Rui Xiang, Rongjie Lai, Hongkai Zhao

In this work, a robust and efficient dual iterative refinement (DIR) method is proposed for dense correspondence between two nearly isometric shapes. [Expand]

Friday Poster Session

Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

Zhihao Xia, Michael Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti

We introduce a neural network-based method to denoise pairs of images taken in quick succession in low-light environments, with and without a flash. [Expand]

Monday Poster Session

DG-Font: Deformable Generative Networks for Unsupervised Font Generation

Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu

Font generation is a challenging problem especially for some writing systems that consist of a large number of characters and has attracted a lot of attention in recent years. [Expand]

Tuesday Poster Session

Graph Stacked Hourglass Networks for 3D Human Pose Estimation

Tianhan Xu, Wataru Takano

In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. [Expand]

Friday Poster Session

Layout-Guided Novel View Synthesis From a Single Indoor Panorama

Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao

Existing view synthesis methods mainly focus on the perspective images and have shown promising results. [Expand]

Friday Poster Session

Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning

Chengming Xu, Yanwei Fu, Chen Liu, Chengjie Wang, Jilin Li, Feiyue Huang, Li Zhang, Xiangyang Xue

Few-shot learning (FSL), which aims to recognise new classes by adapting the learned knowledge with extremely limited few-shot (support) examples, remains an important open problem in computer vision. [Expand]

Tuesday Poster Session

Linear Semantics in Generative Adversarial Networks

Jianjin Xu, Changxi Zheng

Generative Adversarial Networks (GANs) are able to generate high-quality images, but it remains difficult to explicitly specify the semantics of synthesized images. [Expand]

Wednesday Poster Session

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Gang Xu, Jun Xu, Zhen Li, Liang Wang, Xing Sun, Ming-Ming Cheng

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. [Expand]

Tuesday Poster Session

3D-MAN: 3D Multi-Frame Attention Network for Object Detection

Zetong Yang, Yin Zhou, Zhifeng Chen, Jiquan Ngiam

3D object detection is an important module in autonomous driving and robotics. [Expand]

Monday Poster Session

KSM: Fast Multiple Task Adaption via Kernel-Wise Soft Mask Learning

Li Yang, Zhezhi He, Junshan Zhang, Deliang Fan

Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as catastrophic forgetting. [Expand]

Thursday Poster Session

NetAdaptV2: Efficient Neural Architecture Search With Fast Super-Network Training and Architecture Optimization

Tien-Ju Yang, Yi-Lun Liao, Vivienne Sze

Neural architecture search (NAS) typically consists of three main steps: training a super-network, training and evaluating sampled deep neural networks (DNNs), and training the discovered DNN. [Expand]

Monday Poster Session

Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

Gengcong Yang, Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yujiu Yang

To generate "accurate" scene graphs, almost all exist-ing methods predict pairwise relationships in a determin-istic manner. [Expand]

Thursday Poster Session

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis

Mingyu Yin, Li Sun, Qingli Li

View synthesis is usually done by an autoencoder, in which the encoder maps a source view image into a latent content code, and the decoder transforms it into a target view image according to the condition. [Expand]

Wednesday Poster Session

Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed Hierarchical Tucker Structure

Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan

Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. [Expand]

Thursday Poster Session

Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. [Expand]

Thursday Poster Session

Real-Time Selfie Video Stabilization

Jiyang Yu, Ravi Ramamoorthi, Keli Cheng, Michel Sarkis, Ning Bi

We propose a novel real-time selfie video stabilization method. [Expand]

Thursday Poster Session

Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy

Zikai Zhang, Bineng Zhong, Shengping Zhang, Zhenjun Tang, Xin Liu, Zhaoxiang Zhang

A practical long-term tracker typically contains three key properties, i.e., an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism. [Expand]

Monday Poster Session

Domain-Robust VQA With Diverse Datasets and Methods but No Target Labels

Mingda Zhang, Tristan Maidment, Ahmad Diab, Adriana Kovashka, Rebecca Hwa

The observation that computer vision methods overfit to dataset specifics has inspired diverse attempts to make object recognition models robust to domain shifts. [Expand]

Tuesday Poster Session

Event-Based Synthetic Aperture Imaging With a Hybrid Network

Xiang Zhang, Wei Liao, Lei Yu, Wen Yang, Gui-Song Xia

Synthetic aperture imaging (SAI) is able to achieve the see through effect by blurring out the off-focus foreground occlusions and reconstructing the in-focus occluded targets from multi-view images. [Expand]

Thursday Poster Session

View-Guided Point Cloud Completion

Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao

This paper presents a view-guided solution for the task of point cloud completion. [Expand]

Friday Poster Session

Zero-Shot Instance Segmentation

Wednesday Poster Session

GMOT-40: A Benchmark for Generic Multiple Object Tracking

Hexin Bai, Wensheng Cheng, Peng Chu, Juehuan Liu, Kai Zhang, Haibin Ling

Multiple Object Tracking (MOT) has witnessed remarkable advances in recent years. [Expand]

Tuesday Poster Session

Learning Scalable lY=-Constrained Near-Lossless Image Compression via Joint Lossy Image and Residual Compression

Yuanchao Bai, Xianming Liu, Wangmeng Zuo, Yaowei Wang, Xiangyang Ji

We propose a novel joint lossy image and residual compression framework for learning l_infinity-constrained near-lossless image compression. [Expand]

PDF

Show Tweets

Thursday Poster Session

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Zechen Bai, Zhigang Wang, Jian Wang, Di Hu, Errui Ding

Unsupervised domain adaptation (UDA) methods for person re-identification (re-ID) aim at transferring re-ID knowledge from labeled source data to unlabeled target data. [Expand]

Thursday Poster Session

Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers

Apratim Bhattacharyya, Daniel Olmeda Reino, Mario Fritz, Bernt Schiele

Accurate prediction of pedestrian and bicyclist paths is integral to the development of reliable autonomous vehicles in dense urban environments. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions

Navaneeth Bodla, Gaurav Shrivastava, Rama Chellappa, Abhinav Shrivastava

Learning to model and predict how humans interact with objects while performing an action is challenging, and most of the existing video prediction models are ineffective in modeling complicated human-object interactions. [Expand]

PDF

Show Tweets

Thursday Poster Session

Understanding Object Dynamics for Interactive Image-to-Video Synthesis

Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer

What would be the effect of locally poking a static scene? We present an approach that learns naturally-looking global articulations caused by a local manipulation at a pixel level. [Expand]

PDF

Show Tweets

Tuesday Poster Session

OCONet: Image Extrapolation by Object Completion

Richard Strong Bowen, Huiwen Chang, Charles Herrmann, Piotr Teterwak, Ce Liu, Ramin Zabih

Image extrapolation extends an input image beyond the originally-captured field of view. [Expand]

Monday Poster Session

Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning

Liu Bo, Qiulei Dong, Zhanyi Hu

Transductive zero-shot learning (T-ZSL) which could alleviate the domain shift problem in existing ZSL works, has received much attention recently. [Expand]

Friday Poster Session

GAIA: A Transfer Learning System of Object Detection That Fits Your Needs

Xingyuan Bu, Junran Peng, Junjie Yan, Tieniu Tan, Zhaoxiang Zhang

Transfer learning with pre-training on large-scale datasets has played an increasingly significant role in computer vision and natural language processing recently. [Expand]

PDF

Show Tweets

Monday Poster Session

Rethinking Graph Neural Architecture Search From Message-Passing

Thursday Poster Session

Learning Discriminative Prototypes With Dynamic Time Warping

Xiaobin Chang, Frederick Tung, Greg Mori

Dynamic Time Warping (DTW) is widely used for temporal data processing. [Expand]

Wednesday Poster Session

Towards Robust Classification Model by Counterfactual and Invariant Data Generation

Chun-Hao Chang, George Alexandru Adam, Anna Goldenberg

Despite the success of machine learning applications in science, industry, and society in general, many approaches are known to be non-robust, often relying on spurious correlations to make predictions. [Expand]

Thursday Poster Session

Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection

Jiacheng Cheng, Nuno Vasconcelos

The problem of novelty detection in fine-grained visual classification (FGVC) is considered. [Expand]

PDF

Show Tweets

Monday Poster Session

Learning To Filter: Siamese Relation Network for Robust Tracking

Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang

Despite the great success of Siamese-based trackers, their performance under complicated scenarios is still not satisfying, especially when there are distractors. [Expand]

Tuesday Poster Session

Light Field Super-Resolution With Zero-Shot Learning

Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha

Deep learning provides a new avenue for light field super-resolution (SR). [Expand]

PDF

Show Tweets

Wednesday Poster Session

Adaptive Image Transformer for One-Shot Object Detection

Ding-Jie Chen, He-Yen Hsieh, Tyng-Luh Liu

One-shot object detection tackles a challenging task that aims at identifying within a target image all object instances of the same class, implied by a query image patch. [Expand]

PDF

Show Tweets

Thursday Poster Session

Class-Aware Robust Adversarial Training for Object Detection

PDF

Show Tweets

Tuesday Poster Session

MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes

Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang

Deepfakes raised serious concerns on the authenticity of visual contents. [Expand]

Wednesday Poster Session

MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

Hansheng Chen, Yuyao Huang, Wei Tian, Zhong Gao, Lu Xiong

Object localization in 3D space is a challenging aspect in monocular 3D object detection. [Expand]

Wednesday Poster Session

Neural Feature Search for RGB-Infrared Person Re-Identification

Yehansen Chen, Lin Wan, Zhihang Li, Qianyan Jing, Zongyuan Sun

RGB-Infrared person re-identification (RGB-IR ReID) is a challenging cross-modality retrieval problem, which aims at matching the person-of-interest over visible and infrared camera views. [Expand]

Monday Poster Session

Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation With Manipulable Semantics

Jia-Wei Chen, Li-Ju Chen, Chia-Mu Yu, Chun-Shien Lu

With the growing use of camera devices, the industry has many image datasets that provide more opportunities for collaboration between the machine learning community and industry. [Expand]

Tuesday Poster Session

Pareto Self-Supervised Training for Few-Shot Learning

Zhengyu Chen, Jixie Ge, Heshen Zhan, Siteng Huang, Donglin Wang

While few-shot learning (FSL) aims for rapid generalization to new concepts with little supervision, self-supervised learning (SSL) constructs supervisory signals directly computed from unlabeled data. [Expand]

Thursday Poster Session

PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors

Zeyuan Chen, Yangchao Wang, Yang Yang, Dong Liu

Deep learning-based methods have achieved remarkable performance for image dehazing. [Expand]

PDF

Show Tweets

Wednesday Poster Session

S2R-DepthNet: Learning a Generalizable Depth-Specific Structural Representation

Monday Poster Session

Towards Accurate 3D Human Motion Prediction From Incomplete Observations

Qiongjie Cui, Huaijiang Sun

Predicting accurate and realistic future human poses from historically observed sequences is a fundamental task in the intersection of computer vision, graphics, and artificial intelligence. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Dynamic Head: Unifying Object Detection Heads With Attentions

Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang

The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Learning Affinity-Aware Upsampling for Deep Image Matting

Yutong Dai, Hao Lu, Chunhua Shen

We show that learning affinity in upsampling provides an effective and efficient approach to exploit pairwise interactions in deep networks. [Expand]

Tuesday Poster Session

Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts

Zongyong Deng, Hao Liu, Yaoxing Wang, Chenyang Wang, Zekuan Yu, Xuehong Sun

In this paper, we propose a progressive margin loss (PML) approach for unconstrained facial age classification. [Expand]

Qi Fan, Deng-Ping Fan, Huazhu Fu, Chi-Keung Tang, Ling Shao, Yu-Wing Tai

We present a novel group collaborative learning framework (GCNet) capable of detecting co-salient objects in real time (16ms), by simultaneously mining consensus representations at group level based on the two necessary criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module; 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module conditioning the inconsistent consensus. [Expand]

Thursday Poster Session

Learning Triadic Belief Dynamics in Nonverbal Communication From Videos

Lifeng Fan, Shuwen Qiu, Zilong Zheng, Tao Gao, Song-Chun Zhu, Yixin Zhu

Humans possess a unique social cognition capability; nonverbal communication can convey rich social information among agents. [Expand]

Wednesday Poster Session

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Hehe Fan, Yi Yang, Mohan Kankanhalli

Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. [Expand]

PDF

Show Tweets

Thursday Poster Session

Cross-Domain Similarity Learning for Face Recognition in Unseen Domains

Thursday Poster Session

A Multi-Task Network for Joint Specular Highlight Detection and Removal

Gang Fu, Qing Zhang, Lei Zhu, Ping Li, Chunxia Xiao

Specular highlight detection and removal are fundamental and challenging tasks. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Double Low-Rank Representation With Projection Distance Penalty for Clustering

Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

This paper presents a novel, simple yet robust self-representation method, i.e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Auto-Exposure Fusion for Single-Image Shadow Removal

Lan Fu, Changqing Zhou, Qing Guo, Felix Juefei-Xu, Hongkai Yu, Wei Feng, Yang Liu, Song Wang

Shadow removal is still a challenging task due to its inherent background-dependent and spatial-variant properties, leading to unknown and diverse shadow patterns. [Expand]

Wednesday Poster Session

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu

Multi-Source Domain Adaptation (MSDA), which dedicates to transfer the knowledge learned from multiple source domains to an unlabeled target domain, has drawn increasing attention in the research community. [Expand]

PDF

Show Tweets

Friday Poster Session

STMTrack: Template-Free Visual Tracking With Space-Time Memory Networks

Zhihong Fu, Qingjie Liu, Zehua Fu, Yunhong Wang

Boosting performance of the offline trained siamese trackers is getting harder nowadays since the fixed information of the template cropped from the first frame has been almost thoroughly mined, but they are poorly capable of resisting target appearance changes. [Expand]

Thursday Poster Session

Robust Point Cloud Registration Framework Based on Deep Graph Matching

Kexue Fu, Shaolei Liu, Xiaoyuan Luo, Manning Wang

3D point cloud registration is a fundamental problem in computer vision and robotics. [Expand]

Wednesday Poster Session

Transferable Query Selection for Active Domain Adaptation

Bo Fu, Zhangjie Cao, Jianmin Wang, Mingsheng Long

Unsupervised domain adaptation (UDA) enables transferring knowledge from a related source domain to a fully unlabeled target domain. [Expand]

Wednesday Poster Session

Isometric Multi-Shape Matching

Thursday Poster Session

PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation With Neural Positional Encoding and Distilled Matting Loss

Juan Luis Gonzalez, Munchurl Kim

In this paper, we propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net. [Expand]

Tuesday Poster Session

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction

Shanyan Guan, Jingwei Xu, Yunbo Wang, Bingbing Ni, Xiaokang Yang

This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos. [Expand]

Wednesday Poster Session

Inverse Simulation: Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control

Jingfan Guo, Jie Li, Rahul Narain, Hyun Soo Park

This paper studies the problem of inverse cloth simulation---to estimate shape and time-varying poses of the underlying body that generates physically plausible cloth motion, which matches to the point cloud measurements on the clothed humans. [Expand]

PDF

Show Tweets

Thursday Poster Session

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects. [Expand]

Guangyu Guo, Junwei Han, Fang Wan, Dingwen Zhang

Weakly supervised object localization (WSOL) aims at learning to localize objects of interest by only using the image-level labels as the supervision. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Contrastive Embedding for Generalized Zero-Shot Learning

Zongyan Han, Zhenyong Fu, Shuo Chen, Jian Yang

Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes, when only the labeled examples from seen classes are provided. [Expand]

Monday Poster Session

Learning To Fuse Asymmetric Feature Maps in Siamese Trackers

Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen

Recently, Siamese-based trackers have achieved promising performance in visual tracking. [Expand]

Friday Poster Session

Crossing Cuts Polygonal Puzzles: Models and Solvers

Peleg Harel, Ohad Ben-Shahar

Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered fragments, is fundamental to numerous applications, and yet most of the literature has focused thus far on less realistic puzzles whose pieces are identical squares. [Expand]

PDF

Show Tweets

Tuesday Poster Session

NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

Hyunho Ha, Joo Ho Lee, Andreas Meuleman, Min H. Kim

Multiview shape-from-shading (SfS) has achieved high-detail geometry, but its computation is expensive for solving a multiview registration and an ill-posed inverse rendering problem. [Expand]

PDF

Show Tweets

Friday Poster Session

Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps

Yuk Heo, Yeong Jun Koh, Chang-Su Kim

We propose a novel guided interactive segmentation (GIS) algorithm for video objects to improve the segmentation accuracy and reduce the interaction time. [Expand]

Wednesday Poster Session

DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution

Tong He, Chunhua Shen, Anton van den Hengel

Previous top-performing approaches for point cloud instance segmentation involve a bottom-up strategy, which often includes inefficient operations or complex pipelines, such as grouping over-segmented components, introducing additional steps for refining, or designing complicated loss functions. [Expand]

Monday Poster Session

MOST: A Multi-Oriented Scene Text Detector With Localization Refinement

Minghang He, Minghui Liao, Zhibo Yang, Humen Zhong, Jun Tang, Wenqing Cheng, Cong Yao, Yongpan Wang, Xiang Bai

Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios. [Expand]

Wednesday Poster Session

Composing Photos Like a Photographer

Chaoyi Hong, Shuaiyuan Du, Ke Xian, Hao Lu, Zhiguo Cao, Weicai Zhong

We show that explicit modeling of composition rules benefits image cropping. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Disentangling Label Distribution for Long-Tailed Visual Recognition

Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long

Deep learning techniques for point clouds have achieved strong performance on a range of 3D vision tasks. [Expand]

Wednesday Poster Session

Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

Zhen Huang, Xu Shen, Jun Xing, Tongliang Liu, Xinmei Tian, Houqiang Li, Bing Deng, Jianqiang Huang, Xian-Sheng Hua

Knowledge Distillation (KD) is a popular technique to transfer knowledge from a teacher model or ensemble to a student model. [Expand]

PDF

Show Tweets

Tuesday Poster Session

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Yu-Cheng Chang, Tsung-Lin Tsou, Yu-An Wang, Winston H. Hsu

Dense depth estimation plays a key role in multiple applications such as robotics, 3D reconstruction, and augmented reality. [Expand]

Friday Poster Session

Video Rescaling Networks With Joint Optimization Strategies for Downscaling and Upscaling

Yan-Cheng Huang, Yi-Hsin Chen, Cheng-You Lu, Hui-Po Wang, Wen-Hsiao Peng, Ching-Chun Huang

This paper addresses the video rescaling task, which arises from the needs of adapting the video spatial resolution to suit individual viewing devices. [Expand]

Tuesday Poster Session

Learning the Non-Differentiable Optimization for Blind Super-Resolution

Zheng Hui, Jie Li, Xiumei Wang, Xinbo Gao

Previous convolutional neural network (CNN) based blind super-resolution (SR) methods usually adopt an iterative optimization way to approximate the ground-truth (GT) step-by-step. [Expand]

PDF

Show Tweets

Monday Poster Session

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation

Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data. [Expand]

PDF

Show Tweets

Monday Poster Session

A2-FPN: Attention Aggregation Based Feature Pyramid Network for Instance Segmentation

Miao Hu, Yali Li, Lu Fang, Shengjin Wang

Learning pyramidal feature representations is crucial for recognizing object instances at different scales. [Expand]

Thursday Poster Session

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang

Language-queried video actor segmentation aims to predict the pixel-level mask of the actor which performs the actions described by a natural language query in the target frames. [Expand]

Tuesday Poster Session

Efficient Deformable Shape Correspondence via Multiscale Spectral Manifold Wavelets Preservation

Ling Hu, Qinsong Li, Shengjun Liu, Xinru Liu

The functional map framework has proven to be extremely effective for representing dense correspondences between deformable shapes. [Expand]

PDF

Show Tweets

Thursday Poster Session

Learning Cross-Modal Retrieval With Noisy Labels

Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin

Recently, cross-modal retrieval is emerging with the help of deep multimodal learning. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Dense Relation Distillation With Context-Aware Aggregation for Few-Shot Object Detection

Hanzhe Hu, Shuai Bai, Aoxue Li, Jinshi Cui, Liwei Wang

Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. [Expand]

Wednesday Poster Session

Pseudo 3D Auto-Correlation Network for Real Image Denoising

Xiaowan Hu, Ruijun Ma, Zhihong Liu, Yuanhao Cai, Xiaole Zhao, Yulun Zhang, Haoqian Wang

The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain. [Expand]

PDF

Show Tweets

Friday Poster Session

Model-Aware Gesture-to-Gesture Translation

Hezhen Hu, Weilun Wang, Wengang Zhou, Weichao Zhao, Houqiang Li

Hand gesture-to-gesture translation is a significant and interesting problem, which serves as a key role in many applications, such as sign language production. [Expand]

PDF

Show Tweets

Friday Poster Session

Safe Local Motion Planning With Self-Supervised Freespace Forecasting

Peiyun Hu, Aaron Huang, John Dolan, David Held, Deva Ramanan

Safe local motion planning for autonomous driving in dynamic environments requires forecasting how the scene evolves. [Expand]

PDF

Show Tweets

Thursday Poster Session

Wide-Depth-Range 6D Object Pose Estimation in Space

Thursday Poster Session

Facial Action Unit Detection With Transformers

Geethu Miriam Jacob, Bjorn Stenger

The Facial Action Coding System is a taxonomy for fine-grained facial expression analysis. [Expand]

PDF

Show Tweets

Wednesday Poster Session

CAMERAS: Enhanced Resolution and Sanity Preserving Class Activation Mapping for Image Saliency

Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. [Expand]

PDF

Show Tweets

Friday Poster Session

Learning Compositional Representation for 4D Captures With Neural ODE

Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu

Learning based representation has become the key to the success of many computer vision systems. [Expand]

Tuesday Poster Session

UV-Net: Learning From Boundary Representations

Pradeep Kumar Jayaraman, Aditya Sanghi, Joseph G. Lambourne, Karl D.D. Willis, Thomas Davies, Hooman Shayani, Nigel Morris

We introduce UV-Net, a novel neural network architecture and representation designed to operate directly on Boundary representation (B-rep) data from 3D CAD models. [Expand]

PDF

Show Tweets

Thursday Poster Session

Mining Better Samples for Contrastive Learning of Temporal Correspondence

Sangryul Jeon, Dongbo Min, Seungryong Kim, Kwanghoon Sohn

We present a novel framework for contrastive learning of pixel-level representation using only unlabeled video. [Expand]

PDF

Show Tweets

Monday Poster Session

Saliency-Guided Image Translation

Lai Jiang, Mai Xu, Xiaofei Wang, Leonid Sigal

In this paper, we propose a novel task for saliency-guided image translation, with the goal of image-to-image translation conditioned on the user specified saliency map. [Expand]

Friday Poster Session

IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking

Shuai Jia, Yibing Song, Chao Ma, Xiaokang Yang

Adversarial attack arises due to the vulnerability of deep neural networks to perceive input samples injected with imperceptible perturbations. [Expand]

Tuesday Poster Session

Leveraging Line-Point Consistence To Preserve Structures for Wide Parallax Image Stitching

Qi Jia, ZhengJun Li, Xin Fan, Haotian Zhao, Shiyu Teng, Xinchen Ye, Longin Jan Latecki

Generating high-quality stitched images with natural structures is a challenging task in computer vision. [Expand]

Thursday Poster Session

Amalgamating Knowledge From Heterogeneous Graph Neural Networks

Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, Dacheng Tao

In this paper, we study a novel knowledge transfer task in the domain of graph neural networks (GNNs). [Expand]

PDF

Show Tweets

Friday Poster Session

Harmonious Semantic Line Detection via Maximal Weight Clique Selection

Wednesday Poster Session

Joint Negative and Positive Learning for Noisy Labels

Thursday Poster Session

PatchMatch-Based Neighborhood Consensus for Semantic Correspondence

Jae Yong Lee, Joseph DeGol, Victor Fragoso, Sudipta N. Sinha

We address estimating dense correspondences between two images depicting different but semantically related scenes. [Expand]

PDF

Show Tweets

Thursday Poster Session

Network Quantization With Element-Wise Gradient Scaling

Junghyup Lee, Dohyung Kim, Bumsub Ham

Network quantization aims at reducing bit-widths of weights and/or activations, particularly important for implementing deep neural networks with limited hardware resources. [Expand]

Tuesday Poster Session

Relevance-CAM: Your Model Already Knows Where To Look

Jeong Ryong Lee, Sewon Kim, Inyong Park, Taejoon Eo, Dosik Hwang

With increasing fields of application for neural networks and the development of neural networks, the ability to explain deep learning models is also becoming increasingly important. [Expand]

PDF

Show Tweets

Thursday Poster Session

Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning

Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, Yong Man Ro

Our work addresses long-term motion context issues for predicting future frames. [Expand]

Tuesday Poster Session

Picasso: A CUDA-Based Library for Deep Learning Over 3D Meshes

Huan Lei, Naveed Akhtar, Ajmal Mian

We present Picasso, a CUDA-based library comprising novel modules for deep learning over complex real-world 3D meshes. [Expand]

Thursday Poster Session

RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union

Zhidong Liang, Zehan Zhang, Ming Zhang, Xian Zhao, Shiliang Pu

Real-time and high-performance 3D object detection is an attractive research direction in autonomous driving. [Expand]

PDF

Show Tweets

Wednesday Poster Session

4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis

Weihang Liao, Art Subpa-asa, Yinqiang Zheng, Imari Sato

Hyperspectral photoacoustic (HSPA) spectroscopy is an emerging bi-modal imaging technology that is able to show the wavelength-dependent absorption distribution of the interior of a 3D volume. [Expand]

PDF

Show Tweets

Tuesday Poster Session

COMPLETER: Incomplete Multi-View Clustering via Contrastive Prediction

Yijie Lin, Yuanbiao Gou, Zitao Liu, Boyun Li, Jiancheng Lv, Xi Peng

In this paper, we study two challenging problems in incomplete multi-view clustering analysis, namely, i) how to learn an informative and consistent representation among different views without the help of labels and ii) how to recover the missing views from data. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

Chuming Lin, Chengming Xu, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu

Temporal action localization is an important yet challenging task in video understanding. [Expand]

Tuesday Poster Session

Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo

Wednesday Poster Session

Cluster-Wise Hierarchical Generative Model for Deep Amortized Clustering

Huafeng Liu, Jiaqi Wang, Liping Jing

In this paper, we propose Cluster-wise Hierarchical Generative Model for deep amortized clustering (CHiGac). [Expand]

PDF

Show Tweets

Thursday Poster Session

Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding

Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Yu Cheng, Wei Wei, Zichuan Xu, Yulai Xie

This paper addresses the problem of temporal sentence grounding (TSG), which aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. [Expand]

Wednesday Poster Session

Deep Learning in Latent Space for Video Prediction and Compression

Bowen Liu, Yu Chen, Shiyu Liu, Hun-Seok Kim

Learning-based video compression has achieved substantial progress during recent years. [Expand]

PDF

Show Tweets

Monday Poster Session

Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation

Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, Yuexian Zou

Automatically generating radiology reports can improve current clinical practice in diagnostic radiology. [Expand]

PDF

Show Tweets

Thursday Poster Session

Exploit Visual Dependency Relations for Semantic Segmentation

Mingyuan Liu, Dan Schonfeld, Wei Tang

Dependency relations among visual entities are ubiquity because both objects and scenes are highly structured. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction

Feng Liu, Luan Tran, Xiaoming Liu

Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision. [Expand]

Wednesday Poster Session

Generic Perceptual Loss for Modeling Structured Output Dependencies

Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman

We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets. [Expand]

Tuesday Poster Session

FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains

Jia Li, Zhaoyang Li, Jie Cao, Xingguang Song, Ran He

In this work, we propose a novel two-stage framework named FaceInpainter to implement controllable Identity-Guided Face Inpainting (IGFI) under heterogeneous domains. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Few-Shot Object Detection via Classification Refinement and Distractor Retreatment

Yiting Li, Haiyue Zhu, Yu Cheng, Wenxin Wang, Chek Sing Teo, Cheng Xiang, Prahlad Vadakkepat, Tong Heng Lee

We aim to tackle the challenging Few-Shot Object Detection (FSOD) where data-scarce categories are presented during the model learning. [Expand]

PDF

Show Tweets

Thursday Poster Session

Hilbert Sinkhorn Divergence for Optimal Transport

Qian Li, Zhichao Wang, Gang Li, Jun Pang, Guandong Xu

Sinkhorn divergence has become a very popular metric to compare probability distributions in optimal transport. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Learning To Identify Correct 2D-2D Line Correspondences on Sphere

Haoang Li, Kai Chen, Ji Zhao, Jiangliu Wang, Pyojin Kim, Zhe Liu, Yun-Hui Liu

Given a set of putative 2D-2D line correspondences, we aim to identify correct matches. [Expand]

PDF

Show Tweets

Thursday Poster Session

Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression

Wanhua Li, Xiaoke Huang, Jiwen Lu, Jianjiang Feng, Jie Zhou

Uncertainty is the only certainty there is. [Expand]

Thursday Poster Session

Lighting, Reflectance and Geometry Estimation From 360deg Panoramic Stereo

Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360deg stereo images. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Generalizing to the Open World: Deep Visual Odometry With Online Adaptation

Shunkai Li, Xin Wu, Yingdian Cao, Hongbin Zha

Despite learning-based visual odometry (VO) has shown impressive results in recent years, the pretrained networks may easily collapse in unseen environments. [Expand]

Thursday Poster Session

Meta-Mining Discriminative Samples for Kinship Verification

Wanhua Li, Shiwei Wang, Jiwen Lu, Jianjiang Feng, Jie Zhou

Kinship verification aims to find out whether there is a kin relation for a given pair of facial images. [Expand]

Friday Poster Session

Probabilistic Model Distillation for Semantic Correspondence

Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu

Semantic correspondence is a fundamental problem in computer vision, which aims at establishing dense correspondences across images depicting different instances under the same category. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement

Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

Unsupervised learning methods have recently shown their competitiveness against supervised training. [Expand]

Wednesday Poster Session

Representing Videos As Discriminative Sub-Graphs for Action Recognition

Dong Li, Zhaofan Qiu, Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei

Human actions are typically of combinatorial structures or patterns, i.e., subjects, objects, plus spatio-temporal interactions in between. [Expand]

Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte

In this paper, we tackle the problem of convolutional neural network design. [Expand]

Monday Poster Session

Towards Compact CNNs via Collaborative Compression

Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. [Expand]

Tuesday Poster Session

Toward Accurate and Realistic Outfits Visualization With Attention to Details

Kedan Li, Min Jin Chong, Jeffrey Zhang, Jingen Liu

Virtual try-on methods aim to generate images of fashion models wearing arbitrary combinations of garments. [Expand]

PDF

Show Tweets

Thursday Poster Session

Transferable Semantic Augmentation for Domain Adaptation

Shuang Li, Mixue Xie, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Wei Li

Domain adaptation has been widely explored by transferring the knowledge from a label-rich source domain to a related but unlabeled target domain. [Expand]

Thursday Poster Session

Transformation Invariant Few-Shot Object Detection

Aoxue Li, Zhenguo Li

Few-shot object detection (FSOD) aims to learn detectors that can be generalized to novel classes with only a few instances. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations

Zhihui Li, Lina Yao

Temporal action detection on unconstrained videos has seen significant research progress in recent years. [Expand]

PDF

Show Tweets

Tuesday Poster Session

VirFace: Enhancing Face Recognition via Unlabeled Shallow Data

Wenyu Li, Tianchu Guo, Pengyu Li, Binghui Chen, Biao Wang, Wangmeng Zuo, Lei Zhang

Recently, exploiting the effect of the unlabeled data for face recognition attracts increasing attention. [Expand]

Thursday Poster Session

CLCC: Contrastive Learning for Color Constancy

Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, Kevin Jou

In this paper, we present CLCC, a novel contrastive learning framework for color constancy. [Expand]

Wednesday Poster Session

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Xiaoxiao Long, Lingjie Liu, Wei Li, Christian Theobalt, Wenping Wang

We present a novel method for multi-view depth estimation from a single video, which is a critical task in various applications, such as perception, reconstruction and robot navigation. [Expand]

Wednesday Poster Session

Radar-Camera Pixel Depth Association for Depth Completion

Yunfei Long, Daniel Morris, Xiaoming Liu, Marcos Castro, Punarjay Chakravarty, Praveen Narayanan

While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. [Expand]

Thursday Poster Session

Conditional Bures Metric for Domain Adaptation

You-Wei Luo, Chuan-Xian Ren

As a vital problem in classification-oriented transfer, unsupervised domain adaptation (UDA) has attracted widespread attention in recent years. [Expand]

PDF

Show Tweets

Thursday Poster Session

Action Unit Memory Network for Weakly Supervised Temporal Action Localization

Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang

Weakly supervised temporal action localization aims to detect and localize actions in untrimmed videos with only video-level labels during training. [Expand]

Wednesday Poster Session

Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

Huiwen Luo, Koki Nagano, Han-Wei Kung, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. [Expand]

PDF

Show Tweets

Thursday Poster Session

Scalable Differential Privacy With Sparse Network Finetuning

Zelun Luo, Daniel J. Wu, Ehsan Adeli, Li Fei-Fei

We propose a novel method for privacy-preserving training of deep neural networks leveraging public, out-domain data. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Intelligent Carpet: Inferring 3D Human Pose From Tactile Signals

Yiyue Luo, Yunzhu Li, Michael Foshey, Wan Shou, Pratyusha Sharma, Tomas Palacios, Antonio Torralba, Wojciech Matusik

Daily human activities, e.g., locomotion, exercises, and resting, are heavily guided by the tactile interactions between the human and the ground. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Stay Positive: Non-Negative Image Synthesis for Augmented Reality

Katie Luo, Guandao Yang, Wenqi Xian, Harald Haraldsson, Bharath Hariharan, Serge Belongie

In applications such as optical see-through and projector augmented reality, producing images amounts to solving non-negative image generation, where one can only add light to an existing image. [Expand]

Wednesday Poster Session

Large-Capacity Image Steganography Based on Invertible Neural Networks

Shao-Ping Lu, Rong Wang, Tao Zhong, Paul L. Rosin

Many attempts have been made to hide information in images, where the main challenge is how to increase the payload capacity without the container image being detected as containing a message. [Expand]

PDF

Show Tweets

Wednesday Poster Session

CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation

Tao Lu, Limin Wang, Gangshan Wu

Previous point cloud semantic segmentation networks use the same process to aggregate features from neighbors of the same category and different categories. [Expand]

Xiaolei Lv, Shengchu Zhao, Xinyang Yu, Binqiang Zhao

Recognition and reconstruction of residential floor plan drawings are important and challenging in design, decoration, and architectural remodeling fields. [Expand]

PDF

Show Tweets

Friday Poster Session

Towards Evaluating and Training Verifiably Robust Neural Networks

Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin

Recent works have shown that interval bound propagation (IBP) can be used to train verifiably robust neural networks. [Expand]

Tuesday Poster Session

Efficient Multi-Stage Video Denoising With Recurrent Spatio-Temporal Fusion

Matteo Maggioni, Yibin Huang, Cheng Li, Shuai Xiao, Zhongqian Fu, Fenglong Song

In recent years, denoising methods based on deep learning have achieved unparalleled performance at the cost of large computational complexity. [Expand]

Tuesday Poster Session

MultiLink: Multi-Class Structure Recovery via Agglomerative Clustering and Model Selection

Luca Magri, Filippo Leveni, Giacomo Boracchi

We address the problem of recovering multiple structures of different classes in a dataset contaminated by noise and outliers. [Expand]

PDF

Show Tweets

Monday Poster Session

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira

How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Magic Layouts: Structural Prior for Component Detection in User Interface Designs

Dipu Manandhar, Hailin Jin, John Collomosse

We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts. [Expand]

PDF

Show Tweets

Friday Poster Session

CapsuleRRT: Relationships-Aware Regression Tracking via Capsules

Ding Ma, Xiangqian Wu

Regression tracking has gained more and more attention thanks to its easy-to-implement characteristics, while existing regression trackers rarely consider the relationships between the object parts and the complete object. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Weakly Supervised Action Selection Learning in Video

Junwei Ma, Satya Krishna Gorti, Maksims Volkovs, Guangwei Yu

Localizing actions in video is a core task in computer vision. [Expand]

Wednesday Poster Session

Image Super-Resolution With Non-Local Sparse Attention

Yiqun Mei, Yuchen Fan, Yuqian Zhou

Both non-local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). [Expand]

PDF

Show Tweets

Tuesday Poster Session

Real-Time Sphere Sweeping Stereo From Multiview Fisheye Images

Andreas Meuleman, Hyeonjoong Jang, Daniel S. Jeon, Min H. Kim

A set of cameras with fisheye lenses have been used to capture a wide field of view. [Expand]

PDF

Show Tweets

Thursday Poster Session

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang

In this paper, we present a new dataset with the target of advancing the scene parsing task from images to videos. [Expand]

Tuesday Poster Session

PVGNet: A Bottom-Up One-Stage 3D Object Detector With Integrated Multi-Level Features

Norman Muller, Yu-Shiang Wong, Niloy J. Mitra, Angela Dai, Matthias Niessner

Multi-object tracking from RGB-D video sequences is a challenging problem due to the combination of changing viewpoints, motion, and occlusions over time. [Expand]

Tuesday Poster Session

Extreme Low-Light Environment-Driven Image Denoising Over Permanently Shadowed Lunar Regions With a Physical Noise Model

Ben Moseley, Valentin Bickel, Ignacio G. Lopez-Francos, Loveneesh Rana

Recently, learning-based approaches have achieved impressive results in the field of low-light image denoising. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Interventional Video Grounding With Dual Contrastive Learning

Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu

Video grounding aims to localize a moment from an untrimmed video for a given textual query. [Expand]

PDF

Show Tweets

Monday Poster Session

All Labels Are Not Created Equal: Enhancing Semi-Supervision via Label Grouping and Co-Training

Islam Nassar, Samitha Herath, Ehsan Abbasnejad, Wray Buntine, Gholamreza Haffari

Pseudo-labeling is a key component in semi-supervised learning (SSL). [Expand]

Wednesday Poster Session

Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction

Sriram Narayanan, Ramin Moslemi, Francesco Pittaluga, Buyu Liu, Manmohan Chandraker

Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions. [Expand]

Friday Poster Session

FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation

Jaemin Na, Heechul Jung, Hyung Jin Chang, Wonjun Hwang

Unsupervised domain adaptation (UDA) methods for learning domain invariant representations have achieved remarkable progress. [Expand]

Monday Poster Session

Pedestrian and Ego-Vehicle Trajectory Prediction From Monocular Camera

Lukas Neumann, Andrea Vedaldi

Predicting future pedestrian trajectory is a crucial component of autonomous driving systems, as recognizing critical situations based only on current pedestrian position may come too late for any meaningful corrective action (e.g. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Dictionary-Guided Scene Text Recognition

Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh-Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai

Language prior plays an important role in the way humans perceive and recognize text in the wild. [Expand]

Wednesday Poster Session

Discovering Relationships Between Object Categories via Universal Canonical Maps

Natalia Neverova, Artsiom Sanakoyeu, Patrick Labatut, David Novotny, Andrea Vedaldi

We tackle the problem of learning the geometry of multiple categories of deformable objects jointly. [Expand]

Youngmin Oh, Beomjun Kim, Bumsub Ham

We address the problem of weakly-supervised semantic segmentation (WSSS) using bounding box annotations. [Expand]

Tuesday Poster Session

Protecting Intellectual Property of Generative Adversarial Networks From Ambiguity Attacks

Ding Sheng Ong, Chee Seng Chan, Kam Woh Ng, Lixin Fan, Qiang Yang

Ever since Machine Learning as a Service emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. [Expand]

Tuesday Poster Session

A Quasiconvex Formulation for Radial Cameras

Carl Olsson, Viktor Larsson, Fredrik Kahl

In this paper we study structure from motion problems for 1D radial cameras. [Expand]

PDF

Show Tweets

Thursday Poster Session

Bilinear Parameterization for Non-Separable Singular Value Penalties

Marcus Valtonen Ornhag, Jose Pedro Iglesias, Carl Olsson

Low rank inducing penalties have been proven to successfully uncover fundamental structures considered in computer vision and machine learning; however, such methods generally lead to non-convex optimization problems. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Neural Auto-Exposure for High-Dynamic Range Object Detection

Emmanuel Onzon, Fahim Mannan, Felix Heide

Real-world scenes have a dynamic range of up to 280 dB that today's imaging sensors cannot directly capture. [Expand]

Wednesday Poster Session

SDD-FIQA: Unsupervised Face Image Quality Assessment With Similarity Distribution Distance

Fu-Zhao Ou, Xingyu Chen, Ruixin Zhang, Yuge Huang, Shaoxin Li, Jilin Li, Yong Li, Liujuan Cao, Yuan-Gen Wang

In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. [Expand]

Wednesday Poster Session

Fast Sinkhorn Filters: Using Matrix Scaling for Non-Rigid Shape Correspondence With Functional Maps

Gautam Pai, Jing Ren, Simone Melzi, Peter Wonka, Maks Ovsjanikov

In this paper, we provide a theoretical foundation for pointwise map recovery from functional maps and highlight its relation to a range of shape correspondence methods based on spectral alignment. [Expand]

Monday Poster Session

Synthesize-It-Classifier: Learning a Generative Classifier Through Recurrent Self-Analysis

Arghya Pal, Raphael C.-W. Phan, KokSheik Wong

In this work, we show the generative capability of an image classifier network by synthesizing high-resolution, photo-realistic, and diverse images at scale. [Expand]

Tuesday Poster Session

Generalization on Unseen Domains via Inference-Time Label-Preserving Target Projections

Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. [Expand]

PDF

Show Tweets

Thursday Poster Session

Trajectory Prediction With Latent Belief Energy-Based Model

Bo Pang, Tianyang Zhao, Xu Xie, Ying Nian Wu

Human trajectory prediction is critical for autonomous platforms like self-driving cars or social robots. [Expand]

Thursday Poster Session

Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising

Tongyao Pang, Huan Zheng, Yuhui Quan, Hui Ji

Deep denoiser, the deep network for denoising, has been the focus of the recent development on image denoising. [Expand]

PDF

Show Tweets

Monday Poster Session

Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders

Jiwoong Park, Junho Cho, Hyung Jin Chang, Jin Young Choi

Most of the existing literature regarding hyperbolic embedding concentrate upon supervised learning, whereas the use of unsupervised hyperbolic embedding is less well explored. [Expand]

Tuesday Poster Session

Learning To Predict Visual Attributes in the Wild

Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

Visual attributes constitute a large portion of information contained in a scene. [Expand]

PDF

Show Tweets

Thursday Poster Session

SliceNet: Deep Dense Depth Estimation From a Single Indoor Panorama Using a Slice-Based Representation

Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, Enrico Gobbetti

We introduce a novel deep neural network to estimate a depth map from a single monocular indoor panorama. [Expand]

PDF

Show Tweets

Thursday Poster Session

Recognizing Actions in Videos From Unseen Viewpoints

AJ Piergiovanni, Michael S. Ryoo

Standard methods for video recognition use large CNNs designed to capture spatio-temporal data. [Expand]

Tuesday Poster Session

CompositeTasking: Understanding Images by Spatial Composition of Tasks

Nikola Popovic, Danda Pani Paudel, Thomas Probst, Guolei Sun, Luc Van Gool

We define the concept of CompositeTasking as the fusion of multiple, spatially distributed tasks, for various aspects of image understanding. [Expand]

Tuesday Poster Session

A Functional Approach to Rotation Equivariant Non-Linearities for Tensor Field Networks.

Adrien Poulenard, Leonidas J. Guibas

Learning pose invariant representation is a fundamental problem in shape analysis. [Expand]

PDF

Show Tweets

Thursday Poster Session

Labeled From Unlabeled: Exploiting Unlabeled Data for Few-Shot Deep HDR Deghosting

K. Ram Prabhakar, Gowtham Senthil, Susmit Agrawal, R. Venkatesh Babu, Rama Krishna Sai S Gorthi

High Dynamic Range (HDR) deghosting is an indispensable tool in capturing wide dynamic range scenes without ghosting artifacts. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

John Phillips, Julieta Martinez, Ioan Andrei Barsan, Sergio Casas, Abbas Sadat, Raquel Urtasun

Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving including perception, motion forecasting, and motion planning. [Expand]

Tuesday Poster Session

BABEL: Bodies, Action and Behavior With English Labels

Abhinanda R. Punnakkal, Arjun Chandrasekaran, Nikos Athanasiou, Alejandra Quiros-Ramirez, Michael J. Black

Understanding the semantics of human movement -- the what, how and why of the movement -- is an important problem that requires datasets of human actions with semantic labels. [Expand]

PDF

Show Tweets

Monday Poster Session

Boosting Video Representation Learning With Multi-Faceted Integration

Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Xiao-Ping Zhang, Dong Wu, Tao Mei

Video content is multifaceted, consisting of objects, scenes, interactions or actions. [Expand]

PDF

Show Tweets

Thursday Poster Session

Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation Priors

Haiquan Qiu, Yao Wang, Deyu Meng

Snapshot compressive imaging (SCI) is a new type of compressive imaging system that compresses multiple frames of images into a single snapshot measurement, which enjoys low cost, low bandwidth, and high-speed sensing rate. [Expand]

PDF

Show Tweets

Wednesday Poster Session

PQA: Perceptual Question Answering

Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song

Perceptual organization remains one of the very few established theories on the human visual system. [Expand]

Thursday Poster Session

Multi-Scale Aligned Distillation for Low-Resolution Detection

Nicolas Robidoux, Luis E. Garcia Capel, Dong-eun Seo, Avinash Sharma, Federico Ariza, Felix Heide

With a 280 dB dynamic range, the real world is a High Dynamic Range (HDR) world. [Expand]

Tuesday Poster Session

Gaussian Context Transformer

Dongsheng Ruan, Daiyin Wang, Yuan Zheng, Nenggan Zheng, Min Zheng

Recently, a large number of channel attention blocks are proposed to boost the representational power of deep convolutional neural networks (CNNs). [Expand]

PDF

Show Tweets

Thursday Poster Session

Learning-Based Image Registration With Meta-Regularization

Ebrahim Al Safadi, Xubo Song

We introduce a meta-regularization framework for learning-based image registration. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Learning an Explicit Weighting Scheme for Adapting Complex HSI Noise

Xiangyu Rui, Xiangyong Cao, Qi Xie, Zongsheng Yue, Qian Zhao, Deyu Meng

A general approach for handling hyperspectral image (HSI) denoising issue is to impose weights on different HSI pixels to suppress negative influence brought by noisy elements. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Multi-Perspective LSTM for Joint Visual Representation Learning

Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia, Ali Etemad

We present a novel LSTM cell architecture capable of learning both intra- and inter-perspective relationships available in visual sequences captured from multiple perspectives. [Expand]

Friday Poster Session

Introvert: Human Trajectory Prediction via Conditional 3D Attention

Nasim Shafiee, Taskin Padir, Ehsan Elhamifar

Predicting human trajectories is an important component of autonomous moving platforms, such as social robots and self-driving cars. [Expand]

PDF

Show Tweets

Friday Poster Session

Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects

Aashish Sharma, Robby T. Tan

Most existing nighttime visibility enhancement methods focus on low light. [Expand]

PDF

Show Tweets

Thursday Poster Session

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching

Zhelun Shen, Yuchao Dai, Zhibo Rao

Recently, the ever-increasing capacity of large-scale annotated datasets has led to profound progress in stereo matching. [Expand]

Thursday Poster Session

Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes

Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie Zhou

Face clustering is a promising method for annotating unlabeled face images. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

Panoptic segmentation aims to partition an image to object instances and semantic content for thing and stuff categories, respectively. [Expand]

PDF

Show Tweets

Friday Poster Session

clDice - A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

Suprosanna Shit, Johannes C. Paetzold, Anjany Sekuboyina, Ivan Ezhov, Alexander Unger, Andrey Zhylka, Josien P. W. Pluim, Ulrich Bauer, Bjoern H. Menze

Accurate segmentation of tubular, network-like structures, such as vessels, neurons, or roads, is relevant to many fields of research. [Expand]

Friday Poster Session

Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment

Dongyu She, Yu-Kun Lai, Gaoxiong Yi, Kun Xu

Learning computational models of image aesthetics can have a substantial impact on visual art and graphic design. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Learning by Planning: Language-Guided Global Image Editing

Monday Poster Session

Hybrid Message Passing With Performance-Driven Structures for Facial Action Unit Detection

Tengfei Song, Zijun Cui, Wenming Zheng, Qiang Ji

Message passing neural network has been an effective method to represent dependencies among nodes by propagating messages. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Mesh Saliency: An Independent Perceptual Measure or a Derivative of Image Saliency?

Monday Poster Session

Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

Cheng Sun, Chi-Wei Hsiao, Ning-Hsu Wang, Min Sun, Hwann-Tzong Chen

Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity. [Expand]

PDF

Show Tweets

Thursday Poster Session

Learning View Selection for 3D Scenes

Yifan Sun, Qixing Huang, Dun-Yu Hsiao, Li Guan, Gang Hua

Efficient 3D space sampling to represent an underlying3D object/scene is essential for 3D vision, robotics, and be-yond. [Expand]

PDF

Show Tweets

Thursday Poster Session

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai

Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets. [Expand]

Tuesday Poster Session

Lesion-Aware Transformers for Diabetic Retinopathy Grading

Yanan Sun, Chi-Keung Tang, Yu-Wing Tai

Natural image matting separates the foreground from background in fractional occupancy which can be caused by highly transparent objects, complex foreground (e.g., net or tree), and/or objects containing very fine details (e.g., hairs). [Expand]

Wednesday Poster Session

Tuning IR-Cut Filter for Illumination-Aware Spectral Reconstruction From RGB

Bo Sun, Junchi Yan, Xiao Zhou, Yinqiang Zheng

To reconstruct spectral signals from multi-channel observations, in particular trichromatic RGBs, has recently emerged as a promising alternative to traditional scanning-based spectral imager. [Expand]

Monday Poster Session

Uncertainty Reduction for Model Adaptation in Semantic Segmentation

Friday Poster Session

Mirror3D: Depth Refinement for Mirror Surfaces

Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva

Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors. [Expand]

Phi Vu Tran

Recent years have seen flourishing research on both semi-supervised learning and 3D room layout reconstruction. [Expand]

PDF

Show Tweets

Thursday Poster Session

ColorRL: Reinforced Coloring for End-to-End Instance Segmentation

Tran Anh Tuan, Nguyen Tuan Khoa, Tran Minh Quan, Won-Ki Jeong

Instance segmentation, the task of identifying and separating each individual object of interest in the image, is one of the actively studied research topics in computer vision. [Expand]

PDF

Show Tweets

Friday Poster Session

Time Lens: Event-Based Video Frame Interpolation

Stepan Tulyakov, Daniel Gehrig, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza

State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. [Expand]

Friday Poster Session

Uncertainty-Aware Camera Pose Estimation From Points and Lines

Alexander Vakhitov, Luis Ferraz, Antonio Agudo, Francesc Moreno-Noguer

Perspective-n-Point-and-Line (PnPL) algorithms aim at fast, accurate, and robust camera localization with respect to a 3D model from 2D-3D feature correspondences, being a major part of modern robotic and AR/VR systems. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Can We Characterize Tasks Without Labels or Features?

Bram Wallace, Ziyang Wu, Bharath Hariharan

The problem of expert model selection deals with choosing the appropriate pretrained network ("expert") to transfer to a target task. [Expand]

Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM. [Expand]

Wednesday Poster Session

Domain-Specific Suppression for Adaptive Object Detection

Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, Xishan Zhang, Shaoli Liu

Domain adaptation methods face performance degradation in object detection, as the complexity of tasks require more about the transferability of the model. [Expand]

Wednesday Poster Session

Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World

Jiakai Wang, Aishan Liu, Zixin Yin, Shunchang Liu, Shiyu Tang, Xianglong Liu

Deep learning models are vulnerable to adversarial examples. [Expand]

Wednesday Poster Session

EvDistill: Asynchronous Events To End-Task Learning via Bidirectional Reconstruction-Guided Cross-Modal Knowledge Distillation

Lin Wang, Yujeong Chae, Sung-Hoon Yoon, Tae-Kyun Kim, Kuk-Jin Yoon

Event cameras sense per-pixel intensity changes and produce asynchronous event streams with high dynamic range and less motion blur, showing advantages over the conventional cameras. [Expand]

PDF

Show Tweets

Monday Poster Session

FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation

Sijin Wang, Ziwei Yao, Ruiping Wang, Zhongqin Wu, Xilin Chen

Image caption evaluation is a crucial task, which involves the semantic perception and matching of image and text. [Expand]

PDF

Show Tweets

Thursday Poster Session

FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds

Monday Poster Session

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang

Camera and LiDAR are two complementary sensors for 3D object detection in the autonomous driving context. [Expand]

PDF

Show Tweets

Thursday Poster Session

Pseudo Facial Generation With Extreme Poses for Face Recognition

Guoli Wang, Jiaqi Ma, Qian Zhang, Jiwen Lu, Jie Zhou

Face recognition has achieved a great success in recent years, it is still challenging to recognize those facial images with extreme poses. [Expand]

PDF

Show Tweets

Monday Poster Session

Representative Forgery Mining for Fake Face Detection

Tuesday Poster Session

A Generalized Loss Function for Crowd Counting and Localization

Jia Wan, Ziquan Liu, Antoni B. Chan

Previous work shows that a better density map representation can improve the performance of crowd counting. [Expand]

Monday Poster Session

Self-Attention Based Text Knowledge Mining for Text Detection

Qi Wan, Haoqin Ji, Linlin Shen

Pre-trained models play an important role in deep learning based text detectors. [Expand]

PDF

Show Tweets

Tuesday Poster Session

MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation

Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen

For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics. [Expand]

Friday Poster Session

Shallow Feature Matters for Weakly Supervised Object Localization

Jun Wei, Qin Wang, Zhen Li, Sheng Wang, S. Kevin Zhou, Shuguang Cui

Weakly supervised object localization (WSOL) aims to localize objects by only utilizing image-level labels. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Autoregressive Stylized Motion Synthesis With Generative Flow

Yu-Hui Wen, Zhipeng Yang, Hongbo Fu, Lin Gao, Yanan Sun, Yong-Jin Liu

Motion style transfer is an important problem in many computer graphics and computer vision applications, including human animation, games, and robotics. [Expand]

PDF

Show Tweets

Thursday Poster Session

Holistic 3D Human and Scene Mesh Estimation From Single View Images

Zhenzhen Weng, Serena Yeung

The 3D world limits the human body pose and the human body pose conveys information about the surrounding objects. [Expand]

Monday Poster Session

Learning Progressive Point Embeddings for 3D Point Cloud Generation

Cheng Wen, Baosheng Yu, Dacheng Tao

Generative models for 3D point clouds are extremely important for scene/object reconstruction applications in autonomous driving and robotics. [Expand]

PDF

Show Tweets

Wednesday Poster Session

PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths

Zhengqin Xu, Rui He, Shoulie Xie, Shiqian Wu

Robust principal component analysis (RPCA) and its variants have gained wide applications in computer vision. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Consistent Instance False Positive Improves Fairness in Face Recognition

Xingkun Xu, Yuge Huang, Pengcheng Shen, Shaoxin Li, Jilin Li, Feiyue Huang, Yong Li, Zhen Cui

Demographic bias is a significant challenge in practical face recognition systems. [Expand]

Monday Poster Session

Discrimination-Aware Mechanism for Fine-Grained Representation Learning

Furong Xu, Meng Wang, Wei Zhang, Yuan Cheng, Wei Chu

Recently, with the emergence of retrieval requirements for certain individual in the same superclass, e.g., birds, persons, cars, fine-grained recognition task has attracted a significant amount of attention from academia and industry. [Expand]

PDF

Show Tweets

Monday Poster Session

Layer-Wise Searching for 1-Bit Detectors

Sheng Xu, Junhe Zhao, Jinhu Lu, Baochang Zhang, Shumin Han, David Doermann

1-bit detectors show great promise for resource-constrained embedded devices but often suffer from a significant performance gap compared with their real-valued counterparts. [Expand]

PDF

Show Tweets

Tuesday Poster Session

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning Over Traffic Events

Li Xu, He Huang, Jun Liu

Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Towards Accurate Text-Based Image Captioning With Content Diversity Exploration

Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu

Text-based image captioning (TextCap) which aims to read and reason images with texts is crucial for a machine to understand a detailed and complex scene environment, considering that texts are omnipresent in daily life. [Expand]

Thursday Poster Session

A Circular-Structured Representation for Visual Emotion Distribution Learning

Jingyuan Yang, Jie Li, Leida Li, Xiumei Wang, Xinbo Gao

Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Bottom-Up Shift and Reasoning for Referring Image Segmentation

Sibei Yang, Meng Xia, Guanbin Li, Hong-Yu Zhou, Yizhou Yu

Referring image segmentation aims to segment the referent that is the corresponding object or stuff referred by a natural language expression in an image. [Expand]

PDF

Beyond achieving high performance across many vision tasks, multimodal models are expected to be robust to single-source faults due to the availability of redundant information between modalities. [Expand]

Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang

A deep facial attribute editing model strives to meet two requirements: (1) attribute correctness -- the target attribute should correctly appear on the edited face image; (2) irrelevance preservation -- any irrelevant information (e.g., identity) should not be changed after editing. [Expand]

PDF

Show Tweets

Tuesday Poster Session

LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity

Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

When translating text inputs into layouts or images, existing works typically require explicit descriptions of each object in a scene, including their spatial information or the associated relationships. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking

Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua

Multi-person pose estimation and tracking serve as crucial steps for video understanding. [Expand]

Wednesday Poster Session

Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis

Karren Yang, Samuel Goldman, Wengong Jin, Alex X. Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler

In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Partially View-Aligned Representation Learning With Noise-Robust Contrastive Loss

Mouxing Yang, Yunfan Li, Zhenyu Huang, Zitao Liu, Peng Hu, Xi Peng

In real-world applications, it is common that only a portion of data is aligned across views due to spatial, temporal, or spatiotemporal asynchronism, thus leading to the so-called Partially View-aligned Problem (PVP). [Expand]

PDF

Show Tweets

Monday Poster Session

Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow

Wednesday Poster Session

Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching

Wednesday Poster Session

Adversarial Invariant Learning

Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu

Though machine learning algorithms are able to achieve pattern recognition from the correlation between data and labels, the presence of spurious features in the data decreases the robustness of these learned relationships with respect to varied testing environments. [Expand]

PDF

Show Tweets

Thursday Poster Session

Linguistic Structures As Weak Supervision for Visual Scene Graph Generation

Keren Ye, Adriana Kovashka

Prior work in scene graph generation requires categorical supervision at the level of triplets---subjects and objects, and predicates that relate them, either with or without bounding box information. [Expand]

Wednesday Poster Session

Iso-Points: Optimizing Neural Implicit Surfaces With Hybrid Representations

Tuesday Poster Session

Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval

Yawen Zeng, Da Cao, Xiaochi Wei, Meng Liu, Zhou Zhao, Zheng Qin

Given an untrimmed video and a query sentence, cross-modal video moment retrieval aims to rank a video moment from pre-segmented video moment candidates that best matches the query sentence. [Expand]

PDF

Show Tweets

Monday Poster Session

Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces

Alireza Zaeemzadeh, Niccolo Bisagno, Zeno Sambugaro, Nicola Conci, Nazanin Rahnavard, Mubarak Shah

The goal of out-of-distribution (OOD) detection is to handle the situations where the test samples are drawn from a different distribution than the training data. [Expand]

Wednesday Poster Session

Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation

Mengyao Zhai, Lei Chen, Greg Mori

Deep neural networks are susceptible to catastrophic forgetting: when encountering a new task, they can only remember the new task and fail to preserve its ability to accomplish previously learned tasks. [Expand]

Monday Poster Session

ABMDRNet: Adaptive-Weighted Bi-Directional Modality Difference Reduction Network for RGB-T Semantic Segmentation

Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han

Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images. [Expand]

PDF

Show Tweets

Monday Poster Session

Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss

Lu Zhang, Shuigeng Zhou, Jihong Guan, Ji Zhang

Most object detection methods require huge amounts of annotated data and can detect only the categories that appear in the training set. [Expand]

PDF

Show Tweets

Thursday Poster Session

Attention-Guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton

Xi Zhang, Xiaolin Wu

We propose a deep learning system for attention-guided dual-layer image compression (AGDL). [Expand]

Thursday Poster Session

Coarse-To-Fine Person Re-Identification With Auxiliary-Domain Classification and Second-Order Information Bottleneck

Anguo Zhang, Yueming Gao, Yuzhen Niu, Wenxi Liu, Yongcheng Zhou

Person re-identification (Re-ID) is to retrieve a particular person captured by different cameras, which is of great significance for security surveillance and pedestrian behavior analysis. [Expand]

PDF

Show Tweets

Monday Poster Session

Confluent Vessel Trees With Accurate Bifurcations

Thursday Poster Session

iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression

Shifeng Zhang, Chen Zhang, Ning Kang, Zhenguo Li

It is nontrivial to store rapidly growing big data nowadays, which demands high-performance lossless compression techniques. [Expand]

Monday Poster Session

Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual Dataset

Zhimeng Zhang, Lincheng Li, Yu Ding, Changjie Fan

One-shot talking face generation should synthesize high visual quality facial videos with reasonable animations of expression and head pose, and just utilize arbitrary driving audio and arbitrary single face image as the source. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Keypoint-Graph-Driven Learning Framework for Object Pose Estimation

Shaobo Zhang, Wanqing Zhao, Ziyu Guan, Xianlin Peng, Jinye Peng

Many recent 6D pose estimation methods exploited object 3D models to generate synthetic images for training because labels come for free. [Expand]

PDF

Show Tweets

Monday Poster Session

Learning by Watching

Tuesday Poster Session

Posterior Promoted GAN With Distribution Discriminator for Unsupervised Image Synthesis

Xianchao Zhang, Ziyang Cheng, Xiaotong Zhang, Han Liu

Sufficient real information in generator is a critical point for the generation ability of GAN. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Person Re-Identification Using Heterogeneous Local Graph Attention Networks

Zhong Zhang, Haijia Zhang, Shuang Liu

Recently, some methods have focused on learning local relation among parts of pedestrian images for person re-identification (Re-ID), as it offers powerful representation capabilities. [Expand]

PDF

Show Tweets

Thursday Poster Session

Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging

Feilong Zhang, Xianming Liu, Cheng Guo, Shiyi Lin, Junjun Jiang, Xiangyang Ji

Phase retrieval from intensity-only measurements plays a central role in many real-world imaging tasks. [Expand]

PDF

Show Tweets

Wednesday Poster Session

PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS With Relationship Recovery

Wednesday Poster Session

Sparse Multi-Path Corrections in Fringe Projection Profilometry

Yu Zhang, Daniel Lau, David Wipf

Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns. [Expand]

PDF

Show Tweets

Thursday Poster Session

SRDAN: Scale-Aware and Range-Aware Domain Adaptation Network for Cross-Dataset 3D Object Detection

Weichen Zhang, Wen Li, Dong Xu

Geometric characteristic plays an important role in the representation of an object in 3D point clouds. [Expand]

PDF

Show Tweets

Tuesday Poster Session

TSGCNet: Discriminative Geometric Feature Learning With Two-Stream Graph Convolutional Network for 3D Dental Model Segmentation

Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen

The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. [Expand]

Tuesday Poster Session

Unbalanced Feature Transport for Exemplar-Based Image Translation

Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, Chunyan Miao

Despite the great success of GANs in images translation with different conditioned inputs such as semantic segmentation and edge map, generating high-fidelity images with reference styles from exemplars remains a grand challenge in conditional image-to-image translation. [Expand]

PDF

Show Tweets

Thursday Poster Session

3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management

Tianyi Zhao, Kai Cao, Jiawen Yao, Isabella Nogues, Le Lu, Lingyun Huang, Jing Xiao, Zhaozheng Yin, Ling Zhang

The pancreatic disease taxonomy includes ten types of masses (tumors or cysts) [20, 8]. [Expand]

Thursday Poster Session

Deep Lucas-Kanade Homography for Multimodal Image Alignment

Monday Poster Session

High-Speed Image Reconstruction Through Short-Term Plasticity for Spiking Cameras

Yajing Zheng, Lingxiao Zheng, Zhaofei Yu, Boxin Shi, Yonghong Tian, Tiejun Huang

Fovea, located in the centre of the retina, is specialized for high-acuity vision. [Expand]

Xubin Zhong, Xian Qu, Changxing Ding, Dacheng Tao

Modern human-object interaction (HOI) detection approaches can be divided into one-stage methods and two-stage ones. [Expand]

Thursday Poster Session

DAP: Detection-Aware Pre-Training With Weak Supervision

Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang

This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. [Expand]

Tuesday Poster Session

Neighborhood Contrastive Learning for Novel Class Discovery

Zhun Zhong, Enrico Fini, Subhankar Roy, Zhiming Luo, Elisa Ricci, Nicu Sebe

In this paper, we address Novel Class Discovery (NCD), the task of unveiling new classes in a set of unlabeled samples given a labeled dataset with known classes. [Expand]

Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha

While deep convolutional neural networks (CNNs) have achieved great success on image de-raining task, most existing methods can only learn fixed mapping rules between paired rainy/clean images on a single dataset. [Expand]

PDF

Show Tweets

Tuesday Poster Session

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

Long-term actions involve many important visual concepts, e.g., objects, motions, and sub-actions, and there are various relations among these concepts, which we call basic relations. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Learning Placeholders for Open-Set Recognition

Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan

Traditional classifiers are deployed under closed-set setting, with both training and test classes belong to the same set. [Expand]

Tuesday Poster Session

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach

Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jiang

Monocular 3D object detection is an important task in autonomous driving. [Expand]

PDF

Show Tweets

Wednesday Poster Session

Positive Sample Propagation Along the Audio-Visual Event Line

Jinxing Zhou, Liang Zheng, Yiran Zhong, Shijie Hao, Meng Wang

Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). [Expand]

Wednesday Poster Session

Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation

PDF

Show Tweets

Monday Poster Session