CVPR Buzz - 2021
Built by Matt Deitke
CVPR Buzz displays the most discussed papers at CVPR 2021 using Twitter for indexing discussions and Semantic Scholar for collecting citation data.
To add data or see how it was collected, checkout the GitHub repo:
Meta Pseudo Labels
Hieu Pham, Zihang Dai, Qizhe Xie, Quoc V. Le
We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90.2% on ImageNet, which is 1.6% better than the existing state-of-the-art. [Expand]
1178.75
36
Thursday Poster Session
Animating Pictures With Eulerian Motion Fields
Aleksander Holynski, Brian L. Curless, Steven M. Seitz, Richard Szeliski
In this paper, we demonstrate a fully automatic method for converting a still image into a realistic animated looping video. [Expand]
1114.50
2
Tuesday Poster Session
Taming Transformers for High-Resolution Image Synthesis
Patrick Esser, Robin Rombach, Bjorn Ommer
Designed to learn long-range interactions on sequential data, transformers continue to show state-of-the-art results on a wide variety of tasks. [Expand]
863.25
30
Thursday Poster Session
Real-Time High-Resolution Background Matting
Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L. Curless, Steven M. Seitz, Ira Kemelmacher-Shlizerman
We introduce a real-time, high-resolution background replacement technique which operates at 30fps in 4K resolution, and 60fps for HD on a modern GPU. [Expand]
589.00
4
Wednesday Poster Session
RepVGG: Making VGG-Style ConvNets Great Again
Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. [Expand]
561.75
9
Thursday Poster Session
Natural Adversarial Examples
Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song
We introduce two challenging datasets that reliably cause machine learning model performance to substantially degrade. [Expand]
506.75
122
Thursday Poster Session
VirTex: Learning Visual Representations From Textual Annotations
Karan Desai, Justin Johnson
The de-facto approach to many vision tasks is to start from pretrained visual representations, typically learned via supervised training on ImageNet. [Expand]
461.50
36
Wednesday Poster Session
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing
Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing. [Expand]
419.50
8
Wednesday Poster Session
Learning Continuous Image Representation With Local Implicit Image Function
Yinbo Chen, Sifei Liu, Xiaolong Wang
How to represent an image? While the visual world is presented in a continuous manner, machines store and see the images in a discrete way with 2D arrays of pixels. [Expand]
373.75
10
Wednesday Poster Session
Im2Vec: Synthesizing Vector Graphics Without Vector Supervision
Pradyumna Reddy, Michael Gharbi, Michal Lukac, Niloy J. Mitra
Vector graphics are widely used to represent fonts, logos, digital artworks, and graphic designs. [Expand]
358.75
3
Wednesday Poster Session
Exploring Simple Siamese Representation Learning
Xinlei Chen, Kaiming He
Siamese networks have become a common structure in various recent models for unsupervised visual representation learning. [Expand]
345.75
112
Friday Poster Session
Bottleneck Transformers for Visual Recognition
Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. [Expand]
334.00
46
Friday Poster Session
Involution: Inverting the Inherence of Convolution for Visual Recognition
Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen
Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. [Expand]
300.75
6
Thursday Poster Session
Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph
Building instance segmentation models that are data-efficient and can handle rare object categories is an important challenge in computer vision. [Expand]
289.50
22
Tuesday Poster Session
NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections
Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth
We present a learning-based method for synthesizingnovel views of complex scenes using only unstructured collections of in-the-wild photographs. [Expand]
268.50
75
Wednesday Poster Session
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yagiz Aksoy
Neural networks have shown great abilities in estimating depth from a single image. [Expand]
Wednesday Poster Session
Robust Consistent Video Depth Estimation
Johannes Kopf, Xuejian Rong, Jia-Bin Huang
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video. [Expand]
246.50
1
Monday Poster Session
NeX: Real-Time View Synthesis With Neural Basis Expansion
Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn
We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce next-level view-dependent effects--in real time. [Expand]
246.25
6
Wednesday Poster Session
Motion Representations for Articulated Animation
Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov
We propose novel motion representations for animating articulated objects consisting of distinct parts. [Expand]
Thursday Poster Session
Omnimatte: Associating Objects and Their Effects in Video
Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein
Computer vision has become increasingly better at segmenting objects in images and videos; however, scene effects related to the objects -- shadows, reflections, generated smoke, etc. [Expand]
Tuesday Poster Session
Closed-Form Factorization of Latent Semantics in GANs
Yujun Shen, Bolei Zhou
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. [Expand]
211.00
39
Monday Poster Session
Scene Essence
Jiayan Qiu, Yiding Yang, Xinchao Wang, Dacheng Tao
What scene elements, if any, are indispensable for recognizing a scene? We strive to answer this question through the lens of an end-to-end learning scheme. [Expand]
Show Tweets
Wednesday Poster Session
Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes
Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler
Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. [Expand]
207.25
13
Thursday Poster Session
Back to the Feature: Learning Robust Camera Localization From Pixels To Pose
Paul-Edouard Sarlin, Ajaykumar Unagar, Mans Larsson, Hugo Germain, Carl Toft, Viktor Larsson, Marc Pollefeys, Vincent Lepetit, Lars Hammarstrand, Fredrik Kahl, Torsten Sattler
Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. [Expand]
198.25
1
Tuesday Poster Session
Holistic 3D Scene Understanding From a Single Image With Implicit Representation
Cheng Zhang, Zhaopeng Cui, Yinda Zhang, Bing Zeng, Marc Pollefeys, Shuaicheng Liu
We present a new pipeline for holistic 3D scene understanding from a single image, which could predict object shape, object pose and scene layout. [Expand]
Wednesday Poster Session
Pre-Trained Image Processing Transformer
Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao
As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. [Expand]
186.75
52
Thursday Poster Session
Stylized Neural Painting
Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, Zhenwei Shi
This paper proposes an image-to-painting translation method that generates vivid and realistic painting artworks with controllable styles. [Expand]
183.75
1
Friday Poster Session
ArtEmis: Affective Language for Visual Art
Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas J. Guibas
We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. [Expand]
182.50
4
Thursday Poster Session
DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort
Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler
We introduce DatasetGAN: an automatic procedure to generate massive datasets of high-quality semantically segmented images requiring minimal human effort. [Expand]
174.00
2
Wednesday Poster Session
CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister
We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. [Expand]
159.25
2
Wednesday Poster Session
DriveGAN: Towards a Controllable High-Quality Neural Simulation
Seung Wook Kim, Jonah Philion, Antonio Torralba, Sanja Fidler
Realistic simulators are critical for training and verifying robotics systems. [Expand]
Tuesday Poster Session
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
Zhengqi Li, Simon Niklaus, Noah Snavely, Oliver Wang
We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. [Expand]
151.00
30
Tuesday Poster Session
GAN Prior Embedded Network for Blind Face Restoration in the Wild
Tao Yang, Peiran Ren, Xuansong Xie, Lei Zhang
Blind face restoration (BFR) from severely degraded face images in the wild is a very challenging problem. [Expand]
Monday Poster Session
Image Generators With Conditionally-Independent Pixel Synthesis
Ivan Anokhin, Kirill Demochkin, Taras Khakhulin, Gleb Sterkin, Victor Lempitsky, Denis Korzhenkov
Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner. [Expand]
143.00
8
Thursday Poster Session
Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
Daiqing Li, Junlin Yang, Karsten Kreis, Antonio Torralba, Sanja Fidler
Training deep networks with limited labeled data while achieving a strong generalization ability is key in the quest to reduce human annotation efforts. [Expand]
137.00
4
Wednesday Poster Session
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens
Self-attention has the promise of improving computer vision systems due to parameter-independent scaling of receptive fields and content-dependent interactions, in contrast to parameter-dependent scaling and content-independent interactions of convolutions. [Expand]
136.50
16
Thursday Poster Session
Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation
Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or
We present a generic image-to-image translation framework, pixel2style2pixel (pSp). [Expand]
136.25
44
Monday Poster Session
Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers
Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang
Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoder-decoder architecture. [Expand]
133.75
57
Tuesday Poster Session
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments From a Single Moving Camera
Felix Wimbauer, Nan Yang, Lukas von Stumberg, Niclas Zeller, Daniel Cremers
In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. [Expand]
130.25
1
Tuesday Poster Session
Information-Theoretic Segmentation by Inpainting Error Maximization
Pedro Savarese, Sunnie S. Y. Kim, Michael Maire, Greg Shakhnarovich, David McAllester
We study image segmentation from an information-theoretic perspective, proposing a novel adversarial method that performs unsupervised segmentation by partitioning images into maximally independent sets. [Expand]
130.00
1
Tuesday Poster Session
IBRNet: Learning Multi-View Image-Based Rendering
Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P. Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. [Expand]
128.00
11
Tuesday Poster Session
On Robustness and Transferability of Convolutional Neural Networks
Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. [Expand]
126.25
18
Friday Poster Session
LoFTR: Detector-Free Local Feature Matching With Transformers
Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou
We present a novel method for local image feature matching. [Expand]
125.50
3
Wednesday Poster Session
Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings
Brett D. Roads, Bradley C. Love
Advances in supervised learning approaches to object recognition flourished in part because of the availability of high-quality datasets and associated benchmarks. [Expand]
119.00
1
Tuesday Poster Session
Shape and Material Capture at Home
Daniel Lichy, Jiaye Wu, Soumyadip Sengupta, David W. Jacobs
In this paper, we present a technique for estimating the geometry and reflectance of objects using only a camera, flashlight, and optionally a tripod. [Expand]
112.75
1
Tuesday Poster Session
Re-Labeling ImageNet: From Single to Multi-Labels, From Global to Localized Labels
Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun
ImageNet has been the most popular image classification benchmark, but it is also the one with a significant level of label noise. [Expand]
111.75
11
Monday Poster Session
NeuralRecon: Real-Time Coherent 3D Reconstruction From Monocular Video
Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao
We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video. [Expand]
Friday Poster Session
Deep Animation Video Interpolation in the Wild
Li Siyao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris Metaxas, Chen Change Loy, Ziwei Liu
In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming. [Expand]
Tuesday Poster Session
pixelNeRF: Neural Radiance Fields From One or Few Images
Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa
We propose pixelNeRF, a learning framework that predicts a continuous neural scene representation conditioned on one or few input images. [Expand]
Tuesday Poster Session
UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers
Zhigang Dai, Bolun Cai, Yugeng Lin, Junying Chen
Object detection with transformers (DETR) reaches competitive performance with Faster R-CNN via a transformer encoder-decoder architecture. [Expand]
106.75
24
Monday Poster Session
Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu
The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. [Expand]
102.50
15
Wednesday Poster Session
Playable Video Generation
Willi Menapace, Stephane Lathuiliere, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci
This paper introduces the unsupervised learning problem of playable video generation (PVG). [Expand]
102.50
2
Wednesday Poster Session
AutoInt: Automatic Integration for Fast Neural Volume Rendering
David B. Lindell, Julien N. P. Martel, Gordon Wetzstein
Numerical integration is a foundational technique in scientific computing and is at the core of many computer vision applications. [Expand]
100.75
14
Thursday Poster Session
Stable View Synthesis
Gernot Riegler, Vladlen Koltun
We present Stable View Synthesis (SVS). [Expand]
100.75
12
Thursday Poster Session
Shelf-Supervised Mesh Prediction in the Wild
Yufei Ye, Shubham Tulsiani, Abhinav Gupta
We aim to infer 3D shape and pose of objects from a single image and propose a learning-based approach that can train from unstructured image collections, using only segmentation outputs from off-the-shelf recognition systems as supervisory signal (i.e. [Expand]
95.50
1
Wednesday Poster Session
Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans
Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou
This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views. [Expand]
95.00
9
Wednesday Poster Session
Navigating the GAN Parameter Space for Semantic Image Editing
Anton Cherepkov, Andrey Voynov, Artem Babenko
Generative Adversarial Networks (GANs) are currently an indispensable tool for visual editing, being a standard component of image-to-image translation and image restoration pipelines. [Expand]
93.50
1
Tuesday Poster Session
Skip-Convolutions for Efficient Video Processing
Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi
We propose Skip-Convolutions to leverage the large amount of redundancies in video streams and save computations. [Expand]
Monday Poster Session
The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel Brostow, Michael Firman
Self-supervised monocular depth estimation networks are trained to predict scene depth using nearby frames as a supervision signal during training. [Expand]
88.75
1
Monday Poster Session
Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang
We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance. [Expand]
87.75
1
Tuesday Poster Session
SiamMOT: Siamese Multi-Object Tracking
Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, Joseph Tighe
In this work, we focus on improving online multi-object tracking (MOT). [Expand]
Thursday Poster Session
Self-Supervised Geometric Perception
Heng Yang, Wei Dong, Luca Carlone, Vladlen Koltun
We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). [Expand]
86.50
1
Thursday Poster Session
Stochastic Image-to-Video Synthesis Using cINNs
Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Bjorn Ommer
Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics: Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all the remaining characteristics not present in the initial frame. [Expand]
Tuesday Poster Session
Spatially-Adaptive Pixelwise Networks for Fast Image Translation
Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli
We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. [Expand]
84.50
1
Thursday Poster Session
Space-Time Neural Irradiance Fields for Free-Viewpoint Video
Wenqi Xian, Jia-Bin Huang, Johannes Kopf, Changil Kim
We present a method that learns a spatiotemporal neural irradiance field for dynamic scenes from a single video. [Expand]
84.25
27
Wednesday Poster Session
SMPLicit: Topology-Aware Generative Model for Clothed People
Enric Corona, Albert Pumarola, Guillem Alenya, Gerard Pons-Moll, Francesc Moreno-Noguer
In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. [Expand]
83.50
4
Thursday Poster Session
Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos
Yasamin Jafarian, Hyun Soo Park
A key challenge of learning the geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real world imagery. [Expand]
Thursday Poster Session
Multimodal Motion Prediction With Stacked Transformers
Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, Bolei Zhou
Predicting multiple plausible future trajectories of the nearby vehicles is crucial for the safety of autonomous driving. [Expand]
82.25
3
Wednesday Poster Session
Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction
Aljaz Bozic, Pablo Palafox, Michael Zollhofer, Justus Thies, Angela Dai, Matthias Niessner
We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. [Expand]
78.00
2
Monday Poster Session
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
Aditya Prakash, Kashyap Chitta, Andreas Geiger
How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. [Expand]
77.75
1
Tuesday Poster Session
Line Segment Detection Using Transformers Without Edges
Yifan Xu, Weijian Xu, David Cheung, Zhuowen Tu
In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. [Expand]
77.25
4
Tuesday Poster Session
Training Generative Adversarial Networks in One Stage
Chengchao Shen, Youtan Yin, Xinchao Wang, Xubin Li, Jie Song, Mingli Song
Generative Adversarial Networks (GANs) have demonstrated unprecedented success in various image generation tasks. [Expand]
Tuesday Poster Session
GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields
Michael Niemeyer, Andreas Geiger
Deep generative models allow for photorealistic image synthesis at high resolutions. [Expand]
76.25
20
Thursday Poster Session
Spatiotemporal Contrastive Video Representation Learning
Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
We present a self-supervised Contrastive Video Representation Learning (CVRL) method to learn spatiotemporal visual representations from unlabeled videos. [Expand]
76.25
32
Tuesday Poster Session
Transformer Interpretability Beyond Attention Visualization
Hila Chefer, Shir Gur, Lior Wolf
Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. [Expand]
75.75
12
Monday Poster Session
SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor
In this paper we introduce a Transformer-based approach to video object segmentation (VOS). [Expand]
75.75
6
Tuesday Poster Session
Positional Encoding As Spatial Inductive Bias in GANs
Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy
SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field. [Expand]
74.75
6
Thursday Poster Session
Probabilistic Embeddings for Cross-Modal Retrieval
Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus
Cross-modal retrieval methods build a common representation space for samples from multiple modalities, typically from the vision and the language domains. [Expand]
74.25
2
Wednesday Poster Session
Dual Contradistinctive Generative Autoencoder
Gaurav Parmar, Dacheng Li, Kwonjoon Lee, Zhuowen Tu
We present a new generative autoencoder model with dual contradistinctive losses to improve generative autoencoder that performs simultaneous inference (reconstruction) and synthesis (sampling). [Expand]
74.25
5
Monday Poster Session
D-NeRF: Neural Radiance Fields for Dynamic Scenes
Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer
Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. [Expand]
74.25
34
Wednesday Poster Session
On Feature Normalization and Data Augmentation
Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger
The moments (a.k.a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time. [Expand]
73.50
22
Thursday Poster Session
Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction
Guy Gafni, Justus Thies, Michael Zollhofer, Matthias Niessner
We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. [Expand]
73.00
14
Wednesday Poster Session
You Only Look One-Level Feature
Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun
This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out that the success of FPN is due to its divide-and-conquer solution to the optimization problem in object detection rather than multi-scale feature fusion. [Expand]
Thursday Poster Session
Metadata Normalization
Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli
Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. [Expand]
Wednesday Poster Session
End-to-End Video Instance Segmentation With Transformers
Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia
Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. [Expand]
72.00
44
Wednesday Poster Session
Repurposing GANs for One-Shot Semantic Part Segmentation
Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn
While GANs have shown success in realistic image generation, the idea of using GANs for other tasks unrelated to synthesis is underexplored. [Expand]
70.50
3
Tuesday Poster Session
Neural Lumigraph Rendering
Petr Kellnhofer, Lars C. Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, Gordon Wetzstein
Novel view synthesis is a challenging and ill-posed inverse rendering problem. [Expand]
69.75
2
Tuesday Poster Session
Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image Editing
Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh
Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. [Expand]
Monday Poster Session
Passive Inter-Photon Imaging
Atul Ingle, Trevor Seets, Mauro Buttafava, Shantanu Gupta, Alberto Tosi, Mohit Gupta, Andreas Velten
Digital camera pixels measure image intensities by converting incident light energy into an analog electrical current, and then digitizing it into a fixed-width binary representation. [Expand]
Wednesday Poster Session
Plan2Scene: Converting Floorplans to 3D Scenes
Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva
We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene. [Expand]
Wednesday Poster Session
Task Programming: Learning Data Efficient Behavior Representations
Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Yue, Pietro Perona
Specialized domain knowledge is often necessary to accurately annotate training sets for in-depth analysis, but can be burdensome and time-consuming to acquire from domain experts. [Expand]
67.75
2
Tuesday Poster Session
Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
Lei Ke, Yu-Wing Tai, Chi-Keung Tang
Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries. [Expand]
66.25
1
Tuesday Poster Session
Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks
Despoina Paschalidou, Angelos Katharopoulos, Andreas Geiger, Sanja Fidler
Impressive progress in 3D shape extraction led to representations that can capture object geometries with high fidelity. [Expand]
65.75
1
Tuesday Poster Session
Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging
Alvaro Parra, Shin-Fang Chng, Tat-Jun Chin, Anders Eriksson, Ian Reid
Under mild conditions on the noise level of the measurements, rotation averaging satisfies strong duality, which enables global solutions to be obtained via semidefinite programming (SDP) relaxation. [Expand]
Tuesday Poster Session
MOS: Towards Scaling Out-of-Distribution Detection for Large Semantic Space
Rui Huang, Yixuan Li
Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying machine learning models in the real world. [Expand]
Wednesday Poster Session
Anycost GANs for Interactive Image Synthesis and Editing
Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu
Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing. [Expand]
64.00
2
Thursday Poster Session
Generative Hierarchical Features From Synthesizing Images
Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou
Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data. [Expand]
63.25
7
Tuesday Poster Session
NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces
Miguel Jaques, Michael Burke, Timothy M. Hospedales
Learning low-dimensional latent state space dynamics models has proven powerful for enabling vision-based planning and learning for control. [Expand]
63.00
1
Tuesday Poster Session
DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution
Siyuan Qiao, Liang-Chieh Chen, Alan Yuille
Many modern object detectors demonstrate outstanding performances by using the mechanism of looking and thinking twice. [Expand]
62.00
50
Wednesday Poster Session
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
Zongze Wu, Dani Lischinski, Eli Shechtman
We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. [Expand]
61.75
15
Thursday Poster Session
MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers
Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
We present MaX-DeepLab, the first end-to-end model for panoptic segmentation. [Expand]
61.25
28
Tuesday Poster Session
Student-Teacher Learning From Clean Inputs to Noisy Inputs
Guanzhe Hong, Zhiyuan Mao, Xiaojun Lin, Stanley H. Chan
Feature-based student-teacher learning, a training method that encourages the student's hidden features to mimic those of the teacher network, is empirically successful in transferring the knowledge from a pre-trained teacher network to the student network. [Expand]
Thursday Poster Session
Scaled-YOLOv4: Scaling Cross Stage Partial Network
Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao
We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy. [Expand]
60.00
24
Thursday Poster Session
NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis
Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, Jonathan T. Barron
We present a method that takes as input a set of images of a scene illuminated by unconstrained known lighting, and produces as output a 3D representation that can be rendered from novel viewpoints under arbitrary lighting conditions. [Expand]
59.75
18
Wednesday Poster Session
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets
Yuan-Hong Liao, Amlan Kar, Sanja Fidler
Data is the engine of modern computer vision, which necessitates collecting large-scale datasets. [Expand]
Tuesday Poster Session
Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder
Tal Daniel, Aviv Tamar
The recently introduced introspective variational autoencoder (IntroVAE) exhibits outstanding image generations, and allows for amortized inference using an image encoder. [Expand]
Tuesday Poster Session
Regularizing Generative Adversarial Networks Under Limited Data
Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang
Recent years have witnessed the rapid progress of generative adversarial networks (GANs). [Expand]
Wednesday Poster Session
Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes
Dmytro Kotovenko, Matthias Wright, Arthur Heimbrecht, Bjorn Ommer
There have been many successful implementations of neural style transfer in recent years. [Expand]
54.25
1
Thursday Poster Session
Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images
Florian Kluger, Hanno Ackermann, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahn
Humans perceive and construct the surrounding world as an arrangement of simple parametric models. [Expand]
Thursday Poster Session
SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation
Sanghyun Son, Kyoung Mu Lee
Deep CNNs have achieved significant successes in image processing and its applications, including single image super-resolution (SR). [Expand]
Wednesday Poster Session
Learning To Recover 3D Scene Shape From a Single Image
Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. [Expand]
51.75
2
Monday Poster Session
On Self-Contact and Human Pose
Lea Muller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black
People touch their face 23 times an hour, they cross their arms and legs, put their hands on their hips, etc. [Expand]
51.25
2
Wednesday Poster Session
Cross-Modal Contrastive Learning for Text-to-Image Generation
Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee, Yinfei Yang
The output of text-to-image synthesis systems should be coherent, clear, photo-realistic scenes with high semantic fidelity to their conditioned text descriptions. [Expand]
51.25
5
Monday Poster Session
HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms
Mahmoud Afifi, Marcus A. Brubaker, Michael S. Brown
While generative adversarial networks (GANs) can successfully produce high-quality images, they can be challenging to control. [Expand]
51.00
1
Wednesday Poster Session
Single Image Depth Prediction With Wavelet Decomposition
Michael Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit, Daniyar Turmukhambetov
We present a novel method for predicting accurate depths from monocular images with high efficiency. [Expand]
Show Tweets
Wednesday Poster Session
Birds of a Feather: Capturing Avian Shape Models From Images
Yufu Wang, Nikos Kolotouros, Kostas Daniilidis, Marc Badger
Animals are diverse in shape, but building a deformable shape model for a new species is not always possible due to the lack of 3D data. [Expand]
Thursday Poster Session
LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search
Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu
Object tracking has achieved significant progress over the past few years. [Expand]
Thursday Poster Session
Body2Hands: Learning To Infer 3D Hands From Conversational Gesture Body Dynamics
Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo
We propose a novel learned deep prior of body motion for 3D hand shape synthesis and estimation in the domain of conversational gestures. [Expand]
48.50
2
Thursday Poster Session
OSTeC: One-Shot Texture Completion
Baris Gecer, Jiankang Deng, Stefanos Zafeiriou
The last few years have witnessed the great success of non-linear generative models in synthesizing high-quality photorealistic face images. [Expand]
48.00
2
Wednesday Poster Session
Towards Open World Object Detection
K J Joseph, Salman Khan, Fahad Shahbaz Khan, Vineeth N Balasubramanian
Humans have a natural instinct to identify unknown object instances in their environments. [Expand]
48.00
2
Tuesday Poster Session
MoViNets: Mobile Video Networks for Efficient Video Recognition
Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong
We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference. [Expand]
47.50
1
Friday Poster Session
Pulsar: Efficient Sphere-Based Neural Rendering
Christoph Lassner, Michael Zollhofer
We propose Pulsar, an efficient sphere-based differentiable rendering module that is orders of magnitude faster than competing techniques, modular, and easy-to-use due to its tight integration with PyTorch. [Expand]
Monday Poster Session
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
Ning Wang, Wengang Zhou, Jie Wang, Houqiang Li
In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. [Expand]
47.50
2
Monday Poster Session
PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
Kehong Gong, Jianfeng Zhang, Jiashi Feng
Existing 3D human pose estimators suffer poor generalization performance to new datasets, largely due to the limited diversity of 2D-3D pose pairs in the training data. [Expand]
47.25
1
Wednesday Poster Session
Large-Scale Localization Datasets in Crowded Indoor Spaces
Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guerin, Gabriela Csurka, Martin Humenberger
Estimating the precise location of a camera using visual localization enables interesting applications such as augmented reality or robot navigation. [Expand]
Tuesday Poster Session
Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors
Tao Yu, Zerong Zheng, Kaiwen Guo, Pengpeng Liu, Qionghai Dai, Yebin Liu
Human volumetric capture is a long-standing topic in computer vision and computer graphics. [Expand]
46.75
1
Tuesday Poster Session
See Through Gradients: Image Batch Recovery via GradInversion
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov
Training deep neural networks requires gradient estimation from data batches to update parameters. [Expand]
46.50
2
Friday Poster Session
Unsupervised Learning of 3D Object Categories From Videos in the Wild
Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny
Recently, numerous works have attempted to learn 3D reconstructors of textured 3D models of visual categories given a training set of annotated static images of objects. [Expand]
46.25
1
Tuesday Poster Session
Reconstructing 3D Human Pose by Watching Humans in the Mirror
Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou
In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror. [Expand]
Thursday Poster Session
LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization
Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler
In this paper, we present a video-based learning framework for animating personalized 3D talking faces from audio. [Expand]
Monday Poster Session
Permute, Quantize, and Fine-Tune: Efficient Compression of Neural Networks
Julieta Martinez, Jashan Shewakramani, Ting Wei Liu, Ioan Andrei Barsan, Wenyuan Zeng, Raquel Urtasun
Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. [Expand]
Friday Poster Session
Blur, Noise, and Compression Robust Generative Adversarial Networks
Takuhiro Kaneko, Tatsuya Harada
Generative adversarial networks (GANs) have gained considerable attention owing to their ability to reproduce images. [Expand]
45.00
1
Thursday Poster Session
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu
While accurate lip synchronization has been achieved for arbitrary-subject audio-driven talking face generation, the problem of how to efficiently drive the head pose remains. [Expand]
44.75
4
Tuesday Poster Session
NeRD: Neural 3D Reflection Symmetry Detector
Yichao Zhou, Shichen Liu, Yi Ma
Recent advances have shown that symmetry, a structural prior that most objects exhibit, can support a variety of single-view 3D understanding tasks. [Expand]
Friday Poster Session
Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations
Adel Ahmadyan, Liangkai Zhang, Artsiom Ablavatski, Jianing Wei, Matthias Grundmann
3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. [Expand]
44.00
5
Wednesday Poster Session
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He
We present a large-scale study on unsupervised spatiotemporal representation learning from videos. [Expand]
44.00
2
Tuesday Poster Session
Localizing Visual Sounds the Hard Way
Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
The objective of this work is to localize sound sources that are visible in a video without using manual annotations. [Expand]
43.75
1
Friday Poster Session
Learning Optical Flow From Still Images
Filippo Aleotti, Matteo Poggi, Stefano Mattoccia
This paper deals with the scarcity of data for training optical flow networks, highlighting the limitations of existing sources such as labeled synthetic datasets or unlabeled real videos. [Expand]
Thursday Poster Session
Three Ways To Improve Semantic Segmentation With Self-Supervised Depth Estimation
Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Koring, Suman Saha, Luc Van Gool
Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process. [Expand]
43.50
2
Wednesday Poster Session
VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency
Ruohan Gao, Kristen Grauman
We introduce a new approach for audio-visual speech separation. [Expand]
42.75
7
Thursday Poster Session
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu, Kaihua Tang, Hanwang Zhang, Zhiwu Lu, Xian-Sheng Hua, Ji-Rong Wen
Recent VQA models may tend to rely on language bias as a shortcut and thus fail to sufficiently learn the multi-modal knowledge from both vision and language. [Expand]
42.50
24
Thursday Poster Session
VDSM: Unsupervised Video Disentanglement With State-Space Modeling and Deep Mixtures of Experts
Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
Disentangled representations support a range of downstream tasks including causal reasoning, generative modeling, and fair machine learning. [Expand]
Wednesday Poster Session
BoxInst: High-Performance Instance Segmentation With Box Annotations
Zhi Tian, Chunhua Shen, Xinlong Wang, Hao Chen
We present a high-performance method that can achieve mask-level instance segmentation with only bounding-box annotations for training. [Expand]
41.50
4
Tuesday Poster Session
On Semantic Similarity in Video Retrieval
Michael Wray, Hazel Doughty, Dima Damen
Current video retrieval efforts all found their evaluation on an instance-based assumption, that only a single caption is relevant to a query video and vice versa. [Expand]
Tuesday Poster Session
Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein
We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. [Expand]
41.00
14
Tuesday Poster Session
Fast and Accurate Model Scaling
Piotr Dollar, Mannat Singh, Ross Girshick
In this work we analyze strategies for convolutional neural network scaling; that is, the process of scaling a base convolutional network to endow it with greater computational complexity and consequently representational power. [Expand]
41.00
3
Monday Poster Session
DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys
Objects moving at high speed appear significantly blurred when captured with cameras. [Expand]
41.00
3
Tuesday Poster Session
Few-Shot Transformation of Common Actions Into Time and Space
Pengwan Yang, Pascal Mettes, Cees G. M. Snoek
This paper introduces the task of few-shot common action localization in time and space. [Expand]
Friday Poster Session
Understanding Failures of Deep Networks via Robust Feature Extraction
Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz
Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances. [Expand]
Thursday Poster Session
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
Vitor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner
We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. [Expand]
40.00
2
Wednesday Poster Session
High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
Jie Liang, Hui Zeng, Lei Zhang
Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. [Expand]
Wednesday Poster Session
User-Guided Line Art Flat Filling With Split Filling Mechanism
Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu
Flat filling is a critical step in digital artistic content creation with the objective of filling line arts with flat colors. [Expand]
Show Tweets
Wednesday Poster Session
Neural Scene Graphs for Dynamic Scenes
Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, Felix Heide
Recent implicit neural rendering methods have demonstrated that it is possible to learn accurate view synthesis for complex scenes by predicting their volumetric density and color supervised solely by a set of RGB images. [Expand]
39.50
8
Tuesday Poster Session
LASR: Learning Articulated Shape Reconstruction From a Monocular Video
Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu
Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. [Expand]
Friday Poster Session
Representation Learning via Global Temporal Alignment and Cycle-Consistency
Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson
We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e.g., videos) of the same process (e.g., human action). [Expand]
Wednesday Poster Session
A Sliced Wasserstein Loss for Neural Texture Synthesis
Eric Heitz, Kenneth Vanhoey, Thomas Chambon, Laurent Belcour
We address the problem of computing a textural loss based on the statistics extracted from the feature activations of a convolutional neural network optimized for object recognition (e.g. [Expand]
Wednesday Poster Session
GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
Kelvin C.K. Chan, Xintao Wang, Xiangyu Xu, Jinwei Gu, Chen Change Loy
We show that pre-trained Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR). [Expand]
38.75
4
Thursday Poster Session
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
Tomas Jakab, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Kanazawa
We introduce KeypointDeformer, a novel unsupervised method for shape control through automatically discovered 3D keypoints. [Expand]
Thursday Poster Session
High-Fidelity Face Tracking for AR/VR via Deep Lighting Adaptation
Lele Chen, Chen Cao, Fernando De la Torre, Jason Saragih, Chenliang Xu, Yaser Sheikh
3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR. [Expand]
Thursday Poster Session
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, Tobias Fischer
Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. [Expand]
37.50
3
Thursday Poster Session
Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts
Ji Hou, Benjamin Graham, Matthias Niessner, Saining Xie
The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. [Expand]
37.50
4
Friday Poster Session
Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning
Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu, Dongmei Fu, Jianlong Fu
We study on joint learning of Convolutional Neural Network (CNN) and Transformer for vision-language pre-training (VLPT) which aims to learn cross-modal alignments from millions of image-text pairs. [Expand]
Thursday Poster Session
CASTing Your Model: Learning To Localize Improves Self-Supervised Representations
Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik
Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining. [Expand]
37.25
2
Wednesday Poster Session
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen
Inverted bottleneck layers, which are built upon depthwise convolutions, have been the predominant building blocks in state-of-the-art object detection models on mobile devices. [Expand]
37.25
12
Tuesday Poster Session
Rethinking Channel Dimensions for Efficient Model Design
Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo
Designing an efficient model within the limited computational cost is challenging. [Expand]
Monday Poster Session
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha Kembhavi, Roozbeh Mottaghi
The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. [Expand]
Tuesday Poster Session
Efficient Initial Pose-Graph Generation for Global SfM
Daniel Barath, Dmytro Mishkin, Ivan Eichhardt, Ilia Shipachev, Jiri Matas
We propose ways to speed up the initial pose-graph generation for global Structure-from-Motion algorithms. [Expand]
36.50
1
Thursday Poster Session
Efficient Conditional GAN Transfer With Knowledge Propagation Across Classes
Mohamad Shahbazi, Zhiwu Huang, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool
Generative adversarial networks (GANs) have shown impressive results in both unconditional and conditional image generation. [Expand]
Thursday Poster Session
End-to-End Object Detection With Fully Convolutional Network
Jianfeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng
Mainstream object detectors based on the fully convolutional network has achieved impressive performance. [Expand]
36.50
9
Friday Poster Session
Co-Attention for Conditioned Image Matching
Olivia Wiles, Sebastien Ehrhardt, Andrew Zisserman
We propose a new approach to determine correspondences between image pairs in the wild under large changes in illumination, viewpoint, context, and material. [Expand]
Friday Poster Session
Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans
Zhenyu Chen, Ali Gholami, Matthias Niessner, Angel X. Chang
We introduce the new task of dense captioning in RGB-D scans. [Expand]
36.00
2
Tuesday Poster Session
Learning Monocular 3D Reconstruction of Articulated Categories From Motion
Filippos Kokkinos, Iasonas Kokkinos
Monocular 3D reconstruction of articulated object categories is challenging due to the lack of training data and the inherent ill-posedness of the problem. [Expand]
36.00
1
Monday Poster Session
Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov, Stefan Roth
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. [Expand]
Thursday Poster Session
Learned Initializations for Optimizing Coordinate-Based Neural Representations
Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng
Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations for complex low dimensional signals. [Expand]
35.75
16
Tuesday Poster Session
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Pedro Morgado, Nuno Vasconcelos, Ishan Misra
We present a self-supervised learning approach to learn audio-visual representations from video and audio. [Expand]
35.50
35
Thursday Poster Session
We Are More Than Our Joints: Predicting How 3D Bodies Move
Yan Zhang, Michael J. Black, Siyu Tang
A key step towards understanding human behavior is the prediction of 3D human motion. [Expand]
35.50
4
Tuesday Poster Session
Rethinking and Improving the Robustness of Image Style Transfer
Pei Wang, Yijun Li, Nuno Vasconcelos
Extensive research in neural style transfer methods has shown that the correlation between features extracted by a pre-trained VGG network has remarkable ability to capture the visual style of an image. [Expand]
35.00
1
Monday Poster Session
GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving
Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun
Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving. [Expand]
Wednesday Poster Session
Robust and Accurate Object Detection via Adversarial Learning
Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong
Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection. [Expand]
34.50
3
Friday Poster Session
Self-Supervised Multi-Frame Monocular Scene Flow
Junhwa Hur, Stefan Roth
Estimating 3D scene flow from a sequence of monocular images has been gaining increased attention due to the simple, economical capture setup. [Expand]
Monday Poster Session
PPR10K: A Large-Scale Portrait Photo Retouching Dataset With Human-Region Mask and Group-Level Consistency
Jie Liang, Hui Zeng, Miaomiao Cui, Xuansong Xie, Lei Zhang
Different from general photo retouching tasks, portrait photo retouching (PPR), which aims to enhance the visual quality of a collection of flat-looking portrait photos, has its special and practical requirements such as human-region priority (HRP) and group-level consistency (GLC). [Expand]
Monday Poster Session
Causal Attention for Vision-Language Tasks
Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
We present a novel attention mechanism: Causal Attention (CATT), to remove the ever-elusive confounding effect in existing attention-based vision-language models. [Expand]
34.50
1
Wednesday Poster Session
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao
This paper presents a detailed study of improving vision features and develops an improved object detection model for vision language (VL) tasks. [Expand]
Tuesday Poster Session
Fast End-to-End Learning on Protein Surfaces
Freyr Sverrisson, Jean Feydy, Bruno E. Correia, Michael M. Bronstein
Proteins' biological functions are defined by the geometric and chemical structure of their 3D molecular surfaces. [Expand]
33.50
2
Thursday Poster Session
Multimodal Contrastive Training for Visual Representation Learning
Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta
We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. [Expand]
Tuesday Poster Session
DeRF: Decomposed Radiance Fields
Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi
With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. [Expand]
32.75
15
Thursday Poster Session
Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes
Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang
Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity. [Expand]
32.75
1
Wednesday Poster Session
TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text
Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner
A crucial component for the scene text based reasoning required for TextVQA and TextCaps datasets involve detecting and recognizing text present in the images using an optical character recognition (OCR) system. [Expand]
32.50
2
Wednesday Poster Session
Knowledge Evolution in Neural Networks
Ahmed Taha, Abhinav Shrivastava, Larry S. Davis
Deep learning relies on the availability of a large corpus of data (labeled or unlabeled). [Expand]
Thursday Poster Session
Rotation-Only Bundle Adjustment
Seong Hun Lee, Javier Civera
We propose a novel method for estimating the global rotations of the cameras independently of their positions and the scene structure. [Expand]
32.00
1
Monday Poster Session
Learning To Count Everything
Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai
Existing works on visual counting primarily focus on one specific category at a time, such as people, animals, and cells. [Expand]
Tuesday Poster Session
Exploiting Aliasing for Manga Restoration
Minshan Xie, Menghan Xia, Tien-Tsin Wong
As a popular entertainment art form, manga enriches the line drawings details with bitonal screentones. [Expand]
Thursday Poster Session
Quantum Permutation Synchronization
Tolga Birdal, Vladislav Golyanik, Christian Theobalt, Leonidas J. Guibas
We present QuantumSync, the first quantum algorithm for solving a synchronization problem in the context of computer vision. [Expand]
31.75
3
Thursday Poster Session
Sparse R-CNN: End-to-End Object Detection With Learnable Proposals
Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, Ping Luo
We present Sparse R-CNN, a purely sparse method for object detection in images. [Expand]
31.50
18
Thursday Poster Session
Temporal-Relational CrossTransformers for Few-Shot Action Recognition
Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen
We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. [Expand]
Monday Poster Session
VS-Net: Voting With Segmentation for Visual Localization
Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li
Visual localization is of great importance in robotics and computer vision. [Expand]
Tuesday Poster Session
STaR: Self-Supervised Tracking and Reconstruction of Rigid Objects in Motion With Neural Rendering
Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove
We present STaR, a novel method that performs Self-supervised Tracking and Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos without any manual annotation. [Expand]
31.00
5
Thursday Poster Session
Learning Multi-Scale Photo Exposure Correction
Mahmoud Afifi, Konstantinos G. Derpanis, Bjorn Ommer, Michael S. Brown
Capturing photographs with wrong exposures remains a major source of errors in camera-based imaging. [Expand]
Wednesday Poster Session
Weakly Supervised Learning of Rigid 3D Scene Flow
Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal
We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. [Expand]
30.25
1
Tuesday Poster Session
Monte Carlo Scene Search for 3D Scene Understanding
Shreyas Hampali, Sinisa Stekovic, Sayan Deb Sarkar, Chetan S. Kumar, Friedrich Fraundorfer, Vincent Lepetit
We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need for training data. [Expand]
30.25
1
Thursday Poster Session
Robust Reference-Based Super-Resolution via C2-Matching
Yuming Jiang, Kelvin C.K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu
Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image by introducing an additional high-resolution (HR) reference image. [Expand]
Monday Poster Session
SwiftNet: Real-Time Video Object Segmentation
Haochen Wang, Xiaolong Jiang, Haibing Ren, Yao Hu, Song Bai
In this work we present SwiftNet for real-time semi-supervised video object segmentation (one-shot VOS), which reports 77.8% J&F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance. [Expand]
30.00
1
Monday Poster Session
Surrogate Gradient Field for Latent Space Manipulation
Minjun Li, Yanghua Jin, Huachun Zhu
Generative adversarial networks (GANs) can generate high-quality images from sampled latent codes. [Expand]
Tuesday Poster Session
Deep Burst Super-Resolution
Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte
While single-image super-resolution (SISR) has attracted substantial interest in recent years, the proposed approaches are limited to learning image priors in order to add high frequency details. [Expand]
29.50
2
Wednesday Poster Session
CDFI: Compression-Driven Network Design for Frame Interpolation
Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov
DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e.g., mobile devices. [Expand]
Wednesday Poster Session
Intentonomy: A Dataset and Study Towards Human Intent Understanding
Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim
An image is worth a thousand words, conveying information that goes beyond the physical visual content therein. [Expand]
29.50
1
Thursday Poster Session
Pixel-Aligned Volumetric Avatars
Amit Raj, Michael Zollhofer, Tomas Simon, Jason Saragih, Shunsuke Saito, James Hays, Stephen Lombardi
Acquisition and rendering of photo-realistic human heads is a highly challenging research problem of particular importance for virtual telepresence. [Expand]
Thursday Poster Session
Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction
Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei, Chelsea Finn
A video prediction model that generalizes to diverse scenes would enable intelligent agents such as robots to perform a variety of tasks via planning with the model. [Expand]
29.25
2
Monday Poster Session
Differentiable Patch Selection for Image Recognition
Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner
Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. [Expand]
29.00
1
Monday Poster Session
PLOP: Learning Without Forgetting for Continual Semantic Segmentation
Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord
Deep learning approaches are nowadays ubiquitously used to tackle computer vision tasks such as semantic segmentation, requiring large datasets and substantial computational power. [Expand]
28.75
3
Tuesday Poster Session
Activate or Not: Learning Customized Activation
Ningning Ma, Xiangyu Zhang, Ming Liu, Jian Sun
We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. [Expand]
28.75
1
Wednesday Poster Session
Augmentation Strategies for Learning With Noisy Labels
Kento Nishi, Yi Ding, Alex Rich, Tobias Hollerer
Imperfect labels are ubiquitous in real-world datasets. [Expand]
28.75
2
Wednesday Poster Session
Home Action Genome: Cooperative Compositional Action Understanding
Nishant Rai, Haofeng Chen, Jingwei Ji, Rishi Desai, Kazuki Kozuka, Shun Ishizaka, Ehsan Adeli, Juan Carlos Niebles
Existing research on action recognition treats activities as monolithic events occurring in videos. [Expand]
Wednesday Poster Session
NeuTex: Neural Texture Mapping for Volumetric Neural Rendering
Fanbo Xiang, Zexiang Xu, Milos Hasan, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Hao Su
Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic rendering for challenging scenes that mesh reconstruction fails on. [Expand]
28.75
2
Wednesday Poster Session
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris
Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. [Expand]
28.50
8
Wednesday Poster Session
SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements
Qianli Ma, Shunsuke Saito, Jinlong Yang, Siyu Tang, Michael J. Black
Learning to model and reconstruct humans in clothing is challenging due to articulation, non-rigid deformation, and varying clothing types and topologies. [Expand]
28.25
4
Friday Poster Session
MIST: Multiple Instance Spatial Transformer
Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi
We propose a deep network that can be trained to tackle image reconstruction and classification problems that involve detection of multiple object instances, without any supervision regarding their whereabouts. [Expand]
Monday Poster Session
Learning Compositional Radiance Fields of Dynamic Human Heads
Ziyan Wang, Timur Bagautdinov, Stephen Lombardi, Tomas Simon, Jason Saragih, Jessica Hodgins, Michael Zollhofer
Photorealistic rendering of dynamic humans is an important ability for telepresence systems, virtual shopping, synthetic data generation, and more. [Expand]
27.75
5
Tuesday Poster Session
Pixel Codec Avatars
Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando De la Torre, Yaser Sheikh
Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances. [Expand]
Monday Poster Session
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi
We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. [Expand]
Tuesday Poster Session
Benchmarking Representation Learning for Natural World Image Collections
Grant Van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha
Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. [Expand]
27.50
1
Thursday Poster Session
Adversarial Generation of Continuous Images
Ivan Skorokhodov, Savva Ignatyev, Mohamed Elhoseiny
In most existing learning systems, images are typically viewed as 2D pixel arrays. [Expand]
27.25
8
Wednesday Poster Session
The Spatially-Correlative Loss for Various Image Translation Tasks
Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
We propose a novel spatially-correlative loss that is simple, efficient, and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. [Expand]
Friday Poster Session
Ensembling With Deep Generative Views
Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang
Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. [Expand]
Thursday Poster Session
VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
Seunghwan Choi, Sunghyun Park, Minsoo Lee, Jaegul Choo
The task of image-based virtual try-on aims to transfer a target clothing item onto the corresponding region of a person, which is commonly tackled by fitting the item to the desired body part and fusing the warped item with the person. [Expand]
Thursday Poster Session
SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans
Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Niessner
We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. [Expand]
26.75
2
Monday Poster Session
How Transferable Are Reasoning Patterns in VQA?
Corentin Kervadec, Theo Jaunet, Grigory Antipov, Moez Baccouche, Romain Vuillemot, Christian Wolf
Since its inception, Visual Question Answering (VQA) is notoriously known as a task, where models are prone to exploit biases in datasets to find shortcuts instead of performing high-level reasoning. [Expand]
26.75
4
Tuesday Poster Session
SOLD2: Self-Supervised Occlusion-Aware Line Description and Detection
Remi Pautrat, Juan-Ting Lin, Viktor Larsson, Martin R. Oswald, Marc Pollefeys
Compared to feature point detection and description, detecting and matching line segments offer additional challenges. [Expand]
Thursday Poster Session
Variational Transformer Networks for Layout Generation
Diego Martin Arroyo, Janis Postels, Federico Tombari
Generative models able to synthesize layouts of different kinds (e.g. [Expand]
26.50
1
Thursday Poster Session
End-to-End Human Pose and Mesh Reconstruction with Transformers
Kevin Lin, Lijuan Wang, Zicheng Liu
We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image. [Expand]
26.50
15
Monday Poster Session
SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
Shunsuke Saito, Jinlong Yang, Qianli Ma, Michael J. Black
We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. [Expand]
26.50
4
Tuesday Poster Session
Parser-Free Virtual Try-On via Distilling Appearance Flows
Yuying Ge, Yibing Song, Ruimao Zhang, Chongjian Ge, Wei Liu, Ping Luo
Image virtual try-on aims to fit a garment image (target clothes) to a person image. [Expand]
Wednesday Poster Session
Predator: Registration of 3D Point Clouds With Low Overlap
Shengyu Huang, Zan Gojcic, Mikhail Usvyatsov, Andreas Wieser, Konrad Schindler
We introduce PREDATOR, a model for pairwise pointcloud registration with deep attention to the overlap region. [Expand]
26.25
2
Tuesday Poster Session
Few-Shot Image Generation via Cross-Domain Correspondence
Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zhang
Training generative models, such as GANs, on a target domain containing limited examples (e.g., 10) can easily result in overfitting. [Expand]
Wednesday Poster Session
Learning Decision Trees Recurrently Through Communication
Stephan Alaniz, Diego Marcos, Bernt Schiele, Zeynep Akata
Integrated interpretability without sacrificing the prediction accuracy of decision making algorithms has the potential of greatly improving their value to the user. [Expand]
Thursday Poster Session
Teachers Do More Than Teach: Compressing Image-to-Image Models
Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov
Generative Adversarial Networks (GANs) have achieved huge success in generating high-fidelity images, however, they suffer from low efficiency due to tremendous computational cost and bulky memory usage. [Expand]
26.00
1
Thursday Poster Session
Image-to-Image Translation via Hierarchical Style Disentanglement
Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji
Recently, image-to-image translation has made significant progress in achieving both multi-label (i.e., translation conditioned on different labels) and multi-style (i.e., generation with diverse styles) tasks. [Expand]
Wednesday Poster Session
3D CNNs With Adaptive Temporal Feature Resolutions
Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Jurgen Gall
While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. [Expand]
Tuesday Poster Session
De-Rendering the World's Revolutionary Artefacts
Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa
Recent works have shown exciting results in unsupervised image de-rendering--learning to decompose 3D shape, appearance, and lighting from single-image collections without explicit supervision. [Expand]
Tuesday Poster Session
VarifocalNet: An IoU-Aware Dense Object Detector
Haoyang Zhang, Ying Wang, Feras Dayoub, Niko Sunderhauf
Accurately ranking the vast number of candidate detections is crucial for dense object detectors to achieve high performance. [Expand]
25.75
5
Wednesday Poster Session
Multi-Objective Interpolation Training for Robustness To Label Noise
Diego Ortego, Eric Arazo, Paul Albert, Noel E. O'Connor, Kevin McGuinness
Deep neural networks trained with standard cross-entropy loss memorize noisy labels, which degrades their performance. [Expand]
25.50
1
Tuesday Poster Session
Center-Based 3D Object Detection and Tracking
Tianwei Yin, Xingyi Zhou, Philipp Krahenbuhl
Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. [Expand]
25.50
19
Thursday Poster Session
A 3D GAN for Improved Large-Pose Facial Recognition
Richard T. Marriott, Sami Romdhani, Liming Chen
Facial recognition using deep convolutional neural networks relies on the availability of large datasets of face images. [Expand]
25.25
1
Thursday Poster Session
Pixel-Wise Anomaly Detection in Complex Driving Scenes
Giancarlo Di Biase, Hermann Blum, Roland Siegwart, Cesar Cadena
The inability of state-of-the-art semantic segmentation methods to detect anomaly instances hinders them from being deployed in safety-critical and complex applications, such as autonomous driving. [Expand]
25.00
1
Friday Poster Session
High-Fidelity and Arbitrary Face Editing
Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian
Cycle consistency is widely used for face editing. [Expand]
Friday Poster Session
StylePeople: A Generative Model of Fullbody Human Avatars
Artur Grigorev, Karim Iskakov, Anastasia Ianina, Renat Bashirov, Ilya Zakharkin, Alexander Vakhitov, Victor Lempitsky
We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. [Expand]
25.00
1
Tuesday Poster Session
How Privacy-Preserving Are Line Clouds? Recovering Scene Details From 3D Lines
Kunal Chelani, Fredrik Kahl, Torsten Sattler
Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. [Expand]
Friday Poster Session
Style-Aware Normalized Loss for Improving Arbitrary Style Transfer
Jiaxin Cheng, Ayush Jaiswal, Yue Wu, Pradeep Natarajan, Prem Natarajan
Neural Style Transfer (NST) has quickly evolved from single-style to infinite-style models, also known as Arbitrary Style Transfer (AST). [Expand]
Monday Poster Session
Multi-Stage Progressive Image Restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao
Image restoration tasks demand a complex balance between spatial details and high-level contextualized information while recovering images. [Expand]
24.75
10
Thursday Poster Session
SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning
Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, Kurt Keutzer
A common practice in unsupervised representation learning is to use labeled data to evaluate the quality of the learned representations. [Expand]
24.50
3
Monday Poster Session
Learning the Superpixel in a Non-Iterative and Lifelong Manner
Lei Zhu, Qi She, Bin Zhang, Yanye Lu, Zhilin Lu, Duo Li, Jie Hu
Superpixel is generated by automatically clustering pixels in an image into hundreds of compact partitions, which is widely used to perceive the object contours for its excellent contour adherence. [Expand]
Monday Poster Session
Rainbow Memory: Continual Learning With a Memory of Diverse Samples
Jihwan Bang, Heesu Kim, YoungJoon Yoo, Jung-Woo Ha, Jonghyun Choi
Continual learning is a realistic learning scenario for AI models. [Expand]
Wednesday Poster Session
Open World Compositional Zero-Shot Learning
Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata
Compositional Zero-Shot learning (CZSL) requires to recognize state-object compositions unseen during training. [Expand]
24.25
2
Tuesday Poster Session
Skeleton Merger: An Unsupervised Aligned Keypoint Detector
Ruoxi Shi, Zhengrong Xue, Yang You, Cewu Lu
Detecting aligned 3D keypoints is essential under many scenarios such as object tracking, shape retrieval and robotics. [Expand]
Monday Poster Session
Connecting What To Say With Where To Look by Modeling Human Attention Traces
Zihang Meng, Licheng Yu, Ning Zhang, Tamara L. Berg, Babak Damavandi, Vikas Singh, Amy Bearman
We introduce a unified framework to jointly model images, text, and human attention traces. [Expand]
Thursday Poster Session
Progressive Semantic-Aware Style Transformation for Blind Face Restoration
Chaofeng Chen, Xiaoming Li, Lingbo Yang, Xianhui Lin, Lei Zhang, Kwan-Yee K. Wong
Face restoration is important in face image processing, and has been widely studied in recent years. [Expand]
23.75
2
Thursday Poster Session
Point Cloud Upsampling via Disentangled Refinement
Ruihui Li, Xianzhi Li, Pheng-Ann Heng, Chi-Wing Fu
Point clouds produced by 3D scanning are often sparse, non-uniform, and noisy. [Expand]
Monday Poster Session
Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
Malik Boudiaf, Hoel Kervadec, Ziko Imtiaz Masud, Pablo Piantanida, Ismail Ben Ayed, Jose Dolz
We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances--an aspect often overlooked in the literature in favor of the meta-learning paradigm. [Expand]
23.50
3
Thursday Poster Session
MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization
Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas J. Guibas
We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. [Expand]
23.50
1
Wednesday Poster Session
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham
An essential prerequisite for unleashing the potential of supervised deep learning algorithms in the area of 3D scene understanding is the availability of large-scale and richly annotated datasets. [Expand]
23.50
7
Tuesday Poster Session
CodedStereo: Learned Phase Masks for Large Depth-of-Field Stereo
Shiyu Tan, Yicheng Wu, Shoou-I Yu, Ashok Veeraraghavan
Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio (SNR) -- due to the conflicting impact of aperture size on both these variables. [Expand]
Wednesday Poster Session
CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models
Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang
Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data. [Expand]
23.25
8
Wednesday Poster Session
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata
Having access to multi-modal cues (e.g. [Expand]
23.00
1
Tuesday Poster Session
Populating 3D Scenes by Learning Human-Scene Interaction
Mohamed Hassan, Partha Ghosh, Joachim Tesch, Dimitrios Tzionas, Michael J. Black
Humans live within a 3D space and constantly interact with it to perform tasks. [Expand]
23.00
2
Thursday Poster Session
CoCoNets: Continuous Contrastive 3D Scene Representations
Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki
This paper explores self-supervised learning of amodal 3D feature representations from RGB and RGB-D posed images and videos, agnostic to object and scene semantic content, and evaluates the resulting scene representations in the downstream tasks of visual correspondence, object tracking, and object detection. [Expand]
Thursday Poster Session
Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
Abulikemu Abuduweili, Xingjian Li, Humphrey Shi, Cheng-Zhong Xu, Dejing Dou
While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. [Expand]
22.75
1
Tuesday Poster Session
PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting
Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, Noah Snavely
We present an end-to-end inverse rendering pipeline that includes a fully differentiable renderer, and can reconstruct geometry, materials, and illumination from scratch from a set of images. [Expand]
22.75
1
Tuesday Poster Session
Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers
Antoine Miech, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic, Andrew Zisserman
Our objective is language-based search of large-scale image and video datasets. [Expand]
22.50
1
Wednesday Poster Session
Energy-Based Learning for Scene Graph Generation
Mohammed Suhail, Abhay Mittal, Behjat Siddiquie, Chris Broaddus, Jayan Eledath, Gerard Medioni, Leonid Sigal
Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. [Expand]
22.50
1
Thursday Poster Session
Beyond Static Features for Temporally Consistent 3D Human Pose and Shape From a Video
Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee
Despite the recent success of single image-based 3D human pose and shape estimation methods, recovering temporally consistent and smooth 3D human motion from a video is still challenging. [Expand]
22.25
2
Monday Poster Session
Generative Classifiers as a Basis for Trustworthy Image Classification
Radek Mackowiak, Lynton Ardizzone, Ullrich Kothe, Carsten Rother
With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. [Expand]
22.25
2
Tuesday Poster Session
Unsupervised Visual Representation Learning by Tracking Patches in Video
Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong
Inspired by the fact that human eyes continue to develop tracking ability in early and middle childhood, we propose to use tracking as a proxy task for a computer vision system to learn the visual representations. [Expand]
Monday Poster Session
An Alternative Probabilistic Interpretation of the Huber Loss
Gregory P. Meyer
The Huber loss is a robust loss function used for a wide range of regression tasks. [Expand]
21.75
14
Tuesday Poster Session
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen
We present the full-resolution correspondence learning for cross-domain images, which aids image translation. [Expand]
Thursday Poster Session
Extreme Rotation Estimation Using Dense Correlation Volumes
Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor
We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. [Expand]
21.50
1
Thursday Poster Session
Depth From Camera Motion and Object Detection
Brent A. Griffin, Jason J. Corso
This paper addresses the problem of learning to estimate the depth of detected objects given some measurement of camera motion (e.g., from robot kinematics or vehicle odometry). [Expand]
Monday Poster Session
Coordinate Attention for Efficient Mobile Network Design
Qibin Hou, Daquan Zhou, Jiashi Feng
Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. [Expand]
21.50
4
Thursday Poster Session
Universal Spectral Adversarial Attacks for Deformable Shapes
Arianna Rampini, Franco Pestarini, Luca Cosmo, Simone Melzi, Emanuele Rodola
Machine learning models are known to be vulnerable to adversarial attacks, namely perturbations of the data that lead to wrong predictions despite being imperceptible. [Expand]
Tuesday Poster Session
Unsupervised Real-World Image Super Resolution via Domain-Distance Aware Training
Yunxuan Wei, Shuhang Gu, Yawei Li, Radu Timofte, Longcun Jin, Hengjie Song
These days, unsupervised super-resolution (SR) is soaring due to its practical and promising potential in real scenarios. [Expand]
21.50
7
Thursday Poster Session
From Points to Multi-Object 3D Reconstruction
Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. [Expand]
21.25
1
Tuesday Poster Session
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation
Hugo Germain, Vincent Lepetit, Guillaume Bourmaud
Absolute camera pose estimation is usually addressed by sequentially solving two distinct subproblems: First a feature matching problem that seeks to establish putative 2D-3D correspondences, and then a Perspective-n-Point problem that minimizes, w.r.t. [Expand]
21.25
1
Monday Poster Session
Repetitive Activity Counting by Sight and Sound
Yunhua Zhang, Ling Shao, Cees G. M. Snoek
This paper strives for repetitive activity counting in videos. [Expand]
Thursday Poster Session
Contrastive Learning for Compact Single Image Dehazing
Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, Lizhuang Ma
Single image dehazing is a challenging ill-posed problem due to the severe information degeneration. [Expand]
21.00
1
Wednesday Poster Session
Learning To Segment Rigid Motions From Two Frames
Gengshan Yang, Deva Ramanan
Appearance-based detectors achieve remarkable performance on common scenes, benefiting from high-capacity models and massive annotated data, but tend to fail for scenarios that lack training data. [Expand]
Monday Poster Session
Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
Mingi Ji, Seungjae Shin, Seunghyun Hwang, Gibeom Park, Il-Chul Moon
Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. [Expand]
Wednesday Poster Session
VideoMoCo: Contrastive Video Representation Learning With Temporally Adversarial Examples
Tian Pan, Yibing Song, Tianyu Yang, Wenhao Jiang, Wei Liu
MoCo is effective for unsupervised image representation learning. [Expand]
20.75
2
Wednesday Poster Session
Seesaw Loss for Long-Tailed Instance Segmentation
Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin
Instance segmentation has witnessed a remarkable progress on class-balanced benchmarks. [Expand]
20.75
2
Wednesday Poster Session
S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun
Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. [Expand]
20.75
2
Thursday Poster Session
The Lottery Tickets Hypothesis for Supervised and Self-Supervised Pre-Training in Computer Vision Models
Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang
The computer vision world has been re-gaining enthusiasm in various pre-trained models, including both classical ImageNet supervised pre-training and recently emerged self-supervised pre-training such as simCLR and MoCo. [Expand]
20.50
15
Friday Poster Session
Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning
Riccardo Volpi, Diane Larlus, Gregory Rogez
Most standard learning approaches lead to fragile models which are prone to drift when sequentially trained on samples of a different nature -- the well-known "catastrophic forgetting" issue. [Expand]
Tuesday Poster Session
No Shadow Left Behind: Removing Objects and Their Shadows Using Approximate Lighting and Geometry
Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian L. Curless
Removing objects from images is a challenging technical problem that is important for many applications, including mixed reality. [Expand]
Friday Poster Session
4D Panoptic LiDAR Segmentation
Mehmet Aygun, Aljosa Osep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixe
Temporal semantic scene understanding is critical for self-driving cars or robots operating in dynamic environments. [Expand]
20.25
2
Tuesday Poster Session
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut
The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. [Expand]
20.25
2
Tuesday Poster Session
Semi-Supervised Synthesis of High-Resolution Editable Textures for 3D Humans
Bindita Chaudhuri, Nikolaos Sarafianos, Linda Shapiro, Tony Tung
We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup. [Expand]
Wednesday Poster Session
Roses Are Red, Violets Are Blue... but Should VQA Expect Them To?
Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf
Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to ""reason"", leading them to perform ""educated guesses"" instead. [Expand]
19.75
7
Monday Poster Session
Sketch2Model: View-Aware 3D Modeling From Single Free-Hand Sketches
Song-Hai Zhang, Yuan-Chen Guo, Qing-Wen Gu
We investigate the problem of generating 3D meshes from single free-hand sketches, aiming at fast 3D modeling for novice users. [Expand]
Tuesday Poster Session
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu
The rapid progress of photorealistic synthesis techniques has reached at a critical point where the boundary between real and manipulated images starts to blur. [Expand]
Tuesday Poster Session
Ranking Neural Checkpoints
Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong
This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task. [Expand]
19.25
1
Monday Poster Session
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu
In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions. [Expand]
Monday Poster Session
Image Restoration for Under-Display Camera
Yuqian Zhou, David Ren, Neil Emerton, Sehoon Lim, Timothy Large
The new trend of full-screen devices encourages us to position a camera behind a screen. [Expand]
19.25
5
Wednesday Poster Session
Monocular Real-Time Full Body Capture With Inter-Part Correlations
Yuxiao Zhou, Marc Habermann, Ikhsanul Habibie, Ayush Tewari, Christian Theobalt, Feng Xu
We present the first method for real-time full body capture that estimates shape and motion of body and hands together with a dynamic 3D face model from a single color image. [Expand]
19.25
2
Tuesday Poster Session
Patch2Pix: Epipolar-Guided Pixel-Level Correspondences
Qunjie Zhou, Torsten Sattler, Laura Leal-Taixe
The classical matching pipeline used for visual localization typically involves three steps: (i) local feature detection and description, (ii) feature matching, and (iii) outlier rejection. [Expand]
19.25
1
Tuesday Poster Session
Exemplar-Based Open-Set Panoptic Segmentation Network
Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, Bohyung Han
We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task. [Expand]
19.00
1
Monday Poster Session
High-Fidelity Neural Human Motion Transfer From Monocular Video
Moritz Kappel, Vladislav Golyanik, Mohamed Elgharib, Jann-Ole Henningson, Hans-Peter Seidel, Susana Castillo, Christian Theobalt, Marcus Magnor
Video-based human motion transfer creates video animations of humans following a source motion. [Expand]
19.00
1
Monday Poster Session
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang
Localization Quality Estimation (LQE) is crucial and popular in the recent advancement of dense object detectors since it can provide accurate ranking scores that benefit the Non-Maximum Suppression processing and improve detection performance. [Expand]
19.00
5
Thursday Poster Session
3D Spatial Recognition Without Spatially Labeled 3D
Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision. [Expand]
Thursday Poster Session
Unpaired Image-to-Image Translation via Latent Energy Transport
Yang Zhao, Changyou Chen
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains. [Expand]
19.00
1
Friday Poster Session
DeepVideoMVS: Multi-View Stereo on Video With Recurrent Spatio-Temporal Fusion
Arda Duzceker, Silvano Galliani, Christoph Vogel, Pablo Speciale, Mihai Dusmanu, Marc Pollefeys
We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. [Expand]
18.50
1
Thursday Poster Session
i3DMM: Deep Implicit 3D Morphable Model of Human Heads
Tarun Yenamandra, Ayush Tewari, Florian Bernard, Hans-Peter Seidel, Mohamed Elgharib, Daniel Cremers, Christian Theobalt
We present the first deep implicit 3D morphable model (i3DMM) of full heads. [Expand]
18.50
1
Thursday Poster Session
Hierarchical Motion Understanding via Motion Programs
Sumith Kulal, Jiayuan Mao, Alex Aiken, Jiajun Wu
Current approaches to video analysis of human motion focus on raw pixels or keypoints as the basic units of reasoning. [Expand]
Tuesday Poster Session
MagFace: A Universal Representation for Face Recognition and Quality Assessment
Qiang Meng, Shichao Zhao, Zhida Huang, Feng Zhou
The performance of face recognition system degrades when the variability of the acquired faces increases. [Expand]
18.00
3
Thursday Poster Session
Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion
Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen
In this paper, we proposed a novel Style-based Point Generator with Adversarial Rendering (SpareNet) for point cloud completion. [Expand]
18.00
2
Tuesday Poster Session
UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering
Mohamed El Banani, Luya Gao, Justin Johnson
Aligning partial views of a scene into a single whole is essential to understanding one's environment and is a key component of numerous robotics tasks such as SLAM and SfM. [Expand]
17.75
2
Wednesday Poster Session
ANR: Articulated Neural Rendering for Virtual Avatars
Amit Raj, Julian Tanke, James Hays, Minh Vo, Carsten Stoll, Christoph Lassner
Deferred Neural Rendering (DNR) uses a three-step pipeline to translate a mesh representation into an RGB image. [Expand]
17.75
4
Tuesday Poster Session
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li
To date, most existing self-supervised learning methods are designed and optimized for image classification. [Expand]
17.75
15
Tuesday Poster Session
Binary TTC: A Temporal Geofence for Autonomous Navigation
Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen
Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene---even for humans. [Expand]
Thursday Poster Session
Deep Active Surface Models
Udaranga Wickramasinghe, Pascal Fua, Graham Knott
Active Surface Models have a long history of being useful to model complex 3D surfaces. [Expand]
17.50
1
Thursday Poster Session
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani
We present Vx2Text, a framework for text generation from multimodal inputs consisting of video plus text, speech, or audio. [Expand]
17.25
1
Tuesday Poster Session
Mask Guided Matting via Progressive Refinement Network
Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille
We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance. [Expand]
Monday Poster Session
Masksembles for Uncertainty Estimation
Nikita Durasov, Timur Bagautdinov, Pierre Baque, Pascal Fua
Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. [Expand]
17.00
4
Thursday Poster Session
Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces
Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool
This paper presents an uncalibrated deep neural network framework for the photometric stereo problem. [Expand]
Tuesday Poster Session
Learning Graph Embeddings for Compositional Zero-Shot Learning
Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata
In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. [Expand]
17.00
2
Monday Poster Session
Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles
Jevgenij Gamper, Nasir Rajpoot
We present ARCH, a computational pathology (CP) multiple instance captioning dataset to facilitate dense supervision of CP tasks. [Expand]
Friday Poster Session
Quantifying Explainers of Graph Neural Networks in Computational Pathology
Guillaume Jaume, Pushpak Pati, Behzad Bozorgtabar, Antonio Foncubierta, Anna Maria Anniciello, Florinda Feroce, Tilman Rau, Jean-Philippe Thiran, Maria Gabrani, Orcun Goksel
Explainability of deep learning methods is imperative to facilitate their clinical adoption in digital pathology. [Expand]
16.75
4
Wednesday Poster Session
NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration
Zhengang Li, Geng Yuan, Wei Niu, Pu Zhao, Yanyu Li, Yuxuan Cai, Xuan Shen, Zheng Zhan, Zhenglun Kong, Qing Jin, Zhiyu Chen, Sijia Liu, Kaiyuan Yang, Bin Ren, Yanzhi Wang, Xue Lin
With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. [Expand]
Thursday Poster Session
Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE
Jialun Peng, Dong Liu, Songcen Xu, Houqiang Li
Given an incomplete image without additional constraint, image inpainting natively allows for multiple solutions as long as they appear plausible. [Expand]
Wednesday Poster Session
Temporal Query Networks for Fine-Grained Video Understanding
Chuhan Zhang, Ankush Gupta, Andrew Zisserman
Our objective in this work is fine-grained classification of actions in untrimmed videos, where the actions may be temporally extended or may span only a few frames of the video. [Expand]
Tuesday Poster Session
Points As Queries: Weakly Semi-Supervised Object Detection by Points
Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei Zhang, Jian Sun
We propose a novel point annotated setting for the weakly semi-supervised object detection task, in which the dataset comprises small fully annotated images and large weakly annotated images by points. [Expand]
Wednesday Poster Session
How Well Do Self-Supervised Models Transfer?
Linus Ericsson, Henry Gouk, Timothy M. Hospedales
Self-supervised visual representation learning has seen huge progress recently, but no large scale evaluation has compared the many models now available. [Expand]
16.50
7
Tuesday Poster Session
Learning To Relate Depth and Semantics for Unsupervised Domain Adaptation
Suman Saha, Anton Obukhov, Danda Pani Paudel, Menelaos Kanakis, Yuhua Chen, Stamatios Georgoulis, Luc Van Gool
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. [Expand]
Wednesday Poster Session
LOHO: Latent Optimization of Hairstyles via Orthogonalization
Rohit Saha, Brendan Duke, Florian Shkurti, Graham W. Taylor, Parham Aarabi
Hairstyle transfer is challenging due to hair structure differences in the source and target hair. [Expand]
16.50
1
Monday Poster Session
Visual Room Rearrangement
Luca Weihs, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi
There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen environments. [Expand]
Tuesday Poster Session
AdaBins: Depth Estimation Using Adaptive Bins
Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
We address the problem of estimating a high quality dense depth map from a single RGB input image. [Expand]
16.25
9
Tuesday Poster Session
AutoDO: Robust AutoAugment for Biased Data With Label Noise via Scalable Probabilistic Implicit Differentiation
Denis Gudovskiy, Luca Rigazio, Shun Ishizaka, Kazuki Kozuka, Sotaro Tsukizawa
AutoAugment has sparked an interest in automated augmentation methods for deep learning models. [Expand]
Friday Poster Session
SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification
Zijian Hu, Zhengyu Yang, Xuefeng Hu, Ram Nevatia
A common classification task situation is where one has a large amount of data available for training, but only a small portion is annotated with class labels. [Expand]
16.00
1
Thursday Poster Session
SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping
Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski
We present SMURF, a method for unsupervised learning of optical flow that improves state of the art on all benchmarks by 36% to 40% and even outperforms several supervised approaches such as PWC-Net and FlowNet2. [Expand]
Tuesday Poster Session
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu
Contrastive learning methods for unsupervised visual representation learning have reached remarkable levels of transfer performance. [Expand]
16.00
16
Friday Poster Session
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, Junjie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jiwen Lu, Dalong Du, Jie Zhou
In this paper, we contribute a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol. [Expand]
16.00
3
Wednesday Poster Session
Sequential Graph Convolutional Network for Active Learning
Razvan Caramalau, Binod Bhattarai, Tae-Kyun Kim
We propose a novel pool-based Active Learning frame-work constructed on a sequential Graph Convolution Net-work (GCN). [Expand]
15.75
1
Wednesday Poster Session
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll
Recent neural view synthesis methods have achieved impressive quality and realism, surpassing classical pipelines which rely on multi-view reconstruction. [Expand]
Wednesday Poster Session
Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling
Zhichao Huang, Xintong Han, Jia Xu, Tong Zhang
We present a new method for few-shot human motion transfer that achieves realistic human image generation with only a small number of appearance inputs. [Expand]
Monday Poster Session
HOTR: End-to-End Human-Object Interaction Detection With Transformers
Bumsoo Kim, Junhyun Lee, Jaewoo Kang, Eun-Sol Kim, Hyunwoo J. Kim
Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels. [Expand]
15.75
2
Monday Poster Session
Spatially Consistent Representation Learning
Byungseok Roh, Wuhyun Shin, Ildoo Kim, Sungwoong Kim
Self-supervised learning has been widely used to obtain transferrable representations from unlabeled images. [Expand]
15.75
3
Monday Poster Session
HDR Environment Map Estimation for Real-Time Augmented Reality
Gowri Somanath, Daniel Kurz
We present a method to estimate an HDR environment map from a narrow field-of-view LDR camera image in real-time. [Expand]
Wednesday Poster Session
A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification
Jong-Chyi Su, Zezhou Cheng, Subhransu Maji
We evaluate the effectiveness of semi-supervised learning (SSL) on a realistic benchmark where data exhibits considerable class imbalance and contains images from novel classes. [Expand]
15.75
3
Thursday Poster Session
Mesoscopic Photogrammetry With an Unstabilized Phone Camera
Kevin C. Zhou, Colin Cooke, Jaehee Park, Ruobing Qian, Roarke Horstmeyer, Joseph A. Izatt, Sina Farsiu
We present a feature-free photogrammetric technique that enables quantitative 3D mesoscopic (mm-scale height variation) imaging with tens-of-micron accuracy from sequences of images acquired by a smartphone at close range (several cm) under freehand motion without additional hardware. [Expand]
Wednesday Poster Session
Global Transport for Fluid Reconstruction With Learned Self-Supervision
Erik Franz, Barbara Solenthaler, Nils Thuerey
We propose a novel method to reconstruct volumetric flows from sparse views via a global transport formulation. [Expand]
Monday Poster Session
RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Niessner
Semantic scene understanding from point clouds is particularly challenging as the points reflect only a sparse set of the underlying 3D geometry. [Expand]
15.50
2
Tuesday Poster Session
Repopulating Street Scenes
Yifan Wang, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely
We present a framework for automatically reconfiguring images of street scenes by populating, depopulating, or repopulating them with objects such as pedestrians or vehicles. [Expand]
Tuesday Poster Session
Towards High Fidelity Face Relighting With Realistic Shadows
Andrew Hou, Ze Zhang, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu
Existing face relighting methods often struggle with two problems: maintaining the local facial details of the subject and accurately removing and synthesizing shadows in the relit image, especially hard shadows. [Expand]
Thursday Poster Session
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
One of the most challenging question types in VQA is when answering the question requires outside knowledge not present in the image. [Expand]
15.25
2
Thursday Poster Session
Generalized Domain Adaptation
Yu Mitsuzumi, Go Irie, Daiki Ikami, Takashi Shibata
Many variants of unsupervised domain adaptation (UDA) problems have been proposed and solved individually. [Expand]
Monday Poster Session
DECOR-GAN: 3D Shape Detailization by Conditional Refinement
Zhiqin Chen, Vladimir G. Kim, Matthew Fisher, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri
We introduce a deep generative network for 3D shape detailization, akin to stylization with the style being geometric details. [Expand]
15.00
1
Friday Poster Session
Continual Learning via Bit-Level Information Preserving
Yujun Shi, Li Yuan, Yunpeng Chen, Jiashi Feng
Continual learning tackles the setting of learning different tasks sequentially. [Expand]
Friday Poster Session
Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds
Li Yi, Boqing Gong, Thomas Funkhouser
We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors. [Expand]
15.00
8
Thursday Poster Session
IIRC: Incremental Implicitly-Refined Classification
Mohamed Abdelsalam, Mojtaba Faramarzi, Shagun Sodhani, Sarath Chandar
We introduce the 'Incremental Implicitly-Refined Classification (IIRC)' setup, an extension to the class incremental learning setup where the incoming batches of classes have two granularity levels. [Expand]
14.75
2
Wednesday Poster Session
Depth Completion Using Plane-Residual Representation
Byeong-Uk Lee, Kyunghyun Lee, In So Kweon
The basic framework of depth completion is to predict a pixel-wise dense depth map using very sparse input data. [Expand]
Thursday Poster Session
Orthogonal Over-Parameterized Training
Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller
The inductive bias of a neural network is largely determined by the architecture and the training algorithm. [Expand]
14.75
7
Wednesday Poster Session
Towards Real-World Blind Face Restoration With Generative Facial Prior
Xintao Wang, Yu Li, Honglun Zhang, Ying Shan
Blind face restoration usually relies on facial priors, such as facial geometry prior or reference prior, to restore realistic and faithful details. [Expand]
14.75
1
Wednesday Poster Session
Instance Localization for Self-Supervised Detection Pretraining
Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin
Prior research on self-supervised learning has led to considerable progress on image classification, but often with degraded transfer performance on object detection. [Expand]
14.75
3
Tuesday Poster Session
Multiresolution Knowledge Distillation for Anomaly Detection
Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad H. Rohban, Hamid R. Rabiee
Unsupervised representation learning has proved to be a critical component of anomaly detection/localization in images. [Expand]
14.50
3
Thursday Poster Session
SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
Sheng Ao, Qingyong Hu, Bo Yang, Andrew Markham, Yulan Guo
Extracting robust and general 3D local features is key to downstream tasks such as point cloud registration and reconstruction. [Expand]
14.25
4
Thursday Poster Session
Unsupervised 3D Shape Completion Through GAN Inversion
Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy
Most 3D shape completion approaches rely heavily on partial-complete shape pairs and learn in a fully supervised manner. [Expand]
Monday Poster Session
End-to-End Human Object Interaction Detection With HOI Transformer
Cheng Zou, Bohan Wang, Yue Hu, Junqi Liu, Qian Wu, Yu Zhao, Boxun Li, Chenguang Zhang, Chi Zhang, Yichen Wei, Jian Sun
We propose HOI Transformer to tackle human object interaction (HOI) detection in an end-to-end manner. [Expand]
14.00
2
Thursday Poster Session
Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
Self-supervised learning has gained prominence due to its efficacy at learning powerful representations from unlabelled data that achieve excellent performance on many challenging downstream tasks. [Expand]
13.75
3
Tuesday Poster Session
DexYCB: A Benchmark for Capturing Hand Grasping of Objects
Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox
We introduce DexYCB, a new dataset for capturing hand grasping of objects. [Expand]
Wednesday Poster Session
Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration
Shaofei Wang, Andreas Geiger, Siyu Tang
Registering point clouds of dressed humans to parametric human models is a challenging task in computer vision. [Expand]
13.75
3
Wednesday Poster Session
Wide-Baseline Multi-Camera Calibration Using Person Re-Identification
Yan Xu, Yu-Jhe Li, Xinshuo Weng, Kris Kitani
We address the problem of estimating the 3D pose of a network of cameras for large-environment wide-baseline scenarios, e.g., cameras for construction sites, sports stadiums, and public spaces. [Expand]
Thursday Poster Session
How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language
Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto
One of the factors that have hindered progress in the areas of sign language recognition, translation, and production is the absence of large annotated datasets. [Expand]
13.50
4
Monday Poster Session
AGORA: Avatars in Geography Optimized for Regression Analysis
Priyanka Patel, Chun-Hao P. Huang, Joachim Tesch, David T. Hoffmann, Shashank Tripathi, Michael J. Black
While the accuracy of 3D human pose estimation from images has steadily improved on benchmark datasets, the best methods still fail in many real-world scenarios. [Expand]
13.50
4
Thursday Poster Session
VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation
Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In this paper, we present ViP-DeepLab, a unified model attempting to tackle the long-standing and challenging inverse projection problem in vision, which we model as restoring the point clouds from perspective image sequences while providing each point with instance-level semantic interpretations. [Expand]
13.50
3
Tuesday Poster Session
FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding
Bo Sun, Banghuai Li, Shengcai Cai, Ye Yuan, Chi Zhang
Emerging interests have been brought to recognize previously unseen objects given very few training examples, known as few-shot object detection (FSOD). [Expand]
13.50
1
Wednesday Poster Session
Lifting 2D StyleGAN for 3D-Aware Face Generation
Yichun Shi, Divyansh Aggarwal, Anil K. Jain
We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. [Expand]
Tuesday Poster Session
Self-Supervised Learning of Depth Inference for Multi-View Stereo
Jiayu Yang, Jose M. Alvarez, Miaomiao Liu
Recent supervised multi-view depth estimation networks have achieved promising results. [Expand]
Wednesday Poster Session
Unsupervised Human Pose Estimation Through Transforming Shape Templates
Luca Schmidtke, Athanasios Vlontzos, Simon Ellershaw, Anna Lukens, Tomoki Arichi, Bernhard Kainz
Human pose estimation is a major computer vision problem with applications ranging from augmented reality and video capture to surveillance and movement tracking. [Expand]
Monday Poster Session
LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network
Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. [Expand]
12.75
1
Thursday Poster Session
Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao
Artistic style transfer aims at migrating the style from an example image to a content image. [Expand]
12.75
1
Tuesday Poster Session
Neural Surface Maps
Luca Morreale, Noam Aigerman, Vladimir G. Kim, Niloy J. Mitra
Maps are arguably one of the most fundamental concepts used to define and operate on manifold surfaces in differentiable geometry. [Expand]
Tuesday Poster Session
Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting With Their Explanations
Wolfgang Stammer, Patrick Schramowski, Kristian Kersting
Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space. [Expand]
12.75
3
Tuesday Poster Session
A Deep Emulator for Secondary Motion of 3D Characters
Mianlun Zheng, Yi Zhou, Duygu Ceylan, Jernej Barbic
Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. [Expand]
Tuesday Poster Session
Content-Aware GAN Compression
Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, Sun-Yuan Kung
Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. [Expand]
Thursday Poster Session
Faster Meta Update Strategy for Noise-Robust Deep Learning
Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang
It has been shown that deep neural networks are prone to overfitting on biased training data. [Expand]
12.50
2
Monday Poster Session
StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision
Yang Hong, Juyong Zhang, Boyi Jiang, Yudong Guo, Ligang Liu, Hujun Bao
In this paper, we propose StereoPIFu, which integrates the geometric constraints of stereo vision with implicit function representation of PIFu, to recover the 3D shape of the clothed human from a pair of low-cost rectified images. [Expand]
12.25
1
Monday Poster Session
CoMoGAN: Continuous Model-Guided Image-to-Image Translation
Fabio Pizzati, Pietro Cerri, Raoul de Charette
CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. [Expand]
Thursday Poster Session
Self-Supervised Motion Learning From Static Images
Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo H. Ang
Motions are reflected in videos as the movement of pixels, and actions are essentially patterns of inconsistent motions between the foreground and the background. [Expand]
12.00
3
Monday Poster Session
KOALAnet: Blind Super-Resolution Using Kernel-Oriented Adaptive Local Adjustment
Soo Ye Kim, Hyeonjun Sim, Munchurl Kim
Blind super-resolution (SR) methods aim to generate a high quality high resolution image from a low resolution image containing unknown degradations. [Expand]
Wednesday Poster Session
3DCaricShop: A Dataset and a Baseline Method for Single-View 3D Caricature Face Reconstruction
Yuda Qiu, Xiaojie Xu, Lingteng Qiu, Yan Pan, Yushuang Wu, Weikai Chen, Xiaoguang Han
Caricature is an artistic representation that deliberately exaggerates the distinctive features of a human face to convey humor or sarcasm. [Expand]
Wednesday Poster Session
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc Van Gool, Rainer Stiefelhagen
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks. [Expand]
Wednesday Poster Session
Self-Supervised Visibility Learning for Novel View Synthesis
Yujiao Shi, Hongdong Li, Xin Yu
We address the problem of novel view synthesis (NVS) from a few sparse source view images. [Expand]
Wednesday Poster Session
AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching
Xiao Song, Guorun Yang, Xinge Zhu, Hui Zhou, Zhe Wang, Jianping Shi
Recently, records on stereo matching benchmarks are constantly broken by end-to-end disparity networks. [Expand]
12.00
6
Wednesday Poster Session
UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training
Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu, Jingjing Liu
Vision-and-language pre-training has achieved impressive success in learning multimodal representations between vision and language. [Expand]
Tuesday Poster Session
HyperSeg: Patch-Wise Hypernetwork for Real-Time Semantic Segmentation
Yuval Nirkin, Lior Wolf, Tal Hassner
We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. [Expand]
11.75
1
Tuesday Poster Session
Inverting Generative Adversarial Renderer for Face Reconstruction
Jingtan Piao, Keqiang Sun, Quan Wang, Kwan-Yee Lin, Hongsheng Li
Given a monocular face image as input, 3D face geometry reconstruction aims to recover a corresponding 3Dface mesh. [Expand]
Friday Poster Session
Categorical Depth Distribution Network for Monocular 3D Object Detection
Cody Reading, Ali Harakeh, Julia Chae, Steven L. Waslander
Monocular 3D object detection is a key problem for autonomous vehicles, as it provides a solution with simple configuration compared to typical multi-sensor systems. [Expand]
11.75
1
Wednesday Poster Session
Monocular Reconstruction of Neural Face Reflectance Fields
Mallikarjun B R, Ayush Tewari, Tae-Hyun Oh, Tim Weyrich, Bernd Bickel, Hans-Peter Seidel, Hanspeter Pfister, Wojciech Matusik, Mohamed Elgharib, Christian Theobalt
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. [Expand]
11.75
2
Tuesday Poster Session
IMAGINE: Image Synthesis by Image-Guided Model Inversion
Pei Wang, Yijun Li, Krishna Kumar Singh, Jingwan Lu, Nuno Vasconcelos
Synthesizing variations of a specific reference image with semantically valid content is an important task in terms of personalized generation as well as for data augmentation. [Expand]
11.75
1
Tuesday Poster Session
Kaleido-BERT: Vision-Language Pre-Training on Fashion Domain
Mingchen Zhuge, Dehong Gao, Deng-Ping Fan, Linbo Jin, Ben Chen, Haoming Zhou, Minghui Qiu, Ling Shao
We present a new vision-language (VL) pre-training model dubbed Kaleido-BERT, which introduces a novel kaleido strategy for fashion cross-modality representations from transformers. [Expand]
11.75
1
Thursday Poster Session
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training
Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Dongdong Zhang, Nan Duan
We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training. [Expand]
11.50
5
Tuesday Poster Session
(AF)2-S3Net: Attentive Feature Fusion With Adaptive Feature Selection for Sparse Semantic Segmentation Network
Ran Cheng, Ryan Razani, Ehsan Taghavi, Enxu Li, Bingbing Liu
Autonomous robotic systems and self driving cars rely on accurate perception of their surroundings as the safety of the passengers and pedestrians is the top priority. [Expand]
11.25
3
Thursday Poster Session
Neighbor2Neighbor: Self-Supervised Denoising From Single Noisy Images
Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu
In the last few years, image denoising has benefited a lot from the fast development of neural networks. [Expand]
11.25
1
Thursday Poster Session
Task-Aware Variational Adversarial Active Learning
Kwanyoung Kim, Dongwon Park, Kwang In Kim, Se Young Chun
Often, labeling large amount of data is challenging due to high labeling cost limiting the application domain of deep learning techniques. [Expand]
11.25
8
Wednesday Poster Session
Roof-GAN: Learning To Generate Roof Geometry and Relations for Residential Houses
Yiming Qian, Hao Zhang, Yasutaka Furukawa
This paper presents Roof-GAN, a novel generative adversarial network that generates structured geometry of residential roof structures as a set of roof primitives and their relationships. [Expand]
Monday Poster Session
ReMix: Towards Image-to-Image Translation With Limited Data
Jie Cao, Luanxuan Hou, Ming-Hsuan Yang, Ran He, Zhenan Sun
Image-to-image (I2I) translation methods based on generative adversarial networks (GANs) typically suffer from overfitting when limited training data is available. [Expand]
Thursday Poster Session
VaB-AL: Incorporating Class Imbalance and Difficulty With Variational Bayes for Active Learning
Jongwon Choi, Kwang Moo Yi, Jihoon Kim, Jinho Choo, Byoungjip Kim, Jinyeop Chang, Youngjune Gwon, Hyung Jin Chang
Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. [Expand]
11.00
1
Tuesday Poster Session
DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls
Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao, Bin Jiang, Wei Liu
User-intended visual content fills the hole regions of an input image in the image editing scenario. [Expand]
Wednesday Poster Session
Learnable Motion Coherence for Correspondence Pruning
Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang
Motion coherence is an important clue for distinguishing true correspondences from false ones. [Expand]
Tuesday Poster Session
Pose Recognition With Cascade Transformers
Ke Li, Shijie Wang, Xiang Zhang, Yifan Xu, Weijian Xu, Zhuowen Tu
In this paper, we present a regression-based pose recognition method using cascade Transformers. [Expand]
Monday Poster Session
LEAP: Learning Articulated Occupancy of People
Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang
Substantial progress has been made on modeling rigid 3D objects using deep implicit representations. [Expand]
11.00
3
Wednesday Poster Session
Learning Delaunay Surface Elements for Mesh Reconstruction
Marie-Julie Rakotosaona, Paul Guerrero, Noam Aigerman, Niloy J. Mitra, Maks Ovsjanikov
We present a method for reconstructing triangle meshes from point clouds. [Expand]
Monday Poster Session
3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection
He Wang, Yezhen Cong, Or Litany, Yue Gao, Leonidas J. Guibas
3D object detection is an important yet demanding task that heavily relies on difficult to obtain 3D annotations. [Expand]
11.00
1
Thursday Poster Session
Adversarial Robustness Under Long-Tailed Distribution
Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks. [Expand]
Wednesday Poster Session
ReNAS: Relativistic Evaluation of Neural Architecture Search
Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu
An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS). [Expand]
11.00
11
Tuesday Poster Session
Camouflaged Object Segmentation With Distraction Mining
Haiyang Mei, Ge-Peng Ji, Ziqi Wei, Xin Yang, Xiaopeng Wei, Deng-Ping Fan
Camouflaged object segmentation (COS) aims to identify objects that are "perfectly" assimilate into their surroundings, which has a wide range of valuable applications. [Expand]
10.75
3
Wednesday Poster Session
Learning Complete 3D Morphable Face Models From Images and Videos
Mallikarjun B R, Ayush Tewari, Hans-Peter Seidel, Mohamed Elgharib, Christian Theobalt
Most 3D face reconstruction methods rely on 3D morphable models, which disentangle the space of facial deformations into identity and expression geometry, and skin reflectance. [Expand]
10.75
1
Tuesday Poster Session
QPIC: Query-Based Pairwise Human-Object Interaction Detection With Image-Wide Contextual Information
Masato Tamura, Hiroki Ohashi, Tomoaki Yoshinaga
We propose a simple, intuitive yet powerful method for human-object interaction (HOI) detection. [Expand]
10.75
1
Wednesday Poster Session
Lite-HRNet: A Lightweight High-Resolution Network
Changqian Yu, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, Jingdong Wang
We present an efficient high-resolution network, Lite-HRNet, for human pose estimation. [Expand]
10.75
3
Wednesday Poster Session
Topological Planning With Transformers for Vision-and-Language Navigation
Kevin Chen, Junshen K. Chen, Jo Chuang, Marynel Vazquez, Silvio Savarese
Conventional approaches to vision-and-language navigation (VLN) are trained end-to-end but struggle to perform well in freely traversable environments. [Expand]
10.50
2
Wednesday Poster Session
KeepAugment: A Simple Information-Preserving Data Augmentation Approach
Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, Qiang Liu
Data augmentation (DA) is an essential technique for training state-of-the-art deep learning systems. [Expand]
10.50
2
Monday Poster Session
Black-Box Explanation of Object Detectors via Saliency Maps
Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I. Morariu, Ashutosh Mehra, Vicente Ordonez, Kate Saenko
We propose D-RISE, a method for generating visual explanations for the predictions of object detectors. [Expand]
10.50
3
Thursday Poster Session
CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild
Bastian Wandt, Marco Rudolph, Petrissa Zell, Helge Rhodin, Bodo Rosenhahn
Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. [Expand]
10.50
3
Thursday Poster Session
Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation
Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang
Visual object tracking aims to precisely estimate the bounding box for the given target, which is a challenging problem due to factors such as deformation and occlusion. [Expand]
10.50
8
Tuesday Poster Session
Point Cloud Instance Segmentation Using Probabilistic Embeddings
Biao Zhang, Peter Wonka
In this paper, we propose a new framework for point cloud instance segmentation. [Expand]
10.50
6
Wednesday Poster Session
Adversarially Adaptive Normalization for Single Domain Generalization
Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, Mingyuan Zhou
Single domain generalization aims to learn a model that performs well on many unseen domains with only one domain data for training. [Expand]
Wednesday Poster Session
ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
Xiangtao Kong, Hengyuan Zhao, Yu Qiao, Chao Dong
We aim at accelerating super-resolution (SR) networks on large images (2K-8K). [Expand]
Thursday Poster Session
Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency
Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang
Weakly supervised instance segmentation reduces the cost of annotations required to train models. [Expand]
Thursday Poster Session
Learning the Predictability of the Future
Didac Suris, Ruoshi Liu, Carl Vondrick
We introduce a framework for learning from unlabeled video what is predictable in the future. [Expand]
Thursday Poster Session
GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji
6D pose estimation from a single RGB image is a fundamental task in computer vision. [Expand]
Friday Poster Session
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin
Stereophonic audio, especially binaural audio, plays an essential role in immersive viewing environments. [Expand]
10.25
4
Thursday Poster Session
DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation
Xiong Zhang, Hongmin Xu, Hong Mo, Jianchao Tan, Cheng Yang, Lei Wang, Wenqi Ren
Existing NAS methods for dense image prediction tasks usually compromise on restricted search space or search on proxy task to meet the achievable computational demands. [Expand]
10.25
7
Thursday Poster Session
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
Kelvin C.K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy
Video super-resolution (VSR) approaches tend to have more components than the image counterparts as they need to exploit the additional temporal dimension. [Expand]
10.00
10
Tuesday Poster Session
Generalizable Pedestrian Detection: The Elephant in the Room
Irtiza Hasan, Shengcai Liao, Jinpeng Li, Saad Ullah Akram, Ling Shao
Pedestrian detection is used in many vision based applications ranging from video surveillance to autonomous driving. [Expand]
10.00
3
Wednesday Poster Session
Progressive Semantic Segmentation
Chuong Huynh, Anh Tuan Tran, Khoa Luu, Minh Hoai
The objective of this work is to segment high-resolution images without overloading GPU memory usage or losing the fine details in the output segmentation map. [Expand]
Friday Poster Session
AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations From Self-Trained Negative Adversaries
Qianjiang Hu, Xiao Wang, Wei Hu, Guo-Jun Qi
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. [Expand]
10.00
5
Monday Poster Session
Weakly-Supervised Physically Unconstrained Gaze Estimation
Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz
A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios. [Expand]
Wednesday Poster Session
Self-Point-Flow: Self-Supervised Scene Flow Estimation From Point Clouds With Optimal Transport and Random Walk
Ruibo Li, Guosheng Lin, Lihua Xie
Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. [Expand]
Friday Poster Session
ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M. Robertson
To train robust deep neural networks (DNNs), we systematically study several target modification approaches, which include output regularisation, self and non-self label correction (LC). [Expand]
10.00
5
Monday Poster Session
Correlated Input-Dependent Label Noise in Large-Scale Image Classification
Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent
Large scale image classification datasets often contain noisy labels. [Expand]
9.75
1
Monday Poster Session
Learning to Track Instances without Video Annotations
Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz
Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches. [Expand]
Wednesday Poster Session
HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
Vladimir Tankovich, Christian Hane, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz
This paper presents HITNet, a novel neural network architecture for real-time stereo matching. [Expand]
9.75
4
Thursday Poster Session
PatchmatchNet: Learned Multi-View Patchmatch Stereo
Fangjinhua Wang, Silvano Galliani, Christoph Vogel, Pablo Speciale, Marc Pollefeys
We present PatchmatchNet, a novel and learnable cascade formulation of Patchmatch for high-resolution multi-view stereo. [Expand]
Thursday Poster Session
Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination
Xudong Wang, Ziwei Liu, Stella X. Yu
Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets. [Expand]
Thursday Poster Session
Pose-Guided Human Animation From a Single Image in the Wild
Jae Shin Yoon, Lingjie Liu, Vladislav Golyanik, Kripasindhu Sarkar, Hyun Soo Park, Christian Theobalt
We present a new pose transfer method for synthesizing a human animation from a single image of a person controlled by a sequence of body poses. [Expand]
9.75
2
Thursday Poster Session
Adversarial Imaging Pipelines
Buu Phan, Fahim Mannan, Felix Heide
Adversarial attacks play a critical role in understanding deep neural network predictions and improving their robustness. [Expand]
Friday Poster Session
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
Fatemeh Saleh, Sadegh Aliakbarian, Hamid Rezatofighi, Mathieu Salzmann, Stephen Gould
Despite the recent advances in multiple object tracking (MOT), achieved by joint detection and tracking, dealing with long occlusions remains a challenge. [Expand]
9.50
1
Thursday Poster Session
Pareidolia Face Reenactment
Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He
We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video. [Expand]
Monday Poster Session
Counterfactual Zero-Shot and Open-Set Visual Recognition
Zhongqi Yue, Tan Wang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang
We present a novel counterfactual framework for both Zero-Shot Learning (ZSL) and Open-Set Recognition (OSR), whose common challenge is generalizing to the unseen-classes by only training on the seen-classes. [Expand]
9.50
4
Thursday Poster Session
Digital Gimbal: End-to-End Deep Image Stabilization With Learnable Exposure Times
Omer Dahary, Matan Jacoby, Alex M. Bronstein
Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. [Expand]
Thursday Poster Session
FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining
Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez
Neural Architecture Search (NAS) yields state-of-the-art neural networks that outperform their best manually-designed counterparts. [Expand]
Friday Poster Session
Checkerboard Context Model for Efficient Learned Image Compression
Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, Hongwei Qin
For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance. [Expand]
Thursday Poster Session
Audio-Driven Emotional Video Portraits
Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu
Despite previous success in generating audio-driven talking heads, most of the previous studies focus on the correlation between speech content and the mouth shape. [Expand]
9.25
1
Thursday Poster Session
MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation
Sanjay Kariyappa, Atul Prakash, Moinuddin K Qureshi
High quality Machine Learning (ML) models are often considered valuable intellectual property by companies. [Expand]
9.25
7
Thursday Poster Session
AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling
Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra
Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. [Expand]
9.25
5
Tuesday Poster Session
Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
Jianwen Xie, Yifei Xu, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu
We propose a generative model of unordered point sets, such as point clouds, in the forms of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network. [Expand]
Thursday Poster Session
HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens
Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei Zhang, Chao Xu, Chunjing Xu, Dacheng Tao, Chang Xu
Neural Architecture Search (NAS) aims to automatically discover optimal architectures. [Expand]
9.25
4
Wednesday Poster Session
SimPoE: Simulated Character Control for 3D Human Pose Estimation
Ye Yuan, Shih-En Wei, Tomas Simon, Kris Kitani, Jason Saragih
Accurate estimation of 3D human motion from monocular video requires modeling both kinematics (body motion without physical forces) and dynamics (motion with physical forces). [Expand]
9.25
1
Wednesday Poster Session
Pushing It Out of the Way: Interactive Visual Navigation
Kuo-Hao Zeng, Luca Weihs, Ali Farhadi, Roozbeh Mottaghi
We have observed significant progress in visual navigation for embodied agents. [Expand]
Wednesday Poster Session
Asymmetric Metric Learning for Knowledge Transfer
Mateusz Budnik, Yannis Avrithis
Knowledge transfer from large teacher models to smaller student models has recently been studied for metric learning, focusing on fine-grained classification. [Expand]
9.00
2
Wednesday Poster Session
Distilling Knowledge via Knowledge Review
Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia
Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network. [Expand]
Tuesday Poster Session
Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition
Valentin Deschaintre, Yiming Lin, Abhijeet Ghosh
We present a novel method for efficient acquisition of shape and spatially varying reflectance of 3D objects using polarization cues. [Expand]
Friday Poster Session
Searching by Generating: Flexible and Efficient One-Shot NAS With Architecture Generator
Sian-Yao Huang, Wei-Ta Chu
In one-shot NAS, sub-networks need to be searched from the supernet to meet different hardware constraints. [Expand]
Monday Poster Session
Revamping Cross-Modal Recipe Retrieval With Hierarchical Transformers and Self-Supervised Learning
Amaia Salvador, Erhan Gundogdu, Loris Bazzani, Michael Donoser
Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models. [Expand]
Thursday Poster Session
AutoFlow: Learning a Better Training Set for Optical Flow
Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu
Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications. [Expand]
Wednesday Poster Session
PISE: Person Image Synthesis and Editing With Decoupled GAN
Jinsong Zhang, Kun Li, Yu-Kun Lai, Jingyu Yang
Person image synthesis, e.g., pose transfer, is a challenging problem due to large variation and occlusion. [Expand]
9.00
1
Wednesday Poster Session
Improving Calibration for Long-Tailed Recognition
Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia
Deep neural networks may perform poorly when training datasets are heavily class-imbalanced. [Expand]
Friday Poster Session
MP3: A Unified Model To Map, Perceive, Predict and Plan
Sergio Casas, Abbas Sadat, Raquel Urtasun
High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. [Expand]
Thursday Poster Session
Domain-Independent Dominance of Adaptive Methods
Pedro Savarese, David McAllester, Sudarshan Babu, Michael Maire
From a simplified analysis of adaptive methods, we derive AvaGrad, a new optimizer which outperforms SGD on vision tasks when its adaptability is properly tuned. [Expand]
8.75
5
Friday Poster Session
Rotation Equivariant Siamese Networks for Tracking
Deepak K. Gupta, Devanshu Arya, Efstratios Gavves
Rotation is among the long prevailing, yet still unresolved, hard challenges encountered in visual object tracking. [Expand]
Thursday Poster Session
Improving Unsupervised Image Clustering With Robust Learning
Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon Hong, Meeyoung Cha
Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. [Expand]
8.50
1
Thursday Poster Session
Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation
Subhankar Roy, Evgeny Krivosheev, Zhun Zhong, Nicu Sebe, Elisa Ricci
In this paper we address multi-target domain adaptation (MTDA), where given one labeled source dataset and multiple unlabeled target datasets that differ in data distributions, the task is to learn a robust predictor for all the target domains. [Expand]
Tuesday Poster Session
Adaptive Class Suppression Loss for Long-Tail Object Detection
Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang
To address the problem of long-tail distribution for the large vocabulary object detection task, existing methods usually divide the whole categories into several groups and treat each group with different strategies. [Expand]
Tuesday Poster Session
Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation
Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, Nenghai Yu
Image matting is a fundamental and challenging problem in computer vision and graphics. [Expand]
Thursday Poster Session
Mitigating Face Recognition Bias via Group Adaptive Classifier
Sixue Gong, Xiaoming Liu, Anil K. Jain
Face recognition is known to exhibit bias -- subjects in a certain demographic group can be better recognized than other groups. [Expand]
8.25
5
Tuesday Poster Session
Interpolation-Based Semi-Supervised Learning for Object Detection
Jisoo Jeong, Vikas Verma, Minsung Hyun, Juho Kannala, Nojun Kwak
Despite the data labeling cost for the object detection tasks being substantially more than that of the classification tasks, semi-supervised learning methods for object detection have not been studied much. [Expand]
8.25
5
Thursday Poster Session
Learning Invariant Representations and Risks for Semi-Supervised Domain Adaptation
Bo Li, Yezhen Wang, Shanghang Zhang, Dongsheng Li, Kurt Keutzer, Trevor Darrell, Han Zhao
The success of supervised learning crucially hinges on the assumption that training data matches test data, which rarely holds in practice due to potential distribution shift. [Expand]
8.25
4
Monday Poster Session
DeepSurfels: Learning Online Appearance Fusion
Marko Mihajlovic, Silvan Weder, Marc Pollefeys, Martin R. Oswald
We present DeepSurfels, a novel hybrid scene representation for geometry and appearance information. [Expand]
Thursday Poster Session
Neural Camera Simulators
Hao Ouyang, Zifan Shi, Chenyang Lei, Ka Lung Law, Qifeng Chen
We present a controllable camera simulator based on deep neural networks to synthesize raw image data under different camera settings, including exposure time, ISO, and aperture. [Expand]
Wednesday Poster Session
The Neural Tangent Link Between CNN Denoisers and Non-Local Filters
Julian Tachella, Junqi Tang, Mike Davies
Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. [Expand]
8.25
3
Wednesday Poster Session
Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim
Understanding the nutritional content of food from visual data is a challenging computer vision problem, with the potential to have a positive and widespread impact on public health. [Expand]
8.25
2
Wednesday Poster Session
Data-Free Model Extraction
Jean-Baptiste Truong, Pratyush Maini, Robert J. Walls, Nicolas Papernot
Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model. [Expand]
8.25
2
Tuesday Poster Session
The Multi-Temporal Urban Development SpaceNet Dataset
Adam Van Etten, Daniel Hogan, Jesus Martinez Manso, Jacob Shermeyer, Nicholas Weir, Ryan Lewis
Satellite imagery analytics have numerous human development and disaster response applications, particularly when time series methods are involved. [Expand]
8.25
1
Tuesday Poster Session
Cross-View Regularization for Domain Adaptive Panoptic Segmentation
Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
Panoptic segmentation unifies semantic segmentation and instance segmentation which has been attracting increasing attention in recent years. [Expand]
8.00
8
Wednesday Poster Session
Fully Convolutional Networks for Panoptic Segmentation
Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
In this paper, we present a conceptually simple, strong, and efficient framework for panoptic segmentation, called Panoptic FCN. [Expand]
8.00
2
Monday Poster Session
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu, Jing Shao, Hongsheng Li
Localizing persons and recognizing their actions from videos is a challenging task towards high-level video under-standing. [Expand]
8.00
4
Monday Poster Session
Using Shape To Categorize: Low-Shot Learning With an Explicit Shape Bias
Stefan Stojanov, Anh Thai, James M. Rehg
It is widely accepted that reasoning about object shape is important for object recognition. [Expand]
Monday Poster Session
Structured Scene Memory for Vision-Language Navigation
Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen
Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. [Expand]
8.00
1
Wednesday Poster Session
Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
Xue Yang, Liping Hou, Yue Zhou, Wentao Wang, Junchi Yan
Rotation detection serves as a fundamental building block in many visual applications involving aerial image, scene text, and face etc. [Expand]
8.00
7
Friday Poster Session
Rethinking BiSeNet for Real-Time Semantic Segmentation
Mingyuan Fan, Shenqi Lai, Junshi Huang, Xiaoming Wei, Zhenhua Chai, Junfeng Luo, Xiaolin Wei
BiSeNet has been proved to be a popular two-stream network for real-time segmentation. [Expand]
7.75
1
Wednesday Poster Session
Image Inpainting Guided by Coherence Priors of Semantics and Textures
Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi Satoh
Existing inpainting methods have achieved promising performance in recovering defected images of specific scenes. [Expand]
Tuesday Poster Session
Fair Attribute Classification Through Latent Space De-Biasing
Vikram V. Ramaswamy, Sunnie S. Y. Kim, Olga Russakovsky
Fairness in visual recognition is becoming a prominent and critical topic of discussion as recognition systems are deployed at scale in the real world. [Expand]
7.75
3
Wednesday Poster Session
DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation
Xinyi Wu, Zhenyao Wu, Hao Guo, Lili Ju, Song Wang
Semantic segmentation of nighttime images plays an equally important role as that of daytime images in autonomous driving, but the former is much more challenging due to poor illuminations and arduous human annotations. [Expand]
7.75
1
Friday Poster Session
Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach
Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi
Text segmentation is a prerequisite in many real-world text-related tasks, e.g., text style transfer, and scene text removal. [Expand]
7.75
2
Thursday Poster Session
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation
Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, Fang Wen
Self-training is a competitive approach in domain adaptive segmentation, which trains the network with the pseudo labels on the target domain. [Expand]
7.75
5
Thursday Poster Session
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
Zhizhong Huang, Junping Zhang, Hongming Shan
To minimize the effects of age variation in face recognition, previous work either extracts identity-related discriminative features by minimizing the correlation between identity- and age-related features, called age-invariant face recognition (AIFR), or removes age variation by transforming the faces of different age groups into the same age group, called face age synthesis (FAS); however, the former lacks visual results for model interpretation while the latter suffers from artifacts compromising downstream recognition. [Expand]
7.50
2
Wednesday Poster Session
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya, Michael S. Ryoo
In this paper, we introduce 'Coarse-Fine Networks', a two-stream architecture which benefits from different abstractions of temporal resolution to learn better video representations for long-term motion. [Expand]
7.50
1
Wednesday Poster Session
Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection
Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang
Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries. [Expand]
Tuesday Poster Session
SMD-Nets: Stereo Mixture Density Networks
Fabio Tosi, Yiyi Liao, Carolin Schmitt, Andreas Geiger
Despite stereo matching accuracy has greatly improved by deep learning in the last few years, recovering sharp boundaries and high-resolution outputs efficiently remains challenging. [Expand]
Wednesday Poster Session
Learning Accurate Dense Correspondences and When To Trust Them
Prune Truong, Martin Danelljan, Luc Van Gool, Radu Timofte
Establishing dense correspondences between a pair of images is an important and general problem. [Expand]
7.50
4
Tuesday Poster Session
Hierarchical and Partially Observable Goal-Driven Policy Learning With Goals Relational Graph
Xin Ye, Yezhou Yang
We present a novel two-layer hierarchical reinforcement learning approach equipped with a Goals Relational Graph (GRG) for tackling the partially observable goal-driven task, such as goal-driven visual navigation. [Expand]
7.50
1
Thursday Poster Session
RefineMask: Towards High-Quality Instance Segmentation With Fine-Grained Features
Gang Zhang, Xin Lu, Jingru Tan, Jianmin Li, Zhaoxiang Zhang, Quanquan Li, Xiaolin Hu
The two-stage methods for instance segmentation, e.g. [Expand]
Tuesday Poster Session
FSDR: Frequency Space Domain Randomization for Domain Generalization
Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
Domain generalization aims to learn a generalizable model from a `known' source domain for various `unknown' target domains. [Expand]
7.25
7
Tuesday Poster Session
Differentiable SLAM-Net: Learning Particle SLAM for Visual Navigation
Peter Karkus, Shaojun Cai, David Hsu
Simultaneous localization and mapping (SLAM) remains challenging for a number of downstream applications, such as visual robot navigation, because of rapid turns, featureless walls, and poor camera quality. [Expand]
Monday Poster Session
Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation
Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan, Erjin Zhou
Heatmap regression has become the most prevalent choice for nowadays human pose estimation methods. [Expand]
7.25
1
Thursday Poster Session
How Robust Are Randomized Smoothing Based Defenses to Data Poisoning?
Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm
Predictions of certifiably robust classifiers remain constant in a neighborhood of a point, making them resilient to test-time attacks with a guarantee. [Expand]
Thursday Poster Session
Improving Panoptic Segmentation at All Scales
Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder
Crop-based training strategies decouple training resolution from GPU memory consumption, allowing the use of large-capacity panoptic segmentation networks on multi-megapixel images. [Expand]
Wednesday Poster Session
AdderSR: Towards Energy Efficient Image Super-Resolution
Dehua Song, Yunhe Wang, Hanting Chen, Chang Xu, Chunjing Xu, Dacheng Tao
This paper studies the single image super-resolution problem using adder neural networks (AdderNets). [Expand]
7.25
4
Friday Poster Session
Truly Shift-Invariant Convolutional Neural Networks
Anadi Chaman, Ivan Dokmanic
Thanks to the use of convolution and pooling layers, convolutional neural networks were for a long time thought to be shift-invariant. [Expand]
7.00
3
Tuesday Poster Session
MetricOpt: Learning To Optimize Black-Box Evaluation Metrics
Chen Huang, Shuangfei Zhai, Pengsheng Guo, Josh Susskind
We study the problem of directly optimizing arbitrary non-differentiable task evaluation metrics such as misclassification rate and recall. [Expand]
Monday Poster Session
Bi-GCN: Binary Graph Convolutional Network
Junfu Wang, Yunhong Wang, Zhen Yang, Liang Yang, Yuanfang Guo
Graph Neural Networks (GNNs) have achieved tremendous success in graph representation learning. [Expand]
7.00
1
Monday Poster Session
DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images
Meng Ye, Mikael Kanski, Dong Yang, Qi Chang, Zhennan Yan, Qiaoying Huang, Leon Axel, Dimitris Metaxas
Cardiac tagging magnetic resonance imaging (t-MRI) is the gold standard for regional myocardium deformation and cardiac strain estimation. [Expand]
Wednesday Poster Session
Dynamic Region-Aware Convolution
Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun
We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions where features have similar representation. [Expand]
6.75
6
Wednesday Poster Session
Shared Cross-Modal Trajectory Prediction for Autonomous Driving
Chiho Choi, Joon Hee Choi, Jiachen Li, Srikanth Malla
Predicting future trajectories of traffic agents in highly interactive environments is an essential and challenging problem for the safe operation of autonomous driving systems. [Expand]
6.75
6
Monday Poster Session
SuperMix: Supervising the Mixing Data Augmentation
Ali Dabouei, Sobhan Soleymani, Fariborz Taherkhani, Nasser M. Nasrabadi
This paper presents a supervised mixing augmentation method termed SuperMix, which exploits the salient regions within input images to construct mixed training samples. [Expand]
6.75
6
Thursday Poster Session
Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy
Federico Paredes-Valles, Guido C. H. E. de Croon
Event cameras are novel vision sensors that sample, in an asynchronous fashion, brightness increments with low latency and high temporal resolution. [Expand]
6.75
2
Tuesday Poster Session
PU-GCN: Point Cloud Upsampling Using Graph Convolutional Networks
Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem
The effectiveness of learning-based point cloud upsampling pipelines heavily relies on the upsampling modules and feature extractors used therein. [Expand]
6.75
3
Thursday Poster Session
TrafficSim: Learning To Simulate Realistic Multi-Agent Behaviors
Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun
Simulation has the potential to massively scale evaluation of self-driving systems, enabling rapid development as well as safe deployment. [Expand]
6.75
3
Wednesday Poster Session
Neural Descent for Visual 3D Human Pose and Shape
Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
We present deep neural network methodology to reconstruct the 3d pose and shape of people, including hand gestures and facial expression, given an input RGB image. [Expand]
6.75
4
Thursday Poster Session
Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty
Bingbing Zhuang, Manmohan Chandraker
Learning methods for relative camera pose estimation have been developed largely in isolation from classical geometric approaches. [Expand]
Monday Poster Session
The Lottery Ticket Hypothesis for Object Recognition
Sharath Girish, Shishira R Maiya, Kamal Gupta, Hao Chen, Larry S. Davis, Abhinav Shrivastava
Recognition tasks, such as object recognition and keypoint estimation, have seen widespread adoption in recent years. [Expand]
6.50
6
Monday Poster Session
FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
Quande Liu, Cheng Chen, Jing Qin, Qi Dou, Pheng-Ann Heng
Federated learning allows distributed medical institutions to collaboratively learn a shared prediction model with privacy protection. [Expand]
6.50
4
Monday Poster Session
On Learning the Geodesic Path for Incremental Learning
Christian Simon, Piotr Koniusz, Mehrtash Harandi
Neural networks notoriously suffer from the problem of catastrophic forgetting, the phenomenon of forgetting the past knowledge when acquiring new knowledge. [Expand]
6.50
2
Monday Poster Session
TransFill: Reference-Guided Image Inpainting by Merging Multiple Color and Spatial Transformations
Yuqian Zhou, Connelly Barnes, Eli Shechtman, Sohrab Amirghodsi
Image inpainting is the task of plausibly restoring missing pixels within a hole region that is to be removed from a target image. [Expand]
Monday Poster Session
Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations
Umberto Michieli, Pietro Zanuttigh
Deep neural networks suffer from the major limitation of catastrophic forgetting old tasks when learning new ones. [Expand]
6.25
3
Monday Poster Session
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions
Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva
When people observe events, they are able to abstract key information and build concise summaries of what is happening. [Expand]
6.25
1
Thursday Poster Session
TearingNet: Point Cloud Autoencoder To Learn Topology-Friendly Representations
Jiahao Pang, Duanshun Li, Dong Tian
Topology matters. [Expand]
6.25
2
Wednesday Poster Session
Uncertainty-Guided Model Generalization to Unseen Domains
Fengchun Qiao, Xi Peng
We study a worst-case scenario in generalization: Out-of-domain generalization from a single source. [Expand]
6.25
3
Tuesday Poster Session
Fingerspelling Detection in American Sign Language
Bowen Shi, Diane Brentari, Greg Shakhnarovich, Karen Livescu
Fingerspelling, in which words are signed letter by letter, is an important component of American Sign Language. [Expand]
Tuesday Poster Session
Semi-Supervised Action Recognition With Temporal Contrastive Learning
Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
Learning to recognize actions from only a handful of labeled videos is a challenging problem due to the scarcity of tediously collected activity labels. [Expand]
6.25
1
Wednesday Poster Session
Rectification-Based Knowledge Retention for Continual Learning
Pravendra Singh, Pratik Mazumder, Piyush Rai, Vinay P. Namboodiri
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting. [Expand]
6.25
1
Thursday Poster Session
Multiple Instance Active Learning for Object Detection
Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye
Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection. [Expand]
6.25
1
Tuesday Poster Session
AQD: Towards Accurate Quantized Object Detection
Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen
Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. [Expand]
6.00
2
Monday Poster Session
Polarimetric Normal Stereo
Yoshiki Fukao, Ryo Kawahara, Shohei Nobuhara, Ko Nishino
We introduce a novel method for recovering per-pixel surface normals from a pair of polarization cameras. [Expand]
Show Tweets
Monday Poster Session
Point2Skeleton: Learning Skeletal Representations from Point Clouds
Cheng Lin, Changjian Li, Yuan Liu, Nenglun Chen, Yi-King Choi, Wenping Wang
We introduce Point2Skeleton, an unsupervised method to learn skeletal representations from point clouds. [Expand]
6.00
1
Tuesday Poster Session
Source-Free Domain Adaptation for Semantic Segmentation
Yuang Liu, Wei Zhang, Jun Wang
Unsupervised Domain Adaptation (UDA) can tackle the challenge that convolutional neural network (CNN)-based approaches for semantic segmentation heavily rely on the pixel-level annotated data, which is labor-intensive. [Expand]
6.00
1
Monday Poster Session
Delving Into Localization Errors for Monocular 3D Object Detection
Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang
Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging. [Expand]
6.00
1
Tuesday Poster Session
Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification
Oren Nuriel, Sagie Benaim, Lior Wolf
Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. [Expand]
6.00
2
Wednesday Poster Session
Multi-Attentional Deepfake Detection
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, Nenghai Yu
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns. [Expand]
6.00
2
Monday Poster Session
Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, Marios Savvides
Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data. [Expand]
6.00
3
Wednesday Poster Session
Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
Jinzheng Cai, Youbao Tang, Ke Yan, Adam P. Harrison, Jing Xiao, Gigin Lin, Le Lu
Monitoring treatment response in longitudinal studies plays an important role in clinical practice. [Expand]
5.75
3
Thursday Poster Session
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen, Xiaohan Nie, David Fan, Dongqing Zhang, Vimal Bhat, Raffay Hamid
Scenes play a crucial role in breaking the storyline of movies and TV episodes into semantically cohesive parts. [Expand]
Wednesday Poster Session
Semantic Palette: Guiding Scene Generation With Class Proportions
Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Perez, Matthieu Cord
Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem. [Expand]
Wednesday Poster Session
Generative Interventions for Causal Learning
Chengzhi Mao, Augustine Cha, Amogh Gupta, Hao Wang, Junfeng Yang, Carl Vondrick
We introduce a framework for learning robust visual representations that generalize to new viewpoints, backgrounds, and scene contexts. [Expand]
5.75
3
Tuesday Poster Session
Visual Navigation With Spatial Attention
Bar Mayo, Tamir Hazan, Ayellet Tal
This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class, where in each step the agent is provided with an egocentric RGB image of the scene. [Expand]
Friday Poster Session
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution. [Expand]
5.75
5
Wednesday Poster Session
RGB-D Local Implicit Function for Depth Completion of Transparent Objects
Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox
Majority of the perception methods in robotics require depth information provided by RGB-D cameras. [Expand]
Tuesday Poster Session
Person30K: A Dual-Meta Generalization Network for Person Re-Identification
Yan Bai, Jile Jiao, Wang Ce, Jun Liu, Yihang Lou, Xuetao Feng, Ling-Yu Duan
Recently, person re-identification (ReID) has vastly benefited from the surging waves of data-driven methods. [Expand]
Show Tweets
Monday Poster Session
Convolutional Dynamic Alignment Networks for Interpretable Classifications
Moritz Bohle, Mario Fritz, Bernt Schiele
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. [Expand]
Wednesday Poster Session
Semantic Audio-Visual Navigation
Changan Chen, Ziad Al-Halah, Kristen Grauman
Recent work on audio-visual navigation assumes a constantly-sounding target and restricts the role of audio to signaling the target's position. [Expand]
5.50
2
Thursday Poster Session
Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments
Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas J. Guibas
Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc. [Expand]
5.50
1
Wednesday Poster Session
Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs
Jianwei Feng, Dong Huang
Deep Neural Networks(DNNs) require huge GPU memory when training on modern image/video databases. [Expand]
5.50
1
Thursday Poster Session
Rank-One Prior: Toward Real-Time Scene Recovery
Jun Liu, Wen Liu, Jianing Sun, Tieyong Zeng
Scene recovery is a fundamental imaging task for several practical applications, e.g., video surveillance and autonomous vehicles, etc. [Expand]
Thursday Poster Session
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis
3D convolutional networks are prevalent for video recognition. [Expand]
5.50
2
Tuesday Poster Session
Keep Your Eyes on the Lane: Real-Time Attention-Guided Lane Detection
Lucas Tabelini, Rodrigo Berriel, Thiago M. Paixao, Claudine Badue, Alberto F. De Souza, Thiago Oliveira-Santos
Modern lane detection methods have achieved remarkable performances in complex real-world scenarios, but many have issues maintaining real-time efficiency, which is important for autonomous vehicles. [Expand]
5.50
1
Monday Poster Session
Semi-Supervised Video Deraining With Dynamical Rain Generator
Zongsheng Yue, Jianwen Xie, Qian Zhao, Deyu Meng
While deep learning (DL)-based video deraining methods have achieved significant success recently, they still exist two major drawbacks. [Expand]
Monday Poster Session
PCLs: Geometry-Aware Neural Reconstruction of 3D Pose With Perspective Crop Layers
Frank Yu, Mathieu Salzmann, Pascal Fua, Helge Rhodin
Local processing is an essential feature of CNNs and other neural network architectures -- it is one of the reasons why they work so well on images where relevant information is, to a large extent, local. [Expand]
5.50
1
Wednesday Poster Session
Deep Implicit Templates for 3D Shape Representation
Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu
Deep implicit functions (DIFs), as a kind of 3D shape representation, are becoming more and more popular in the 3D vision community due to their compactness and strong representation power. [Expand]
5.50
2
Monday Poster Session
DOTS: Decoupling Operation and Topology in Differentiable Architecture Search
Yu-Chao Gu, Li-Juan Wang, Yun Liu, Yi Yang, Yu-Huan Wu, Shao-Ping Lu, Ming-Ming Cheng
Differentiable Architecture Search (DARTS) has attracted extensive attention due to its efficiency in searching for cell structures. [Expand]
5.25
4
Thursday Poster Session
Distilling Causal Effect of Data in Class-Incremental Learning
Xinting Hu, Kaihua Tang, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang
We propose a causal framework to explain the catastrophic forgetting in Class-Incremental Learning (CIL) and then derive a novel distillation method that is orthogonal to the existing anti-forgetting techniques, such as data replay and feature/label distillation. [Expand]
5.25
5
Tuesday Poster Session
Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation
Rongjie Li, Songyang Zhang, Bo Wan, Xuming He
Scene graph generation is an important visual understanding task with a broad range of vision applications. [Expand]
Wednesday Poster Session
ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring
Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. [Expand]
5.25
5
Wednesday Poster Session
Simultaneously Localize, Segment and Rank the Camouflaged Objects
Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Bowen Liu, Nick Barnes, Deng-Ping Fan
Camouflage is a key defence mechanism across species that is critical to survival. [Expand]
5.25
5
Thursday Poster Session
Learning Camera Localization via Dense Scene Matching
Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu, Ping Tan
Camera localization aims to estimate 6 DoF camera poses from RGB images. [Expand]
Monday Poster Session
Diverse Semantic Image Synthesis via Probability Distribution Modeling
Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu
Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. [Expand]
Wednesday Poster Session
Divergence Optimization for Noisy Universal Domain Adaptation
Qing Yu, Atsushi Hashimoto, Yoshitaka Ushiku
Universal domain adaptation (UniDA) has been proposed to transfer knowledge learned from a label-rich source domain to a label-scarce target domain without any constraints on the label sets. [Expand]
Monday Poster Session
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, Yongdong Zhang
Linguistic knowledge is of great benefit to scene text recognition. [Expand]
Wednesday Poster Session
Representative Batch Normalization With Feature Calibration
Shang-Hua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng
Batch Normalization (BatchNorm) has become the default component in modern neural networks to stabilize training. [Expand]
5.00
5
Wednesday Poster Session
FrameExit: Conditional Early Exiting for Efficient Video Recognition
Amir Ghodrati, Babak Ehteshami Bejnordi, Amirhossein Habibian
In this paper, we propose a conditional early exiting framework for efficient video recognition. [Expand]
5.00
1
Friday Poster Session
Achieving Robustness in Classification Using Optimal Transport With Hinge Regularization
Mathieu Serrurier, Franck Mamalet, Alberto Gonzalez-Sanz, Thibaut Boissin, Jean-Michel Loubes, Eustasio del Barrio
Adversarial examples have pointed out Deep Neural Network's vulnerability to small local noise. [Expand]
5.00
3
Monday Poster Session
Understanding the Behaviour of Contrastive Loss
Feng Wang, Huaping Liu
Unsupervised contrastive learning has achieved outstanding success, while the mechanism of contrastive loss has been less studied. [Expand]
5.00
5
Monday Poster Session
TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption
Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Florencio, Lijuan Wang, Cha Zhang, Lei Zhang, Jiebo Luo
In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and Text-Caption tasks. [Expand]
5.00
5
Wednesday Poster Session
Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation
Yazhou Yao, Tao Chen, Guo-Sen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, Jian Zhang
Semantic segmentation aims to classify every pixel of an input image. [Expand]
5.00
2
Monday Poster Session
Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification
Yuyang Zhao, Zhun Zhong, Fengxiang Yang, Zhiming Luo, Yaojin Lin, Shaozi Li, Nicu Sebe
Recent advances in person re-identification (ReID) obtain impressive accuracy in the supervised and unsupervised learning settings. [Expand]
5.00
5
Tuesday Poster Session
Learning the Best Pooling Strategy for Visual Semantic Embedding
Jiacheng Chen, Hexiang Hu, Hao Wu, Yuning Jiang, Changhu Wang
Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions. [Expand]
4.75
3
Friday Poster Session
Context-Aware Layout to Image Generation With Enhanced Object Appearance
Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang
A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. [Expand]
4.75
1
Thursday Poster Session
Neural Response Interpretation Through the Lens of Critical Pathways
Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Christian Rupprecht, Seong Tae Kim, Nassir Navab
Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. [Expand]
4.75
2
Thursday Poster Session
3D-to-2D Distillation for Indoor Scene Parsing
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu
Indoor scene semantic parsing from RGB images is very challenging due to occlusions, object distortion, and viewpoint variations. [Expand]
4.75
2
Tuesday Poster Session
Combining Semantic Guidance and Deep Reinforcement Learning for Generating Human Level Paintings
Jaskirat Singh, Liang Zheng
Generation of stroke-based non-photorealistic imagery, is an important problem in the computer vision community. [Expand]
4.75
1
Friday Poster Session
Joint Learning of 3D Shape Retrieval and Deformation
Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri, Leonidas J. Guibas
We propose a novel technique for producing high-quality 3D models that match a given target object image or scan. [Expand]
4.75
2
Thursday Poster Session
Deep Gradient Projection Networks for Pan-sharpening
Shuang Xu, Jiangshe Zhang, Zixiang Zhao, Kai Sun, Junmin Liu, Chunxia Zhang
Pan-sharpening is an important technique for remote sensing imaging systems to obtain high resolution multispectral images. [Expand]
Monday Poster Session
Learning Semantic-Aware Dynamics for Video Prediction
Xinzhu Bei, Yanchao Yang, Stefano Soatto
We propose an architecture and training scheme to predict video frames by explicitly modeling dis-occlusions and capturing the evolution of semantically consistent regions in the video. [Expand]
Monday Poster Session
Mixed-Privacy Forgetting in Deep Networks
Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto
We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting. [Expand]
4.50
1
Monday Poster Session
Track, Check, Repeat: An EM Approach to Unsupervised Tracking
Adam W. Harley, Yiming Zuo, Jing Wen, Ayush Mangal, Shubhankar Potdar, Ritwick Chaudhry, Katerina Fragkiadaki
We propose an unsupervised method for detecting and tracking moving objects in 3D, in unlabelled RGB-D videos. [Expand]
Friday Poster Session
Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging
Tao Huang, Weisheng Dong, Xin Yuan, Jinjian Wu, Guangming Shi
In coded aperture snapshot spectral imaging (CASSI) system, the real-world hyperspectral image (HSI) can be reconstructed from the captured compressive image in a snapshot. [Expand]
4.50
3
Friday Poster Session
SetVAE: Learning Hierarchical Composition for Generative Modeling of Set-Structured Data
Jinwoo Kim, Jaehoon Yoo, Juho Lee, Seunghoon Hong
Generative modeling of set-structured data, such as point clouds, requires reasoning over local and global structures at various scales. [Expand]
Thursday Poster Session
GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection
Abhinav Kumar, Garrick Brazil, Xiaoming Liu
Modern 3D object detectors have immensely benefited from the end-to-end learning idea. [Expand]
Wednesday Poster Session
BRepNet: A Topological Message Passing System for Solid Models
Joseph G. Lambourne, Karl D.D. Willis, Pradeep Kumar Jayaraman, Aditya Sanghi, Peter Meltzer, Hooman Shayani
Boundary representation (B-rep) models are the standard way 3D shapes are described in Computer-Aided Design (CAD) applications. [Expand]
4.50
3
Thursday Poster Session
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
Shi Qiu, Saeed Anwar, Nick Barnes
Given the prominence of current 3D sensors, a fine-grained analysis on the basic point cloud data is worthy of further investigation. [Expand]
Monday Poster Session
TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking
N Dinesh Reddy, Laurent Guigues, Leonid Pishchulin, Jayan Eledath, Srinivasa G. Narasimhan
We consider the task of 3D pose estimation and trackingof multiple people seen in an arbitrary number of camerafeeds. [Expand]
Show Tweets
Thursday Poster Session
DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for Deep Neural Networks
Abhishek Singh, Ayush Chopra, Ethan Garza, Emily Zhang, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar
Recent deep learning models have shown remarkable performance in image classification. [Expand]
Thursday Poster Session
Practical Wide-Angle Portraits Correction With Deep Structured Models
Jing Tan, Shan Zhao, Pengfei Xiong, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
Wide-angle portraits often enjoy expanded views. [Expand]
Tuesday Poster Session
Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark
Longyin Wen, Dawei Du, Pengfei Zhu, Qinghua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu
To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured large-scale dataset, named as DroneCrowd, formed by 112 video clips with 33,600 HD frames in various scenarios. [Expand]
Wednesday Poster Session
Regularizing Neural Networks via Adversarial Model Perturbation
Yaowei Zheng, Richong Zhang, Yongyi Mao
Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. [Expand]
4.50
2
Wednesday Poster Session
Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network
Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Chen Change Loy, Jinwei Gu
Recent development of Under-Display Camera (UDC) systems provides a true bezel-less and notch-free viewing experience on smartphones (and TV, laptops, tablets), while allowing images to be captured from the selfie camera embedded underneath. [Expand]
Monday Poster Session
MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, Xinjing Cheng
Real-world training data usually exhibits long-tailed distribution, where several majority classes have a significantly larger number of samples than the remaining minority classes. [Expand]
Tuesday Poster Session
Simulating Unknown Target Models for Query-Efficient Black-Box Attacks
Chen Ma, Li Chen, Jun-Hai Yong
Many adversarial attacks have been proposed to investigate the security issues of deep neural networks. [Expand]
Thursday Poster Session
Beyond Image to Depth: Improving Depth Prediction Using Echoes
Kranti Kumar Parida, Siddharth Srivastava, Gaurav Sharma
We address the problem of estimating depth with multi modal audio visual data. [Expand]
4.25
1
Wednesday Poster Session
On the Difficulty of Membership Inference Attacks
Shahbaz Rezaei, Xin Liu
Recent studies propose membership inference (MI) attacks on deep models, where the goal is to infer if a sample has been used in the training process. [Expand]
Wednesday Poster Session
Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-Constrained Optimization
Fakai Wang, Kang Zheng, Le Lu, Jing Xiao, Min Wu, Shun Miao
Accurate vertebra localization and identification are required in many clinical applications of spine disorder diagnosis and surgery planning. [Expand]
Tuesday Poster Session
Single-View 3D Object Reconstruction From Shape Priors in Memory
Shuo Yang, Min Xu, Haozhe Xie, Stuart Perry, Jiahao Xia
Existing methods for single-view 3D object reconstruction directly learn to transform image features into 3D representations. [Expand]
Tuesday Poster Session
Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search
Yibo Yang, Shan You, Hongyang Li, Fei Wang, Chen Qian, Zhouchen Lin
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. [Expand]
4.25
4
Tuesday Poster Session
Open-Vocabulary Object Detection Using Captions
Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang
Despite the remarkable accuracy of deep neural networks in object detection, they are costly to train and scale due to supervision requirements. [Expand]
4.25
1
Thursday Poster Session
CoLA: Weakly-Supervised Temporal Action Localization With Snippet Contrastive Learning
Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou
Weakly-supervised temporal action localization (WS-TAL) aims to localize actions in untrimmed videos with only video-level labels. [Expand]
4.25
2
Friday Poster Session
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li
Supervised learning based object detection frameworks demand plenty of laborious manual annotations, which may not be practical in real applications. [Expand]
4.25
1
Tuesday Poster Session
Face Forgery Detection by 3D Decomposition
Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li
Detecting digital face manipulation has attracted extensive attention due to the potential harms of fake media to the public. [Expand]
4.25
1
Tuesday Poster Session
How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks With DenseNet-Type Skip Connections?
Kartikeya Bhardwaj, Guihong Li, Radu Marculescu
DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. [Expand]
Thursday Poster Session
Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning
Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto
We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. [Expand]
4.00
2
Monday Poster Session
SLADE: A Self-Training Framework for Distance Metric Learning
Jiali Duan, Yen-Liang Lin, Son Tran, Larry S. Davis, C.-C. Jay Kuo
Most existing distance metric learning approaches use fully labeled data to learn the sample similarities in an embedding space. [Expand]
Wednesday Poster Session
Generalized Few-Shot Object Detection Without Forgetting
Zhibo Fan, Yuchen Ma, Zeming Li, Jian Sun
Learning object detection from few examples recently emerged to deal with data-limited situations. [Expand]
Tuesday Poster Session
Regressive Domain Adaptation for Unsupervised Keypoint Detection
Junguang Jiang, Yifei Ji, Ximei Wang, Yufeng Liu, Jianmin Wang, Mingsheng Long
Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain. [Expand]
Tuesday Poster Session
Taskology: Utilizing Task Relations at Scale
Yao Lu, Soren Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon
Many computer vision tasks address the problem of scene understanding and are naturally interrelated e.g. [Expand]
4.00
3
Wednesday Poster Session
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, Nong Sang
Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. [Expand]
4.00
4
Monday Poster Session
Affective Processes: Stochastic Modelling of Temporal Context for Emotion and Facial Expression Recognition
Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos
Temporal context is key to the recognition of expressions of emotion. [Expand]
Wednesday Poster Session
Look Before You Speak: Visually Contextualized Utterances
Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
While most conversational AI systems focus on textual dialogue only, conditioning utterances on visual context (when it's available) can lead to more realistic conversations. [Expand]
4.00
1
Friday Poster Session
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian, Di Hu, Chenliang Xu
There are rich synchronized audio and visual events in our daily life. [Expand]
4.00
4
Monday Poster Session
CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang
Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. [Expand]
4.00
3
Wednesday Poster Session
Deep Optimized Priors for 3D Shape Modeling and Reconstruction
Mingyue Yang, Yuxin Wen, Weikai Chen, Yongwei Chen, Kui Jia
Many learning-based approaches have difficulty scaling to unseen data, as the generality of its learned prior is limited to the scale and variations of the training samples. [Expand]
4.00
3
Tuesday Poster Session
Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition
Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, Jian Sun
Despite the success of the deep neural networks, it remains challenging to effectively build a system for long-tail visual recognition tasks. [Expand]
Monday Poster Session
Few-Shot 3D Point Cloud Semantic Segmentation
Na Zhao, Tat-Seng Chua, Gim Hee Lee
Many existing approaches for 3D point cloud semantic segmentation are fully supervised. [Expand]
4.00
2
Wednesday Poster Session
A Second-Order Approach to Learning With Instance-Dependent Label Noise
Zhaowei Zhu, Tongliang Liu, Yang Liu
The presence of label noise often misleads the training of deep neural networks. [Expand]
4.00
4
Wednesday Poster Session
Meta Batch-Instance Normalization for Generalizable Person Re-Identification
Seokeon Choi, Taekyung Kim, Minki Jeong, Hyoungseob Park, Changick Kim
Although supervised person re-identification (Re-ID) methods have shown impressive performance, they suffer from a poor generalization capability on unseen domains. [Expand]
3.75
3
Tuesday Poster Session
A Peek Into the Reasoning of Neural Networks: Interpreting With Structural Visual Concepts
Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu
Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from a lack of transparency and interpretability. [Expand]
Monday Poster Session
StyleMix: Separating Content and Style for Enhanced Data Augmentation
Minui Hong, Jinwoo Choi, Gunhee Kim
In spite of the great success of deep neural networks for many challenging classification tasks, the learned networks are vulnerable to overfitting and adversarial attacks. [Expand]
Show Tweets
Thursday Poster Session
General Multi-Label Image Classification With Transformers
Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi
Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. [Expand]
3.75
1
Friday Poster Session
Model-Contrastive Federated Learning
Qinbin Li, Bingsheng He, Dawn Song
Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. [Expand]
3.75
1
Wednesday Poster Session
UAV-Human: A Large Benchmark for Human Behavior Understanding With Unmanned Aerial Vehicles
Tianjiao Li, Jun Liu, Wei Zhang, Yun Ni, Wenqian Wang, Zhiheng Li
Human behavior understanding with unmanned aerial vehicles (UAVs) is of great significance for a wide range of applications, which simultaneously brings an urgent demand of large, challenging, and comprehensive benchmarks for the development and evaluation of UAV-based models. [Expand]
Friday Poster Session
Learning Asynchronous and Sparse Human-Object Interaction in Videos
Romero Morais, Vuong Le, Svetha Venkatesh, Truyen Tran
Human activities can be learned from video. [Expand]
Friday Poster Session
Neural Prototype Trees for Interpretable Fine-Grained Image Recognition
Meike Nauta, Ron van Bree, Christin Seifert
Prototype-based methods use interpretable representations to address the black-box nature of deep learning models, in contrast to post-hoc explanation methods that only approximate such models. [Expand]
3.75
1
Thursday Poster Session
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu
Convolutional video models have an order of magnitude larger computational complexity than their counterpart image-level models. [Expand]
Thursday Poster Session
3D Object Detection With Pointformer
Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, Gao Huang
Feature learning for 3D object detection from point clouds is very challenging due to the irregularity of 3D point cloud data. [Expand]
3.75
3
Wednesday Poster Session
DeFlow: Learning Complex Image Degradations From Unpaired Data With Conditional Flows
Valentin Wolf, Andreas Lugmayr, Martin Danelljan, Luc Van Gool, Radu Timofte
The difficulty of obtaining paired data remains a major bottleneck for learning image restoration and enhancement models for real-world applications. [Expand]
3.75
2
Monday Poster Session
A Decomposition Model for Stereo Matching
Chengtang Yao, Yunde Jia, Huijun Di, Pengxiang Li, Yuwei Wu
In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as the resolution increases. [Expand]
Tuesday Poster Session
Transitional Adaptation of Pretrained Models for Visual Storytelling
Youngjae Yu, Jiwan Chung, Heeseung Yun, Jongseok Kim, Gunhee Kim
Previous models for vision-to-language generation tasks usually pretrain a visual encoder and a language generator in the respective domains and jointly finetune them with the target task. [Expand]
Show Tweets
Thursday Poster Session
Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map
Elmira Amirloo, Mohsen Rohani, Ershad Banijamali, Jun Luo, Pascal Poupart
In this paper we propose a system consisting of a modular network and a trajectory planner. [Expand]
Wednesday Poster Session
A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images Detection
Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Ngai-Man Cheung
CNN-based generative modelling has evolved to produce synthetic images indistinguishable from real images in the RGB pixel space. [Expand]
Wednesday Poster Session
Transformer Tracking
Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu
Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. [Expand]
3.50
3
Wednesday Poster Session
Recurrent Multi-View Alignment Network for Unsupervised Surface Registration
Wanquan Feng, Juyong Zhang, Hongrui Cai, Haofei Xu, Junhui Hou, Hujun Bao
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data. [Expand]
Wednesday Poster Session
Domain Adaptation With Auxiliary Target Domain-Oriented Classifier
Jian Liang, Dapeng Hu, Jiashi Feng
Domain adaptation (DA) aims to transfer knowledge from a label-rich but heterogeneous domain to a label-scare domain, which alleviates the labeling efforts and attracts considerable attention. [Expand]
Friday Poster Session
Discovering Hidden Physics Behind Transport Dynamics
Peirong Liu, Lin Tian, Yubo Zhang, Stephen Aylward, Yueh Lee, Marc Niethammer
Transport processes are ubiquitous. [Expand]
Wednesday Poster Session
Multi-Person Implicit Reconstruction From a Single Image
Armin Mustafa, Akin Caliskan, Lourdes Agapito, Adrian Hilton
We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image. [Expand]
Thursday Poster Session
Offboard 3D Object Detection From Point Cloud Sequences
Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa Vo, Boyang Deng, Dragomir Anguelov
While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels. [Expand]
3.50
1
Tuesday Poster Session
Backdoor Attacks Against Deep Learning Systems in the Physical World
Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao
Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific "trigger." Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that apply digitally generated patterns as triggers. [Expand]
Tuesday Poster Session
Neural Splines: Fitting 3D Surfaces With Infinitely-Wide Neural Networks
Francis Williams, Matthew Trager, Joan Bruna, Denis Zorin
We present Neural Splines, a technique for 3D surface reconstruction that is based on random feature kernels arising from infinitely-wide shallow ReLU networks. [Expand]
3.50
1
Wednesday Poster Session
Track To Detect and Segment: An Online Multi-Object Tracker
Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan
Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. [Expand]
3.50
1
Thursday Poster Session
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc Van Gool
To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. [Expand]
3.50
1
Monday Poster Session
One Shot Face Swapping on Megapixels
Yuhao Zhu, Qi Li, Jian Wang, Cheng-Zhong Xu, Zhenan Sun
Face swapping has both positive applications such as entertainment, human-computer interaction, etc., and negative applications such as DeepFake threats to politics, economics, etc. [Expand]
Tuesday Poster Session
PointDSC: Robust Point Cloud Registration Using Deep Spatial Consistency
Xuyang Bai, Zixin Luo, Lei Zhou, Hongkai Chen, Lei Li, Zeyu Hu, Hongbo Fu, Chiew-Lan Tai
Removing outlier correspondences is one of the critical steps for successful feature-based point cloud registration. [Expand]
Friday Poster Session
RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening
Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne T. Kim, Seungryong Kim, Jaegul Choo
Enhancing the generalization capability of deep neural networks to unseen domains is crucial for safety-critical applications in the real world such as autonomous driving. [Expand]
3.25
1
Thursday Poster Session
Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, Mubarak Shah
Anomaly detection in video is a challenging computer vision problem. [Expand]
3.25
3
Thursday Poster Session
Graph Attention Tracking
Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, Chunhua Shen
Siamese network based trackers formulate the visual tracking task as a similarity matching problem. [Expand]
3.25
3
Wednesday Poster Session
Group Whitening: Balancing Learning Efficiency and Representational Capacity
Lei Huang, Yi Zhou, Li Liu, Fan Zhu, Ling Shao
Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. [Expand]
Wednesday Poster Session
Monocular Depth Estimation via Listwise Ranking Using the Plackett-Luce Model
Julian Lienen, Eyke Hullermeier, Ralph Ewerth, Nils Nommensen
In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. [Expand]
3.25
1
Thursday Poster Session
Adaptive Aggregation Networks for Class-Incremental Learning
Yaoyao Liu, Bernt Schiele, Qianru Sun
Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase. [Expand]
Monday Poster Session
DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
Rui Liu, Yixiao Ge, Ching Lam Choi, Xiaogang Wang, Hongsheng Li
Conditional generative adversarial networks (cGANs) target at synthesizing diverse images given the input conditions and latent codes, but unfortunately, they usually suffer from the issue of mode collapse. [Expand]
3.25
2
Friday Poster Session
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
In many real-world problems, collecting a large number of labeled samples is infeasible. [Expand]
3.25
1
Wednesday Poster Session
StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval
Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song
Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. [Expand]
3.25
3
Wednesday Poster Session
Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing
Wen Shen, Zhihua Wei, Shikun Huang, Binbin Zhang, Panyue Chen, Ping Zhao, Quanshi Zhang
In this paper, we diagnose deep neural networks for 3D point cloud processing to explore utilities of different network architectures. [Expand]
Wednesday Poster Session
Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
Jiapeng Tang, Dan Xu, Kui Jia, Lei Zhang
This paper focuses on the task of 4D shape reconstruction from a sequence of point clouds. [Expand]
3.25
1
Tuesday Poster Session
SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud Based Place Recognition
Yan Xia, Yusheng Xu, Shuang Li, Rui Wang, Juan Du, Daniel Cremers, Uwe Stilla
We tackle the problem of place recognition from point cloud data and introduce a self-attention and orientation encoding network (SOE-Net) that fully explores the relationship between points and incorporates long-range context into point-wise local descriptors. [Expand]
3.25
3
Thursday Poster Session
Cross-Iteration Batch Normalization
Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin
A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. [Expand]
Thursday Poster Session
Prototype Completion With Primitive Knowledge for Few-Shot Learning
Baoquan Zhang, Xutao Li, Yunming Ye, Zhichao Huang, Lisai Zhang
Few-shot learning is a challenging task, which aims to learn a classifier for novel classes with few examples. [Expand]
3.25
1
Tuesday Poster Session
OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World
Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi Yang, Nicu Sebe
In this paper, we tackle the problem of discovering new classes in unlabeled visual data given labeled data from disjoint classes. [Expand]
3.25
3
Wednesday Poster Session
Binary Graph Neural Networks
Mehdi Bahri, Gaetan Bahl, Stefanos Zafeiriou
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. [Expand]
Wednesday Poster Session
Memory-Efficient Network for Large-Scale Video Compressive Sensing
Ziheng Cheng, Bo Chen, Guanliang Liu, Hao Zhang, Ruiying Lu, Zhengjue Wang, Xin Yuan
Video snapshot compressive imaging (SCI) captures a sequence of video frames in a single shot using a 2D detector. [Expand]
3.00
3
Friday Poster Session
NBNet: Noise Basis Learning for Image Denoising With Subspace Projection
Shen Cheng, Yuzhi Wang, Haibin Huang, Donghao Liu, Haoqiang Fan, Shuaicheng Liu
In this paper, we introduce NBNet, a novel framework for image denoising. [Expand]
3.00
1
Tuesday Poster Session
FS-Net: Fast Shape-Based Network for Category-Level 6D Object Pose Estimation With Decoupled Rotation Mechanism
Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Linlin Shen, Ales Leonardis
In this paper, we focus on category-level 6D pose and size estimation from a monocular RGB-D image. [Expand]
3.00
1
Monday Poster Session
Model-Based 3D Hand Reconstruction via Self-Supervised Learning
Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan
Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity. [Expand]
3.00
1
Wednesday Poster Session
Learning a Proposal Classifier for Multiple Object Tracking
Peng Dai, Renliang Weng, Wongun Choi, Changshui Zhang, Zhangping He, Wei Ding
The recent trend in multiple object tracking (MOT) is heading towards leveraging deep learning to boost the tracking performance. [Expand]
3.00
3
Monday Poster Session
Global2Local: Efficient Structure Search for Video Action Segmentation
Shang-Hua Gao, Qi Han, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng
Temporal receptive fields of models play an important role in action segmentation. [Expand]
3.00
3
Friday Poster Session
ContactOpt: Optimizing Contact To Improve Grasps
Patrick Grady, Chengcheng Tang, Christopher D. Twigg, Minh Vo, Samarth Brahmbhatt, Charles C. Kemp
Physical contact between hands and objects plays a critical role in human grasps. [Expand]
3.00
3
Monday Poster Session
Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On
Chongjian Ge, Yibing Song, Yuying Ge, Han Yang, Wei Liu, Ping Luo
Image virtual try-on replaces the clothes on a person image with a desired in-shop clothes image. [Expand]
Friday Poster Session
Anti-Aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation
Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye
Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples. [Expand]
3.00
3
Wednesday Poster Session
Spatiotemporal Registration for Event-Based Visual Odometry
Daqi Liu, Alvaro Parra, Tat-Jun Chin
A useful application of event sensing is visual odometry, especially in settings that require high-temporal resolution. [Expand]
Tuesday Poster Session
Dual-Stream Multiple Instance Learning Network for Whole Slide Image Classification With Self-Supervised Contrastive Learning
Bin Li, Yin Li, Kevin W. Eliceiri
We address the challenging problem of whole slide image (WSI) classification. [Expand]
3.00
3
Thursday Poster Session
Searching for Fast Model Families on Datacenter Accelerators
Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc V. Le, Norman P. Jouppi
Neural Architecture Search (NAS), together with model scaling, has shown remarkable progress in designing high accuracy and fast convolutional architecture families. [Expand]
3.00
3
Wednesday Poster Session
Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection
Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai
Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding. [Expand]
3.00
3
Wednesday Poster Session
Diffusion Probabilistic Models for 3D Point Cloud Generation
Shitong Luo, Wei Hu
We present a probabilistic model for point cloud generation, which is fundamental for various 3D vision tasks such as shape completion, upsampling, synthesis and data augmentation. [Expand]
3.00
3
Tuesday Poster Session
Read and Attend: Temporal Localisation in Sign Language Videos
Gul Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
The objective of this work is to annotate sign instances across a broad vocabulary in continuous sign language. [Expand]
3.00
2
Friday Poster Session
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang, Qi She, Aljosa Smolic
Spatial-temporal, channel-wise, and motion patterns are three complementary and crucial types of information for video action recognition. [Expand]
Thursday Poster Session
MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing
Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan
To capture high-speed videos using a two-dimensional detector, video snapshot compressive imaging (SCI) is a promising system, where the video frames are coded by different masks and then compressed to a snapshot measurement. [Expand]
3.00
3
Monday Poster Session
Prototype-Supervised Adversarial Network for Targeted Attack of Deep Hashing
Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, Guangming Lu
Due to its powerful capability of representation learning and high-efficiency computation, deep hashing has made significant progress in large-scale image retrieval. [Expand]
Friday Poster Session
Understanding the Robustness of Skeleton-Based Action Recognition Under Adversarial Attack
He Wang, Feixiang He, Zhexi Peng, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg
Action recognition has been heavily employed in many applications such as autonomous vehicles, surveillance, etc, where its robustness is a primary concern. [Expand]
3.00
2
Thursday Poster Session
Positive-Congruent Training: Towards Regression-Free Model Updates
Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto
Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error. [Expand]
3.00
1
Thursday Poster Session
Robust Instance Segmentation Through Reasoning About Multi-Object Occlusion
Xiaoding Yuan, Adam Kortylewski, Yihong Sun, Alan Yuille
Analyzing complex scenes with Deep Neural Networks is a challenging task, particularly when images contain multiple objects that partially occlude each other. [Expand]
Wednesday Poster Session
Deep Stable Learning for Out-of-Distribution Generalization
Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. [Expand]
Tuesday Poster Session
DoDNet: Learning To Segment Multi-Organ and Tumors From Multiple Partially Labeled Datasets
Jianpeng Zhang, Yutong Xie, Yong Xia, Chunhua Shen
Due to the intensive cost of labor and expertise in annotating 3D medical images at a voxel level, most benchmark datasets are equipped with the annotations of only one type of organs and/or tumors, resulting in the so-called partially labeling issue. [Expand]
3.00
3
Monday Poster Session
Improving Sign Language Translation With Monolingual Data by Sign Back-Translation
Hao Zhou, Wengang Zhou, Weizhen Qi, Junfu Pu, Houqiang Li
Despite existing pioneering works on sign language translation (SLT), there is a non-trivial obstacle, i.e., the limited quantity of parallel sign-text data. [Expand]
Monday Poster Session
Spatially-Varying Outdoor Lighting Estimation From Intrinsics
Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi
We present SOLID-Net, a neural network for spatially-varying outdoor lighting estimation from a single outdoor image for any 2D pixel location. [Expand]
Thursday Poster Session
ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows
Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo
Universal style transfer retains styles from reference images in content images. [Expand]
2.75
1
Monday Poster Session
Boundary IoU: Improving Object-Centric Image Segmentation Evaluation
Bowen Cheng, Ross Girshick, Piotr Dollar, Alexander C. Berg, Alexander Kirillov
We present Boundary IoU (Intersection-over-Union), a new segmentation evaluation measure focused on boundary quality. [Expand]
2.75
1
Thursday Poster Session
Equivariant Point Network for 3D Point Cloud Analysis
Haiwei Chen, Shichen Liu, Weikai Chen, Hao Li, Randall Hill
Features that are equivariant to a larger group of symmetries have been shown to be more discriminative and powerful in recent studies. [Expand]
Thursday Poster Session
Compatibility-Aware Heterogeneous Visual Search
Rahul Duggal, Hao Zhou, Shuo Yang, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto
We tackle the problem of visual search under resource constraints. [Expand]
Wednesday Poster Session
Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark
Joakim Bruslund Haurum, Thomas B. Moeslund
Perhaps surprisingly sewerage infrastructure is one of the most costly infrastructures in modern society. [Expand]
2.75
1
Thursday Poster Session
Reciprocal Landmark Detection and Tracking With Extremely Few Annotations
Jianzhe Lin, Ghazal Sahebzamani, Christina Luong, Fatemeh Taheri Dezaki, Mohammad Jafari, Purang Abolmaesumi, Teresa Tsang
Localization of anatomical landmarks to perform two-dimensional measurements in echocardiography is part of routine clinical workflow in cardiac disease diagnosis. [Expand]
Thursday Poster Session
Noise-Resistant Deep Metric Learning With Ranking-Based Instance Selection
Chang Liu, Han Yu, Boyang Li, Zhiqi Shen, Zhanning Gao, Peiran Ren, Xuansong Xie, Lizhen Cui, Chunyan Miao
The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. [Expand]
Tuesday Poster Session
Zero-Shot Adversarial Quantization
Yuang Liu, Wei Zhang, Jun Wang
Model quantization is a promising approach to compress deep neural networks and accelerate inference, making it possible to be deployed on mobile and edge devices. [Expand]
Monday Poster Session
Exploring Adversarial Fake Images on Face Manifold
Dongze Li, Wei Wang, Hongxing Fan, Jing Dong
Images synthesized by powerful generative adversarial network (GAN) based methods have drawn moral and privacy concerns. [Expand]
Tuesday Poster Session
StEP: Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis
Moustafa Meshry, Yixuan Ren, Larry S. Davis, Abhinav Shrivastava
We propose a novel approach for multi-modal Image-to-image (I2I) translation. [Expand]
Tuesday Poster Session
HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang
In this paper, we address the problem of building pixel-wise dense correspondences between human images under arbitrary camera viewpoints and body poses. [Expand]
Monday Poster Session
Depth-Conditioned Dynamic Message Propagation for Monocular 3D Object Detection
Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, Xiangyang Xue, Jianfeng Feng, Li Zhang
The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection. [Expand]
Monday Poster Session
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua
We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions. [Expand]
Wednesday Poster Session
PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds
Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi
We introduce Position Adaptive Convolution (PAConv), a generic convolution operation for 3D point cloud processing. [Expand]
2.75
1
Tuesday Poster Session
Patch-VQ: 'Patching Up' the Video Quality Problem
Zhenqiang Ying, Maniratnam Mandal, Deepti Ghadiyaram, Alan Bovik
No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem for social and streaming media applications. [Expand]
2.75
2
Thursday Poster Session
Are Labels Always Necessary for Classifier Accuracy Evaluation?
Weijian Deng, Liang Zheng
To calculate the model accuracy on a computer vision task, e.g., object recognition, we usually require a test set composing of test samples and their ground truth labels. [Expand]
2.50
2
Thursday Poster Session
XProtoNet: Diagnosis in Chest Radiography With Global and Local Explanations
Eunji Kim, Siwon Kim, Minji Seo, Sungroh Yoon
Automated diagnosis using deep neural networks in chest radiography can help radiologists detect life-threatening diseases. [Expand]
2.50
1
Friday Poster Session
MongeNet: Efficient Sampler for Geometric Deep Learning
Leo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Olivier Salvado
Recent advances in geometric deep-learning introduce complex computational challenges for evaluating the distance between meshes. [Expand]
Friday Poster Session
One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu
Point cloud semantic segmentation often requires largescale annotated training data, but clearly, point-wise labels are too tedious to prepare. [Expand]
Monday Poster Session
PointGuard: Provably Robust 3D Point Cloud Classification
Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
3D point cloud classification has many safety-critical applications such as autonomous driving and robotic grasping. [Expand]
2.50
2
Tuesday Poster Session
UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning
Kunming Luo, Chuan Wang, Shuaicheng Liu, Haoqiang Fan, Jue Wang, Jian Sun
We present an unsupervised learning approach for optical flow estimation by improving the upsampling and learning of pyramid network. [Expand]
2.50
2
Monday Poster Session
Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality
Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha
We present Affect2MM, a learning method for time-series emotion prediction for multimedia content. [Expand]
2.50
1
Tuesday Poster Session
Dive Into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition
Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, Tao Mei
Due to the subjective annotation and the inherent inter-class similarity of facial expressions, one of key challenges in Facial Expression Recognition (FER) is the annotation ambiguity. [Expand]
Tuesday Poster Session
SceneGen: Learning To Generate Realistic Traffic Scenes
Shuhan Tan, Kelvin Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun
We consider the problem of generating realistic traffic scenes automatically. [Expand]
2.50
1
Monday Poster Session
Learning Better Visual Dialog Agents With Pretrained Visual-Linguistic Representation
Tao Tu, Qing Ping, Govindarajan Thattai, Gokhan Tur, Prem Natarajan
GuessWhat?! is a visual dialog guessing game which incorporates a Questioner agent that generates a sequence of questions, while an Oracle agent answers the respective questions about a target object in an image. [Expand]
Tuesday Poster Session
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation
Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu
Weakly supervised phrase grounding aims at learning region-phrase correspondences using only image-sentence pairs. [Expand]
2.50
1
Thursday Poster Session
When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks
Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo
Human pose estimation is a fundamental yet challenging task in computer vision, which aims at localizing human anatomical keypoints. [Expand]
Thursday Poster Session
Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision
Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, Serena Yeung
Instance segmentation is an active topic in computer vision that is usually solved by using supervised learning approaches over very large datasets composed of object level masks. [Expand]
Monday Poster Session
Capturing Omni-Range Context for Omnidirectional Segmentation
Kailun Yang, Jiaming Zhang, Simon Reiss, Xinxin Hu, Rainer Stiefelhagen
Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving. [Expand]
2.50
1
Monday Poster Session
TPCN: Temporal Point Cloud Networks for Motion Forecasting
Maosheng Ye, Tongyi Cao, Qifeng Chen
We propose the Temporal Point Cloud Networks (TPCN), a novel and flexible framework with joint spatial and temporal learning for trajectory prediction. [Expand]
2.50
2
Wednesday Poster Session
Learning To Recommend Frame for Interactive Video Object Segmentation in the Wild
Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao
This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively. [Expand]
Thursday Poster Session
Few-Shot Incremental Learning With Continually Evolved Classifiers
Chi Zhang, Nan Song, Guosheng Lin, Yun Zheng, Pan Pan, Yinghui Xu
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points, without forgetting knowledge of old classes. [Expand]
2.50
2
Thursday Poster Session
Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis
Yaxuan Zhu, Ruiqi Gao, Siyuan Huang, Song-Chun Zhu, Ying Nian Wu
How to efficiently represent camera pose is an essential problem in 3D computer vision, especially in tasks like camera pose regression and novel view synthesis. [Expand]
Wednesday Poster Session
LQF: Linear Quadratic Fine-Tuning
Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto
Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization. [Expand]
2.25
1
Friday Poster Session
More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song
A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. [Expand]
2.25
2
Tuesday Poster Session
InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
Shubhankar Borse, Ying Wang, Yizhe Zhang, Fatih Porikli
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network, which efficiently learns the degree of parametric transformations between estimated and target boundaries. [Expand]
Tuesday Poster Session
Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds
Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu
3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. [Expand]
Wednesday Poster Session
Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Recognition
Chun-Fu Richard Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan
In recent years, a number of approaches based on 2D or 3D convolutional neural networks (CNN) have emerged for video action recognition, achieving state-of-the-art results on several large-scale benchmark datasets. [Expand]
Tuesday Poster Session
Reformulating HOI Detection As Adaptive Set Prediction
Mingfei Chen, Yue Liao, Si Liu, Zhiyuan Chen, Fei Wang, Chen Qian
Determining which image regions to concentrate is critical for Human-Object Interaction (HOI) detection. [Expand]
2.25
2
Wednesday Poster Session
Wasserstein Contrastive Representation Distillation
Liqun Chen, Dong Wang, Zhe Gan, Jingjing Liu, Ricardo Henao, Lawrence Carin
The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former. [Expand]
2.25
2
Friday Poster Session
Generalizable Person Re-Identification With Relevance-Aware Mixture of Experts
Yongxing Dai, Xiaotong Li, Jun Liu, Zekun Tong, Ling-Yu Duan
Domain generalizable (DG) person re-identification (ReID) is a challenging problem because we cannot access any unseen target domain data during training. [Expand]
2.25
1
Friday Poster Session
General Instance Distillation for Object Detection
Xing Dai, Zeren Jiang, Zhao Wu, Yiping Bao, Zhicheng Wang, Si Liu, Erjin Zhou
In recent years, knowledge distillation has been proved to be an effective solution for model compression. [Expand]
2.25
2
Wednesday Poster Session
Deformed Implicit Field: Modeling 3D Shapes With Learned Dense Correspondence
Yu Deng, Jiaolong Yang, Xin Tong
We propose a novel Deformed Implicit Field (DIF) representation for modeling 3D shapes of a category and generating dense correspondences among shapes. [Expand]
2.25
2
Wednesday Poster Session
AlphaMatch: Improving Consistency for Semi-Supervised Learning With Alpha-Divergence
Chengyue Gong, Dilin Wang, Qiang Liu
Semi-supervised learning (SSL) is a key approach toward more data-efficient machine learning by jointly leverage both labeled and unlabeled data. [Expand]
Thursday Poster Session
ReDet: A Rotation-Equivariant Detector for Aerial Object Detection
Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia
Recently, object detection in aerial images has gained much attention in computer vision. [Expand]
2.25
2
Monday Poster Session
Reinforced Attention for Few-Shot Learning and Beyond
Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson
Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images. [Expand]
2.25
2
Monday Poster Session
A Multiplexed Network for End-to-End, Multilingual OCR
Jing Huang, Guan Pang, Rama Kovvuri, Mandy Toh, Kevin J Liang, Praveen Krishnan, Xi Yin, Tal Hassner
Recent advances in OCR have shown that an end-to-end (E2E) training pipeline that includes both detection and recognition leads to the best results. [Expand]
2.25
2
Tuesday Poster Session
FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
Yair Kittenplon, Yonina C. Eldar, Dan Raviv
Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision. [Expand]
2.25
2
Tuesday Poster Session
Refer-It-in-RGBD: A Bottom-Up Approach for 3D Visual Grounding in RGBD Images
Haolin Liu, Anran Lin, Xiaoguang Han, Lei Yang, Yizhou Yu, Shuguang Cui
Grounding referring expressions in RGBD image has been an emerging field. [Expand]
Tuesday Poster Session
Semi-Supervised 3D Hand-Object Poses Estimation With Interactions in Time
Shaowei Liu, Hanwen Jiang, Jiarui Xu, Sifei Liu, Xiaolong Wang
Estimating 3D hand and object pose from a single image is an extremely challenging problem: hands and objects are often self-occluded during interactions, and the 3D annotations are scarce as even humans cannot directly label the ground-truths from a single image perfectly. [Expand]
2.25
2
Thursday Poster Session
Unsupervised Part Segmentation Through Disentangling Appearance and Shape
Shilong Liu, Lei Zhang, Xiao Yang, Hang Su, Jun Zhu
We study the problem of unsupervised discovery and segmentation of object parts, which, as an intermediate local representation, are capable of finding intrinsic object structure and providing more explainable recognition results. [Expand]
Wednesday Poster Session
PointNetLK Revisited
Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey
We address the generalization ability of recent learning-based point cloud registration methods. [Expand]
Thursday Poster Session
QAIR: Practical Query-Efficient Black-Box Attacks for Image Retrieval
Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, Hui Xue
We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting, where the adversary only has query access to the top-k ranked unlabeled images from the database. [Expand]
2.25
2
Tuesday Poster Session
Quasi-Dense Similarity Learning for Multiple Object Tracking
Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu
Similarity learning has been recognized as a crucial step for object tracking. [Expand]
2.25
1
Monday Poster Session
Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration
Liyuan Pan, Shah Chowdhury, Richard Hartley, Miaomiao Liu, Hongguang Zhang, Hongdong Li
The dual-pixel (DP) hardware works by splitting each pixel in half and creating an image pair in a single snapshot. [Expand]
2.25
1
Tuesday Poster Session
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-Bit Neural Networks via Guided Distribution Calibration
Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides
Previous studies dominantly target at self-supervised learning on real-valued networks and have achieved many promising results. [Expand]
2.25
2
Monday Poster Session
NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras
Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Minye Wu, Kaiwen Guo, Lan Xu
4D reconstruction and rendering of human activities is critical for immersive VR/AR experience. [Expand]
2.25
2
Tuesday Poster Session
Modeling Multi-Label Action Dependencies for Temporal Action Localization
Praveen Tirupattur, Kevin Duarte, Yogesh S Rawat, Mubarak Shah
Real world videos contain many complex actions with inherent relationships between action classes. [Expand]
Monday Poster Session
ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning
Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, Jun Zhu
Continual learning usually assumes the incoming data are fully labeled, which might not be applicable in real applications. [Expand]
Tuesday Poster Session
Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning
Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun
Self-supervised learning has shown great potentials in improving the video representation ability of deep neural networks by getting supervision from the data itself. [Expand]
2.25
2
Thursday Poster Session
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang, Linchao Zhu, Yi Yang
Text-video retrieval is a challenging task that aims to search relevant video contents based on natural language descriptions. [Expand]
2.25
2
Tuesday Poster Session
Few-Shot Classification With Feature Map Reconstruction Networks
Davis Wertheimer, Luming Tang, Bharath Hariharan
In this paper we reformulate few-shot classification as a reconstruction problem in latent space. [Expand]
Wednesday Poster Session
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes
Zhihang Zhong, Yinqiang Zheng, Imari Sato
Joint rolling shutter correction and deblurring (RSCD) techniques are critical for the prevalent CMOS cameras. [Expand]
Wednesday Poster Session
Sequence-to-Sequence Contrastive Learning for Text Recognition
Aviad Aberdam, Ron Litman, Shahar Tsiper, Oron Anschel, Ron Slossberg, Shai Mazor, R. Manmatha, Pietro Perona
We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. [Expand]
2.00
2
Thursday Poster Session
Unsupervised Multi-Source Domain Adaptation Without Access to Source Data
Sk Miraj Ahmed, Dripta S. Raychaudhuri, Sujoy Paul, Samet Oymak, Amit K. Roy-Chowdhury
Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for an unlabeled dataset by transferring knowledge from a labeled source data, which has been trained on similar tasks. [Expand]
2.00
2
Wednesday Poster Session
Object Classification From Randomized EEG Trials
Hamad Ahmed, Ronnie B. Wilbur, Hari M. Bharadwaj, Jeffrey Mark Siskind
New results suggest strong limits to the feasibility of object classification from human brain activity evoked by image stimuli, as measured through EEG. [Expand]
2.00
1
Tuesday Poster Session
Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo
Seung-Hwan Baek, Felix Heide
Active stereo cameras that recover depth from structured light captures have become a cornerstone sensor modality for 3D scene reconstruction and understanding tasks across application domains. [Expand]
2.00
2
Tuesday Poster Session
Architectural Adversarial Robustness: The Case for Deep Pursuit
George Cazenavette, Calvin Murdock, Simon Lucey
Despite their unmatched performance, deep neural networks remain susceptible to targeted attacks by nearly imperceptible levels of adversarial noise. [Expand]
2.00
2
Wednesday Poster Session
Learning Feature Aggregation for Deep 3D Morphable Models
Zhixiang Chen, Tae-Kyun Kim
3D morphable models are widely used for the shape representation of an object class in computer vision and graphics applications. [Expand]
Thursday Poster Session
I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors
Chaoqi Chen, Zebiao Zheng, Yue Huang, Xinghao Ding, Yizhou Yu
Recent works on two-stage cross-domain detection have widely explored the local feature patterns to achieve more accurate adaptation results. [Expand]
2.00
2
Thursday Poster Session
Semantic-Aware Knowledge Distillation for Few-Shot Class-Incremental Learning
Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi
Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner. [Expand]
2.00
2
Monday Poster Session
LiBRe: A Practical Bayesian Approach to Adversarial Detection
Zhijie Deng, Xiao Yang, Shizhen Xu, Hang Su, Jun Zhu
Despite their appealing flexibility, deep neural networks (DNNs) are vulnerable against adversarial examples. [Expand]
Monday Poster Session
Multi-Institutional Collaborations for Improving Deep Learning-Based Magnetic Resonance Image Reconstruction Using Federated Learning
Pengfei Guo, Puyang Wang, Jinyuan Zhou, Shanshan Jiang, Vishal M. Patel
Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. [Expand]
2.00
2
Monday Poster Session
MetaCorrection: Domain-Aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation
Xiaoqing Guo, Chen Yang, Baopu Li, Yixuan Yuan
Unsupervised domain adaptation (UDA) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain. [Expand]
Tuesday Poster Session
Lips Don't Lie: A Generalisable and Robust Approach To Face Forgery Detection
Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
Although current deep learning-based face forgery detectors achieve impressive performance in constrained scenarios, they are vulnerable to samples created by unseen manipulation methods. [Expand]
2.00
1
Tuesday Poster Session
Neural Cellular Automata Manifold
Alejandro Hernandez, Armand Vilalta, Francesc Moreno-Noguer
Very recently, the Neural Cellular Automata (NCA) has been proposed to simulate the morphogenesis process with deep networks. [Expand]
2.00
1
Wednesday Poster Session
Visualizing Adapted Knowledge in Domain Transfer
Yunzhong Hou, Liang Zheng
A source model trained on source data and a target model learned through unsupervised domain adaptation (UDA) usually encode different knowledge. [Expand]
2.00
1
Thursday Poster Session
Multi-Target Domain Adaptation With Collaborative Consistency Learning
Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang
Recently unsupervised domain adaptation for the semantic segmentation task has become more and more popular due to the high-cost of pixel-level annotation on real-world images. [Expand]
2.00
2
Wednesday Poster Session
In the Light of Feature Distributions: Moment Matching for Neural Style Transfer
Nikolai Kalischek, Jan D. Wegner, Konrad Schindler
Style transfer aims to render the content of a given image in the graphical/artistic style of another image. [Expand]
Wednesday Poster Session
UniT: Unified Knowledge Transfer for Any-Shot Object Detection and Segmentation
Siddhesh Khandelwal, Raghav Goyal, Leonid Sigal
Methods for object detection and segmentation rely on large scale instance-level annotations for training, which are difficult and time-consuming to collect. [Expand]
2.00
1
Tuesday Poster Session
Robust Reflection Removal With Reflection-Free Flash-Only Cues
Chenyang Lei, Qifeng Chen
We propose a simple yet effective reflection-free cue for robust reflection removal from a pair of flash and ambient (no-flash) images. [Expand]
2.00
2
Thursday Poster Session
SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation
Dongfang Liu, Yiming Cui, Wenbo Tan, Yingjie Chen
Video instance segmentation (VIS) is a new and critical task in computer vision. [Expand]
2.00
1
Wednesday Poster Session
Watching You: Global-Guided Reciprocal Learning for Video-Based Person Re-Identification
Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang
Video-based person re-identification (Re-ID) aims to automatically retrieve video sequences of the same person under non-overlapping cameras. [Expand]
2.00
2
Thursday Poster Session
Beyond Max-Margin: Class Margin Equilibrium for Few-Shot Object Detection
Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye
Few-shot object detection has made encouraging progress by reconstructing novel class objects using the feature representation learned upon a set of base classes. [Expand]
2.00
2
Wednesday Poster Session
D2IM-Net: Learning Detail Disentangled Implicit Fields From Single Images
Manyi Li, Hao Zhang
We present the first single-view 3D reconstruction network aimed at recovering geometric details from an input image which encompass both topological shape structures and surface features. [Expand]
2.00
2
Wednesday Poster Session
Dynamic Slimmable Network
Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang
Current dynamic networks and dynamic pruning methods have shown their promising capability in reducing theoretical computation complexity. [Expand]
2.00
2
Wednesday Poster Session
PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training
Luke Melas-Kyriazi, Arjun K. Manrai
Unsupervised domain adaptation is a promising technique for semantic segmentation and other computer vision tasks for which large-scale data annotation is costly and time-consuming. [Expand]
Thursday Poster Session
Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks
Roi Pony, Itay Naeh, Shie Mannor
Deep neural networks for video classification, just like image classification networks, may be subjected to adversarial manipulation. [Expand]
2.00
1
Monday Poster Session
Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect
Athena Sayles, Ashish Hooda, Mohit Gupta, Rahul Chatterjee, Earlence Fernandes
Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts -- a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. [Expand]
2.00
1
Thursday Poster Session
SSN: Soft Shadow Network for Image Compositing
Yichen Sheng, Jianming Zhang, Bedrich Benes
We introduce an interactive Soft Shadow Network (SSN) to generates controllable soft shadows for image compositing. [Expand]
2.00
1
Tuesday Poster Session
ZeroScatter: Domain Transfer for Long Distance Imaging and Vision Through Scattering Media
Zheng Shi, Ethan Tseng, Mario Bijelic, Werner Ritter, Felix Heide
Adverse weather conditions, including snow, rain, and fog, pose a major challenge for both human and computer vision. [Expand]
Tuesday Poster Session
Open Domain Generalization with Domain-Augmented Meta-Learning
Yang Shu, Zhangjie Cao, Chenyu Wang, Jianmin Wang, Mingsheng Long
Leveraging datasets available to learn a model with high generalization ability to unseen domains is important for computer vision, especially when the unseen domain's annotated data are unavailable. [Expand]
2.00
2
Wednesday Poster Session
Equalization Loss v2: A New Gradient Balance Approach for Long-Tailed Object Detection
Jingru Tan, Xin Lu, Gang Zhang, Changqing Yin, Quanquan Li
Recently proposed decoupled training methods emerge as a dominant paradigm for long-tailed object detection. [Expand]
Monday Poster Session
RAFT-3D: Scene Flow Using Rigid-Motion Embeddings
Zachary Teed, Jia Deng
We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion. [Expand]
2.00
2
Wednesday Poster Session
Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach
Giang Truong, Huu Le, David Suter, Erchuan Zhang, Syed Zulqarnain Gilani
Robust model fitting is a core algorithm in a large number of computer vision applications. [Expand]
Wednesday Poster Session
Incremental Learning via Rate Reduction
Ziyang Wu, Christina Baek, Chong You, Yi Ma
Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes. [Expand]
2.00
2
Monday Poster Session
Efficient Regional Memory Network for Video Object Segmentation
Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Wenxiu Sun
Recently, several Space-Time Memory based networks have shown that the object cues (e.g. [Expand]
2.00
2
Monday Poster Session
Learnable Companding Quantization for Accurate Low-Bit Neural Networks
Kohei Yamamoto
Quantizing deep neural networks is an effective method for reducing memory consumption and improving inference speed, and is thus useful for implementation in resource-constrained devices. [Expand]
Tuesday Poster Session
Interactive Self-Training With Mean Teachers for Semi-Supervised Object Detection
Qize Yang, Xihan Wei, Biao Wang, Xian-Sheng Hua, Lei Zhang
The goal of semi-supervised object detection is to learn a detection model using only a few labeled data and large amounts of unlabeled data, thereby reducing the cost of data labeling. [Expand]
2.00
2
Tuesday Poster Session
Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation
Yuntong Ye, Yi Chang, Hanyu Zhou, Luxin Yan
Existing deep learning-based image deraining methods have achieved promising performance for synthetic rainy images, typically rely on the pairs of sharp images and simulated rainy counterparts. [Expand]
Monday Poster Session
Mutual Graph Learning for Camouflaged Object Detection
Qiang Zhai, Xin Li, Fan Yang, Chenglizhao Chen, Hong Cheng, Deng-Ping Fan
Automatically detecting/segmenting object(s) that blend in with their surroundings is difficult for current models. [Expand]
2.00
2
Thursday Poster Session
Group-aware Label Transfer for Domain Adaptive Person Re-identification
Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha
Unsupervised Domain Adaptive (UDA) person re-identification (ReID) aims at adapting the model trained on a labeled source-domain dataset to a target-domain dataset without any further annotations. [Expand]
2.00
2
Tuesday Poster Session
Partition-Guided GANs
Mohammadreza Armandpour, Ali Sadeghian, Chunyuan Li, Mingyuan Zhou
Despite the success of Generative Adversarial Networks (GANs), their training suffers from several well-known problems, including mode collapse and difficulties learning a disconnected set of manifolds. [Expand]
Tuesday Poster Session
ReAgent: Point Cloud Registration Using Imitation and Reinforcement Learning
Dominik Bauer, Timothy Patten, Markus Vincze
Point cloud registration is a common step in many 3D computer vision tasks such as object pose estimation, where a 3D model is aligned to an observation. [Expand]
Thursday Poster Session
FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise
Jaeseok Byun, Sungmin Cha, Taesup Moon
We consider the challenging blind denoising problem for Poisson-Gaussian noise, in which no additional information about clean images or noise level parameters is available. [Expand]
Tuesday Poster Session
Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles
Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu
Controllable Image Captioning (CIC) -- generating image descriptions following designated control signals -- has received unprecedented attention over the last few years. [Expand]
1.75
1
Friday Poster Session
3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia
The ability to understand the ways to interact with objects from visual cues, a.k.a. [Expand]
Monday Poster Session
Unbiased Mean Teacher for Cross-Domain Object Detection
Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan
Cross-domain object detection is challenging, because object detection model is often vulnerable to data variance, especially to the considerable domain shift between two distinctive domains. [Expand]
1.75
1
Tuesday Poster Session
Adaptive Methods for Real-World Domain Generalization
Abhimanyu Dubey, Vignesh Ramanathan, Alex Pentland, Dhruv Mahajan
Invariant approaches have been remarkably successful in tackling the problem of domain generalization, where the objective is to perform inference on data distributions different from those used in training. [Expand]
1.75
1
Thursday Poster Session
Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings
Mihai Dusmanu, Johannes L. Schonberger, Sudipta N. Sinha, Marc Pollefeys
Many computer vision systems require users to upload image features to the cloud for processing and storage. [Expand]
Thursday Poster Session
Single-Shot Freestyle Dance Reenactment
Oran Gafni, Oron Ashual, Lior Wolf
The task of motion transfer between a source dancer and a target person is a special case of the pose transfer problem, in which the target person changes their pose in accordance with the motions of the dancer. [Expand]
1.75
1
Monday Poster Session
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion
Vitor Guizilini, Rares Ambrus, Wolfram Burgard, Adrien Gaidon
Estimating scene geometry from cost-effective sensors is key for robots. [Expand]
Wednesday Poster Session
Interpreting Super-Resolution Networks With Local Attribution Maps
Jinjin Gu, Chao Dong
Image super-resolution (SR) techniques have been developing rapidly, benefiting from the invention of deep networks and its successive breakthroughs. [Expand]
1.75
1
Wednesday Poster Session
Learning Optical Flow From a Few Matches
Shihao Jiang, Yao Lu, Hongdong Li, Richard Hartley
State-of-the-art neural network models for optical flow estimation require a dense correlation volume at high resolutions for representing per-pixel displacement. [Expand]
Friday Poster Session
Multi-Shot Temporal Event Localization: A Benchmark
Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr
Current developments in temporal event or action localization usually target actions captured by a single camera. [Expand]
Thursday Poster Session
Retinex-Inspired Unrolling With Cooperative Prior Architecture Search for Low-Light Image Enhancement
Risheng Liu, Long Ma, Jiaao Zhang, Xin Fan, Zhongxuan Luo
Low-light image enhancement plays very important roles in low-level vision areas. [Expand]
1.75
1
Wednesday Poster Session
Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation
Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu
In semi-supervised domain adaptation, a few labeled samples per class in the target domain guide features of the remaining target samples to aggregate around them. [Expand]
Monday Poster Session
Temporal Action Segmentation From Timestamp Supervision
Zhe Li, Yazan Abu Farha, Jurgen Gall
Temporal action segmentation approaches have been very successful recently. [Expand]
Wednesday Poster Session
Variational Relational Point Completion Network
Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu
Real-scanned point clouds are often incomplete due to viewpoint, occlusion, and noise. [Expand]
1.75
1
Wednesday Poster Session
The Affective Growth of Computer Vision
Norman Makoto Su, David J. Crandall
The success of deep learning has led to intense growth and interest in computer vision, along with concerns about its potential impact on society. [Expand]
Wednesday Poster Session
Look Closer To Segment Better: Boundary Patch Refinement for Instance Segmentation
Chufeng Tang, Hang Chen, Xiao Li, Jianmin Li, Zhaoxiang Zhang, Xiaolin Hu
Tremendous efforts have been made on instance segmentation but the mask quality is still not satisfactory. [Expand]
Thursday Poster Session
A Fourier-Based Framework for Domain Generalization
Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, Qi Tian
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data. [Expand]
Thursday Poster Session
SOON: Scenario Oriented Object Navigation With Graph-Based Exploration
Fengda Zhu, Xiwen Liang, Yi Zhu, Qizhi Yu, Xiaojun Chang, Xiaodan Liang
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots. [Expand]
Thursday Poster Session
Deeply Shape-Guided Cascade for Instance Segmentation
Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen
The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages. [Expand]
Wednesday Poster Session
Encoder Fusion Network With Co-Attention Embedding for Referring Image Segmentation
Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu
Recently, referring image segmentation has aroused widespread interest. [Expand]
Thursday Poster Session
AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
Visual events are a composition of temporal actions involving actors spatially interacting with objects. [Expand]
Wednesday Poster Session
Distilling Object Detectors via Decoupled Features
Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu
Knowledge distillation is a widely used paradigm for inheriting information from a complicated teacher network to a compact student network and maintaining the strong performance. [Expand]
1.50
1
Monday Poster Session
Learning by Aligning Videos in Time
Sanjay Haresh, Sateesh Kumar, Huseyin Coskun, Shahram N. Syed, Andrey Konin, Zeeshan Zia, Quoc-Huy Tran
We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. [Expand]
1.50
1
Tuesday Poster Session
DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation
Yufan He, Dong Yang, Holger Roth, Can Zhao, Daguang Xu
Recently, neural architecture search(NAS) has been applied to automatically search high-performance networks for medical image segmentation. [Expand]
1.50
1
Tuesday Poster Session
Deep Dual Consecutive Network for Human Pose Estimation
Zhenguang Liu, Haoming Chen, Runyang Feng, Shuang Wu, Shouling Ji, Bailin Yang, Xun Wang
Multi-frame human pose estimation in complicated situations is challenging. [Expand]
Monday Poster Session
Invertible Denoising Network: A Light Solution for Real Noise Removal
Yang Liu, Zhenyue Qin, Saeed Anwar, Pan Ji, Dongwoo Kim, Sabrina Caldwell, Tom Gedeon
Invertible networks have various benefits for image denoising since they are lightweight, information-lossless, and memory-saving during back-propagation. [Expand]
Thursday Poster Session
The Blessings of Unlabeled Background in Untrimmed Videos
Yuan Liu, Jingyuan Chen, Zhenfang Chen, Bing Deng, Jianqiang Huang, Hanwang Zhang
Weakly-supervised Temporal Action Localization (WTAL) aims to detect the action segments with only video-level action labels in training. [Expand]
1.50
1
Tuesday Poster Session
SurFree: A Fast Surrogate-Free Black-Box Attack
Thibault Maho, Teddy Furon, Erwan Le Merrer
Machine learning classifiers are critically prone to evasion attacks. [Expand]
1.50
1
Wednesday Poster Session
Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization
Haoyu Ma, Xiangru Lin, Zifeng Wu, Yizhou Yu
Unsupervised domain adaptation (UDA) in semantic segmentation is a fundamental yet promising task relieving the need for laborious annotation works. [Expand]
1.50
1
Tuesday Poster Session
Convolutional Hough Matching Networks
Juhong Min, Minsu Cho
Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images. [Expand]
1.50
1
Tuesday Poster Session
Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization
Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu
Weakly supervised object localization (WSOL) remains an open problem due to the deficiency of finding object extent information using a classification network. [Expand]
Thursday Poster Session
Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation
Hyojin Park, Jayeon Yoo, Seohyeong Jeong, Ganesh Venkatesh, Nojun Kwak
Current state-of-the-art approaches for Semi-supervised Video Object Segmentation (Semi-VOS) propagates information from previous frames to generate segmentation mask for the current frame. [Expand]
Wednesday Poster Session
HoHoNet: 360 Indoor Holistic Understanding With Latent Horizontal Features
Cheng Sun, Min Sun, Hwann-Tzong Chen
We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). [Expand]
1.50
1
Monday Poster Session
Layerwise Optimization by Gradient Decomposition for Continual Learning
Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang
Deep neural networks achieve state-of-the-art and sometimes super-human performance across a variety of domains. [Expand]
1.50
1
Wednesday Poster Session
Consensus Maximisation Using Influences of Monotone Boolean Functions
Ruwan Tennakoon, David Suter, Erchuan Zhang, Tat-Jun Chin, Alireza Bab-Hadiashar
Consensus maximisation (MaxCon), widely used for robust fitting in computer vision, aims to find the largest subset of data that fits the model within some tolerance level. [Expand]
Tuesday Poster Session
Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules
Aisha Urooj, Hilde Kuehne, Kevin Duarte, Chuang Gan, Niels Lobo, Mubarak Shah
The problem of grounding VQA tasks has seen an increased attention in the research community recently, with most attempts usually focusing on solving this task by using pretrained object detectors. [Expand]
Wednesday Poster Session
Efficient Feature Transformations for Discriminative and Generative Continual Learning
Vinay Kumar Verma, Kevin J Liang, Nikhil Mehta, Piyush Rai, Lawrence Carin
As neural networks are increasingly being applied to real-world applications, mechanisms to address distributional shift and sequential task learning without forgetting are critical. [Expand]
Thursday Poster Session
PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
Yi Wei, Ziyi Wang, Yongming Rao, Jiwen Lu, Jie Zhou
In this paper, we propose a Point-Voxel Recurrent All-Pairs Field Transforms (PV-RAFT) method to estimate scene flow from point clouds. [Expand]
Tuesday Poster Session
Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang
Nowadays, we have witnessed the early progress on learning the association between voice and face automatically, which brings a new wave of studies to the computer vision community. [Expand]
1.50
1
Friday Poster Session
Rethinking Class Relations: Absolute-Relative Supervised and Unsupervised Few-Shot Learning
Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr
The majority of existing few-shot learning methods describe image relations with binary labels. [Expand]
Wednesday Poster Session
Variational Pedestrian Detection
Yuang Zhang, Huanyu He, Jianguo Li, Yuxi Li, John See, Weiyao Lin
Pedestrian detection in a crowd is a challenging task due to a high number of mutually-occluding human instances, which brings ambiguity and optimization difficulties to the current IoU-based ground truth assignment procedure in classical object detection methods. [Expand]
Thursday Poster Session
Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias
Yunhan Zhao, Shu Kong, Charless Fowlkes
Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. [Expand]
Friday Poster Session
Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Long Zhao, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu
We introduce a novel representation learning method to disentangle pose-dependent as well as view-dependent factors from 2D human poses. [Expand]
Thursday Poster Session
Learning Statistical Texture for Semantic Segmentation
Lanyun Zhu, Deyi Ji, Shiping Zhu, Weihao Gan, Wei Wu, Junjie Yan
Existing semantic segmentation works mainly focus on learning the contextual information in high-level semantic features with CNNs. [Expand]
1.50
1
Thursday Poster Session
The Translucent Patch: A Physical and Universal Attack on Object Detectors
Alon Zolfi, Moshe Kravchik, Yuval Elovici, Asaf Shabtai
Physical adversarial attacks against object detectors have seen increasing success in recent years. [Expand]
1.50
1
Thursday Poster Session
Riggable 3D Face Reconstruction via In-Network Optimization
Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, Ping Tan
This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. [Expand]
Tuesday Poster Session
View Generalization for Single Image Textured 3D Models
Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro
Humans can easily infer the underlying 3D geometry and texture of an object only from a single 2D image. [Expand]
Show Tweets
Tuesday Poster Session
Scale-Localized Abstract Reasoning
Yaniv Benny, Niv Pekar, Lior Wolf
We consider the abstract relational reasoning task, which is commonly used as an intelligence test. [Expand]
1.25
1
Thursday Poster Session
Limitations of Post-Hoc Feature Alignment for Robustness
Collin Burns, Jacob Steinhardt
Feature alignment is an approach to improving robustness to distribution shift that matches the distribution of feature activations between the training distribution and test distribution. [Expand]
Monday Poster Session
Semi-Supervised Domain Adaptation Based on Dual-Level Domain Mixing for Semantic Segmentation
Shuaijun Chen, Xu Jia, Jianzhong He, Yongjie Shi, Jianzhuang Liu
Data-driven based approaches, in spite of great success in many tasks, have poor generalization when applied to unseen image domains, and require expensive cost of annotation especially for dense pixel prediction tasks such as semantic segmentation. [Expand]
1.25
1
Wednesday Poster Session
Triple-Cooperative Video Shadow Detection
Zhihao Chen, Liang Wan, Lei Zhu, Jia Shen, Huazhu Fu, Wennan Liu, Jing Qin
Shadow detection in single image has received signifi-cant research interests in recent years. [Expand]
Monday Poster Session
Cloud2Curve: Generation and Vectorization of Parametric Sketches
Ayan Das, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. [Expand]
1.25
1
Tuesday Poster Session
BASAR:Black-Box Attack on Skeletal Action Recognition
Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang
Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. [Expand]
1.25
1
Wednesday Poster Session
Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink
Ranjie Duan, Xiaofeng Mao, A. K. Qin, Yuefeng Chen, Shaokai Ye, Yuan He, Yun Yang
Though it is well known that the performance of deep neural networks (DNNs) degrades under certain light conditions, there exists no study on the threats of light beams emitted from some physical source as adversarial attacker on DNNs in a real-world scenario. [Expand]
1.25
1
Friday Poster Session
MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
Jia-Chang Feng, Fa-Ting Hong, Wei-Shi Zheng
Weakly supervised video anomaly detection (WS-VAD) is to distinguish anomalies from normal events based on discriminative representations. [Expand]
1.25
1
Thursday Poster Session
Incremental Few-Shot Instance Segmentation
Dan Andrei Ganea, Bas Boom, Ronald Poppe
Few-shot instance segmentation methods are promising when labeled training data for novel classes is scarce. [Expand]
Monday Poster Session
WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
Mingfei Gao, Yingbo Zhou, Ran Xu, Richard Socher, Caiming Xiong
Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications. [Expand]
1.25
1
Monday Poster Session
Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression
Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, Jingdong Wang
In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. [Expand]
Thursday Poster Session
Cross Modal Focal Loss for RGBD Face Anti-Spoofing
Anjith George, Sebastien Marcel
Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. [Expand]
1.25
1
Wednesday Poster Session
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors
Vladimir Guzov, Aymen Mir, Torsten Sattler, Gerard Pons-Moll
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. [Expand]
1.25
1
Tuesday Poster Session
Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation
Ryuhei Hamaguchi, Yasutaka Furukawa, Masaki Onishi, Ken Sakurada
This paper proposes a novel heterogeneous grid convolution that builds a graph-based image representation by exploiting heterogeneity in the image content, enabling adaptive, efficient, and controllable computations in a convolutional architecture. [Expand]
Thursday Poster Session
ChallenCap: Monocular 3D Capture of Challenging Human Performances Using Multi-Modal References
Yannan He, Anqi Pang, Xin Chen, Han Liang, Minye Wu, Yuexin Ma, Lan Xu
Capturing challenging human motions is critical for numerous applications, but it suffers from complex motion patterns and severe self-occlusion under the monocular setting. [Expand]
1.25
1
Thursday Poster Session
Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries
Saif Imran, Xiaoming Liu, Daniel Morris
Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. [Expand]
1.25
1
Monday Poster Session
Memory-Guided Unsupervised Image-to-Image Translation
Somi Jeong, Youngjung Kim, Eungbean Lee, Kwanghoon Sohn
We present a novel unsupervised framework for instance-level image-to-image translation. [Expand]
Tuesday Poster Session
Locate Then Segment: A Strong Pipeline for Referring Image Segmentation
Ya Jing, Tao Kong, Wei Wang, Liang Wang, Lei Li, Tieniu Tan
Referring image segmentation aims to segment the objects referred by a natural language expression. [Expand]
1.25
1
Wednesday Poster Session
Hierarchical Lovasz Embeddings for Proposal-Free Panoptic Segmentation
Tommi Kerola, Jie Li, Atsushi Kanehira, Yasunori Kudo, Alexis Vallet, Adrien Gaidon
Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. [Expand]
Show Tweets
Thursday Poster Session
IronMask: Modular Architecture for Protecting Deep Face Template
Sunpill Kim, Yunseong Jeong, Jinsu Kim, Jungkon Kim, Hyung Tae Lee, Jae Hong Seo
Convolutional neural networks have made remarkable progress in the face recognition field. [Expand]
1.25
1
Friday Poster Session
Interpretable Social Anchors for Human Trajectory Forecasting in Crowds
Parth Kothari, Brian Sifringer, Alexandre Alahi
Human trajectory forecasting in crowds, at its core, is a sequence prediction problem with specific challenges of capturing inter-sequence dependencies (social interactions) and consequently predicting socially-compliant multimodal distributions. [Expand]
Thursday Poster Session
BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation
Jungbeom Lee, Jihun Yi, Chaehun Shin, Sungroh Yoon
Weakly supervised segmentation methods using bounding box annotations focus on obtaining a pixel-level mask from each box containing an object. [Expand]
1.25
1
Monday Poster Session
Looking Into Your Speech: Learning Cross-Modal Affinity for Audio-Visual Speech Separation
Jiyoung Lee, Soo-Whan Chung, Sunok Kim, Hong-Goo Kang, Kwanghoon Sohn
In this paper, we address the problem of separating individual speech signals from videos using audio-visual neural processing. [Expand]
1.25
1
Monday Poster Session
Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu
The remarkable success in face forgery techniques has received considerable attention in computer vision due to security concerns. [Expand]
1.25
1
Monday Poster Session
From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation
Chen Li, Gim Hee Lee
Animal pose estimation is an important field that has received increasing attention in the recent years. [Expand]
Monday Poster Session
Progressive Domain Expansion Network for Single Domain Generalization
Lei Li, Ke Gao, Juan Cao, Ziyao Huang, Yepeng Weng, Xiaoyue Mi, Zhengze Yu, Xiaoya Li, Boyang Xia
Single domain generalization is a challenging case of model generalization, where the models are trained on a single domain and tested on other unseen domains. [Expand]
1.25
1
Monday Poster Session
PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation
Xiangtai Li, Hao He, Xia Li, Duo Li, Guangliang Cheng, Jianping Shi, Lubin Weng, Yunhai Tong, Zhouchen Lin
Aerial Image Segmentation is a particular semantic segmentation problem and has several challenging characteristics that general semantic segmentation does not have. [Expand]
1.25
1
Tuesday Poster Session
MUST-GAN: Multi-Level Statistics Transfer for Self-Driven Person Image Generation
Tianxiang Ma, Bo Peng, Wei Wang, Jing Dong
Pose-guided person image generation usually involves using paired source-target images to supervise the training, which significantly increases the data preparation effort and limits the application of the models. [Expand]
Thursday Poster Session
Robust Audio-Visual Instance Discrimination
Pedro Morgado, Ishan Misra, Nuno Vasconcelos
We present a self-supervised learning method to learn audio and video representations. [Expand]
1.25
1
Thursday Poster Session
Focus on Local: Detecting Lane Marker From Bottom Up via Key Point
Zhan Qu, Huan Jin, Yang Zhou, Zhen Yang, Wei Zhang
Mainstream lane marker detection methods are implemented by predicting the overall structure and deriving parametric curves through post-processing. [Expand]
Thursday Poster Session
Probabilistic 3D Human Shape and Pose Estimation From Multiple Unconstrained Images in the Wild
Akash Sengupta, Ignas Budvytis, Roberto Cipolla
This paper addresses the problem of 3D human body shape and pose estimation from RGB images. [Expand]
Friday Poster Session
Manifold Regularized Dynamic Network Pruning
Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, Dacheng Tao, Chang Xu
Neural network pruning is an essential approach for reducing the computational complexity of deep models so that they can be well deployed on resource-limited devices. [Expand]
1.25
1
Tuesday Poster Session
HLA-Face: Joint High-Low Adaptation for Low Light Face Detection
Wenjing Wang, Wenhan Yang, Jiaying Liu
Face detection in low light scenarios is challenging but vital to many practical applications, e.g., surveillance video, autonomous driving at night. [Expand]
Friday Poster Session
Scene Text Retrieval via Joint Text Detection and Similarity Learning
Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu
Scene text retrieval aims to localize and search all text instances from an image gallery, which are the same or similar with a given query text. [Expand]
Tuesday Poster Session
Towards More Flexible and Accurate Object Tracking With Natural Language: Algorithms and Benchmark
Xiao Wang, Xiujun Shu, Zhipeng Zhang, Bo Jiang, Yaowei Wang, Yonghong Tian, Feng Wu
Tracking by natural language specification is a new rising research topic that aims at locating the target object in the video sequence based on its language description. [Expand]
1.25
1
Thursday Poster Session
Troubleshooting Blind Image Quality Models in the Wild
Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics. [Expand]
1.25
1
Friday Poster Session
ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang
Human pose estimation has achieved significant progress in recent years. [Expand]
Friday Poster Session
CondenseNet V2: Sparse Feature Reactivation for Deep Networks
Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. [Expand]
1.25
1
Tuesday Poster Session
FP-NAS: Fast Probabilistic Neural Architecture Search
Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli
Differential Neural Architecture Search (NAS) requires all layer choices to be held in memory simultaneously; this limits the size of both search space and final architecture. [Expand]
Thursday Poster Session
DER: Dynamically Expandable Representation for Class Incremental Learning
Shipeng Yan, Jiangwei Xie, Xuming He
We address the problem of class incremental learning, which is a core step towards achieving adaptive vision intelligence. [Expand]
1.25
1
Tuesday Poster Session
Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations
Yanyi Zhang, Xinyu Li, Ivan Marsic
Multi-label activity recognition is designed for recognizing multiple activities that are performed simultaneously or sequentially in each video. [Expand]
Thursday Poster Session
Weakly Supervised Video Salient Object Detection
Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han
Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are timeconsuming and expensive to obtain. [Expand]
Friday Poster Session
Simpler Certified Radius Maximization by Propagating Covariances
Xingjian Zhen, Rudrasis Chakraborty, Vikas Singh
One strategy for adversarially training a robust model is to maximize its certified radius -- the neighborhood around a given training sample for which the model's prediction remains unchanged. [Expand]
Wednesday Poster Session
Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee
Video inpainting aims to fill spatio-temporal "corrupted" regions with plausible content. [Expand]
Friday Poster Session
What's in the Image? Explorable Decoding of Compressed Images
Yuval Bahat, Tomer Michaeli
The ever-growing amounts of visual contents captured on a daily basis necessitate the use of lossy compression methods in order to save storage space and transmission bandwidth. [Expand]
Show Tweets
Tuesday Poster Session
Behavior-Driven Synthesis of Human Dynamics
Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer
Generating and representing human behavior are of major importance for various computer vision applications. [Expand]
1.00
1
Thursday Poster Session
On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective
Nontawat Charoenphakdee, Jayakorn Vongkulbhisal, Nuttapong Chairatanakul, Masashi Sugiyama
The focal loss has demonstrated its effectiveness in many real-world applications such as object detection and image classification, but its theoretical understanding has been limited so far. [Expand]
1.00
1
Tuesday Poster Session
Wide-Baseline Relative Camera Pose Estimation With Directional Learning
Kefan Chen, Noah Snavely, Ameesh Makadia
Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. [Expand]
1.00
1
Tuesday Poster Session
A Hyperbolic-to-Hyperbolic Graph Convolutional Network
Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia
Hyperbolic graph convolutional networks (GCNs) demonstrate powerful representation ability to model graphs with hierarchical structure. [Expand]
1.00
1
Monday Poster Session
Square Root Bundle Adjustment for Large-Scale Reconstruction
Nikolaus Demmel, Christiane Sommer, Daniel Cremers, Vladyslav Usenko
We propose a new formulation for the bundle adjustment problem which relies on nullspace marginalization of landmark variables by QR decomposition. [Expand]
Thursday Poster Session
StickyPillars: Robust and Efficient Feature Matching on Point Clouds Using Graph Neural Networks
Kai Fischer, Martin Simon, Florian Olsner, Stefan Milz, Horst-Michael Gross, Patrick Mader
Robust point cloud registration in real-time is an important prerequisite for many mapping and localization algorithms. [Expand]
1.00
1
Monday Poster Session
Unsupervised Pre-Training for Person Re-Identification
Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen
In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation. [Expand]
Thursday Poster Session
Privacy-Preserving Collaborative Learning With Automatic Transformation Search
Wei Gao, Shangwei Guo, Tianwei Zhang, Han Qiu, Yonggang Wen, Yang Liu
Collaborative learning has gained great popularity due to its benefit of data privacy protection: participants can jointly train a Deep Learning model without sharing their training sets. [Expand]
Monday Poster Session
Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
Rui Gong, Yuhua Chen, Danda Pani Paudel, Yawei Li, Ajad Chhatkuli, Wen Li, Dengxin Dai, Luc Van Gool
Open compound domain adaptation (OCDA) is a domain adaptation setting, where target domain is modeled as a compound of multiple unknown homogeneous domains, which brings the advantage of improved generalization to unseen domains. [Expand]
1.00
1
Wednesday Poster Session
Panoptic Segmentation Forecasting
Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander G. Schwing
Our goal is to forecast the near future given a set of recent observations. [Expand]
Thursday Poster Session
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun
In this work, we present FFB6D, a full flow bidirectional fusion network designed for 6D pose estimation from a single RGBD image. [Expand]
1.00
1
Tuesday Poster Session
Learnable Graph Matching: Incorporating Graph Partitioning With Deep Feature Learning for Multiple Object Tracking
Jiawei He, Zehao Huang, Naiyan Wang, Zhaoxiang Zhang
Data association across frames is at the core of Multiple Object Tracking (MOT) task. [Expand]
1.00
1
Tuesday Poster Session
Multi-Source Domain Adaptation With Collaborative Learning for Semantic Segmentation
Jianzhong He, Xu Jia, Shuaijun Chen, Jianzhuang Liu
Multi-source unsupervised domain adaptation (MSDA) aims at adapting models trained on multiple labeled source domains to an unlabeled target domain. [Expand]
1.00
1
Wednesday Poster Session
DSRNA: Differentiable Search of Robust Neural Architectures
Ramtin Hosseini, Xingyi Yang, Pengtao Xie
In deep learning applications, the architectures of deep neural networks are crucial in achieving high accuracy. [Expand]
1.00
1
Tuesday Poster Session
Detecting Human-Object Interaction via Fabricated Compositional Learning
Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, Dacheng Tao
Human-Object Interaction (HOI) detection, inferring the relationships between human and objects from images/videos, is a fundamental task for high-level scene understanding. [Expand]
1.00
1
Thursday Poster Session
DI-Fusion: Online Implicit 3D Reconstruction With Deep Priors
Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu
Previous online 3D dense reconstruction methods struggle to achieve the balance between memory storage and surface quality, largely due to the usage of stagnant underlying geometry representation, such as TSDF (truncated signed distance functions) or surfels, without any knowledge of the scene priors. [Expand]
1.00
1
Wednesday Poster Session
Self-Supervised Video Representation Learning by Context and Motion Decoupling
Lianghua Huang, Yu Liu, Bin Wang, Pan Pan, Yinghui Xu, Rong Jin
A key challenge in self-supervised video representation learning is how to effectively capture motion information besides context bias. [Expand]
1.00
1
Thursday Poster Session
Learning Position and Target Consistency for Memory-Based Video Object Segmentation
Li Hu, Peng Zhang, Bang Zhang, Pan Pan, Yinghui Xu, Rong Jin
This paper studies the problem of semi-supervised video object segmentation(VOS). [Expand]
1.00
1
Tuesday Poster Session
EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
Yang Jiao, Trac D. Tran, Guangming Shi
This paper addresses the challenging unsupervised scene flow estimation problem by jointly learning four low-level vision sub-tasks: optical flow F, stereo-depth D, camera pose P and motion segmentation S. [Expand]
Tuesday Poster Session
Embedding Transfer With Label Relaxation for Improved Metric Learning
Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak
This paper presents a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another. [Expand]
Tuesday Poster Session
Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution
Hyungjun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim
Binarization of neural network models is considered as one of the promising methods to deploy deep neural network models on resource-constrained environments such as mobile devices. [Expand]
1.00
1
Wednesday Poster Session
Single-View Robot Pose and Joint Angle Estimation via Render & Compare
Yann Labbe, Justin Carpentier, Mathieu Aubry, Josef Sivic
We introduce RoboPose, a method to estimate the joint angles and the 6D camera-to-robot pose of a known articulated robot from a single RGB image. [Expand]
Monday Poster Session
Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency
Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia
Semantic segmentation has made tremendous progress in recent years. [Expand]
1.00
1
Monday Poster Session
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation
Jungbeom Lee, Eunji Kim, Sungroh Yoon
Weakly supervised semantic segmentation produces a pixel-level localization from class labels; but a classifier trained on such labels is likely to restrict its focus to a small discriminative region of the target object. [Expand]
1.00
1
Tuesday Poster Session
Regularization Strategy for Point Cloud via Rigidly Mixed Sample
Dogyoon Lee, Jaeha Lee, Junhyeop Lee, Hyeongmin Lee, Minhyeok Lee, Sungmin Woo, Sangyoun Lee
Data augmentation is an effective regularization strategy to alleviate the overfitting, which is an inherent drawback of the deep neural networks. [Expand]
1.00
1
Friday Poster Session
MOOD: Multi-Level Out-of-Distribution Detection
Ziqian Lin, Sreya Dutta Roy, Yixuan Li
Out-of-distribution (OOD) detection is essential to prevent anomalous inputs from causing a model to fail during deployment. [Expand]
1.00
1
Thursday Poster Session
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin
Crowd counting is a fundamental yet challenging task, which desires rich information to generate pixel-wise crowd density maps. [Expand]
1.00
1
Tuesday Poster Session
Goal-Oriented Gaze Estimation for Zero-Shot Learning
Yang Liu, Lei Zhou, Xiao Bai, Yifei Huang, Lin Gu, Jun Zhou, Tatsuya Harada
Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. [Expand]
1.00
1
Tuesday Poster Session
Inception Convolution With Efficient Dilation Search
Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu
As a variant of standard convolution, a dilated convolution can control effective receptive fields and handle large scale variance of objects without introducing additional computational costs. [Expand]
1.00
1
Thursday Poster Session
Action Shuffle Alternating Learning for Unsupervised Action Segmentation
Jun Li, Sinisa Todorovic
This paper addresses unsupervised action segmentation. [Expand]
1.00
1
Thursday Poster Session
HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation
Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu
Model-based 3D pose and shape estimation methods reconstruct a full 3D mesh for the human body by estimating several parameters. [Expand]
1.00
1
Tuesday Poster Session
OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets
Zhengqin Li, Ting-Wei Yu, Shen Sang, Sarah Wang, Meng Song, Yuhan Liu, Yu-Ying Yeh, Rui Zhu, Nitesh Gundavarapu, Jia Shi, Sai Bi, Hong-Xing Yu, Zexiang Xu, Kalyan Sunkavalli, Milos Hasan, Ravi Ramamoorthi, Manmohan Chandraker
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. [Expand]
1.00
1
Wednesday Poster Session
POSEFusion: Pose-Guided Selective Fusion for Single-View Human Volumetric Capture
Zhe Li, Tao Yu, Zerong Zheng, Kaiwen Guo, Yebin Liu
We propose POse-guided SElective Fusion (POSEFusion), a single-view human volumetric capture method that leverages tracking-based methods and tracking-free inference to achieve high-fidelity and dynamic 3D reconstruction. [Expand]
1.00
1
Thursday Poster Session
Virtual Fully-Connected Layer: Training a Large-Scale Face Recognition Dataset With Limited Computational Resources
Pengyu Li, Biao Wang, Lei Zhang
Recently, deep face recognition has achieved significant progress because of Convolutional Neural Networks (CNNs) and large-scale datasets. [Expand]
1.00
1
Thursday Poster Session
M3DSSD: Monocular 3D Single Stage Object Detector
Shujie Luo, Hang Dai, Ling Shao, Yong Ding
In this paper, we propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention. [Expand]
1.00
1
Tuesday Poster Session
Bridging the Visual Gap: Wide-Range Image Blending
Chia-Ni Lu, Ya-Chu Chang, Wei-Chen Chiu
In this paper we propose a new problem scenario in image processing, wide-range image blending, which aims to smoothly merge two different input photos into a panorama by generating novel image content for the intermediate region between them. [Expand]
1.00
1
Monday Poster Session
IQDet: Instance-Wise Quality Distribution Sampling for Object Detection
Yuchen Ma, Songtao Liu, Zeming Li, Jian Sun
We propose a dense object detector with an instance-wise sampling strategy, named IQDet. [Expand]
1.00
1
Monday Poster Session
Depth-Aware Mirror Segmentation
Haiyang Mei, Bo Dong, Wen Dong, Pieter Peers, Xin Yang, Qiang Zhang, Xiaopeng Wei
We present a novel mirror segmentation method that leverages depth estimates from ToF-based cameras as an additional cue to disambiguate challenging cases where the contrast or relation in RGB colors between the mirror reflection and the surrounding scene is subtle. [Expand]
1.00
1
Tuesday Poster Session
GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction
Cheol-Hui Min, Jinseok Bae, Junho Lee, Young Min Kim
We present GATSBI, a generative model that can transform a sequence of raw observations into a structured latent representation that fully captures the spatio-temporal context of the agent's actions. [Expand]
Tuesday Poster Session
Background Splitting: Finding Rare Classes in a Sea of Background
Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian
We focus on the problem of training deep image classification models for a small number of extremely rare categories. [Expand]
Wednesday Poster Session
LayoutGMN: Neural Graph Matching for Structural Layout Similarity
Akshay Gadi Patil, Manyi Li, Matthew Fisher, Manolis Savva, Hao Zhang
We present a deep neural network to predict structural similarity between 2D layouts by leveraging Graph Matching Networks (GMN). [Expand]
1.00
1
Wednesday Poster Session
Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals
Kun Qian, Shilin Zhu, Xinyu Zhang, Li Erran Li
Vehicle detection with visual sensors like lidar and camera is one of the critical functions enabling autonomous driving. [Expand]
1.00
1
Monday Poster Session
Every Annotation Counts: Multi-Label Deep Supervision for Medical Image Segmentation
Simon Reiss, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen
Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field. [Expand]
1.00
1
Wednesday Poster Session
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation
Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Xiaoliang Cheng, Kewei Liang
Binary grid mask representation is broadly used in instance segmentation. [Expand]
1.00
1
Wednesday Poster Session
StablePose: Learning 6D Object Poses From Geometrically Stable Patches
Yifei Shi, Junwen Huang, Xin Xu, Yifan Zhang, Kai Xu
We introduce the concept of geometric stability to the problem of 6D object pose estimation and propose to learn pose inference based on geometrically stable patches extracted from observed 3D point clouds. [Expand]
Thursday Poster Session
BCNet: Searching for Network Width With Bilaterally Coupled Network
Xiu Su, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu
Searching for a more compact network width recently serves as an effective way of channel pruning for the deployment of convolutional neural networks (CNNs) under hardware constraints. [Expand]
1.00
1
Monday Poster Session
Prioritized Architecture Sampling With Monto-Carlo Tree Search
Xiu Su, Tao Huang, Yanxi Li, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu
One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once. [Expand]
1.00
1
Wednesday Poster Session
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks
Naoya Takahashi, Yuki Mitsufuji
Tasks that involve high-resolution dense prediction require a modeling of both local and global patterns in a large input field. [Expand]
1.00
1
Monday Poster Session
EnD: Entangling and Disentangling Deep Representations for Bias Correction
Enzo Tartaglione, Carlo Alberto Barbano, Marco Grangetto
Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. [Expand]
1.00
1
Thursday Poster Session
MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection
Vibashan VS, Vikram Gupta, Poojan Oza, Vishwanath A. Sindagi, Vishal M. Patel
Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. [Expand]
1.00
1
Tuesday Poster Session
Combinatorial Learning of Graph Edit Distance via Dynamic Embedding
Runzhong Wang, Tianqi Zhang, Tianshu Yu, Junchi Yan, Xiaokang Yang
Graph Edit Distance (GED) is a popular similarity measurement for pairwise graphs and it also refers to the recovery of the edit path from the source graph to the target graph. [Expand]
1.00
1
Tuesday Poster Session
Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection
Zhenyu Wang, Yali Li, Ye Guo, Lu Fang, Shengjin Wang
In this paper, we delve into semi-supervised object detection where unlabeled images are leveraged to break through the upper bound of fully-supervised object detection models. [Expand]
1.00
1
Tuesday Poster Session
Convolutional Neural Network Pruning With Structural Redundancy Reduction
Zi Wang, Chengcheng Li, Xiangyang Wang
Convolutional neural network (CNN) pruning has become one of the most successful network compression approaches in recent years. [Expand]
1.00
1
Thursday Poster Session
Enhancing the Transferability of Adversarial Attacks Through Variance Tuning
Xiaosen Wang, Kun He
Deep neural networks are vulnerable to adversarial examples that mislead the models with imperceptible perturbations. [Expand]
1.00
1
Monday Poster Session
Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs
Hui-Po Wang, Ning Yu, Mario Fritz
While Generative Adversarial Networks (GANs) show increasing performance and the level of realism is becoming indistinguishable from natural images, this also comes with high demands on data and computation. [Expand]
1.00
1
Wednesday Poster Session
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, Nong Sang
Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. [Expand]
1.00
1
Monday Poster Session
NeuralFusion: Online Depth Fusion in Latent Space
Silvan Weder, Johannes L. Schonberger, Marc Pollefeys, Martin R. Oswald
We present a novel online depth map fusion approach that learns depth map aggregation in a latent feature space. [Expand]
1.00
1
Tuesday Poster Session
Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
Yu Wu, Yi Yang
We investigate the weakly-supervised audio-visual video parsing task, which aims to parse a video into temporal event segments and predict the audible or visible event categories. [Expand]
1.00
1
Monday Poster Session
MotionRNN: A Flexible Model for Video Prediction With Spacetime-Varying Motions
Haixu Wu, Zhiyu Yao, Jianmin Wang, Mingsheng Long
This paper tackles video prediction from a new dimension of predicting spacetime-varying motions that are incessantly changing across both space and time. [Expand]
1.00
1
Thursday Poster Session
Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
Shiyu Xuan, Shiliang Zhang
Most of unsupervised person Re-Identification (Re-ID) works produce pseudo-labels by measuring the feature similarity without considering the distribution discrepancy among cameras, leading to degraded accuracy in label computation across cameras. [Expand]
1.00
1
Thursday Poster Session
Inferring CAD Modeling Sequences Using Zone Graphs
Xianghao Xu, Wenzhe Peng, Chin-Yi Cheng, Karl D.D. Willis, Daniel Ritchie
In computer-aided design (CAD), the ability to "reverse engineer" the modeling steps used to create 3D shapes is a long-sought-after goal. [Expand]
1.00
1
Tuesday Poster Session
DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-Scale Consistency
Zongxin Yang, Xin Yu, Yi Yang
Compared to 2D object bounding-box labeling, it is very difficult for humans to annotate 3D object poses, especially when depth images of scenes are unavailable. [Expand]
1.00
1
Tuesday Poster Session
Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification
Fengxiang Yang, Zhun Zhong, Zhiming Luo, Yuanzheng Cai, Yaojin Lin, Shaozi Li, Nicu Sebe
This paper considers the problem of unsupervised person re-identification (re-ID), which aims to learn discriminative models with unlabeled data. [Expand]
1.00
1
Tuesday Poster Session
ST3D: Self-Training for Unsupervised Domain Adaptation on 3D Object Detection
Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds. [Expand]
1.00
1
Wednesday Poster Session
Slimmable Compressive Autoencoders for Practical Neural Image Compression
Fei Yang, Luis Herranz, Yongmei Cheng, Mikhail G. Mozerov
Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. [Expand]
1.00
1
Tuesday Poster Session
Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation
Xiangyu Yue, Zangwei Zheng, Shanghang Zhang, Yang Gao, Trevor Darrell, Kurt Keutzer, Alberto Sangiovanni Vincentelli
Unsupervised Domain Adaptation (UDA) transfers predictive models from a fully-labeled source domain to an unlabeled target domain. [Expand]
1.00
1
Thursday Poster Session
LAFEAT: Piercing Through Adversarial Defenses With Latent Features
Yunrui Yu, Xitong Gao, Cheng-Zhong Xu
Deep convolutional neural networks are susceptible to adversarial attacks. [Expand]
1.00
1
Tuesday Poster Session
CorrNet3D: Unsupervised End-to-End Learning of Dense Correspondence for 3D Point Clouds
Yiming Zeng, Yue Qian, Zhiyu Zhu, Junhui Hou, Hui Yuan, Ying He
Motivated by the intuition that one can transform two aligned point clouds to each other more easily and meaningfully than a misaligned pair, we propose CorrNet3D -the first unsupervised and end-to-end deep learning-based framework - to drive the learning of dense correspondence between 3D shapes by means of deformation-like reconstruction to overcome the need for annotated data. [Expand]
Tuesday Poster Session
Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
Chi Zhang, Baoxiong Jia, Song-Chun Zhu, Yixin Zhu
Spatial-temporal reasoning is a challenging task in Artificial Intelligence (AI) due to its demanding but unique nature: a theoretic requirement on representing and reasoning based on spatial-temporal knowledge in mind, and an applied requirement on a high-level cognitive system capable of navigating and acting in space and time. [Expand]
1.00
1
Wednesday Poster Session
ACRE: Abstract Causal REasoning Beyond Covariation
Chi Zhang, Baoxiong Jia, Mark Edmonds, Song-Chun Zhu, Yixin Zhu
Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. [Expand]
1.00
1
Wednesday Poster Session
Body Meshes as Points
Jianfeng Zhang, Dongdong Yu, Jun Hao Liew, Xuecheng Nie, Jiashi Feng
We consider the challenging multi-person 3D body mesh estimation task in this work. [Expand]
1.00
1
Monday Poster Session
EDNet: Efficient Disparity Estimation With Cost Volume Combination and Attention-Based Spatial Residual
Songyan Zhang, Zhicheng Wang, Qiang Wang, Jinshuo Zhang, Gang Wei, Xiaowen Chu
Existing state-of-the-art disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression, which is inefficient due to the high memory consumption and slow inference speed. [Expand]
Tuesday Poster Session
Exploiting Edge-Oriented Reasoning for 3D Point-Based Scene Graph Analysis
Chaoyi Zhang, Jianhui Yu, Yang Song, Weidong Cai
Scene understanding is a critical problem in computer vision. [Expand]
1.00
1
Wednesday Poster Session
Neural Architecture Search With Random Labels
Xuanyang Zhang, Pengfei Hou, Xiangyu Zhang, Jian Sun
In this paper, we investigate a new variant of neural architecture search (NAS) paradigm -- searching with random labels (RLNAS). [Expand]
1.00
1
Wednesday Poster Session
Stochastic Whitening Batch Normalization
Shengdong Zhang, Ehsan Nezhadarya, Homa Fashandi, Jiayi Liu, Darin Graham, Mohak Shah
Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). [Expand]
Wednesday Poster Session
UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification
Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains. [Expand]
1.00
1
Thursday Poster Session
Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction From Raw Point Clouds
Wenbin Zhao, Jiabao Lei, Yuxin Wen, Jianguo Zhang, Kui Jia
Shape modeling and reconstruction from raw point clouds of objects stand as a fundamental challenge in vision and graphics research. [Expand]
1.00
1
Wednesday Poster Session
SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud
Wu Zheng, Weiliang Tang, Li Jiang, Chi-Wing Fu
We present Self-Ensembling Single-Stage object Detector (SE-SSD) for accurate and efficient 3D object detection in outdoor point clouds. [Expand]
Thursday Poster Session
Cross-MPI: Cross-Scale Stereo for Image Super-Resolution Using Multiplane Images
Yuemei Zhou, Gaochang Wu, Ying Fu, Kun Li, Yebin Liu
Various combinations of cameras enrich computational photography, among which reference-based superresolution (RefSR) plays a critical role in multiscale imaging systems. [Expand]
1.00
1
Thursday Poster Session
Panoptic-PolarNet: Proposal-Free LiDAR Point Cloud Panoptic Segmentation
Zixiang Zhou, Yang Zhang, Hassan Foroosh
Panoptic segmentation presents a new challenge in exploiting the merits of both detection and segmentation, with the aim of unifying instance segmentation and semantic segmentation in a single framework. [Expand]
1.00
1
Thursday Poster Session
Fourier Contour Embedding for Arbitrary-Shaped Text Detection
Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang
One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances. [Expand]
1.00
1
Tuesday Poster Session
Complementary Relation Contrastive Distillation
Jinguo Zhu, Shixiang Tang, Dapeng Chen, Shijie Yu, Yakun Liu, Mingzhe Rong, Aijun Yang, Xiaohua Wang
Knowledge distillation aims to transfer representation ability from a teacher model to a student model. [Expand]
1.00
1
Wednesday Poster Session
Where and What? Examining Interpretable Disentangled Representations
Xinqi Zhu, Chang Xu, Dacheng Tao
Capturing interpretable variations has long been one of the goals in disentanglement learning. [Expand]
1.00
1
Tuesday Poster Session
Denoise and Contrast for Category Agnostic Shape Completion
Antonio Alliegro, Diego Valsesia, Giulia Fracastoro, Enrico Magli, Tatiana Tommasi
In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. [Expand]
Tuesday Poster Session
Dogfight: Detecting Drones From Drones Videos
Muhammad Waseem Ashraf, Waqas Sultani, Mubarak Shah
As airborne vehicles are becoming more autonomous and ubiquitous, it has become vital to develop the capability to detect the objects in their surroundings. [Expand]
Tuesday Poster Session
What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa
Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. [Expand]
Tuesday Poster Session
Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown Generic Reflectance
Ziang Cheng, Hongdong Li, Yuta Asano, Yinqiang Zheng, Imari Sato
Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e.g. [Expand]
Friday Poster Session
Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration
Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng
Recent years have witnessed significant progress in 3D hand mesh recovery. [Expand]
Thursday Poster Session
Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision
Xiaokang Chen, Yuhui Yuan, Gang Zeng, Jingdong Wang
In this paper, we study the semi-supervised semantic segmentation problem via exploring both labeled data and extra unlabeled data. [Expand]
Monday Poster Session
PiCIE: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering
Jang Hyun Cho, Utkarsh Mall, Kavita Bala, Bharath Hariharan
We present a new framework for semantic segmentation without annotations via clustering. [Expand]
Friday Poster Session
Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation
Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu
Unsupervised Domain Adaptation (UDA) aims to generalize the knowledge learned from a well-labeled source domain to an unlabled target domain. [Expand]
Tuesday Poster Session
Siamese Natural Language Tracker: Tracking by Natural Language Descriptions With Siamese Trackers
Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff
We propose a novel Siamese Natural Language Tracker (SNLT), which brings the advancements in visual tracking to the tracking by natural language (NL) specification task. [Expand]
Tuesday Poster Session
OTA: Optimal Transport Assignment for Object Detection
Zheng Ge, Songtao Liu, Zeming Li, Osamu Yoshie, Jian Sun
Recent advances in label assignment in object detection mainly seek to independently define positive/negative training samples for each ground-truth (gt) object. [Expand]
Monday Poster Session
Bidirectional Projection Network for Cross Dimension Scene Understanding
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong
2D image representations are in regular grids and can be processed efficiently, whereas 3D point clouds are unordered and scattered in 3D space. [Expand]
Thursday Poster Session
Few-Shot Open-Set Recognition by Transformation Consistency
Minki Jeong, Seokeon Choi, Changick Kim
In this paper, we attack a few-shot open-set recognition (FSOSR) problem, which is a combination of few-shot learning (FSL) and open-set recognition (OSR). [Expand]
Thursday Poster Session
Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification?
Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, Dawn Song
Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption. [Expand]
Wednesday Poster Session
Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network
Aupendu Kar, Prabir Kumar Biswas
Convolutional neural network (CNN) has achieved unprecedented success in image super-resolution tasks in recent years. [Expand]
Tuesday Poster Session
Deep Implicit Moving Least-Squares Functions for 3D Reconstruction
Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu
Point set is a flexible and lightweight representation widely used for 3D deep learning. [Expand]
Monday Poster Session
PD-GAN: Probabilistic Diverse GAN for Image Inpainting
Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao
We propose PD-GAN, a probabilistic diverse GAN forimage inpainting. [Expand]
Wednesday Poster Session
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
Yongfei Liu, Bo Wan, Lin Ma, Xuming He
Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding. [Expand]
Tuesday Poster Session
Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation
Astuti Sharma, Tarun Kalluri, Manmohan Chandraker
Domain adaptation deals with training models using large scale labeled data from a specific source domain and then adapting the knowledge to certain target domains that have few or no labels. [Expand]
Tuesday Poster Session
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
Mingjie Sun, Jimin Xiao, Eng Gee Lim
In this paper, we are tackling the proposal-free referring expression grounding task, aiming at localizing the target object according to a query sentence, without relying on off-the-shelf object proposals. [Expand]
Thursday Poster Session
Delving into Data: Effectively Substitute Training for Black-box Attack
Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, Xiangyang Xue
Deep models have shown their vulnerability when processing adversarial samples. [Expand]
Tuesday Poster Session
Exploring Sparsity in Image Super-Resolution for Efficient Inference
Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo
Current CNN-based super-resolution (SR) methods process all locations equally with computational resources being uniformly assigned in space. [Expand]
Tuesday Poster Session
From Rain Generation to Rain Removal
Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, Deyu Meng
For the single image rain removal (SIRR) task, the performance of deep learning (DL)-based methods is mainly affected by the designed deraining models and training datasets. [Expand]
Thursday Poster Session
Cycle4Completion: Unpaired Point Cloud Completion Using Cycle Transformation With Missing Region Coding
Xin Wen, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu
In this paper, we present a novel unpaired point cloud completion network, named Cycle4Completion, to infer the complete geometries from a partial 3D object. [Expand]
Thursday Poster Session
Bilateral Grid Learning for Stereo Matching Networks
Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo
Real-time performance of stereo matching networks is important for many applications, such as automatic driving, robot navigation and augmented reality (AR). [Expand]
Thursday Poster Session
Diversifying Sample Generation for Accurate Data-Free Quantization
Xiangguo Zhang, Haotong Qin, Yifu Ding, Ruihao Gong, Qinghua Yan, Renshuai Tao, Yuhang Li, Fengwei Yu, Xianglong Liu
Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. [Expand]
Friday Poster Session
Fostering Generalization in Single-View 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
Jan Bechtold, Maxim Tatarchenko, Volker Fischer, Thomas Brox
Single-view 3D object reconstruction has seen much progress, yet methods still struggle generalizing to novel shapes unseen during training. [Expand]
Friday Poster Session
Towards Part-Based Understanding of RGB-D Scans
Alexey Bokhovkin, Vladislav Ishimtsev, Emil Bogomolov, Denis Zorin, Alexey Artemov, Evgeny Burnaev, Angela Dai
Recent advances in 3D semantic scene understanding have shown impressive progress in 3D instance segmentation, enabling object-level reasoning about 3D scenes; however, a finer-grained understanding is required to enable interactions with objects and their functional understanding. [Expand]
Wednesday Poster Session
Fine-Grained Angular Contrastive Learning With Coarse Labels
Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky
Few-shot learning methods offer pre-training techniques optimized for easier later adaptation of the model to new classes (unseen during training) using one or a few examples. [Expand]
Wednesday Poster Session
Semantic Scene Completion via Integrating Instances and Scene In-the-Loop
Yingjie Cai, Xuesong Chen, Chao Zhang, Kwan-Yee Lin, Xiaogang Wang, Hongsheng Li
Semantic Scene Completion aims at reconstructing a complete 3D scene with precise voxel-wise semantics from a single-view depth or RGBD image. [Expand]
Monday Poster Session
Globally Optimal Relative Pose Estimation With Gravity Prior
Yaqing Ding, Daniel Barath, Jian Yang, Hui Kong, Zuzana Kukelova
Smartphones, tablets and camera systems used, e.g., in cars and UAVs, are typically equipped with IMUs (inertial measurement units) that can measure the gravity vector accurately. [Expand]
Monday Poster Session
Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball
Andrew Elliott, Stephen Law, Chris Russell
We present a simple regularization of adversarial perturbations based upon the perceptual loss. [Expand]
Wednesday Poster Session
Learning Goals From Failure
Dave Epstein, Carl Vondrick
We introduce a framework that predicts the goals behind observable human action in video. [Expand]
Wednesday Poster Session
Fair Feature Distillation for Visual Recognition
Sangwon Jung, Donggyu Lee, Taeeon Park, Taesup Moon
Fairness is becoming an increasingly crucial issue for computer vision, especially in the human-related decision systems. [Expand]
Thursday Poster Session
How To Exploit the Transferability of Learned Image Compression to Conventional Codecs
Jan P. Klopp, Keng-Chi Liu, Liang-Gee Chen, Shao-Yi Chien
Lossy image compression is often limited by the simplicity of the chosen loss measure. [Expand]
Friday Poster Session
Restore From Restored: Video Restoration With Pseudo Clean Video
Seunghwan Lee, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim
In this study, we propose a self-supervised video denoising method called ""restore-from-restored."" This method fine-tunes a pre-trained network by using a pseudo clean video during the test phase. [Expand]
Tuesday Poster Session
Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation
Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim
Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. [Expand]
Tuesday Poster Session
DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes With Biharmonic Coordinates
Minghua Liu, Minhyuk Sung, Radomir Mech, Hao Su
We propose DeepMetaHandles, a 3D conditional generative model based on mesh deformation. [Expand]
Monday Poster Session
Anchor-Constrained Viterbi for Set-Supervised Action Segmentation
Jun Li, Sinisa Todorovic
This paper is about action segmentation under weak supervision in training, where the ground truth provides only a set of actions present, but neither their temporal ordering nor when they occur in a training video. [Expand]
Wednesday Poster Session
Continuous Face Aging via Self-Estimated Residual Age Embedding
Zeqi Li, Ruowei Jiang, Parham Aarabi
Face synthesis, including face aging, in particular, has been one of the major topics that witnessed a substantial improvement in image fidelity by using generative adversarial networks (GANs). [Expand]
Thursday Poster Session
HCRF-Flow: Scene Flow From Point Clouds With Continuous High-Order CRFs and Position-Aware Flow Embedding
Ruibo Li, Guosheng Lin, Tong He, Fayao Liu, Chunhua Shen
Scene flow in 3D point clouds plays an important role in understanding dynamic environments. [Expand]
Monday Poster Session
Context Modeling in 3D Human Pose Estimation: A Unified Perspective
Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang
Estimating 3D human pose from a single image suffers from severe ambiguity since multiple 3D joint configurations may have the same 2D projection. [Expand]
Tuesday Poster Session
Lipstick Ain't Enough: Beyond Color Matching for In-the-Wild Makeup Transfer
Thao Nguyen, Anh Tuan Tran, Minh Hoai
Makeup transfer is the task of applying on a source face the makeup style from a reference image. [Expand]
Thursday Poster Session
Lifelong Person Re-Identification via Adaptive Knowledge Accumulation
Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew
Person ReID methods always learn through a stationary domain that is fixed by the choice of a given dataset. [Expand]
Wednesday Poster Session
PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation
Tal Reiss, Niv Cohen, Liron Bergman, Yedid Hoshen
Anomaly detection methods require high-quality features. [Expand]
Monday Poster Session
Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition
Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang
In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. [Expand]
Wednesday Poster Session
Improved Handling of Motion Blur in Online Object Detection
Mohamed Sayed, Gabriel Brostow
We wish to detect specific categories of objects, for online vision systems that will run in the real world. [Expand]
Monday Poster Session
Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution
Baoli Sun, Xinchen Ye, Baopu Li, Haojie Li, Zhihui Wang, Rui Xu
Existing color-guided depth super-resolution (DSR) approaches require paired RGB-D data as training examples where the RGB image is used as structural guidance to recover the degraded depth map due to their geometrical similarity. [Expand]
Wednesday Poster Session
AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles
Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun
As self-driving systems become better, simulating scenarios where the autonomy stack may fail becomes more important. [Expand]
Wednesday Poster Session
Image Inpainting With External-Internal Learning and Monochromic Bottleneck
Tengfei Wang, Hao Ouyang, Qifeng Chen
Although recent inpainting approaches have demonstrated significant improvement with deep neural networks, they still suffer from artifacts such as blunt structures and abrupt colors when filling in the missing regions. [Expand]
Tuesday Poster Session
Multiple Object Tracking With Correlation Learning
Qiang Wang, Yun Zheng, Pan Pan, Yinghui Xu
Recent works have shown that convolutional networks have substantially improved the performance of multiple object tracking by simultaneously learning detection and appearance features. [Expand]
Tuesday Poster Session
Invertible Image Signal Processing
Yazhou Xing, Zian Qian, Qifeng Chen
Unprocessed RAW data is a highly valuable image format for image editing and computer vision. [Expand]
Tuesday Poster Session
Open-Book Video Captioning With Retrieve-Copy-Generate Network
Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu
In this paper, we convert traditional video captioning task into a new paradigm, i.e., Open-book Video Captioning, which generates natural language under the prompts of video-content-relevant sentences, not limited to the video itself. [Expand]
Wednesday Poster Session
MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition
Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song
Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. [Expand]
Friday Poster Session
Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks
Yu Cheng, Bo Wang, Bo Yang, Robby T. Tan
In monocular video 3D multi-person pose estimation, inter-person occlusion and close interactions can cause human detection to be erroneous and human-joints grouping to be unreliable. [Expand]
Wednesday Poster Session
Contrastive Neural Architecture Search With Neural Architecture Comparators
Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, Yaowei Wang, Mingkui Tan
One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. [Expand]
Wednesday Poster Session
Efficient Object Embedding for Spliced Image Retrieval
Bor-Chun Chen, Zuxuan Wu, Larry S. Davis, Ser-Nam Lim
Detecting spliced images is one of the emerging challenges in computer vision. [Expand]
Thursday Poster Session
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking
Minghao Chen, Jianlong Fu, Haibin Ling
Despite remarkable progress achieved, most neural architecture search (NAS) methods focus on searching for one single accurate and robust architecture. [Expand]
Friday Poster Session
Robust Representation Learning With Feedback for Single Image Deraining
Chenghao Chen, Hao Li
A deraining network can be interpreted as a conditional generator that aims at removing rain streaks from image. [Expand]
Wednesday Poster Session
Scale-Aware Automatic Augmentation for Object Detection
Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia
We propose Scale-aware AutoAug to learn data augmentation policies for object detection. [Expand]
Wednesday Poster Session
Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging
Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, Felix Heide
We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. [Expand]
Wednesday Poster Session
Diverse Branch Block: Building a Convolution as an Inception-Like Unit
Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. [Expand]
Wednesday Poster Session
Deep Graph Matching Under Quadratic Constraint
Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia
Recently, deep learning based methods have demonstrated promising results on the graph matching problem, by relying on the descriptive capability of deep features extracted on graph nodes. [Expand]
Tuesday Poster Session
SSAN: Separable Self-Attention Network for Video Representation Learning
Xudong Guo, Xun Guo, Yan Lu
Self-attention has been successfully applied to video representation learning due to the effectiveness of modeling long range dependencies. [Expand]
Thursday Poster Session
Capsule Network Is Not More Robust Than Convolutional Network
Jindong Gu, Volker Tresp, Han Hu
The Capsule Network is widely believed to be more robust than Convolutional Networks. [Expand]
Thursday Poster Session
Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline
Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, Yao Zhao
Depth maps obtained by commercial depth sensors are always in low-resolution, making it difficult to be used in various computer vision tasks. [Expand]
Wednesday Poster Session
Transformation Driven Visual Reasoning
Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
This paper defines a new visual reasoning paradigm by introducing an important factor, i.e. [Expand]
Tuesday Poster Session
Affordance Transfer Learning for Human-Object Interaction Detection
Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, Dacheng Tao
Reasoning the human-object interactions (HOI) is essential for deeper scene understanding, while object affordances (or functionalities) are of great importance for human to discover unseen HOIs with novel objects. [Expand]
Monday Poster Session
DARCNN: Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images
Joy Hsu, Wah Chiu, Serena Yeung
In the biomedical domain, there is an abundance of dense, complex data where objects of interest may be challenging to detect or constrained by limits of human knowledge. [Expand]
Monday Poster Session
FVC: A New Framework Towards Deep Video Compression in Feature Space
Zhihao Hu, Guo Lu, Dong Xu
Learning based video compression attracts increasing attention in the past few years. [Expand]
Monday Poster Session
SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction From Video Data
Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing
Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. [Expand]
Monday Poster Session
MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking
Jennifer Jang, Heinrich Jiang
MeanShift is a popular mode-seeking clustering algorithm used in a wide range of applications in machine learning. [Expand]
Tuesday Poster Session
LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents
ByeoungDo Kim, Seong Hyeon Park, Seokhwan Lee, Elbek Khoshimjonov, Dongsuk Kum, Junsoo Kim, Jeong Soo Kim, Jun Won Choi
In this paper, we address the problem of predicting the future motion of a dynamic agent (called a target agent) given its current and past states as well as the information on its environment. [Expand]
Thursday Poster Session
SIPSA-Net: Shift-Invariant Pan Sharpening With Moving Object Alignment for Satellite Imagery
Jaehyup Lee, Soomin Seo, Munchurl Kim
Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. [Expand]
Wednesday Poster Session
Flow-Based Kernel Prior With Application to Blind Super-Resolution
Jingyun Liang, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte
Kernel estimation is generally one of the key problems for blind image super-resolution (SR). [Expand]
Wednesday Poster Session
OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
Tingting Liang, Yongtao Wang, Zhi Tang, Guosheng Hu, Haibin Ling
Recently, neural architecture search (NAS) has been exploited to design feature pyramid networks (FPNs) and achieved promising results for visual object detection. [Expand]
Wednesday Poster Session
Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation
Dohun Lim, Hyeonseok Lee, Sungchan Kim
We present a novel method for reliably explaining the predictions of neural networks. [Expand]
Tuesday Poster Session
Region-Aware Adaptive Instance Normalization for Image Harmonization
Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu
Image composition plays a common but important role in photo editing. [Expand]
Wednesday Poster Session
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Xiangru Lin, Guanbin Li, Yizhou Yu
Humans learn from life events to form intuitions towards the understanding of visual environments and languages. [Expand]
Tuesday Poster Session
From Shadow Generation To Shadow Removal
Zhihao Liu, Hui Yin, Xinyi Wu, Zhenyao Wu, Yang Mi, Song Wang
Shadow removal is a computer-vision task that aims to restore the image content in shadow regions. [Expand]
Tuesday Poster Session
Fully Convolutional Scene Graph Generation
Hengyue Liu, Ning Yan, Masood Mortazavi, Bir Bhanu
This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. [Expand]
Thursday Poster Session
No Frame Left Behind: Full Video Action Recognition
Xin Liu, Silvia L. Pintea, Fatemeh Karimi Nejadasl, Olaf Booij, Jan C. van Gemert
Not all video frames are equally informative for recognizing an action. [Expand]
Thursday Poster Session
Towards Unified Surgical Skill Assessment
Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li
Surgical skills have a great influence on surgical safety and patients' well-being. [Expand]
Wednesday Poster Session
Causal Hidden Markov Model for Time Series Disease Forecasting
Jing Li, Botong Wu, Xinwei Sun, Yizhou Wang
We propose a causal hidden Markov model to achieve robust prediction of irreversible disease at an early stage, which is safety-critical and vital for medical treatment in early stages. [Expand]
Thursday Poster Session
Exploring intermediate representation for monocular vehicle pose estimation
Shichao Li, Zengqiang Yan, Hongyang Li, Kwang-Ting Cheng
We present a new learning-based framework to recover vehicle pose in SO(3) from a single RGB image. [Expand]
Monday Poster Session
DeepI2P: Image-to-Point Cloud Registration via Deep Classification
Jiaxin Li, Gim Hee Lee
This paper presents DeepI2P: a novel approach for cross-modality registration between an image and a point cloud. [Expand]
Friday Poster Session
LiDAR R-CNN: An Efficient and Universal 3D Object Detector
Zhichao Li, Feng Wang, Naiyan Wang
LiDAR-based 3D detection in point cloud is essential in the perception system of autonomous driving. [Expand]
Wednesday Poster Session
Generalizing Face Forgery Detection With High-Frequency Features
Yuchen Luo, Yong Zhang, Junchi Yan, Wei Liu
Current face forgery detection methods achieve high accuracy under the within-database scenario where training and testing forgeries are synthesized by the same algorithm. [Expand]
Friday Poster Session
Self-Supervised Pillar Motion Learning for Autonomous Driving
Chenxu Luo, Xiaodong Yang, Alan Yuille
Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments. [Expand]
Tuesday Poster Session
Learning Semantic Person Image Generation by Region-Adaptive Normalization
Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, Wangmeng Zuo
Human pose transfer has received great attention due to its wide applications, yet is still a challenging task that is not well solved. [Expand]
Wednesday Poster Session
FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions
Weian Mao, Zhi Tian, Xinlong Wang, Chunhua Shen
We propose a fully convolutional multi-person pose estimation framework using dynamic instance-aware convolutions, termed FCPose. [Expand]
Wednesday Poster Session
Polygonal Point Set Tracking
Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim
In this paper, we propose a novel learning-based polygonal point set tracking method. [Expand]
Tuesday Poster Session
Reducing Domain Gap by Reducing Style Bias
Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, Donggeun Yoo
Convolutional Neural Networks (CNNs) often fail to maintain their performance when they confront new test domains, which is known as the problem of domain shift. [Expand]
Wednesday Poster Session
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects
Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, Yasutaka Furukawa
This paper proposes a generative adversarial layout refinement network for automated floorplan generation. [Expand]
Show Tweets
Thursday Poster Session
Hyperdimensional Computing as a Framework for Systematic Aggregation of Image Descriptors
Peer Neubert, Stefan Schubert
Image and video descriptors are an omnipresent tool in computer vision and its application fields like mobile robotics. [Expand]
Friday Poster Session
Bridge To Answer: Structure-Aware Graph Interaction Network for Video Question Answering
Jungin Park, Jiyoung Lee, Kwanghoon Sohn
This paper presents a novel method, termed Bridge to Answer, to infer correct answers for questions about a given video by leveraging adequate graph interactions of heterogeneous crossmodal graphs. [Expand]
Thursday Poster Session
VoxelContext-Net: An Octree Based Framework for Point Cloud Compression
Zizheng Que, Guo Lu, Dong Xu
In this paper, we propose a two-stage deep learning framework called VoxelContext-Net for both static and dynamic point cloud compression. [Expand]
Tuesday Poster Session
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On
Igor Santesteban, Nils Thuerey, Miguel A. Otaduy, Dan Casas
We propose a new generative model for 3D garment deformations that enables us to learn, for first time, a data-driven method for virtual try-on that effectively addresses garment-body collisions. [Expand]
Thursday Poster Session
Single Pair Cross-Modality Super Resolution
Guy Shacht, Dov Danon, Sharon Fogel, Daniel Cohen-Or
Non-visual imaging sensors are widely used in the industry for different purposes. [Expand]
Tuesday Poster Session
Learning To Segment Actions From Visual and Language Instructions via Differentiable Weak Sequence Alignment
Yuhan Shen, Lu Wang, Ehsan Elhamifar
We address the problem of unsupervised localization of key-steps and feature learning in instructional videos using both visual and language instructions. [Expand]
Show Tweets
Wednesday Poster Session
SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction
Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua
Pedestrian trajectory prediction is a key technology in autopilot, which remains to be very challenging due to complex interactions between pedestrians. [Expand]
Wednesday Poster Session
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song, Shizhe Chen, Qin Jin
Video paragraph captioning aims to describe multiple events in untrimmed videos with descriptive paragraphs. [Expand]
Wednesday Poster Session
Tracking Pedestrian Heads in Dense Crowd
Ramana Sundararaman, Cedric De Almeida Braga, Eric Marchand, Julien Pettre
Tracking humans in crowded video sequences is an important constituent of visual scene understanding. [Expand]
Tuesday Poster Session
Dynamic Metric Learning: Towards a Scalable Metric Space To Accommodate Multiple Semantic Scales
Yifan Sun, Yuke Zhu, Yuhan Zhang, Pengkun Zheng, Xi Qiu, Chi Zhang, Yichen Wei
This paper introduces a new fundamental characteristics, i.e., the dynamic range, from real-world metric tools to deep visual recognition. [Expand]
Tuesday Poster Session
Improving the Efficiency and Robustness of Deepfakes Detection Through Precise Geometric Features
Zekun Sun, Yujie Han, Zeyu Hua, Na Ruan, Weijia Jia
Deepfakes is a branch of malicious techniques that transplant a target face to the original one in videos, resulting in serious problems such as infringement of copyright, confusion of information, or even public panic. [Expand]
Tuesday Poster Session
Tangent Space Backpropagation for 3D Transformation Groups
Zachary Teed, Jia Deng
We address the problem of performing backpropagation for computation graphs involving 3D transformation groups SO(3), SE(3), and Sim(3). [Expand]
Wednesday Poster Session
Unsupervised Object Detection With LIDAR Clues
Hao Tian, Yuntao Chen, Jifeng Dai, Zhaoxiang Zhang, Xizhou Zhu
Despite the importance of unsupervised object detection, to the best of our knowledge, there is no previous work addressing this problem. [Expand]
Tuesday Poster Session
Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
Aysim Toker, Qunjie Zhou, Maxim Maximov, Laura Leal-Taixe
The goal of cross-view image based geo-localization is to determine the location of a given street view image by matching it against a collection of geo-tagged satellite images. [Expand]
Tuesday Poster Session
There Is More Than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking With Sound by Distilling Multimodal Knowledge
Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada
Attributes of sound inherent to objects can provide valuable cues to learn rich representations for object detection and tracking. [Expand]
Thursday Poster Session
CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement
Noranart Vesdapunt, Baoyuan Wang
Face detection is a fundamental problem for many downstream face applications, and there is a rising demand for faster, more accurate yet support for higher resolution face detectors. [Expand]
Monday Poster Session
Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter
Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Dezhi Peng, Zhe Li, Mengchao He, Yongpan Wang, Canjie Luo
Text recognition is a popular research subject with many associated challenges. [Expand]
Tuesday Poster Session
Learning Fine-Grained Segmentation of 3D Shapes Without Part Labels
Xiaogang Wang, Xun Sun, Xinyu Cao, Kai Xu, Bin Zhou
Existing learning-based approaches to 3D shape segmentation usually formulate it as a semantic labeling problem, assuming that all parts of training shapes are annotated with a given set of labels. [Expand]
Wednesday Poster Session
PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization
Guangming Wang, Xinrui Wu, Zhe Liu, Hesheng Wang
A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, using hierarchical embedding mask optimization is proposed in this paper. [Expand]
Show Tweets
Friday Poster Session
Scene-Aware Generative Network for Human Motion Synthesis
Jingbo Wang, Sijie Yan, Bo Dai, Dahua Lin
We revisit human motion synthesis, a task useful in various real-world applications, in this paper. [Expand]
Thursday Poster Session
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang, Zhan Tong, Bin Ji, Gangshan Wu
Temporal modeling still remains challenging for action recognition in videos. [Expand]
Monday Poster Session
Training Networks in Null Space of Feature Covariance for Continual Learning
Shipeng Wang, Xiaorong Li, Jian Sun, Zongben Xu
In the setting of continual learning, a network is trained on a sequence of tasks, and suffers from catastrophic forgetting. [Expand]
Monday Poster Session
Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images
Xinggang Wang, Jiapei Feng, Bin Hu, Qi Ding, Longjin Ran, Xiaoxin Chen, Wenyu Liu
Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation. [Expand]
Wednesday Poster Session
Unsupervised Degradation Representation Learning for Blind Super-Resolution
Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jungang Yang, Wei An, Yulan Guo
Most existing CNN-based super-resolution (SR) methods are developed based on an assumption that the degradation is fixed and known (e.g., bicubic downsampling). [Expand]
Wednesday Poster Session
Forecasting Irreversible Disease via Progression Learning
Botong Wu, Sijie Ren, Jing Li, Xinwei Sun, Shi-Ming Li, Yizhou Wang
Forecasting Parapapillary atrophy (PPA), i.e., a symptom related to most irreversible eye diseases, provides an alarm for implementing an intervention to slow down the disease progression at early stage. [Expand]
Wednesday Poster Session
SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences
Shun-Cheng Wu, Johanna Wald, Keisuke Tateno, Nassir Navab, Federico Tombari
Scene graphs are a compact and explicit representation successfully used in a variety of 2D scene understanding tasks. [Expand]
Wednesday Poster Session
A Dual Iterative Refinement Method for Non-Rigid Shape Matching
Rui Xiang, Rongjie Lai, Hongkai Zhao
In this work, a robust and efficient dual iterative refinement (DIR) method is proposed for dense correspondence between two nearly isometric shapes. [Expand]
Friday Poster Session
Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments
Zhihao Xia, Michael Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti
We introduce a neural network-based method to denoise pairs of images taken in quick succession in low-light environments, with and without a flash. [Expand]
Monday Poster Session
DG-Font: Deformable Generative Networks for Unsupervised Font Generation
Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu
Font generation is a challenging problem especially for some writing systems that consist of a large number of characters and has attracted a lot of attention in recent years. [Expand]
Tuesday Poster Session
Graph Stacked Hourglass Networks for 3D Human Pose Estimation
Tianhan Xu, Wataru Takano
In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. [Expand]
Friday Poster Session
Layout-Guided Novel View Synthesis From a Single Indoor Panorama
Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao
Existing view synthesis methods mainly focus on the perspective images and have shown promising results. [Expand]
Friday Poster Session
Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning
Chengming Xu, Yanwei Fu, Chen Liu, Chengjie Wang, Jilin Li, Feiyue Huang, Li Zhang, Xiangyang Xue
Few-shot learning (FSL), which aims to recognise new classes by adapting the learned knowledge with extremely limited few-shot (support) examples, remains an important open problem in computer vision. [Expand]
Tuesday Poster Session
Linear Semantics in Generative Adversarial Networks
Jianjin Xu, Changxi Zheng
Generative Adversarial Networks (GANs) are able to generate high-quality images, but it remains difficult to explicitly specify the semantics of synthesized images. [Expand]
Wednesday Poster Session
Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
Gang Xu, Jun Xu, Zhen Li, Liang Wang, Xing Sun, Ming-Ming Cheng
Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. [Expand]
Tuesday Poster Session
3D-MAN: 3D Multi-Frame Attention Network for Object Detection
Zetong Yang, Yin Zhou, Zhifeng Chen, Jiquan Ngiam
3D object detection is an important module in autonomous driving and robotics. [Expand]
Monday Poster Session
KSM: Fast Multiple Task Adaption via Kernel-Wise Soft Mask Learning
Li Yang, Zhezhi He, Junshan Zhang, Deliang Fan
Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as catastrophic forgetting. [Expand]
Thursday Poster Session
NetAdaptV2: Efficient Neural Architecture Search With Fast Super-Network Training and Architecture Optimization
Tien-Ju Yang, Yi-Lun Liao, Vivienne Sze
Neural architecture search (NAS) typically consists of three main steps: training a super-network, training and evaluating sampled deep neural networks (DNNs), and training the discovered DNN. [Expand]
Monday Poster Session
Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
Gengcong Yang, Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yujiu Yang
To generate "accurate" scene graphs, almost all exist-ing methods predict pairwise relationships in a determin-istic manner. [Expand]
Thursday Poster Session
ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
Mingyu Yin, Li Sun, Qingli Li
View synthesis is usually done by an autoencoder, in which the encoder maps a source view image into a latent content code, and the decoder transforms it into a target view image according to the condition. [Expand]
Wednesday Poster Session
Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed Hierarchical Tucker Structure
Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan
Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. [Expand]
Thursday Poster Session
Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search
Kaicheng Yu, Rene Ranftl, Mathieu Salzmann
Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. [Expand]
Thursday Poster Session
Real-Time Selfie Video Stabilization
Jiyang Yu, Ravi Ramamoorthi, Keli Cheng, Michel Sarkis, Ning Bi
We propose a novel real-time selfie video stabilization method. [Expand]
Thursday Poster Session
Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy
Zikai Zhang, Bineng Zhong, Shengping Zhang, Zhenjun Tang, Xin Liu, Zhaoxiang Zhang
A practical long-term tracker typically contains three key properties, i.e., an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism. [Expand]
Monday Poster Session
Domain-Robust VQA With Diverse Datasets and Methods but No Target Labels
Mingda Zhang, Tristan Maidment, Ahmad Diab, Adriana Kovashka, Rebecca Hwa
The observation that computer vision methods overfit to dataset specifics has inspired diverse attempts to make object recognition models robust to domain shifts. [Expand]
Tuesday Poster Session
Event-Based Synthetic Aperture Imaging With a Hybrid Network
Xiang Zhang, Wei Liao, Lei Yu, Wen Yang, Gui-Song Xia
Synthetic aperture imaging (SAI) is able to achieve the see through effect by blurring out the off-focus foreground occlusions and reconstructing the in-focus occluded targets from multi-view images. [Expand]
Thursday Poster Session
View-Guided Point Cloud Completion
Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao
This paper presents a view-guided solution for the task of point cloud completion. [Expand]
Friday Poster Session
Zero-Shot Instance Segmentation
Ye Zheng, Jiahong Wu, Yongqiang Qin, Faen Zhang, Li Cui
Deep learning has significantly improved the precision of instance segmentation with abundant labeled data. [Expand]
Monday Poster Session
VIGOR: Cross-View Image Geo-Localization Beyond One-to-One Retrieval
Sijie Zhu, Taojiannan Yang, Chen Chen
Cross-view image geo-localization aims to determine the locations of street-view query images by matching with GPS-tagged reference images from aerial view. [Expand]
Tuesday Poster Session
Leveraging the Availability of Two Cameras for Illuminant Estimation
Abdelrahman Abdelhamed, Abhijith Punnappurath, Michael S. Brown
Most modern smartphones are now equipped with two rear-facing cameras -- a main camera for standard imaging and an additional camera to provide wide-angle or telephoto zoom capabilities. [Expand]
Show Tweets
Tuesday Poster Session
RPSRNet: End-to-End Trainable Rigid Point Set Registration Network Using Barnes-Hut 2D-Tree Representation
Sk Aziz Ali, Kerem Kahraman, Gerd Reis, Didier Stricker
We propose RPSRNet - a novel end-to-end trainable deep neural network for rigid point set registration. [Expand]
Thursday Poster Session
Understanding and Simplifying Perceptual Distances
Dan Amir, Yair Weiss
Perceptual metrics based on features of deep Convolutional Neural Networks (CNNs) have shown remarkable success when used as loss functions in a range of computer vision problems and significantly outperform classical losses such as L1 or L2 in pixel space. [Expand]
Show Tweets
Thursday Poster Session
Learning Deep Latent Variable Models by Short-Run MCMC Inference With Optimal Transport Correction
Dongsheng An, Jianwen Xie, Ping Li
Learning latent variable models with deep top-down architectures typically requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. [Expand]
Thursday Poster Session
Adversarial Robustness Across Representation Spaces
Pranjal Awasthi, George Yu, Chun-Sung Ferng, Andrew Tomkins, Da-Cheng Juan
Adversarial robustness corresponds to the susceptibility of deep neural networks to imperceptible perturbations made at test time. [Expand]
Wednesday Poster Session
GMOT-40: A Benchmark for Generic Multiple Object Tracking
Hexin Bai, Wensheng Cheng, Peng Chu, Juehuan Liu, Kai Zhang, Haibin Ling
Multiple Object Tracking (MOT) has witnessed remarkable advances in recent years. [Expand]
Tuesday Poster Session
Learning Scalable lY=-Constrained Near-Lossless Image Compression via Joint Lossy Image and Residual Compression
Yuanchao Bai, Xianming Liu, Wangmeng Zuo, Yaowei Wang, Xiangyang Ji
We propose a novel joint lossy image and residual compression framework for learning l_infinity-constrained near-lossless image compression. [Expand]
Show Tweets
Thursday Poster Session
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
Zechen Bai, Zhigang Wang, Jian Wang, Di Hu, Errui Ding
Unsupervised domain adaptation (UDA) methods for person re-identification (re-ID) aim at transferring re-ID knowledge from labeled source data to unlabeled target data. [Expand]
Thursday Poster Session
Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers
Apratim Bhattacharyya, Daniel Olmeda Reino, Mario Fritz, Bernt Schiele
Accurate prediction of pedestrian and bicyclist paths is integral to the development of reliable autonomous vehicles in dense urban environments. [Expand]
Show Tweets
Tuesday Poster Session
Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions
Navaneeth Bodla, Gaurav Shrivastava, Rama Chellappa, Abhinav Shrivastava
Learning to model and predict how humans interact with objects while performing an action is challenging, and most of the existing video prediction models are ineffective in modeling complicated human-object interactions. [Expand]
Show Tweets
Thursday Poster Session
Understanding Object Dynamics for Interactive Image-to-Video Synthesis
Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer
What would be the effect of locally poking a static scene? We present an approach that learns naturally-looking global articulations caused by a local manipulation at a pixel level. [Expand]
Show Tweets
Tuesday Poster Session
OCONet: Image Extrapolation by Object Completion
Richard Strong Bowen, Huiwen Chang, Charles Herrmann, Piotr Teterwak, Ce Liu, Ramin Zabih
Image extrapolation extends an input image beyond the originally-captured field of view. [Expand]
Monday Poster Session
Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning
Liu Bo, Qiulei Dong, Zhanyi Hu
Transductive zero-shot learning (T-ZSL) which could alleviate the domain shift problem in existing ZSL works, has received much attention recently. [Expand]
Friday Poster Session
GAIA: A Transfer Learning System of Object Detection That Fits Your Needs
Xingyuan Bu, Junran Peng, Junjie Yan, Tieniu Tan, Zhaoxiang Zhang
Transfer learning with pre-training on large-scale datasets has played an increasingly significant role in computer vision and natural language processing recently. [Expand]
Show Tweets
Monday Poster Session
Rethinking Graph Neural Architecture Search From Message-Passing
Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang
Graph neural networks (GNNs) emerged recently as a standard toolkit for learning from data on graphs. [Expand]
Tuesday Poster Session
Revisiting Superpixels for Active Learning in Semantic Segmentation With Realistic Annotation Costs
Lile Cai, Xun Xu, Jun Hao Liew, Chuan Sheng Foo
State-of-the-art methods for semantic segmentation are based on deep neural networks that are known to be data-hungry. [Expand]
Show Tweets
Wednesday Poster Session
Debiased Subjective Assessment of Real-World Image Enhancement
Peibei Cao, Zhangyang Wang, Kede Ma
In real-world image enhancement, it is often challenging (if not impossible) to acquire ground-truth data, preventing the adoption of distance metrics for objective quality assessment. [Expand]
Show Tweets
Monday Poster Session
Normal Integration via Inverse Plane Fitting With Minimum Point-to-Plane Distance
Xu Cao, Boxin Shi, Fumio Okura, Yasuyuki Matsushita
This paper presents a surface normal integration method that solves an inverse problem of local plane fitting. [Expand]
Show Tweets
Monday Poster Session
To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels
Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine, Vijay Vasudevan, Xiao Zhang, Dragomir Anguelov
3D object detection is vital for many robotics applications. [Expand]
Show Tweets
Friday Poster Session
Adaptive Convolutions for Structure-Aware Style Transfer
Prashanth Chandran, Gaspard Zoss, Paulo Gotardo, Markus Gross, Derek Bradley
Style transfer between images is an artistic application of CNNs, where the 'style' of one image is transferred onto another image while preserving the latter's content. [Expand]
Show Tweets
Wednesday Poster Session
Deep Perceptual Preprocessing for Video Coding
Aaron Chadha, Yiannis Andreopoulos
We introduce the concept of rate-aware deep perceptual preprocessing (DPP) for video encoding. [Expand]
Show Tweets
Thursday Poster Session
Your "Flamingo" is My "Bird": Fine-Grained, or Not
Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo
Whether what you see in Figure 1 is a "flamingo" or a "bird", is the question we ask in this paper. [Expand]
Thursday Poster Session
Learning Discriminative Prototypes With Dynamic Time Warping
Xiaobin Chang, Frederick Tung, Greg Mori
Dynamic Time Warping (DTW) is widely used for temporal data processing. [Expand]
Wednesday Poster Session
Towards Robust Classification Model by Counterfactual and Invariant Data Generation
Chun-Hao Chang, George Alexandru Adam, Anna Goldenberg
Despite the success of machine learning applications in science, industry, and society in general, many approaches are known to be non-robust, often relying on spurious correlations to make predictions. [Expand]
Thursday Poster Session
Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection
Jiacheng Cheng, Nuno Vasconcelos
The problem of novelty detection in fine-grained visual classification (FGVC) is considered. [Expand]
Show Tweets
Monday Poster Session
Learning To Filter: Siamese Relation Network for Robust Tracking
Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang
Despite the great success of Siamese-based trackers, their performance under complicated scenarios is still not satisfying, especially when there are distractors. [Expand]
Tuesday Poster Session
Light Field Super-Resolution With Zero-Shot Learning
Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha
Deep learning provides a new avenue for light field super-resolution (SR). [Expand]
Show Tweets
Wednesday Poster Session
Adaptive Image Transformer for One-Shot Object Detection
Ding-Jie Chen, He-Yen Hsieh, Tyng-Luh Liu
One-shot object detection tackles a challenging task that aims at identifying within a target image all object instances of the same class, implied by a query image patch. [Expand]
Show Tweets
Thursday Poster Session
Class-Aware Robust Adversarial Training for Object Detection
Pin-Chun Chen, Bo-Han Kung, Jun-Cheng Chen
Object detection is an important computer vision task with plenty of real-world applications; therefore, how to enhance its robustness against adversarial attacks has emerged as a crucial issue. [Expand]
Wednesday Poster Session
Blind Deblurring for Saturated Images
Liang Chen, Jiawei Zhang, Songnan Lin, Faming Fang, Jimmy S. Ren
Blind deblurring has received considerable attention in recent years. [Expand]
Show Tweets
Tuesday Poster Session
Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation
Haoxin Chen, Hanjie Wu, Nanxuan Zhao, Sucheng Ren, Shengfeng He
This paper tackles the task of Few-Shot Video Object Segmentation (FSVOS), i.e., segmenting objects in the query videos with certain class specified in a few labeled support images. [Expand]
Show Tweets
Thursday Poster Session
Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity
Zhile Chen, Feng Li, Yuhui Quan, Yong Xu, Hui Ji
In recent years, convolutional neural networks (CNNs) have become a prominent tool for texture recognition. [Expand]
Show Tweets
Tuesday Poster Session
DualAST: Dual Style-Learning Networks for Artistic Style Transfer
Haibo Chen, Lei Zhao, Zhizhong Wang, Huiming Zhang, Zhiwen Zuo, Ailin Li, Wei Xing, Dongming Lu
Artistic style transfer is an image editing task that aims at repainting everyday photographs with learned artistic styles. [Expand]
Show Tweets
Monday Poster Session
ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning
Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma
Recently, the transductive graph-based methods have achieved great success in the few-shot classification task. [Expand]
Show Tweets
Tuesday Poster Session
Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach
Yu Chen, Ji Zhao, Laurent Kneip
We address rotation averaging (RA) and its application to real-world 3D reconstruction. [Expand]
Wednesday Poster Session
Indoor Lighting Estimation Using an Event Camera
Zehao Chen, Qian Zheng, Peisong Niu, Huajin Tang, Gang Pan
Image-based methods for indoor lighting estimation suffer from the problem of intensity-distance ambiguity. [Expand]
Show Tweets
Thursday Poster Session
Jigsaw Clustering for Unsupervised Visual Representation Learning
Pengguang Chen, Shu Liu, Jiaya Jia
Unsupervised representation learning with contrastive learning achieves great success recently. [Expand]
Thursday Poster Session
Learning a Non-Blind Deblurring Network for Night Blurry Images
Liang Chen, Jiawei Zhang, Jinshan Pan, Songnan Lin, Faming Fang, Jimmy S. Ren
Deblurring night blurry images is difficult, because the common-used blur model based on the linear convolution operation does not hold in this situation due to the influence of saturated pixels. [Expand]
Show Tweets
Wednesday Poster Session
Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification
Hao Chen, Yaohui Wang, Benoit Lagadec, Antitza Dantcheva, Francois Bremond
Recent self-supervised contrastive learning provides an effective approach for unsupervised person re-identification (ReID) by learning invariance from different views (transformed versions) of an input. [Expand]
Monday Poster Session
Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification
Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng
It is well acknowledged that person re-identification (person ReID) highly relies on visual texture information like clothing. [Expand]
Show Tweets
Wednesday Poster Session
Learning Student Networks in the Wild
Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang
Data-free learning for student networks is a new paradigm for solving users' anxiety caused by the privacy problem of using original training data. [Expand]
Show Tweets
Tuesday Poster Session
MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes
Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang
Deepfakes raised serious concerns on the authenticity of visual contents. [Expand]
Wednesday Poster Session
MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
Hansheng Chen, Yuyao Huang, Wei Tian, Zhong Gao, Lu Xiong
Object localization in 3D space is a challenging aspect in monocular 3D object detection. [Expand]
Wednesday Poster Session
Neural Feature Search for RGB-Infrared Person Re-Identification
Yehansen Chen, Lin Wan, Zhihang Li, Qianyan Jing, Zongyuan Sun
RGB-Infrared person re-identification (RGB-IR ReID) is a challenging cross-modality retrieval problem, which aims at matching the person-of-interest over visible and infrared camera views. [Expand]
Monday Poster Session
Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation With Manipulable Semantics
Jia-Wei Chen, Li-Ju Chen, Chia-Mu Yu, Chun-Shien Lu
With the growing use of camera devices, the industry has many image datasets that provide more opportunities for collaboration between the machine learning community and industry. [Expand]
Tuesday Poster Session
Pareto Self-Supervised Training for Few-Shot Learning
Zhengyu Chen, Jixie Ge, Heshen Zhan, Siteng Huang, Donglin Wang
While few-shot learning (FSL) aims for rapid generalization to new concepts with little supervision, self-supervised learning (SSL) constructs supervisory signals directly computed from unlabeled data. [Expand]
Thursday Poster Session
PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors
Zeyuan Chen, Yangchao Wang, Yang Yang, Dong Liu
Deep learning-based methods have achieved remarkable performance for image dehazing. [Expand]
Show Tweets
Wednesday Poster Session
S2R-DepthNet: Learning a Generalizable Depth-Specific Structural Representation
Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng
Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes. [Expand]
Tuesday Poster Session
Scene Text Telescope: Text-Focused Scene Image Super-Resolution
Jingye Chen, Bin Li, Xiangyang Xue
Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image. [Expand]
Show Tweets
Thursday Poster Session
Predicting Human Scanpaths in Visual Question Answering
Xianyu Chen, Ming Jiang, Qi Zhao
Attention has been an important mechanism for both humans and computer vision systems. [Expand]
Show Tweets
Wednesday Poster Session
Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning
Shaoxiang Chen, Yu-Gang Jiang
Dense Event Captioning (DEC) aims to jointly localize and describe multiple events of interest in untrimmed videos, which is an advancement of the conventional video captioning task (generating a single sentence description for a trimmed video). [Expand]
Show Tweets
Wednesday Poster Session
Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning
Zhixiang Chi, Yang Wang, Yuanhao Yu, Jin Tang
In this paper, we tackle the problem of dynamic scene deblurring. [Expand]
Show Tweets
Wednesday Poster Session
Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion
Cheng Chi, Qingjie Wang, Tianyu Hao, Peng Guo, Xin Yang
Precise estimation of optical flow, stereo depth and camera motion are important for the real-world 3D scene understanding and visual perception. [Expand]
Show Tweets
Monday Poster Session
Multi-Label Learning From Single Positive Labels
Elijah Cole, Oisin Mac Aodha, Titouan Lorieul, Pietro Perona, Dan Morris, Nebojsa Jojic
Predicting all applicable labels for a given image is known as multi-label classification. [Expand]
Show Tweets
Monday Poster Session
Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation
Ze Cui, Jing Wang, Shangyin Gao, Tiansheng Guo, Yihui Feng, Bo Bai
With the development of deep learning techniques, the combination of deep learning with image compression has drawn lots of attention. [Expand]
Show Tweets
Wednesday Poster Session
Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression
Yufei Cui, Ziquan Liu, Qiao Li, Antoni B. Chan, Chun Jason Xue
Nested networks or slimmable networks are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. [Expand]
Monday Poster Session
Towards Accurate 3D Human Motion Prediction From Incomplete Observations
Qiongjie Cui, Huaijiang Sun
Predicting accurate and realistic future human poses from historically observed sequences is a fundamental task in the intersection of computer vision, graphics, and artificial intelligence. [Expand]
Show Tweets
Tuesday Poster Session
Dynamic Head: Unifying Object Detection Heads With Attentions
Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang
The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. [Expand]
Show Tweets
Wednesday Poster Session
Learning Affinity-Aware Upsampling for Deep Image Matting
Yutong Dai, Hao Lu, Chunhua Shen
We show that learning affinity in upsampling provides an effective and efficient approach to exploit pairwise interactions in deep networks. [Expand]
Tuesday Poster Session
Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts
Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, Sing Bing Kang
We present Zillow Indoor Dataset (ZInD): A large indoor dataset with 71,474 panoramas from 1,524 real unfurnished homes. [Expand]
Show Tweets
Monday Poster Session
Progressive Contour Regression for Arbitrary-Shape Scene Text Detection
Pengwen Dai, Sanyi Zhang, Hua Zhang, Xiaochun Cao
State-of-the-art scene text detection methods usually model the text instance with local pixels or components from the bottom-up perspective and, therefore, are sensitive to noises and dependent on the complicated heuristic post-processing especially for arbitrary-shape texts. [Expand]
Show Tweets
Wednesday Poster Session
Nearest Neighbor Matching for Deep Clustering
Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, Heng Huang
Deep clustering gradually becomes an important branch in unsupervised learning methods. [Expand]
Show Tweets
Thursday Poster Session
GANmut: Learning Interpretable Conditional Space for Gamut of Emotions
Stefano d'Apolito, Danda Pani Paudel, Zhiwu Huang, Andres Romero, Luc Van Gool
Humans can communicate emotions through a plethora of facial expressions, each with its own intensity, nuances and ambiguities. [Expand]
Show Tweets
Monday Poster Session
Deep Homography for Efficient Stereo Image Compression
Xin Deng, Wenzhe Yang, Ren Yang, Mai Xu, Enpeng Liu, Qianhan Feng, Radu Timofte
In this paper, we propose HESIC, an end-to-end trainable deep network for stereo image compression (SIC). [Expand]
Show Tweets
Monday Poster Session
LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-Resolution
Xin Deng, Hao Wang, Mai Xu, Yichen Guo, Yuhang Song, Li Yang
The omnidirectional images (ODIs) are usually at low-resolution, due to the constraints of collection, storage and transmission. [Expand]
Show Tweets
Wednesday Poster Session
PML: Progressive Margin Loss for Long-Tailed Age Classification
Zongyong Deng, Hao Liu, Yaoxing Wang, Chenyang Wang, Zekuan Yu, Xuehong Sun
In this paper, we propose a progressive margin loss (PML) approach for unconstrained facial age classification. [Expand]
Wednesday Poster Session
Variational Prototype Learning for Deep Face Recognition
Jiankang Deng, Jia Guo, Jing Yang, Alexandros Lattas, Stefanos Zafeiriou
Deep face recognition has achieved remarkable improvements due to the introduction of margin-based softmax loss, in which the prototype stored in the last linear layer represents the center of each class. [Expand]
Show Tweets
Thursday Poster Session
Sketch, Ground, and Refine: Top-Down Dense Video Captioning
Chaorui Deng, Shizhe Chen, Da Chen, Yuan He, Qi Wu
The dense video captioning task aims to detect and describe a sequence of events in a video for detailed and coherent storytelling. [Expand]
Show Tweets
Monday Poster Session
Spatially-Invariant Style-Codes Controlled Makeup Transfer
Han Deng, Chu Han, Hongmin Cai, Guoqiang Han, Shengfeng He
Transferring makeup from the misaligned reference image is challenging. [Expand]
Show Tweets
Tuesday Poster Session
Part-Aware Panoptic Segmentation
Daan de Geus, Panagiotis Meletis, Chenyang Lu, Xiaoxiao Wen, Gijs Dubbelman
In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing. [Expand]
Show Tweets
Tuesday Poster Session
HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers
Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo
High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. [Expand]
Show Tweets
Tuesday Poster Session
Learning Spatially-Variant MAP Models for Non-Blind Image Deblurring
Jiangxin Dong, Stefan Roth, Bernt Schiele
The classical maximum a-posteriori (MAP) framework for non-blind image deblurring requires defining suitable data and regularization terms, whose interplay yields the desired clear image through optimization. [Expand]
Show Tweets
Tuesday Poster Session
EventZoom: Learning To Denoise and Super Resolve Neuromorphic Events
Peiqi Duan, Zihao W. Wang, Xinyu Zhou, Yi Ma, Boxin Shi
We address the problem of jointly denoising and super resolving neuromorphic events, a novel visual signal that represents thresholded temporal gradients in a space-time window. [Expand]
Show Tweets
Thursday Poster Session
TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search
Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
Recent breakthroughs of Neural Architecture Search (NAS) extend the field's research scope towards a broader range of vision tasks and more diversified search spaces. [Expand]
Tuesday Poster Session
NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go
Marvin Eisenberger, David Novotny, Gael Kerchenbaum, Patrick Labatut, Natalia Neverova, Daniel Cremers, Andrea Vedaldi
We present NeuroMorph, a new neural network architecture that takes as input two 3D shapes and produces in one go, i.e. [Expand]
Show Tweets
Wednesday Poster Session
Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models
Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz
While recent pre-training tasks on 2D images have proven very successful for transfer learning, pre-training for 3D data remains challenging. [Expand]
Show Tweets
Wednesday Poster Session
Dual Attention Guided Gaze Target Detection in the Wild
Yi Fang, Jiapeng Tang, Wang Shen, Wei Shen, Xiao Gu, Li Song, Guangtao Zhai
Gaze target detection aims to infer where each person in a scene is looking. [Expand]
Show Tweets
Thursday Poster Session
Group Collaborative Learning for Co-Salient Object Detection
Qi Fan, Deng-Ping Fan, Huazhu Fu, Chi-Keung Tang, Ling Shao, Yu-Wing Tai
We present a novel group collaborative learning framework (GCNet) capable of detecting co-salient objects in real time (16ms), by simultaneously mining consensus representations at group level based on the two necessary criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module; 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module conditioning the inconsistent consensus. [Expand]
Thursday Poster Session
Learning Triadic Belief Dynamics in Nonverbal Communication From Videos
Lifeng Fan, Shuwen Qiu, Zilong Zheng, Tao Gao, Song-Chun Zhu, Yixin Zhu
Humans possess a unique social cognition capability; nonverbal communication can convey rich social information among agents. [Expand]
Wednesday Poster Session
Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos
Hehe Fan, Yi Yang, Mohan Kankanhalli
Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. [Expand]
Show Tweets
Thursday Poster Session
Cross-Domain Similarity Learning for Face Recognition in Unseen Domains
Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker
Face recognition models trained under the assumption of identical training and test distributions often suffer from poor generalization when faced with unknown variations, such as a novel ethnicity or unpredictable individual make-ups during test time. [Expand]
Thursday Poster Session
SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation
Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, Fei-Yue Wang
How to learn effective features from large-scale point clouds for semantic segmentation has attracted increasing attention in recent years. [Expand]
Show Tweets
Thursday Poster Session
LiDAR-Aug: A General Rendering-Based Augmentation Framework for 3D Object Detection
Jin Fang, Xinxin Zuo, Dingfu Zhou, Shengze Jin, Sen Wang, Liangjun Zhang
Annotating the LiDAR point cloud is crucial for deep learning-based 3D object detection tasks. [Expand]
Show Tweets
Tuesday Poster Session
Semantic-Aware Video Text Detection
Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
Most existing video text detection methods track texts with appearance features, which are easily influenced by the change of perspective and illumination. [Expand]
Show Tweets
Monday Poster Session
AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training
Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, Cristian Sminchisescu
I went to the gym today, but how well did I do? And where should I improve? Ah, my back hurts slightly... [Expand]
Show Tweets
Wednesday Poster Session
Anticipating Human Actions by Correlating Past With the Future With Jaccard Similarity Measures
Basura Fernando, Samitha Herath
We propose a framework for early action recognition and anticipation by correlating past features with the future using three novel similarity measures called Jaccard vector similarity, Jaccard cross-correlation and Jaccard Frobenius inner product over covariances. [Expand]
Thursday Poster Session
A Multi-Task Network for Joint Specular Highlight Detection and Removal
Gang Fu, Qing Zhang, Lei Zhu, Ping Li, Chunxia Xiao
Specular highlight detection and removal are fundamental and challenging tasks. [Expand]
Show Tweets
Wednesday Poster Session
Double Low-Rank Representation With Projection Distance Penalty for Clustering
Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang
This paper presents a novel, simple yet robust self-representation method, i.e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering. [Expand]
Show Tweets
Tuesday Poster Session
Auto-Exposure Fusion for Single-Image Shadow Removal
Lan Fu, Changqing Zhou, Qing Guo, Felix Juefei-Xu, Hongkai Yu, Wei Feng, Yang Liu, Song Wang
Shadow removal is still a challenging task due to its inherent background-dependent and spatial-variant properties, leading to unknown and diverse shadow patterns. [Expand]
Wednesday Poster Session
Partial Feature Selection and Alignment for Multi-Source Domain Adaptation
Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu
Multi-Source Domain Adaptation (MSDA), which dedicates to transfer the knowledge learned from multiple source domains to an unlabeled target domain, has drawn increasing attention in the research community. [Expand]
Show Tweets
Friday Poster Session
STMTrack: Template-Free Visual Tracking With Space-Time Memory Networks
Zhihong Fu, Qingjie Liu, Zehua Fu, Yunhong Wang
Boosting performance of the offline trained siamese trackers is getting harder nowadays since the fixed information of the template cropped from the first frame has been almost thoroughly mined, but they are poorly capable of resisting target appearance changes. [Expand]
Thursday Poster Session
Robust Point Cloud Registration Framework Based on Deep Graph Matching
Kexue Fu, Shaolei Liu, Xiaoyuan Luo, Manning Wang
3D point cloud registration is a fundamental problem in computer vision and robotics. [Expand]
Wednesday Poster Session
Transferable Query Selection for Active Domain Adaptation
Bo Fu, Zhangjie Cao, Jianmin Wang, Mingsheng Long
Unsupervised domain adaptation (UDA) enables transferring knowledge from a related source domain to a fully unlabeled target domain. [Expand]
Wednesday Poster Session
Isometric Multi-Shape Matching
Maolin Gao, Zorah Lahner, Johan Thunberg, Daniel Cremers, Florian Bernard
Finding correspondences between shapes is a fundamental problem in computer vision and graphics, which is relevant for many applications, including 3D reconstruction, object tracking, and style transfer. [Expand]
Thursday Poster Session
Information Bottleneck Disentanglement for Identity Swapping
Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He
Improving the performance of face forgery detectors often requires more identity-swapped images of higher-quality. [Expand]
Show Tweets
Tuesday Poster Session
Network Pruning via Performance Maximization
Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang
Channel pruning is a class of powerful methods for model compression. [Expand]
Show Tweets
Wednesday Poster Session
Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
Chen Gao, Jinyu Chen, Si Liu, Luting Wang, Qiong Zhang, Qi Wu
The Remote Embodied Referring Expression (REVERIE) is a recently raised task that requires an agent to navigate to and localise a referred remote object according to a high-level language instruction. [Expand]
Show Tweets
Tuesday Poster Session
Privacy Preserving Localization and Mapping From Uncalibrated Cameras
Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schonberger, Marc Pollefeys
Recent works on localization and mapping from privacy preserving line features have made significant progress towards addressing the privacy concerns arising from cloud-based solutions in mixed reality and robotics. [Expand]
Show Tweets
Monday Poster Session
Video Object Segmentation Using Global and Instance Embedding Learning
Wenbin Ge, Xiankai Lu, Jianbing Shen
In this paper, we propose a feature embedding based video object segmentation (VOS) method which is simple, fast and effective. [Expand]
Show Tweets
Friday Poster Session
Learning Graphs for Knowledge Transfer With Limited Labels
Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava
Fixed input graphs are a mainstay in approaches that utilize Graph Convolution Networks (GCNs) for knowledge transfer. [Expand]
Show Tweets
Wednesday Poster Session
Polygonal Building Extraction by Frame Field Learning
Nicolas Girard, Dmitriy Smirnov, Justin Solomon, Yuliya Tarabalka
While state of the art image segmentation models typically output segmentations in raster format, applications in geographic information systems often require vector polygons. [Expand]
Show Tweets
Tuesday Poster Session
OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning
Spyros Gidaris, Andrei Bursuc, Gilles Puy, Nikos Komodakis, Matthieu Cord, Patrick Perez
Learning image representations without human supervision is an important and active research field. [Expand]
Show Tweets
Tuesday Poster Session
MaxUp: Lightweight Adversarial Training With Data Augmentation Improves Neural Network Training
Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
We propose MaxUp, an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models, especially deep neural networks. [Expand]
Show Tweets
Monday Poster Session
Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma
Hidden features in neural network usually fail to learn informative representation for 3D segmentation as supervisions are only given on output prediction, while this can be solved by omni-scale supervision on intermediate layers. [Expand]
Thursday Poster Session
PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation With Neural Positional Encoding and Distilled Matting Loss
Juan Luis Gonzalez, Munchurl Kim
In this paper, we propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net. [Expand]
Tuesday Poster Session
Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
Shanyan Guan, Jingwei Xu, Yunbo Wang, Bingbing Ni, Xiaokang Yang
This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos. [Expand]
Wednesday Poster Session
Inverse Simulation: Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control
Jingfan Guo, Jie Li, Rahul Narain, Hyun Soo Park
This paper studies the problem of inverse cloth simulation---to estimate shape and time-varying poses of the underlying body that generates physically plausible cloth motion, which matches to the point cloud measurements on the clothed humans. [Expand]
Show Tweets
Thursday Poster Session
Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection
Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye
Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects. [Expand]
Show Tweets
Wednesday Poster Session
Intrinsic Image Harmonization
Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, Bing Zheng
Compositing an image usually inevitably suffers from inharmony problem that is mainly caused by incompatibility of foreground and background from two different images with distinct surfaces and lights, corresponding to material-dependent and light-dependent characteristics, namely, reflectance and illumination intrinsic images, respectively. [Expand]
Show Tweets
Friday Poster Session
Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-Balanced Samplings
Hao Guo, Song Wang
Long-tailed data distribution is common in many multi-label visual recognition tasks and the direct use of these data for training usually leads to relatively low performance on tail classes. [Expand]
Thursday Poster Session
Multispectral Photometric Stereo for Spatially-Varying Spectral Reflectances: A Well Posed Problem?
Heng Guo, Fumio Okura, Boxin Shi, Takuya Funatomi, Yasuhiro Mukaigawa, Yasuyuki Matsushita
Multispectral photometric stereo (MPS) aims at recovering the surface normal of a scene from a single-shot multispectral image, which is known as an ill-posed problem. [Expand]
Show Tweets
Monday Poster Session
Online Multiple Object Tracking With Cross-Task Synergy
Song Guo, Jingya Wang, Xinchao Wang, Dacheng Tao
Modern online multiple object tracking (MOT) methods usually focus on two directions to improve tracking performance. [Expand]
Wednesday Poster Session
Positive-Unlabeled Data Purification in the Wild for Object Detection
Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang
Deep learning based object detection approaches have achieved great progress with the benefit from large amount of labeled images. [Expand]
Show Tweets
Monday Poster Session
Strengthen Learning Tolerance for Weakly Supervised Object Localization
Guangyu Guo, Junwei Han, Fang Wan, Dingwen Zhang
Weakly supervised object localization (WSOL) aims at learning to localize objects of interest by only using the image-level labels as the supervision. [Expand]
Show Tweets
Wednesday Poster Session
Contrastive Embedding for Generalized Zero-Shot Learning
Zongyan Han, Zhenyong Fu, Shuo Chen, Jian Yang
Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes, when only the labeled examples from seen classes are provided. [Expand]
Monday Poster Session
Learning To Fuse Asymmetric Feature Maps in Siamese Trackers
Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen
Recently, Siamese-based trackers have achieved promising performance in visual tracking. [Expand]
Friday Poster Session
Crossing Cuts Polygonal Puzzles: Models and Solvers
Peleg Harel, Ohad Ben-Shahar
Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered fragments, is fundamental to numerous applications, and yet most of the literature has focused thus far on less realistic puzzles whose pieces are identical squares. [Expand]
Show Tweets
Tuesday Poster Session
NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning
Hyunho Ha, Joo Ho Lee, Andreas Meuleman, Min H. Kim
Multiview shape-from-shading (SfS) has achieved high-detail geometry, but its computation is expensive for solving a multiview registration and an ill-posed inverse rendering problem. [Expand]
Show Tweets
Friday Poster Session
Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps
Yuk Heo, Yeong Jun Koh, Chang-Su Kim
We propose a novel guided interactive segmentation (GIS) algorithm for video objects to improve the segmentation accuracy and reduce the interaction time. [Expand]
Wednesday Poster Session
DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution
Tong He, Chunhua Shen, Anton van den Hengel
Previous top-performing approaches for point cloud instance segmentation involve a bottom-up strategy, which often includes inefficient operations or complex pipelines, such as grouping over-segmented components, introducing additional steps for refining, or designing complicated loss functions. [Expand]
Monday Poster Session
MOST: A Multi-Oriented Scene Text Detector With Localization Refinement
Minghang He, Minghui Liao, Zhibo Yang, Humen Zhong, Jun Tang, Wenqing Cheng, Cong Yao, Yongpan Wang, Xiang Bai
Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios. [Expand]
Wednesday Poster Session
Composing Photos Like a Photographer
Chaoyi Hong, Shuaiyuan Du, Ke Xian, Hao Lu, Zhiguo Cao, Weicai Zhong
We show that explicit modeling of composition rules benefits image cropping. [Expand]
Show Tweets
Tuesday Poster Session
Disentangling Label Distribution for Long-Tailed Visual Recognition
Youngkyu Hong, Seungju Han, Kwanghee Choi, Seokjun Seo, Beomsu Kim, Buru Chang
The current evaluation protocol of long-tailed visual recognition trains the classification model on the long-tailed source label distribution and evaluates its performance on the uniform target label distribution. [Expand]
Tuesday Poster Session
Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification
Peixian Hong, Tao Wu, Ancong Wu, Xintong Han, Wei-Shi Zheng
Recently, person re-identification (Re-ID) has achieved great progress. [Expand]
Show Tweets
Wednesday Poster Session
Partial Person Re-Identification With Part-Part Correspondence Learning
Tianyu He, Xu Shen, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua
Driven by the success of deep learning, the last decade has seen rapid advances in person re-identification (re-ID). [Expand]
Show Tweets
Wednesday Poster Session
LPSNet: A Lightweight Solution for Fast Panoptic Segmentation
Weixiang Hong, Qingpei Guo, Wei Zhang, Jingdong Chen, Wei Chu
Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level. [Expand]
Show Tweets
Friday Poster Session
Panoramic Image Reflection Removal
Yuchen Hong, Qian Zheng, Lingran Zhao, Xudong Jiang, Alex C. Kot, Boxin Shi
This paper studies the problem of panoramic image reflection removal, aiming at reliving the content ambiguity between reflection and transmission scenes. [Expand]
Show Tweets
Wednesday Poster Session
Image Change Captioning by Learning From an Auxiliary Task
Mehrdad Hosseinzadeh, Yang Wang
We tackle the challenging task of image change captioning. [Expand]
Show Tweets
Monday Poster Session
VLN BERT: A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen Gould
Accuracy of many visiolinguistic tasks has benefited significantly from the application of vision-and-language (V&L) BERT. [Expand]
Show Tweets
Monday Poster Session
BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification
Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang, Shiguang Shan
In this paper, we present an efficient spatial-temporal representation for video person re-identification (reID). [Expand]
Monday Poster Session
Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
Luwei Hou, Yu Zhang, Kui Fu, Jia Li
Cross-domain weakly supervised object detection aims to adapt object-level knowledge from a fully labeled source domain dataset (i.e. [Expand]
Show Tweets
Wednesday Poster Session
Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net
Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao
Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition. [Expand]
Show Tweets
Tuesday Poster Session
DeepLM: Large-Scale Nonlinear Least Squares on Deep Learning Frameworks Using Stochastic Domain Decomposition
Jingwei Huang, Shan Huang, Mingwei Sun
We propose a novel approach for large-scale nonlinear least squares problems based on deep learning frameworks. [Expand]
Show Tweets
Wednesday Poster Session
Geo-FARM: Geodesic Factor Regression Model for Misaligned Pre-Shape Responses in Statistical Shape Analysis
Chao Huang, Anuj Srivastava, Rongjie Liu
The problem of using covariates to predict shapes of objects in a regression setting is important in many fields. [Expand]
Show Tweets
Thursday Poster Session
Memory Oriented Transfer Learning for Semi-Supervised Image Deraining
Huaibo Huang, Aijing Yu, Ran He
Deep learning based methods have shown dramatic improvements in image rain removal by using large-scale paired data of synthetic datasets. [Expand]
Show Tweets
Wednesday Poster Session
Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding
Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao
An LBYL ( 'Look Before You Leap' ) Network is proposed for end-to-end trainable one-stage visual grounding. [Expand]
Friday Poster Session
MetaSets: Meta-Learning on Point Sets for Generalizable Representations
Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long
Deep learning techniques for point clouds have achieved strong performance on a range of 3D vision tasks. [Expand]
Wednesday Poster Session
Revisiting Knowledge Distillation: An Inheritance and Exploration Framework
Zhen Huang, Xu Shen, Jun Xing, Tongliang Liu, Xinmei Tian, Houqiang Li, Bing Deng, Jianqiang Huang, Xian-Sheng Hua
Knowledge Distillation (KD) is a popular technique to transfer knowledge from a teacher model or ensemble to a student model. [Expand]
Show Tweets
Tuesday Poster Session
S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation
Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Yu-Cheng Chang, Tsung-Lin Tsou, Yu-An Wang, Winston H. Hsu
Dense depth estimation plays a key role in multiple applications such as robotics, 3D reconstruction, and augmented reality. [Expand]
Friday Poster Session
Video Rescaling Networks With Joint Optimization Strategies for Downscaling and Upscaling
Yan-Cheng Huang, Yi-Hsin Chen, Cheng-You Lu, Hui-Po Wang, Wen-Hsiao Peng, Ching-Chun Huang
This paper addresses the video rescaling task, which arises from the needs of adapting the video spatial resolution to suit individual viewing devices. [Expand]
Tuesday Poster Session
Learning the Non-Differentiable Optimization for Blind Super-Resolution
Zheng Hui, Jie Li, Xiumei Wang, Xinbo Gao
Previous convolutional neural network (CNN) based blind super-resolution (SR) methods usually adopt an iterative optimization way to approximate the ground-truth (GT) step-by-step. [Expand]
Show Tweets
Monday Poster Session
ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation
Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data. [Expand]
Show Tweets
Monday Poster Session
A2-FPN: Attention Aggregation Based Feature Pyramid Network for Instance Segmentation
Miao Hu, Yali Li, Lu Fang, Shengjin Wang
Learning pyramidal feature representations is crucial for recognizing object instances at different scales. [Expand]
Thursday Poster Session
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang
Language-queried video actor segmentation aims to predict the pixel-level mask of the actor which performs the actions described by a natural language query in the target frames. [Expand]
Tuesday Poster Session
Efficient Deformable Shape Correspondence via Multiscale Spectral Manifold Wavelets Preservation
Ling Hu, Qinsong Li, Shengjun Liu, Xinru Liu
The functional map framework has proven to be extremely effective for representing dense correspondences between deformable shapes. [Expand]
Show Tweets
Thursday Poster Session
Learning Cross-Modal Retrieval With Noisy Labels
Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin
Recently, cross-modal retrieval is emerging with the help of deep multimodal learning. [Expand]
Show Tweets
Tuesday Poster Session
Dense Relation Distillation With Context-Aware Aggregation for Few-Shot Object Detection
Hanzhe Hu, Shuai Bai, Aoxue Li, Jinshi Cui, Liwei Wang
Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. [Expand]
Wednesday Poster Session
Pseudo 3D Auto-Correlation Network for Real Image Denoising
Xiaowan Hu, Ruijun Ma, Zhihong Liu, Yuanhao Cai, Xiaole Zhao, Yulun Zhang, Haoqian Wang
The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain. [Expand]
Show Tweets
Friday Poster Session
Model-Aware Gesture-to-Gesture Translation
Hezhen Hu, Weilun Wang, Wengang Zhou, Weichao Zhao, Houqiang Li
Hand gesture-to-gesture translation is a significant and interesting problem, which serves as a key role in many applications, such as sign language production. [Expand]
Show Tweets
Friday Poster Session
Safe Local Motion Planning With Self-Supervised Freespace Forecasting
Peiyun Hu, Aaron Huang, John Dolan, David Held, Deva Ramanan
Safe local motion planning for autonomous driving in dynamic environments requires forecasting how the scene evolves. [Expand]
Show Tweets
Thursday Poster Session
Wide-Depth-Range 6D Object Pose Estimation in Space
Yinlin Hu, Sebastien Speierer, Wenzel Jakob, Pascal Fua, Mathieu Salzmann
6D pose estimation in space poses unique challenges that are not commonly encountered in the terrestrial setting. [Expand]
Friday Poster Session
Self-Supervised 3D Mesh Reconstruction From Single Images
Tao Hu, Liwei Wang, Xiaogang Xu, Shu Liu, Jiaya Jia
Recent single-view 3D reconstruction methods reconstruct object's shape and texture from a single image with only 2D image-level annotation. [Expand]
Show Tweets
Tuesday Poster Session
Self-Supervised Video GANs: Learning for Appearance Consistency and Motion Coherency
Sangeek Hyun, Jihwan Kim, Jae-Pil Heo
A video can be represented by the composition of appearance and motion. [Expand]
Show Tweets
Wednesday Poster Session
Shape From Sky: Polarimetric Normal Recovery Under the Sky
Tomoki Ichikawa, Matthew Purri, Ryo Kawahara, Shohei Nobuhara, Kristin Dana, Ko Nishino
The sky exhibits a unique spatial polarization pattern by scattering the unpolarized sun light. [Expand]
Show Tweets
Thursday Poster Session
Optimal Quantization Using Scaled Codebook
Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez
We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled. [Expand]
Show Tweets
Thursday Poster Session
3D Shape Generation With Grid-Based Implicit Functions
Moritz Ibing, Isaak Lim, Leif Kobbelt
Previous approaches to generate shapes in a 3D setting train a GAN on the latent space of an autoencoder (AE). [Expand]
Thursday Poster Session
Facial Action Unit Detection With Transformers
Geethu Miriam Jacob, Bjorn Stenger
The Facial Action Coding System is a taxonomy for fine-grained facial expression analysis. [Expand]
Show Tweets
Wednesday Poster Session
CAMERAS: Enhanced Resolution and Sanity Preserving Class Activation Mapping for Image Saliency
Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. [Expand]
Show Tweets
Friday Poster Session
Learning Compositional Representation for 4D Captures With Neural ODE
Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu
Learning based representation has become the key to the success of many computer vision systems. [Expand]
Tuesday Poster Session
UV-Net: Learning From Boundary Representations
Pradeep Kumar Jayaraman, Aditya Sanghi, Joseph G. Lambourne, Karl D.D. Willis, Thomas Davies, Hooman Shayani, Nigel Morris
We introduce UV-Net, a novel neural network architecture and representation designed to operate directly on Boundary representation (B-rep) data from 3D CAD models. [Expand]
Show Tweets
Thursday Poster Session
Mining Better Samples for Contrastive Learning of Temporal Correspondence
Sangryul Jeon, Dongbo Min, Seungryong Kim, Kwanghoon Sohn
We present a novel framework for contrastive learning of pixel-level representation using only unlabeled video. [Expand]
Show Tweets
Monday Poster Session
Saliency-Guided Image Translation
Lai Jiang, Mai Xu, Xiaofei Wang, Leonid Sigal
In this paper, we propose a novel task for saliency-guided image translation, with the goal of image-to-image translation conditioned on the user specified saliency map. [Expand]
Friday Poster Session
IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking
Shuai Jia, Yibing Song, Chao Ma, Xiaokang Yang
Adversarial attack arises due to the vulnerability of deep neural networks to perceive input samples injected with imperceptible perturbations. [Expand]
Tuesday Poster Session
Leveraging Line-Point Consistence To Preserve Structures for Wide Parallax Image Stitching
Qi Jia, ZhengJun Li, Xin Fan, Haotian Zhao, Shiyu Teng, Xinchen Ye, Longin Jan Latecki
Generating high-quality stitched images with natural structures is a challenging task in computer vision. [Expand]
Thursday Poster Session
Amalgamating Knowledge From Heterogeneous Graph Neural Networks
Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, Dacheng Tao
In this paper, we study a novel knowledge transfer task in the domain of graph neural networks (GNNs). [Expand]
Show Tweets
Friday Poster Session
Harmonious Semantic Line Detection via Maximal Weight Clique Selection
Dongkwon Jin, Wonhui Park, Seong-Gyun Jeong, Chang-Su Kim
A novel algorithm to detect an optimal set of semantic lines is proposed in this work. [Expand]
Friday Poster Session
Turning Frequency to Resolution: Video Super-Resolution via Event Cameras
Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, Dacheng Tao
State-of-the-art video super-resolution (VSR) methods focus on exploiting inter- and intra-frame correlations to estimate high-resolution (HR) video frames from low-resolution (LR) ones. [Expand]
Show Tweets
Wednesday Poster Session
Cross-Modal Center Loss for 3D Cross-Modal Retrieval
Longlong Jing, Elahe Vahdani, Jiaxing Tan, Yingli Tian
Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. [Expand]
Show Tweets
Tuesday Poster Session
Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling
Wei Ji, Shuang Yu, Junde Wu, Kai Ma, Cheng Bian, Qi Bi, Jingjing Li, Hanruo Liu, Li Cheng, Yefeng Zheng
In medical image analysis, it is typical to collect multiple annotations, each from a different clinical expert or rater, in the expectation that possible diagnostic errors could be mitigated. [Expand]
Show Tweets
Thursday Poster Session
Calibrated RGB-D Salient Object Detection
Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng
Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD). [Expand]
Show Tweets
Wednesday Poster Session
Practical Single-Image Super-Resolution Using Look-Up Table
Younghyun Jo, Seon Joo Kim
A number of super-resolution (SR) algorithms from interpolation to deep neural networks (DNN) have emerged to restore or create missing details of the input low-resolution image. [Expand]
Show Tweets
Monday Poster Session
Joint Deep Model-Based MR Image and Coil Sensitivity Reconstruction Network (Joint-ICNet) for Fast MRI
Yohan Jun, Hyungseob Shin, Taejoon Eo, Dosik Hwang
Magnetic resonance imaging (MRI) can provide diagnostic information with high-resolution and high-contrast images. [Expand]
Show Tweets
Tuesday Poster Session
Time Adaptive Recurrent Neural Network
Anil Kag, Venkatesh Saligrama
We propose a learning method that, dynamically modifies the time-constants of the continuous-time counterpart of a vanilla RNN. [Expand]
Show Tweets
Thursday Poster Session
Tackling the Ill-Posedness of Super-Resolution Through Adaptive Target Generation
Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim
By the one-to-many nature of the super-resolution (SR) problem, a single low-resolution (LR) image can be mapped to many high-resolution (HR) images. [Expand]
Show Tweets
Friday Poster Session
Unsupervised Learning of Depth and Depth-of-Field Effect From Natural Images With Aperture Rendering Generative Adversarial Networks
Takuhiro Kaneko
Understanding the 3D world from 2D projected natural images is a fundamental challenge in computer vision and graphics. [Expand]
Show Tweets
Friday Poster Session
Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning
Shichao Kan, Yigang Cen, Yang Li, Vladimir Mladenovic, Zhihai He
In unsupervised learning of image features without labels, especially on datasets with fine-grained object classes, it is often very difficult to tell if a given image belongs to one specific object class or another, even for human eyes. [Expand]
Show Tweets
Thursday Poster Session
Zero-Shot Single Image Restoration Through Controlled Perturbation of Koschmieder's Model
Aupendu Kar, Sobhan Kanti Dhara, Debashis Sen, Prabir Kumar Biswas
Real-world image degradation due to light scattering can be described based on the Koschmieder's model. [Expand]
Show Tweets
Friday Poster Session
Differentiable Diffusion for Dense Depth Estimation From Multi-View Images
Numair Khan, Min H. Kim, James Tompkin
We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision. [Expand]
Show Tweets
Wednesday Poster Session
Guided Integrated Gradients: An Adaptive Path Method for Removing Noise
Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi
Integrated Gradients (IG) is a commonly used feature attribution method for deep neural networks. [Expand]
Show Tweets
Tuesday Poster Session
Neural Side-by-Side: Predicting Human Preferences for No-Reference Super-Resolution Evaluation
Valentin Khrulkov, Artem Babenko
Super-resolution based on deep convolutional networks is currently gaining much attention from both academia and industry. [Expand]
Show Tweets
Tuesday Poster Session
Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking
Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg
In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene. [Expand]
Wednesday Poster Session
Joint Negative and Positive Learning for Noisy Labels
Youngdong Kim, Juseung Yun, Hyounguk Shon, Junmo Kim
Training of Convolutional Neural Networks (CNNs) with data with noisy labels is known to be a challenge. [Expand]
Wednesday Poster Session
High-Quality Stereo Image Restoration From Double Refraction
Hakyeong Kim, Andreas Meuleman, Daniel S. Jeon, Min H. Kim
Single-shot monocular birefractive stereo methods have been used for estimating sparse depth from double refraction over edges. [Expand]
Show Tweets
Thursday Poster Session
Not Just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction
Daejin Kim, Mohammad Azam Khan, Jaegul Choo
Facial attribute editing aims to manipulate the image with the desired attribute while preserving the other details. [Expand]
Show Tweets
Tuesday Poster Session
Quality-Agnostic Image Recognition via Invertible Decoder
Insoo Kim, Seungju Han, Ji-won Baek, Seong-Jin Park, Jae-Joon Han, Jinwoo Shin
Despite the remarkable performance of deep models on image recognition tasks, they are known to be susceptible to common corruptions such as blur, noise, and low-resolution. [Expand]
Show Tweets
Thursday Poster Session
Prototype-Guided Saliency Feature Learning for Person Search
Hanjae Kim, Sunghun Joung, Ig-Jae Kim, Kwanghoon Sohn
Existing person search methods integrate person detection and re-identification (re-ID) module into a unified system. [Expand]
Show Tweets
Tuesday Poster Session
QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
Vladimir Kryzhanovskiy, Gleb Balitskiy, Nikolay Kozyrskiy, Aleksandr Zuruev
Modern deep neural networks (DNNs) cannot be effectively used in mobile and embedded devices due to strict requirements for computational complexity, memory, and power consumption. [Expand]
Show Tweets
Wednesday Poster Session
T-vMF Similarity for Regularizing Intra-Class Feature Distribution
Takumi Kobayashi
Deep convolutional neural networks (CNNs) leverage large-scale training dataset to produce remarkable performance on various image classification tasks. [Expand]
Show Tweets
Tuesday Poster Session
Controllable Image Restoration for Under-Display Camera in Smartphones
Kinam Kwon, Eunhee Kang, Sangwon Lee, Su-Jin Lee, Hyong-Euk Lee, ByungIn Yoo, Jae-Joon Han
Under-display camera (UDC) technology is essential for full-screen display in smartphones and is achieved by removing the concept of drilling holes on display. [Expand]
Show Tweets
Monday Poster Session
IMODAL: Creating Learnable User-Defined Deformation Models
Leander Lacroix, Benjamin Charlier, Alain Trouve, Barbara Gris
A natural way to model the evolution of an object (growth of a leaf for instance) is to estimate a plausible deforming path between two observations. [Expand]
Show Tweets
Thursday Poster Session
3D Video Stabilization With Depth Estimation by CNN-Based Optimization
Yao-Chih Lee, Kuan-Wei Tseng, Yu-Ta Chen, Chien-Cheng Chen, Chu-Song Chen, Yi-Ping Hung
Video stabilization is an essential component of visual quality enhancement. [Expand]
Show Tweets
Wednesday Poster Session
Restoring Extremely Dark Images in Real Time
Mohit Lamba, Kaushik Mitra
A practical low-light enhancement solution must be computationally fast, memory-efficient, and achieve a visually appealing restoration. [Expand]
Show Tweets
Tuesday Poster Session
Blocks-World Cameras
Jongho Lee, Mohit Gupta
For several vision and robotics applications, 3D geometry of man-made environments such as indoor scenes can be represented with a small number of dominant planes. [Expand]
Show Tweets
Thursday Poster Session
CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback
Seungmin Lee, Dongwan Kim, Bohyung Han
We tackle the task of image retrieval with text feedback, where a reference image and modifier text are combined to identify the desired target image. [Expand]
Show Tweets
Monday Poster Session
Iterative Filter Adaptive Network for Single Image Defocus Deblurring
Junyong Lee, Hyeongseok Son, Jaesung Rim, Sunghyun Cho, Seungyong Lee
We propose a novel end-to-end learning-based approach for single image defocus deblurring. [Expand]
Show Tweets
Monday Poster Session
DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation
Seunghun Lee, Sunghyun Cho, Sunghoon Im
In this paper, we present DRANet, a network architecture that disentangles image representations and transfers the visual attributes in a latent space for unsupervised cross-domain adaptation. [Expand]
Thursday Poster Session
PatchMatch-Based Neighborhood Consensus for Semantic Correspondence
Jae Yong Lee, Joseph DeGol, Victor Fragoso, Sudipta N. Sinha
We address estimating dense correspondences between two images depicting different but semantically related scenes. [Expand]
Show Tweets
Thursday Poster Session
Network Quantization With Element-Wise Gradient Scaling
Junghyup Lee, Dohyung Kim, Bumsub Ham
Network quantization aims at reducing bit-widths of weights and/or activations, particularly important for implementing deep neural networks with limited hardware resources. [Expand]
Tuesday Poster Session
Relevance-CAM: Your Model Already Knows Where To Look
Jeong Ryong Lee, Sewon Kim, Inyong Park, Taejoon Eo, Dosik Hwang
With increasing fields of application for neural networks and the development of neural networks, the ability to explain deep learning models is also becoming increasingly important. [Expand]
Show Tweets
Thursday Poster Session
Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning
Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, Yong Man Ro
Our work addresses long-term motion context issues for predicting future frames. [Expand]
Tuesday Poster Session
Picasso: A CUDA-Based Library for Deep Learning Over 3D Meshes
Huan Lei, Naveed Akhtar, Ajmal Mian
We present Picasso, a CUDA-based library comprising novel modules for deep learning over complex real-world 3D meshes. [Expand]
Thursday Poster Session
RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union
Zhidong Liang, Zehan Zhang, Ming Zhang, Xian Zhao, Shiliang Pu
Real-time and high-performance 3D object detection is an attractive research direction in autonomous driving. [Expand]
Show Tweets
Wednesday Poster Session
4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis
Weihang Liao, Art Subpa-asa, Yinqiang Zheng, Imari Sato
Hyperspectral photoacoustic (HSPA) spectroscopy is an emerging bi-modal imaging technology that is able to show the wavelength-dependent absorption distribution of the interior of a 3D volume. [Expand]
Show Tweets
Tuesday Poster Session
COMPLETER: Incomplete Multi-View Clustering via Contrastive Prediction
Yijie Lin, Yuanbiao Gou, Zitao Liu, Boyun Li, Jiancheng Lv, Xi Peng
In this paper, we study two challenging problems in incomplete multi-view clustering analysis, namely, i) how to learn an informative and consistent representation among different views without the help of labels and ii) how to recover the missing views from data. [Expand]
Show Tweets
Wednesday Poster Session
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
Chuming Lin, Chengming Xu, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu
Temporal action localization is an important yet challenging task in video understanding. [Expand]
Tuesday Poster Session
Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo
Jiahao Lin, Gim Hee Lee
Existing approaches for multi-view multi-person 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views and solve for the 3D pose estimation for each person. [Expand]
Thursday Poster Session
Rich Context Aggregation With Reflection Prior for Glass Surface Detection
Jiaying Lin, Zebang He, Rynson W.H. Lau
Glass surfaces appear everywhere. [Expand]
Show Tweets
Thursday Poster Session
Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
Yang Liu, Qingchao Chen, Samuel Albanie
In this paper, we study the task of visual-text retrieval in the highly practical setting in which labelled visual data with paired text descriptions are available in one domain (the "source"), but only unlabelled visual data (without text descriptions) are available in the domain of interest (the "target"). [Expand]
Show Tweets
Thursday Poster Session
What Can Style Transfer and Paintings Do for Model Robustness?
Hubert Lin, Mitchell van Zuijlen, Sylvia C. Pont, Maarten W.A. Wijntjes, Kavita Bala
A common strategy for improving model robustness is through data augmentations. [Expand]
Wednesday Poster Session
Cluster-Wise Hierarchical Generative Model for Deep Amortized Clustering
Huafeng Liu, Jiaqi Wang, Liping Jing
In this paper, we propose Cluster-wise Hierarchical Generative Model for deep amortized clustering (CHiGac). [Expand]
Show Tweets
Thursday Poster Session
Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Yu Cheng, Wei Wei, Zichuan Xu, Yulai Xie
This paper addresses the problem of temporal sentence grounding (TSG), which aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. [Expand]
Wednesday Poster Session
Deep Learning in Latent Space for Video Prediction and Compression
Bowen Liu, Yu Chen, Shiyu Liu, Hun-Seok Kim
Learning-based video compression has achieved substantial progress during recent years. [Expand]
Show Tweets
Monday Poster Session
Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation
Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, Yuexian Zou
Automatically generating radiology reports can improve current clinical practice in diagnostic radiology. [Expand]
Show Tweets
Thursday Poster Session
Exploit Visual Dependency Relations for Semantic Segmentation
Mingyuan Liu, Dan Schonfeld, Wei Tang
Dependency relations among visual entities are ubiquity because both objects and scenes are highly structured. [Expand]
Show Tweets
Wednesday Poster Session
Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction
Feng Liu, Luan Tran, Xiaoming Liu
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision. [Expand]
Wednesday Poster Session
Generic Perceptual Loss for Modeling Structured Output Dependencies
Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen
The perceptual loss has been widely used as an effective loss term in image synthesis tasks including image super-resolution [16], and style transfer [14]. [Expand]
Tuesday Poster Session
iMiGUE: An Identity-Free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
Xin Liu, Henglin Shi, Haoyu Chen, Zitong Yu, Xiaobai Li, Guoying Zhao
We introduce a new dataset for the emotional artificial intelligence research: identity-free video dataset for micro-gesture understanding and emotion analysis (iMiGUE). [Expand]
Show Tweets
Wednesday Poster Session
Learning To Warp for Style Transfer
Xiao-Chang Liu, Yong-Liang Yang, Peter Hall
Since its inception in 2015, Style Transfer has focused on texturing a content image using an art exemplar. [Expand]
Show Tweets
Tuesday Poster Session
Mask-Embedded Discriminator With Region-Based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis
Yi Liu, Xiaoyang Huo, Tianyi Chen, Xiangping Zeng, Si Wu, Zhiwen Yu, Hau-San Wong
Semi-supervised generative learning (SSGL) makes use of unlabeled data to achieve a trade-off between the data collection/annotation effort and generation performance, when adequate labeled data are not available. [Expand]
Show Tweets
Tuesday Poster Session
Neighborhood Normalization for Robust Geometric Feature Learning
Xingtong Liu, Benjamin D. Killeen, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath
Extracting geometric features from 3D models is a common first step in applications such as 3D registration, tracking, and scene flow estimation. [Expand]
Show Tweets
Thursday Poster Session
RankDetNet: Delving Into Ranking Constraints for Object Detection
Ji Liu, Dong Li, Rongzhang Zheng, Lu Tian, Yi Shan
Modern object detection approaches cast detecting objects as optimizing two subtasks of classification and localization simultaneously. [Expand]
Show Tweets
Monday Poster Session
PluckerNet: Learn To Register 3D Line Reconstructions
Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha
Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve line correspondences and relative pose between reconstructions. [Expand]
Show Tweets
Monday Poster Session
Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation
Yahui Liu, Enver Sangineto, Yajing Chen, Linchao Bao, Haoxian Zhang, Nicu Sebe, Bruno Lepri, Wei Wang, Marco De Nadai
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic interpolation results. [Expand]
Show Tweets
Wednesday Poster Session
Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
Jiawei Liu, Zheng-Jun Zha, Wei Wu, Kecheng Zheng, Qibin Sun
Video-based person re-identification aims to match pedestrians from video sequences across non-overlapping camera views. [Expand]
Tuesday Poster Session
3D Human Action Representation Learning via Cross-View Consistency Pursuit
Linguo Li, Minsi Wang, Bingbing Ni, Hang Wang, Jiancheng Yang, Wenjun Zhang
In this work, we propose a Cross-view Contrastive Learning framework for unsupervised 3D skeleton-based action representation (CrosSCLR), by leveraging multi-view complementary supervision signal. [Expand]
Tuesday Poster Session
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim
Prototype learning is extensively used for few-shot segmentation. [Expand]
Wednesday Poster Session
Combined Depth Space Based Architecture Search for Person Re-Identification
Hanjun Li, Gaojie Wu, Wei-Shi Zheng
Most works on person re-identification (ReID) take advantage of large backbone networks such as ResNet, which are designed for image classification instead of ReID, for feature extraction. [Expand]
Tuesday Poster Session
Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer
Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu
Occluded person re-identification (Re-ID) is a challenging task as persons are frequently occluded by various obstacles or other persons, especially in the crowd scenario. [Expand]
Tuesday Poster Session
Domain Consensus Clustering for Universal Domain Adaptation
Guangrui Li, Guoliang Kang, Yi Zhu, Yunchao Wei, Yi Yang
In this paper, we investigate Universal Domain Adaptation (UniDA) problem, which aims to transfer the knowledge from source to target under unaligned label space. [Expand]
Wednesday Poster Session
Dynamic Class Queue for Large Scale Face Recognition in the Wild
Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Wenyu Liu
Learning discriminative representation using large-scale face datasets in the wild is crucial for real-world applications, yet it remains challenging. [Expand]
Tuesday Poster Session
Dynamic Domain Adaptation for Efficient Inference
Shuang Li, JinMing Zhang, Wenxuan Ma, Chi Harold Liu, Wei Li
Domain adaptation (DA) enables knowledge transfer from a labeled source domain to an unlabeled target domain by reducing the cross-domain distribution discrepancy. [Expand]
Wednesday Poster Session
Dynamic Transfer for Multi-Source Domain Adaptation
Yunsheng Li, Lu Yuan, Yinpeng Chen, Pei Wang, Nuno Vasconcelos
Recent works of multi-source domain adaptation focus on learning a domain-agnostic model, of which the parameters are static. [Expand]
Wednesday Poster Session
Ego-Exo: Transferring Visual Representations From Third-Person to First-Person Videos
Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman
We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets. [Expand]
Tuesday Poster Session
FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains
Jia Li, Zhaoyang Li, Jie Cao, Xingguang Song, Ran He
In this work, we propose a novel two-stage framework named FaceInpainter to implement controllable Identity-Guided Face Inpainting (IGFI) under heterogeneous domains. [Expand]
Show Tweets
Tuesday Poster Session
Few-Shot Object Detection via Classification Refinement and Distractor Retreatment
Yiting Li, Haiyue Zhu, Yu Cheng, Wenxin Wang, Chek Sing Teo, Cheng Xiang, Prahlad Vadakkepat, Tong Heng Lee
We aim to tackle the challenging Few-Shot Object Detection (FSOD) where data-scarce categories are presented during the model learning. [Expand]
Show Tweets
Thursday Poster Session
Hilbert Sinkhorn Divergence for Optimal Transport
Qian Li, Zhichao Wang, Gang Li, Jun Pang, Guandong Xu
Sinkhorn divergence has become a very popular metric to compare probability distributions in optimal transport. [Expand]
Show Tweets
Tuesday Poster Session
Learning To Identify Correct 2D-2D Line Correspondences on Sphere
Haoang Li, Kai Chen, Ji Zhao, Jiangliu Wang, Pyojin Kim, Zhe Liu, Yun-Hui Liu
Given a set of putative 2D-2D line correspondences, we aim to identify correct matches. [Expand]
Show Tweets
Thursday Poster Session
Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression
Wanhua Li, Xiaoke Huang, Jiwen Lu, Jianjiang Feng, Jie Zhou
Uncertainty is the only certainty there is. [Expand]
Thursday Poster Session
Lighting, Reflectance and Geometry Estimation From 360deg Panoramic Stereo
Junxuan Li, Hongdong Li, Yasuyuki Matsushita
We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360deg stereo images. [Expand]
Show Tweets
Wednesday Poster Session
Generalizing to the Open World: Deep Visual Odometry With Online Adaptation
Shunkai Li, Xin Wu, Yingdian Cao, Hongbin Zha
Despite learning-based visual odometry (VO) has shown impressive results in recent years, the pretrained networks may easily collapse in unseen environments. [Expand]
Thursday Poster Session
Meta-Mining Discriminative Samples for Kinship Verification
Wanhua Li, Shiwei Wang, Jiwen Lu, Jianjiang Feng, Jie Zhou
Kinship verification aims to find out whether there is a kin relation for a given pair of facial images. [Expand]
Friday Poster Session
Probabilistic Model Distillation for Semantic Correspondence
Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu
Semantic correspondence is a fundamental problem in computer vision, which aims at establishing dense correspondences across images depicting different instances under the same category. [Expand]
Show Tweets
Wednesday Poster Session
Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement
Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao
Unsupervised learning methods have recently shown their competitiveness against supervised training. [Expand]
Wednesday Poster Session
Representing Videos As Discriminative Sub-Graphs for Action Recognition
Dong Li, Zhaofan Qiu, Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei
Human actions are typically of combinatorial structures or patterns, i.e., subjects, objects, plus spatio-temporal interactions in between. [Expand]
Show Tweets
Tuesday Poster Session
Spatial Assembly Networks for Image Representation Learning
Yang Li, Shichao Kan, Jianhe Yuan, Wenming Cao, Zhihai He
It has been long recognized that deep neural networks are sensitive to changes in spatial configurations or scene structures. [Expand]
Show Tweets
Thursday Poster Session
SelfDoc: Self-Supervised Document Representation Learning
Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu
We propose SelfDoc, a task-agnostic pre-training framework for document image understanding. [Expand]
Tuesday Poster Session
Self-Supervised Video Hashing via Bidirectional Transformers
Shuyan Li, Xiu Li, Jiwen Lu, Jie Zhou
Most existing unsupervised video hashing methods are built on unidirectional models with less reliable training objectives, which underuse the correlations among frames and the similarity structure between videos. [Expand]
Show Tweets
Thursday Poster Session
Spatial Feature Calibration and Temporal Fusion for Effective One-Stage Video Instance Segmentation
Minghan Li, Shuai Li, Lida Li, Lei Zhang
Modern one-stage video instance segmentation networks suffer from two limitations. [Expand]
Wednesday Poster Session
Spherical Confidence Learning for Face Recognition
Shen Li, Jianqing Xu, Xiaqing Xu, Pengcheng Shen, Shaoxin Li, Bryan Hooi
An emerging line of research has found that spherical spaces better match the underlying geometry of facial images, as evidenced by the state-of-the-art facial recognition methods which benefit empirically from spherical representations. [Expand]
Show Tweets
Friday Poster Session
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures
Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte
In this paper, we tackle the problem of convolutional neural network design. [Expand]
Monday Poster Session
Towards Compact CNNs via Collaborative Compression
Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji
Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. [Expand]
Tuesday Poster Session
Toward Accurate and Realistic Outfits Visualization With Attention to Details
Kedan Li, Min Jin Chong, Jeffrey Zhang, Jingen Liu
Virtual try-on methods aim to generate images of fashion models wearing arbitrary combinations of garments. [Expand]
Show Tweets
Thursday Poster Session
Transferable Semantic Augmentation for Domain Adaptation
Shuang Li, Mixue Xie, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Wei Li
Domain adaptation has been widely explored by transferring the knowledge from a label-rich source domain to a related but unlabeled target domain. [Expand]
Thursday Poster Session
Transformation Invariant Few-Shot Object Detection
Aoxue Li, Zhenguo Li
Few-shot object detection (FSOD) aims to learn detectors that can be generalized to novel classes with only a few instances. [Expand]
Show Tweets
Tuesday Poster Session
Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations
Zhihui Li, Lina Yao
Temporal action detection on unconstrained videos has seen significant research progress in recent years. [Expand]
Show Tweets
Tuesday Poster Session
VirFace: Enhancing Face Recognition via Unlabeled Shallow Data
Wenyu Li, Tianchu Guo, Pengyu Li, Binghui Chen, Biao Wang, Wangmeng Zuo, Lei Zhang
Recently, exploiting the effect of the unlabeled data for face recognition attracts increasing attention. [Expand]
Thursday Poster Session
CLCC: Contrastive Learning for Color Constancy
Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, Kevin Jou
In this paper, we present CLCC, a novel contrastive learning framework for color constancy. [Expand]
Wednesday Poster Session
Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks
Xiaoxiao Long, Lingjie Liu, Wei Li, Christian Theobalt, Wenping Wang
We present a novel method for multi-view depth estimation from a single video, which is a critical task in various applications, such as perception, reconstruction and robot navigation. [Expand]
Wednesday Poster Session
Radar-Camera Pixel Depth Association for Depth Completion
Yunfei Long, Daniel Morris, Xiaoming Liu, Marcos Castro, Punarjay Chakravarty, Praveen Narayanan
While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. [Expand]
Thursday Poster Session
Conditional Bures Metric for Domain Adaptation
You-Wei Luo, Chuan-Xian Ren
As a vital problem in classification-oriented transfer, unsupervised domain adaptation (UDA) has attracted widespread attention in recent years. [Expand]
Show Tweets
Thursday Poster Session
Action Unit Memory Network for Weakly Supervised Temporal Action Localization
Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang
Weakly supervised temporal action localization aims to detect and localize actions in untrimmed videos with only video-level labels during training. [Expand]
Wednesday Poster Session
Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement
Huiwen Luo, Koki Nagano, Han-Wei Kung, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li
We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. [Expand]
Show Tweets
Thursday Poster Session
Scalable Differential Privacy With Sparse Network Finetuning
Zelun Luo, Daniel J. Wu, Ehsan Adeli, Li Fei-Fei
We propose a novel method for privacy-preserving training of deep neural networks leveraging public, out-domain data. [Expand]
Show Tweets
Tuesday Poster Session
Intelligent Carpet: Inferring 3D Human Pose From Tactile Signals
Yiyue Luo, Yunzhu Li, Michael Foshey, Wan Shou, Pratyusha Sharma, Tomas Palacios, Antonio Torralba, Wojciech Matusik
Daily human activities, e.g., locomotion, exercises, and resting, are heavily guided by the tactile interactions between the human and the ground. [Expand]
Show Tweets
Wednesday Poster Session
Stay Positive: Non-Negative Image Synthesis for Augmented Reality
Katie Luo, Guandao Yang, Wenqi Xian, Harald Haraldsson, Bharath Hariharan, Serge Belongie
In applications such as optical see-through and projector augmented reality, producing images amounts to solving non-negative image generation, where one can only add light to an existing image. [Expand]
Wednesday Poster Session
Large-Capacity Image Steganography Based on Invertible Neural Networks
Shao-Ping Lu, Rong Wang, Tao Zhong, Paul L. Rosin
Many attempts have been made to hide information in images, where the main challenge is how to increase the payload capacity without the container image being detected as containing a message. [Expand]
Show Tweets
Wednesday Poster Session
CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation
Tao Lu, Limin Wang, Gangshan Wu
Previous point cloud semantic segmentation networks use the same process to aggregate features from neighbors of the same category and different categories. [Expand]
Show Tweets
Thursday Poster Session
Dual-GAN: Joint BVP and Noise Modeling for Remote Physiological Measurement
Hao Lu, Hu Han, S. Kevin Zhou
Remote photoplethysmography (rPPG) based physiological measurement has great application values in health monitoring, emotion analysis, etc. [Expand]
Show Tweets
Thursday Poster Session
MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution
Liying Lu, Wenbo Li, Xin Tao, Jiangbo Lu, Jiaya Jia
Reference-based image super-resolution (RefSR) has shown promising success in recovering high-frequency details by utilizing an external reference image (Ref). [Expand]
Tuesday Poster Session
Personalized Outfit Recommendation With Learnable Anchors
Zhi Lu, Yang Hu, Yan Chen, Bing Zeng
The multimedia community has recently seen a tremendous surge of interest in the fashion recommendation problem. [Expand]
Show Tweets
Thursday Poster Session
Learning Normal Dynamics in Videos With Meta Prototype Network
Hui Lv, Chen Chen, Zhen Cui, Chunyan Xu, Yong Li, Jian Yang
Frame reconstruction (current or future frames) based on Auto-Encoder (AE) is a popular method for video anomaly detection. [Expand]
Thursday Poster Session
Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences
Fengmao Lv, Xiang Chen, Yanyong Huang, Lixin Duan, Guosheng Lin
Human multimodal emotion recognition involves time-series data of different modalities, such as natural language, visual motions, and acoustic behaviors. [Expand]
Show Tweets
Monday Poster Session
Residential Floor Plan Recognition and Reconstruction
Xiaolei Lv, Shengchu Zhao, Xinyang Yu, Binqiang Zhao
Recognition and reconstruction of residential floor plan drawings are important and challenging in design, decoration, and architectural remodeling fields. [Expand]
Show Tweets
Friday Poster Session
Towards Evaluating and Training Verifiably Robust Neural Networks
Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin
Recent works have shown that interval bound propagation (IBP) can be used to train verifiably robust neural networks. [Expand]
Tuesday Poster Session
Efficient Multi-Stage Video Denoising With Recurrent Spatio-Temporal Fusion
Matteo Maggioni, Yibin Huang, Cheng Li, Shuai Xiao, Zhongqian Fu, Fenglong Song
In recent years, denoising methods based on deep learning have achieved unparalleled performance at the cost of large computational complexity. [Expand]
Tuesday Poster Session
MultiLink: Multi-Class Structure Recovery via Agglomerative Clustering and Model Selection
Luca Magri, Filippo Leveni, Giacomo Boracchi
We address the problem of recovering multiple structures of different classes in a dataset contaminated by noise and outliers. [Expand]
Show Tweets
Monday Poster Session
Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira
How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. [Expand]
Show Tweets
Wednesday Poster Session
Magic Layouts: Structural Prior for Component Detection in User Interface Designs
Dipu Manandhar, Hailin Jin, John Collomosse
We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts. [Expand]
Show Tweets
Friday Poster Session
CapsuleRRT: Relationships-Aware Regression Tracking via Capsules
Ding Ma, Xiangqian Wu
Regression tracking has gained more and more attention thanks to its easy-to-implement characteristics, while existing regression trackers rarely consider the relationships between the object parts and the complete object. [Expand]
Show Tweets
Wednesday Poster Session
Weakly Supervised Action Selection Learning in Video
Junwei Ma, Satya Krishna Gorti, Maksims Volkovs, Guangwei Yu
Localizing actions in video is a core task in computer vision. [Expand]
Wednesday Poster Session
Image Super-Resolution With Non-Local Sparse Attention
Yiqun Mei, Yuchen Fan, Yuqian Zhou
Both non-local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). [Expand]
Show Tweets
Tuesday Poster Session
Real-Time Sphere Sweeping Stereo From Multiview Fisheye Images
Andreas Meuleman, Hyeonjoong Jang, Daniel S. Jeon, Min H. Kim
A set of cameras with fisheye lenses have been used to capture a wide field of view. [Expand]
Show Tweets
Thursday Poster Session
VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild
Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang
In this paper, we present a new dataset with the target of advancing the scene parsing task from images to videos. [Expand]
Tuesday Poster Session
PVGNet: A Bottom-Up One-Stage 3D Object Detector With Integrated Multi-Level Features
Zhenwei Miao, Jikai Chen, Hongyu Pan, Ruiwen Zhang, Kaixuan Liu, Peihan Hao, Jun Zhu, Yang Wang, Xin Zhan
Quantization-based methods are widely used in LiDAR points 3D object detection for its efficiency in extracting context information. [Expand]
Show Tweets
Tuesday Poster Session
Physically-Aware Generative Network for 3D Shape Modeling
Mariem Mezghanni, Malika Boulkenafed, Andre Lieutier, Maks Ovsjanikov
Shapes are often designed to satisfy structural properties and serve a particular functionality in the physical world. [Expand]
Show Tweets
Wednesday Poster Session
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps
Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov
High Definition (HD) maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. [Expand]
Show Tweets
Tuesday Poster Session
Wasserstein Barycenter for Multi-Source Domain Adaptation
Eduardo Fernandes Montesuma, Fred Maurice Ngole Mboula
Multi-source domain adaptation is a key technique that allows a model to be trained on data coming from various probability distribution. [Expand]
Show Tweets
Friday Poster Session
Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
Norman Muller, Yu-Shiang Wong, Niloy J. Mitra, Angela Dai, Matthias Niessner
Multi-object tracking from RGB-D video sequences is a challenging problem due to the combination of changing viewpoints, motion, and occlusions over time. [Expand]
Tuesday Poster Session
Extreme Low-Light Environment-Driven Image Denoising Over Permanently Shadowed Lunar Regions With a Physical Noise Model
Ben Moseley, Valentin Bickel, Ignacio G. Lopez-Francos, Loveneesh Rana
Recently, learning-based approaches have achieved impressive results in the field of low-light image denoising. [Expand]
Show Tweets
Tuesday Poster Session
Interventional Video Grounding With Dual Contrastive Learning
Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu
Video grounding aims to localize a moment from an untrimmed video for a given textual query. [Expand]
Show Tweets
Monday Poster Session
All Labels Are Not Created Equal: Enhancing Semi-Supervision via Label Grouping and Co-Training
Islam Nassar, Samitha Herath, Ehsan Abbasnejad, Wray Buntine, Gholamreza Haffari
Pseudo-labeling is a key component in semi-supervised learning (SSL). [Expand]
Wednesday Poster Session
Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction
Sriram Narayanan, Ramin Moslemi, Francesco Pittaluga, Buyu Liu, Manmohan Chandraker
Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions. [Expand]
Friday Poster Session
FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation
Jaemin Na, Heechul Jung, Hyung Jin Chang, Wonjun Hwang
Unsupervised domain adaptation (UDA) methods for learning domain invariant representations have achieved remarkable progress. [Expand]
Monday Poster Session
Pedestrian and Ego-Vehicle Trajectory Prediction From Monocular Camera
Lukas Neumann, Andrea Vedaldi
Predicting future pedestrian trajectory is a crucial component of autonomous driving systems, as recognizing critical situations based only on current pedestrian position may come too late for any meaningful corrective action (e.g. [Expand]
Show Tweets
Wednesday Poster Session
Dictionary-Guided Scene Text Recognition
Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh-Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai
Language prior plays an important role in the way humans perceive and recognize text in the wild. [Expand]
Wednesday Poster Session
Discovering Relationships Between Object Categories via Universal Canonical Maps
Natalia Neverova, Artsiom Sanakoyeu, Patrick Labatut, David Novotny, Andrea Vedaldi
We tackle the problem of learning the geometry of multiple categories of deformable objects jointly. [Expand]
Show Tweets
Monday Poster Session
Clusformer: A Transformer Based Clustering Approach to Unsupervised Large-Scale Face and Visual Landmark Recognition
Xuan-Bac Nguyen, Duc Toan Bui, Chi Nhan Duong, Tien D. Bui, Khoa Luu
The research in automatic unsupervised visual clustering has received considerable attention over the last couple years. [Expand]
Show Tweets
Wednesday Poster Session
FAPIS: A Few-Shot Anchor-Free Part-Based Instance Segmenter
Khoi Nguyen, Sinisa Todorovic
This paper is about few-shot instance segmentation, where training and test image sets do not share the same object classes. [Expand]
Wednesday Poster Session
Controlling the Rain: From Removal to Rendering
Siqi Ni, Xueyun Cao, Tao Yue, Xuemei Hu
Existing rain image editing methods focus on either removing rain from rain images or rendering rain on rain-free images. [Expand]
Show Tweets
Tuesday Poster Session
HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection
Jongyoun Noh, Sanghoon Lee, Bumsub Ham
We address the problem of 3D object detection, that is, estimating 3D object bounding boxes from point clouds. [Expand]
Thursday Poster Session
Automated Log-Scale Quantization for Low-Cost Deep Neural Networks
Sangyun Oh, Hyeonuk Sim, Sugil Lee, Jongeun Lee
Quantization plays an important role in deep neural network (DNN) hardware. [Expand]
Show Tweets
Monday Poster Session
Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation
Youngmin Oh, Beomjun Kim, Bumsub Ham
We address the problem of weakly-supervised semantic segmentation (WSSS) using bounding box annotations. [Expand]
Tuesday Poster Session
Protecting Intellectual Property of Generative Adversarial Networks From Ambiguity Attacks
Ding Sheng Ong, Chee Seng Chan, Kam Woh Ng, Lixin Fan, Qiang Yang
Ever since Machine Learning as a Service emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. [Expand]
Tuesday Poster Session
A Quasiconvex Formulation for Radial Cameras
Carl Olsson, Viktor Larsson, Fredrik Kahl
In this paper we study structure from motion problems for 1D radial cameras. [Expand]
Show Tweets
Thursday Poster Session
Bilinear Parameterization for Non-Separable Singular Value Penalties
Marcus Valtonen Ornhag, Jose Pedro Iglesias, Carl Olsson
Low rank inducing penalties have been proven to successfully uncover fundamental structures considered in computer vision and machine learning; however, such methods generally lead to non-convex optimization problems. [Expand]
Show Tweets
Tuesday Poster Session
Neural Auto-Exposure for High-Dynamic Range Object Detection
Emmanuel Onzon, Fahim Mannan, Felix Heide
Real-world scenes have a dynamic range of up to 280 dB that today's imaging sensors cannot directly capture. [Expand]
Wednesday Poster Session
SDD-FIQA: Unsupervised Face Image Quality Assessment With Similarity Distribution Distance
Fu-Zhao Ou, Xingyu Chen, Ruixin Zhang, Yuge Huang, Shaoxin Li, Jilin Li, Yong Li, Liujuan Cao, Yuan-Gen Wang
In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. [Expand]
Wednesday Poster Session
Fast Sinkhorn Filters: Using Matrix Scaling for Non-Rigid Shape Correspondence With Functional Maps
Gautam Pai, Jing Ren, Simone Melzi, Peter Wonka, Maks Ovsjanikov
In this paper, we provide a theoretical foundation for pointwise map recovery from functional maps and highlight its relation to a range of shape correspondence methods based on spectral alignment. [Expand]
Monday Poster Session
Synthesize-It-Classifier: Learning a Generative Classifier Through Recurrent Self-Analysis
Arghya Pal, Raphael C.-W. Phan, KokSheik Wong
In this work, we show the generative capability of an image classifier network by synthesizing high-resolution, photo-realistic, and diverse images at scale. [Expand]
Tuesday Poster Session
Generalization on Unseen Domains via Inference-Time Label-Preserving Target Projections
Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP
Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. [Expand]
Show Tweets
Thursday Poster Session
Trajectory Prediction With Latent Belief Energy-Based Model
Bo Pang, Tianyang Zhao, Xu Xie, Ying Nian Wu
Human trajectory prediction is critical for autonomous platforms like self-driving cars or social robots. [Expand]
Thursday Poster Session
Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising
Tongyao Pang, Huan Zheng, Yuhui Quan, Hui Ji
Deep denoiser, the deep network for denoising, has been the focus of the recent development on image denoising. [Expand]
Show Tweets
Monday Poster Session
Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders
Jiwoong Park, Junho Cho, Hyung Jin Chang, Jin Young Choi
Most of the existing literature regarding hyperbolic embedding concentrate upon supervised learning, whereas the use of unsupervised hyperbolic embedding is less well explored. [Expand]
Tuesday Poster Session
Learning To Predict Visual Attributes in the Wild
Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava
Visual attributes constitute a large portion of information contained in a scene. [Expand]
Show Tweets
Thursday Poster Session
SliceNet: Deep Dense Depth Estimation From a Single Indoor Panorama Using a Slice-Based Representation
Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, Enrico Gobbetti
We introduce a novel deep neural network to estimate a depth map from a single monocular indoor panorama. [Expand]
Show Tweets
Thursday Poster Session
Recognizing Actions in Videos From Unseen Viewpoints
AJ Piergiovanni, Michael S. Ryoo
Standard methods for video recognition use large CNNs designed to capture spatio-temporal data. [Expand]
Tuesday Poster Session
CompositeTasking: Understanding Images by Spatial Composition of Tasks
Nikola Popovic, Danda Pani Paudel, Thomas Probst, Guolei Sun, Luc Van Gool
We define the concept of CompositeTasking as the fusion of multiple, spatially distributed tasks, for various aspects of image understanding. [Expand]
Tuesday Poster Session
A Functional Approach to Rotation Equivariant Non-Linearities for Tensor Field Networks.
Adrien Poulenard, Leonidas J. Guibas
Learning pose invariant representation is a fundamental problem in shape analysis. [Expand]
Show Tweets
Thursday Poster Session
Labeled From Unlabeled: Exploiting Unlabeled Data for Few-Shot Deep HDR Deghosting
K. Ram Prabhakar, Gowtham Senthil, Susmit Agrawal, R. Venkatesh Babu, Rama Krishna Sai S Gorthi
High Dynamic Range (HDR) deghosting is an indispensable tool in capturing wide dynamic range scenes without ghosting artifacts. [Expand]
Show Tweets
Tuesday Poster Session
Deep Multi-Task Learning for Joint Localization, Perception, and Prediction
John Phillips, Julieta Martinez, Ioan Andrei Barsan, Sergio Casas, Abbas Sadat, Raquel Urtasun
Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving including perception, motion forecasting, and motion planning. [Expand]
Tuesday Poster Session
BABEL: Bodies, Action and Behavior With English Labels
Abhinanda R. Punnakkal, Arjun Chandrasekaran, Nikos Athanasiou, Alejandra Quiros-Ramirez, Michael J. Black
Understanding the semantics of human movement -- the what, how and why of the movement -- is an important problem that requires datasets of human actions with semantic labels. [Expand]
Show Tweets
Monday Poster Session
Boosting Video Representation Learning With Multi-Faceted Integration
Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Xiao-Ping Zhang, Dong Wu, Tao Mei
Video content is multifaceted, consisting of objects, scenes, interactions or actions. [Expand]
Show Tweets
Thursday Poster Session
Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation Priors
Haiquan Qiu, Yao Wang, Deyu Meng
Snapshot compressive imaging (SCI) is a new type of compressive imaging system that compresses multiple frames of images into a single snapshot measurement, which enjoys low cost, low bandwidth, and high-speed sensing rate. [Expand]
Show Tweets
Wednesday Poster Session
PQA: Perceptual Question Answering
Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song
Perceptual organization remains one of the very few established theories on the human visual system. [Expand]
Thursday Poster Session
Multi-Scale Aligned Distillation for Low-Resolution Detection
Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia
In instance-level detection tasks (e.g., object detection), reducing input resolution is an easy option to improve runtime efficiency. [Expand]
Show Tweets
Thursday Poster Session
Removing Raindrops and Rain Streaks in One Go
Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang
Existing rain-removal algorithms often tackle either rain streak removal or raindrop removal, and thus may fail to handle real-world rainy scenes. [Expand]
Show Tweets
Wednesday Poster Session
DyGLIP: A Dynamic Graph Model With Link Prediction for Accurate Multi-Camera Multiple Object Tracking
Kha Gia Quach, Pha Nguyen, Huu Le, Thanh-Dat Truong, Chi Nhan Duong, Minh-Triet Tran, Khoa Luu
Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer vision problem due to its emerging applicability in several real-world applications. [Expand]
Show Tweets
Thursday Poster Session
Exploiting & Refining Depth Distributions With Triangulation Light Curtains
Yaadhav Raaj, Siddharth Ancha, Robert Tamburo, David Held, Srinivasa G. Narasimhan
Active sensing through the use of Adaptive Depth Sensors is a nascent field, with potential in areas such as Advanced driver-assistance systems (ADAS). [Expand]
Show Tweets
Wednesday Poster Session
DAT: Training Deep Networks Robust To Label-Noise by Matching the Feature Distributions
Yuntao Qu, Shasha Mo, Jianwei Niu
In real application scenarios, the performance of deep networks may be degraded when the dataset contains noisy labels. [Expand]
Show Tweets
Tuesday Poster Session
Flow Guided Transformable Bottleneck Networks for Motion Retargeting
Jian Ren, Menglei Chai, Oliver J. Woodford, Kyle Olszewski, Sergey Tulyakov
Human motion retargeting aims to transfer the motion of one person in a driving video or set of images to another person. [Expand]
Show Tweets
Wednesday Poster Session
Adaptive Consistency Prior Based Deep Network for Image Denoising
Chao Ren, Xiaohai He, Chuncheng Wang, Zhibo Zhao
Recent studies have shown that deep networks can achieve promising results for image denoising. [Expand]
Show Tweets
Wednesday Poster Session
Reciprocal Transformations for Unsupervised Video Object Segmentation
Sucheng Ren, Wenxi Liu, Yongtuo Liu, Haoxin Chen, Guoqiang Han, Shengfeng He
Unsupervised video object segmentation (UVOS) aims at segmenting the primary objects in videos without any human intervention. [Expand]
Show Tweets
Thursday Poster Session
Learning From the Master: Distilling Cross-Modal Advanced Knowledge for Lip Reading
Sucheng Ren, Yong Du, Jianming Lv, Guoqiang Han, Shengfeng He
Lip reading aims to predict the spoken sentences from silent lip videos. [Expand]
Show Tweets
Thursday Poster Session
End-to-End High Dynamic Range Camera Pipeline Optimization
Nicolas Robidoux, Luis E. Garcia Capel, Dong-eun Seo, Avinash Sharma, Federico Ariza, Felix Heide
With a 280 dB dynamic range, the real world is a High Dynamic Range (HDR) world. [Expand]
Tuesday Poster Session
Gaussian Context Transformer
Dongsheng Ruan, Daiyin Wang, Yuan Zheng, Nenggan Zheng, Min Zheng
Recently, a large number of channel attention blocks are proposed to boost the representational power of deep convolutional neural networks (CNNs). [Expand]
Show Tweets
Thursday Poster Session
Learning-Based Image Registration With Meta-Regularization
Ebrahim Al Safadi, Xubo Song
We introduce a meta-regularization framework for learning-based image registration. [Expand]
Show Tweets
Wednesday Poster Session
Learning an Explicit Weighting Scheme for Adapting Complex HSI Noise
Xiangyu Rui, Xiangyong Cao, Qi Xie, Zongsheng Yue, Qian Zhao, Deyu Meng
A general approach for handling hyperspectral image (HSI) denoising issue is to impose weights on different HSI pixels to suppress negative influence brought by noisy elements. [Expand]
Show Tweets
Tuesday Poster Session
Multi-Perspective LSTM for Joint Visual Representation Learning
Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia, Ali Etemad
We present a novel LSTM cell architecture capable of learning both intra- and inter-perspective relationships available in visual sequences captured from multiple perspectives. [Expand]
Friday Poster Session
Introvert: Human Trajectory Prediction via Conditional 3D Attention
Nasim Shafiee, Taskin Padir, Ehsan Elhamifar
Predicting human trajectories is an important component of autonomous moving platforms, such as social robots and self-driving cars. [Expand]
Show Tweets
Friday Poster Session
Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects
Aashish Sharma, Robby T. Tan
Most existing nighttime visibility enhancement methods focus on low light. [Expand]
Show Tweets
Thursday Poster Session
CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
Zhelun Shen, Yuchao Dai, Zhibo Rao
Recently, the ever-increasing capacity of large-scale annotated datasets has led to profound progress in stereo matching. [Expand]
Thursday Poster Session
Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes
Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie Zhou
Face clustering is a promising method for annotating unlabeled face images. [Expand]
Show Tweets
Wednesday Poster Session
Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation
Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji
Panoptic segmentation aims to partition an image to object instances and semantic content for thing and stuff categories, respectively. [Expand]
Show Tweets
Friday Poster Session
clDice - A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation
Suprosanna Shit, Johannes C. Paetzold, Anjany Sekuboyina, Ivan Ezhov, Alexander Unger, Andrey Zhylka, Josien P. W. Pluim, Ulrich Bauer, Bjoern H. Menze
Accurate segmentation of tubular, network-like structures, such as vessels, neurons, or roads, is relevant to many fields of research. [Expand]
Friday Poster Session
Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment
Dongyu She, Yu-Kun Lai, Gaoxiong Yi, Kun Xu
Learning computational models of image aesthetics can have a substantial impact on visual art and graphic design. [Expand]
Show Tweets
Wednesday Poster Session
Learning by Planning: Language-Guided Global Image Editing
Jing Shi, Ning Xu, Yihang Xu, Trung Bui, Franck Dernoncourt, Chenliang Xu
Recently, language-guided global image editing draws increasing attention with growing application potentials. [Expand]
Thursday Poster Session
GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition
Fengmin Shi, Jie Guo, Haonan Zhang, Shan Yang, Xiying Wang, Yanwen Guo
In this paper, we aim to recognize materials with combined use of auditory and visual perception. [Expand]
Show Tweets
Thursday Poster Session
Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data
Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang
Recent advances in deep learning have demonstrated excellent results for Facial Attribute Recognition (FAR), typically trained with large-scale labeled data. [Expand]
Show Tweets
Thursday Poster Session
Communication Efficient SGD via Gradient Sampling With Bayes Prior
Liuyihan Song, Kang Zhao, Pan Pan, Yu Liu, Yingya Zhang, Yinghui Xu, Rong Jin
Gradient compression has been widely adopted in data-parallel distributed training of deep neural networks to reduce communication overhead. [Expand]
Show Tweets
Thursday Poster Session
Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos
Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang
In this paper, we address the problem of referring expression comprehension in videos, which is challenging due to complex expression and scene dynamics. [Expand]
Monday Poster Session
Hybrid Message Passing With Performance-Driven Structures for Facial Action Unit Detection
Tengfei Song, Zijun Cui, Wenming Zheng, Qiang Ji
Message passing neural network has been an effective method to represent dependencies among nodes by propagating messages. [Expand]
Show Tweets
Tuesday Poster Session
Mesh Saliency: An Independent Perceptual Measure or a Derivative of Image Saliency?
Ran Song, Wei Zhang, Yitian Zhao, Yonghuai Liu, Paul L. Rosin
While mesh saliency aims to predict regional importance of 3D surfaces in agreement with human visual perception and is well researched in computer vision and graphics, latest work with eye-tracking experiments shows that state-of-the-art mesh saliency methods remain poor at predicting human fixations. [Expand]
Wednesday Poster Session
Tree-Like Decision Distillation
Jie Song, Haofei Zhang, Xinchao Wang, Mengqi Xue, Ying Chen, Li Sun, Dacheng Tao, Mingli Song
Knowledge distillation pursues a diminutive yet well-behaved student network by harnessing the knowledge learned by a cumbersome teacher model. [Expand]
Show Tweets
Thursday Poster Session
Spatio-temporal Contrastive Domain Adaptation for Action Recognition
Xiaolin Song, Sicheng Zhao, Jingyu Yang, Huanjing Yue, Pengfei Xu, Runbo Hu, Hua Chai
Unsupervised domain adaptation (UDA) for human action recognition is a practical and challenging problem. [Expand]
Show Tweets
Wednesday Poster Session
Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation
Tengfei Song, Zijun Cui, Yuru Wang, Wenming Zheng, Qiang Ji
Deep learning methods have been widely applied to automatic facial action unit (AU) intensity estimation and achieved state-of-the-art performance. [Expand]
Show Tweets
Tuesday Poster Session
Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling
Daniel Stadler, Jurgen Beyerer
Multi-pedestrian trackers perform well when targets are clearly visible making the association task quite easy. [Expand]
Show Tweets
Wednesday Poster Session
Gated Spatio-Temporal Attention-Guided Video Deblurring
Maitreya Suin, A. N. Rajagopalan
Video deblurring remains a challenging task due to the complexity of spatially and temporally varying blur. [Expand]
Show Tweets
Wednesday Poster Session
Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion
Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li
RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e., RGB and depth. [Expand]
Monday Poster Session
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
Cheng Sun, Chi-Wei Hsiao, Ning-Hsu Wang, Min Sun, Hwann-Tzong Chen
Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity. [Expand]
Show Tweets
Thursday Poster Session
Learning View Selection for 3D Scenes
Yifan Sun, Qixing Huang, Dun-Yu Hsiao, Li Guan, Gang Hua
Efficient 3D space sampling to represent an underlying3D object/scene is essential for 3D vision, robotics, and be-yond. [Expand]
Show Tweets
Thursday Poster Session
Deep Video Matting via Spatio-Temporal Alignment and Aggregation
Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai
Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets. [Expand]
Tuesday Poster Session
Lesion-Aware Transformers for Diabetic Retinopathy Grading
Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang
Diabetic retinopathy (DR) is the leading cause of permanent blindness in the working-age population. [Expand]
Show Tweets
Wednesday Poster Session
RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov
The detection of 3D objects from LiDAR data is a critical component in most autonomous driving systems. [Expand]
Show Tweets
Tuesday Poster Session
Soteria: Provable Defense Against Privacy Leakage in Federated Learning From Representation Perspective
Jingwei Sun, Ang Li, Binghui Wang, Huanrui Yang, Hai Li, Yiran Chen
Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. [Expand]
Show Tweets
Wednesday Poster Session
Semantic Image Matting
Yanan Sun, Chi-Keung Tang, Yu-Wing Tai
Natural image matting separates the foreground from background in fractional occupancy which can be caused by highly transparent objects, complex foreground (e.g., net or tree), and/or objects containing very fine details (e.g., hairs). [Expand]
Wednesday Poster Session
Tuning IR-Cut Filter for Illumination-Aware Spectral Reconstruction From RGB
Bo Sun, Junchi Yan, Xiao Zhou, Yinqiang Zheng
To reconstruct spectral signals from multi-channel observations, in particular trichromatic RGBs, has recently emerged as a promising alternative to traditional scanning-based spectral imager. [Expand]
Monday Poster Session
Uncertainty Reduction for Model Adaptation in Semantic Segmentation
Prabhu Teja S, Francois Fleuret
Traditional methods for Unsupervised Domain Adaptation (UDA) targeting semantic segmentation exploit information common to the source and target domains, using both labeled source data and unlabeled target data. [Expand]
Wednesday Poster Session
ArtCoder: An End-to-End Method for Generating Scanning-Robust Stylized QR Codes
Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, Mingliang Xu, Tao Ren
Quick Response (QR) code is one of the most worldwide used two-dimensional codes. [Expand]
Show Tweets
Monday Poster Session
Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification
Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi
The goal is to use Wasserstein metric to provide pseudo labels for the unlabeled images to train a Convolutional Neural Networks (CNN) in a Semi-Supervised Learning (SSL) manner for the classification task. [Expand]
Show Tweets
Thursday Poster Session
Event-Based Bispectral Photometry Using Temporally Modulated Illumination
Tsuyoshi Takatani, Yuzuha Ito, Ayaka Ebisu, Yinqiang Zheng, Takahito Aoto
Analysis of bispectral difference plays a critical role in various applications that involve rays propagating in a light absorbing medium. [Expand]
Show Tweets
Friday Poster Session
Humble Teachers Teach Better Students for Semi-Supervised Object Detection
Yihe Tang, Weifeng Chen, Yijun Luo, Yuting Zhang
We propose a semi-supervised approach for contemporary object detectors following the teacher-student dual model framework. [Expand]
Show Tweets
Tuesday Poster Session
Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms
Yuxing Tang, Zhenjie Cao, Yanbo Zhang, Zhicheng Yang, Zongcheng Ji, Yiwei Wang, Mei Han, Jie Ma, Jing Xiao, Peng Chang
Mammographic mass detection is an integral part of a computer-aided diagnosis system. [Expand]
Show Tweets
Tuesday Poster Session
Mutual CRF-GNN for Few-Shot Learning
Shixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli Ouyang
Graph-neural-networks (GNN) is a rising trend for few-shot learning. [Expand]
Show Tweets
Monday Poster Session
SKFAC: Training Neural Networks With Faster Kronecker-Factored Approximate Curvature
Zedong Tang, Fenlong Jiang, Maoguo Gong, Hao Li, Yue Wu, Fan Yu, Zidong Wang, Min Wang
The bottleneck of computation burden limits the widespread use of the 2nd order optimization algorithms for training deep neural networks. [Expand]
Show Tweets
Thursday Poster Session
OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations
Yang Tan, Yang Li, Shao-Lun Huang
Transfer learning across heterogeneous data distributions (a.k.a. [Expand]
Friday Poster Session
Mirror3D: Depth Refinement for Mirror Surfaces
Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva
Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors. [Expand]
Show Tweets
Friday Poster Session
Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks?
Yapeng Tian, Chenliang Xu
In this paper, we propose to make a systematic study on machines' multisensory perception under attacks. [Expand]
Tuesday Poster Session
Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification
Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie, Lizhuang Ma
The Information Bottleneck (IB) provides an information theoretic principle for representation learning, by retaining all information relevant for predicting label while minimizing the redundancy. [Expand]
Monday Poster Session
Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services
Jinyu Tian, Jiantao Zhou, Jia Duan
Model protection is vital when deploying Convolutional Neural Networks (CNNs) for commercial services, due to the massive costs of training them. [Expand]
Monday Poster Session
Post-Hoc Uncertainty Calibration for Domain Drift Scenarios
Christian Tomani, Sebastian Gruber, Muhammed Ebrar Erdem, Daniel Cremers, Florian Buettner
We address the problem of uncertainty calibration. [Expand]
Wednesday Poster Session
FaceSec: A Fine-Grained Robustness Evaluation Framework for Face Recognition Systems
Liang Tong, Zhengzhang Chen, Jingchao Ni, Wei Cheng, Dongjin Song, Haifeng Chen, Yevgeniy Vorobeychik
We present FACESEC, a framework for fine-grained robustness evaluation of face recognition systems. [Expand]
Thursday Poster Session
Automatic Correction of Internal Units in Generative Neural Networks
Ali Tousi, Haedong Jeong, Jiyeon Han, Hwanil Choi, Jaesik Choi
Generative Adversarial Networks (GANs) have shown satisfactory performance in synthetic image generation by devising complex network structure and adversarial training scheme. [Expand]
Wednesday Poster Session
Explore Image Deblurring via Encoded Blur Kernel Space
Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai
This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space. [Expand]
Thursday Poster Session
Reconsidering Representation Alignment for Multi-View Clustering
Daniel J. Trosten, Sigurd Lokse, Robert Jenssen, Michael Kampffmeyer
Aligning distributions of view representations is a core component of today's state of the art models for deep multi-view clustering. [Expand]
Monday Poster Session
SSLayout360: Semi-Supervised Indoor Layout Estimation From 360deg Panorama
Phi Vu Tran
Recent years have seen flourishing research on both semi-supervised learning and 3D room layout reconstruction. [Expand]
Show Tweets
Thursday Poster Session
ColorRL: Reinforced Coloring for End-to-End Instance Segmentation
Tran Anh Tuan, Nguyen Tuan Khoa, Tran Minh Quan, Won-Ki Jeong
Instance segmentation, the task of identifying and separating each individual object of interest in the image, is one of the actively studied research topics in computer vision. [Expand]
Show Tweets
Friday Poster Session
Time Lens: Event-Based Video Frame Interpolation
Stepan Tulyakov, Daniel Gehrig, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza
State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. [Expand]
Friday Poster Session
Uncertainty-Aware Camera Pose Estimation From Points and Lines
Alexander Vakhitov, Luis Ferraz, Antonio Agudo, Francesc Moreno-Noguer
Perspective-n-Point-and-Line (PnPL) algorithms aim at fast, accurate, and robust camera localization with respect to a 3D model from 2D-3D feature correspondences, being a major part of modern robotic and AR/VR systems. [Expand]
Show Tweets
Tuesday Poster Session
Can We Characterize Tasks Without Labels or Features?
Bram Wallace, Ziyang Wu, Bharath Hariharan
The problem of expert model selection deals with choosing the appropriate pretrained network ("expert") to transfer to a target task. [Expand]
Show Tweets
Monday Poster Session
A Self-Boosting Framework for Automated Radiographic Report Generation
Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li
Automated radiographic report generation is a challenging task since it requires to generate paragraphs describing fine-grained visual differences of cases, especially for those between the diseased and the healthy. [Expand]
Show Tweets
Monday Poster Session
Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification
Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang
Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases. [Expand]
Monday Poster Session
Deep Two-View Structure-From-Motion Revisited
Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM. [Expand]
Wednesday Poster Session
Domain-Specific Suppression for Adaptive Object Detection
Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, Xishan Zhang, Shaoli Liu
Domain adaptation methods face performance degradation in object detection, as the complexity of tasks require more about the transferability of the model. [Expand]
Wednesday Poster Session
Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World
Jiakai Wang, Aishan Liu, Zixin Yin, Shunchang Liu, Shiyu Tang, Xianglong Liu
Deep learning models are vulnerable to adversarial examples. [Expand]
Wednesday Poster Session
EvDistill: Asynchronous Events To End-Task Learning via Bidirectional Reconstruction-Guided Cross-Modal Knowledge Distillation
Lin Wang, Yujeong Chae, Sung-Hoon Yoon, Tae-Kyun Kim, Kuk-Jin Yoon
Event cameras sense per-pixel intensity changes and produce asynchronous event streams with high dynamic range and less motion blur, showing advantages over the conventional cameras. [Expand]
Show Tweets
Monday Poster Session
FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation
Sijin Wang, Ziwei Yao, Ruiping Wang, Zhongqin Wu, Xilin Chen
Image caption evaluation is a crucial task, which involves the semantic perception and matching of image and text. [Expand]
Show Tweets
Thursday Poster Session
FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
Haiyan Wang, Jiahao Pang, Muhammad A. Lodhi, Yingli Tian, Dong Tian
Scene flow depicts the dynamics of a 3D scene, which is critical for various applications such as autonomous driving, robot navigation, AR/VR, etc. [Expand]
Thursday Poster Session
From Semantic Categories to Fixations: A Novel Weakly-Supervised Visual-Auditory Saliency Detection Approach
Guotao Wang, Chenglizhao Chen, Deng-Ping Fan, Aimin Hao, Hong Qin
Thanks to the rapid advances in the deep learning techniques and the wide availability of large-scale training sets, the performances of video saliency detection models have been improving steadily and significantly. [Expand]
Show Tweets
Thursday Poster Session
Gradient-Based Algorithms for Machine Teaching
Pei Wang, Kabir Nagrecha, Nuno Vasconcelos
The problem of machine teaching is considered. [Expand]
Show Tweets
Monday Poster Session
Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship
Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo
OCR-based image captioning aims to automatically describe images based on all the visual entities (both visual objects and scene text) in images. [Expand]
Show Tweets
Monday Poster Session
Glancing at the Patch: Anomaly Localization With Global and Local Feature Comparison
Shenzhi Wang, Liwei Wu, Lei Cui, Yujun Shen
Anomaly localization, with the purpose to segment the anomalous regions within images, is challenging due to the large variety of anomaly types. [Expand]
Show Tweets
Monday Poster Session
LED2-Net: Monocular 360deg Layout Estimation via Differentiable Depth Rendering
Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space. [Expand]
Show Tweets
Thursday Poster Session
Multi-Decoding Deraining Network and Quasi-Sparsity Based Training
Yinglong Wang, Chao Ma, Bing Zeng
Existing deep deraining models are mainly learned via directly minimizing the statistical differences between rainy images and rain-free ground truths. [Expand]
Show Tweets
Thursday Poster Session
PAUL: Procrustean Autoencoder for Unsupervised Lifting
Chaoyang Wang, Simon Lucey
Recent success in casting Non-rigid Structure from Motion (NRSfM) as an unsupervised deep learning problem has raised fundamental questions about what novelty in NRSfM prior could the deep learning offer. [Expand]
Monday Poster Session
PointAugmenting: Cross-Modal Augmentation for 3D Object Detection
Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang
Camera and LiDAR are two complementary sensors for 3D object detection in the autonomous driving context. [Expand]
Show Tweets
Thursday Poster Session
Pseudo Facial Generation With Extreme Poses for Face Recognition
Guoli Wang, Jiaqi Ma, Qian Zhang, Jiwen Lu, Jie Zhou
Face recognition has achieved a great success in recent years, it is still challenging to recognize those facial images with extreme poses. [Expand]
Show Tweets
Monday Poster Session
Representative Forgery Mining for Fake Face Detection
Chengrui Wang, Weihong Deng
Although vanilla Convolutional Neural Network (CNN) based detectors can achieve satisfactory performance on fake face detection, we observe that the detectors tend to seek forgeries on a limited region of face, which reveals that the detectors is short of understanding of forgery. [Expand]
Thursday Poster Session
Rich Features for Perceptual Quality Assessment of UGC Videos
Yilin Wang, Junjie Ke, Hossein Talebi, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli, Peyman Milanfar, Feng Yang
Video quality assessment for User Generated Content (UGC) is an important topic in both industry and academia. [Expand]
Show Tweets
Thursday Poster Session
RSG: A Simple but Effective Module for Learning Imbalanced Datasets
Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, Zhenghua Xu
Imbalanced datasets widely exist in practice and are a great challenge for training deep neural models with a good generalization on infrequent classes. [Expand]
Show Tweets
Tuesday Poster Session
Single-Stage Instance Shadow Detection With Bidirectional Relation Learning
Tianyu Wang, Xiaowei Hu, Chi-Wing Fu, Pheng-Ann Heng
Instance shadow detection aims to find shadow instances paired with the objects that cast the shadows. [Expand]
Show Tweets
Monday Poster Session
Structured Multi-Level Interaction Network for Video Moment Localization via Language Query
Hao Wang, Zheng-Jun Zha, Liang Li, Dong Liu, Jiebo Luo
We address the problem of localizing a specific moment described by a natural language query. [Expand]
Show Tweets
Tuesday Poster Session
Unsupervised Visual Attention and Invariance for Reinforcement Learning
Xudong Wang, Long Lian, Stella X. Yu
The vision-based reinforcement learning (RL) has achieved tremendous success. [Expand]
Tuesday Poster Session
A Generalized Loss Function for Crowd Counting and Localization
Jia Wan, Ziquan Liu, Antoni B. Chan
Previous work shows that a better density map representation can improve the performance of crowd counting. [Expand]
Monday Poster Session
Self-Attention Based Text Knowledge Mining for Text Detection
Qi Wan, Haoqin Ji, Linlin Shen
Pre-trained models play an important role in deep learning based text detectors. [Expand]
Show Tweets
Tuesday Poster Session
MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation
Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen
For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics. [Expand]
Friday Poster Session
Shallow Feature Matters for Weakly Supervised Object Localization
Jun Wei, Qin Wang, Zhen Li, Sheng Wang, S. Kevin Zhou, Shuguang Cui
Weakly supervised object localization (WSOL) aims to localize objects by only utilizing image-level labels. [Expand]
Show Tweets
Tuesday Poster Session
Autoregressive Stylized Motion Synthesis With Generative Flow
Yu-Hui Wen, Zhipeng Yang, Hongbo Fu, Lin Gao, Yanan Sun, Yong-Jin Liu
Motion style transfer is an important problem in many computer graphics and computer vision applications, including human animation, games, and robotics. [Expand]
Show Tweets
Thursday Poster Session
Holistic 3D Human and Scene Mesh Estimation From Single View Images
Zhenzhen Weng, Serena Yeung
The 3D world limits the human body pose and the human body pose conveys information about the surrounding objects. [Expand]
Monday Poster Session
Learning Progressive Point Embeddings for 3D Point Cloud Generation
Cheng Wen, Baosheng Yu, Dacheng Tao
Generative models for 3D point clouds are extremely important for scene/object reconstruction applications in autonomous driving and robotics. [Expand]
Show Tweets
Wednesday Poster Session
PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths
Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu
The task of point cloud completion aims to predict the missing part for an incomplete 3D shape. [Expand]
Wednesday Poster Session
Separating Skills and Concepts for Novel Visual Question Answering
Spencer Whitehead, Hui Wu, Heng Ji, Rogerio Feris, Kate Saenko
Generalization to out-of-distribution data has been a problem for Visual Question Answering (VQA) models. [Expand]
Show Tweets
Tuesday Poster Session
Learning To Associate Every Segment for Video Panoptic Segmentation
Sanghyun Woo, Dahun Kim, Joon-Young Lee, In So Kweon
Temporal correspondence -- linking pixels or objects across frames -- is a fundamental supervisory signal for the video models. [Expand]
Show Tweets
Monday Poster Session
Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics
Yanzhao Wu, Ling Liu, Zhongwei Xie, Ka-Ho Chow, Wenqi Wei
Neural network ensembles are gaining popularity by harnessing the complementary wisdom of multiple base models. [Expand]
Show Tweets
Friday Poster Session
Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
Tong Wu, Junshi Huang, Guangyu Gao, Xiaoming Wei, Xiaolin Wei, Xuan Luo, Chi Harold Liu
Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. [Expand]
Show Tweets
Friday Poster Session
Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification
Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji
Visible-infrared person re-identification (Re-ID) aims to match the pedestrian images of the same identity from different modalities. [Expand]
Show Tweets
Tuesday Poster Session
Improving the Transferability of Adversarial Samples With Adversarial Transformations
Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King
Although deep neural networks (DNNs) have achieved tremendous performance in diverse vision challenges, they are surprisingly susceptible to adversarial examples, which are born of intentionally perturbing benign samples in a human-imperceptible fashion. [Expand]
Show Tweets
Wednesday Poster Session
Progressive Unsupervised Learning for Visual Object Tracking
Qiangqiang Wu, Jia Wan, Antoni B. Chan
In this paper, we propose a progressive unsupervised learning (PUL) framework, which entirely removes the need for annotated training videos in visual tracking. [Expand]
Show Tweets
Tuesday Poster Session
Towards Long-Form Video Understanding
Chao-Yuan Wu, Philipp Krahenbuhl
Our world offers a never-ending stream of visual stimuli, yet today's vision systems only accurately recognize patterns within a few seconds. [Expand]
Show Tweets
Monday Poster Session
Improving Transferability of Adversarial Patches on Face Recognition With Generative Models
Zihao Xiao, Xianfeng Gao, Chilin Fu, Yinpeng Dong, Wei Gao, Xiaolu Zhang, Jun Zhou, Jun Zhu
Face recognition is greatly improved by deep convolutional neural networks (CNNs). [Expand]
Show Tweets
Thursday Poster Session
Dynamic Weighted Learning for Unsupervised Domain Adaptation
Ni Xiao, Lei Zhang
Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. [Expand]
Thursday Poster Session
Space-Time Distillation for Video Super-Resolution
Zeyu Xiao, Xueyang Fu, Jie Huang, Zhen Cheng, Zhiwei Xiong
Compact video super-resolution (VSR) networks can be easily deployed on resource-limited devices, e.g., smart-phones and wearable devices, but have considerable performance gaps compared with complicated VSR networks that require a large amount of computing resources. [Expand]
Show Tweets
Monday Poster Session
You See What I Want You To See: Exploring Targeted Black-Box Transferability Attack for Hash-Based Image Retrieval Systems
Yanru Xiao, Cong Wang
With the large multimedia content online, deep hashing has become a popular method for efficient image retrieval and storage. [Expand]
Show Tweets
Monday Poster Session
Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation
Guo-Sen Xie, Jie Liu, Huan Xiong, Ling Shao
Few-shot semantic segmentation (FSS) aims to segment unseen class objects given very few densely-annotated support images from the same class. [Expand]
Show Tweets
Tuesday Poster Session
End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution
Wenzhu Xing, Karen Egiazarian
Image denoising, demosaicing and super-resolution are key problems of image restoration well studied in the recent decades. [Expand]
Show Tweets
Tuesday Poster Session
Seeing in Extra Darkness Using a Deep-Red Flash
Jinhui Xiong, Jian Wang, Wolfgang Heidrich, Shree Nayar
We propose a new flash technique for low-light imaging, using deep-red light as an illuminating source. [Expand]
Wednesday Poster Session
Adaptive Rank Estimate in Robust Principal Component Analysis
Zhengqin Xu, Rui He, Shoulie Xie, Shiqian Wu
Robust principal component analysis (RPCA) and its variants have gained wide applications in computer vision. [Expand]
Show Tweets
Tuesday Poster Session
Consistent Instance False Positive Improves Fairness in Face Recognition
Xingkun Xu, Yuge Huang, Pengcheng Shen, Shaoxin Li, Jilin Li, Feiyue Huang, Yong Li, Zhen Cui
Demographic bias is a significant challenge in practical face recognition systems. [Expand]
Monday Poster Session
Discrimination-Aware Mechanism for Fine-Grained Representation Learning
Furong Xu, Meng Wang, Wei Zhang, Yuan Cheng, Wei Chu
Recently, with the emergence of retrieval requirements for certain individual in the same superclass, e.g., birds, persons, cars, fine-grained recognition task has attracted a significant amount of attention from academia and industry. [Expand]
Show Tweets
Monday Poster Session
Layer-Wise Searching for 1-Bit Detectors
Sheng Xu, Junhe Zhao, Jinhu Lu, Baochang Zhang, Shumin Han, David Doermann
1-bit detectors show great promise for resource-constrained embedded devices but often suffer from a significant performance gap compared with their real-valued counterparts. [Expand]
Show Tweets
Tuesday Poster Session
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning Over Traffic Events
Li Xu, He Huang, Jun Liu
Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. [Expand]
Show Tweets
Wednesday Poster Session
Towards Accurate Text-Based Image Captioning With Content Diversity Exploration
Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu
Text-based image captioning (TextCap) which aims to read and reason images with texts is crucial for a machine to understand a detailed and complex scene environment, considering that texts are omnipresent in daily life. [Expand]
Thursday Poster Session
A Circular-Structured Representation for Visual Emotion Distribution Learning
Jingyuan Yang, Jie Li, Leida Li, Xiumei Wang, Xinbo Gao
Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks. [Expand]
Show Tweets
Tuesday Poster Session
Bottom-Up Shift and Reasoning for Referring Image Segmentation
Sibei Yang, Meng Xia, Guanbin Li, Hong-Yu Zhou, Yizhou Yu
Referring image segmentation aims to segment the referent that is the corresponding object or stuff referred by a natural language expression in an image. [Expand]
Show Tweets
Wednesday Poster Session
Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories
Xitong Yang, Haoqi Fan, Lorenzo Torresani, Larry S. Davis, Heng Wang
The standard way of training video models entails sampling at each iteration a single clip from a video and optimizing the clip prediction with respect to the video-level label. [Expand]
Wednesday Poster Session
CT-Net: Complementary Transfering Network for Garment Transfer With Arbitrary Geometric Changes
Fan Yang, Guosheng Lin
Garment transfer shows great potential in realistic applications with the goal of transfering outfits across different people images. [Expand]
Wednesday Poster Session
Defending Multimodal Fusion Models Against Single-Source Adversaries
Karren Yang, Wan-Yi Lin, Manash Barman, Filipe Condessa, Zico Kolter
Beyond achieving high performance across many vision tasks, multimodal models are expected to be robust to single-source faults due to the availability of redundant information between modalities. [Expand]
Show Tweets
Tuesday Poster Session
Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes
Huiting Yang, Liangyu Chai, Qiang Wen, Shuang Zhao, Zixun Sun, Shengfeng He
Generative adversarial networks (GANs) learn to map noise latent vectors to high-fidelity image outputs. [Expand]
Show Tweets
Thursday Poster Session
DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping
Yanchao Yang, Brian Lai, Stefano Soatto
We describe an unsupervised method to detect and segment portions of images of live scenes that, at some point in time, are seen moving as a coherent whole, which we refer to as objects. [Expand]
Tuesday Poster Session
End-to-End Rotation Averaging With Multi-Source Propagation
Luwei Yang, Heng Li, Jamal Ahmed Rahim, Zhaopeng Cui, Ping Tan
This paper presents an end-to-end neural network for multiple rotation averaging in SfM. [Expand]
Show Tweets
Thursday Poster Session
Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods
Minghan Yang, Dong Xu, Hongyu Chen, Zaiwen Wen, Mengyun Chen
In this paper, we consider stochastic second-order methods for minimizing a finite summation of nonconvex functions. [Expand]
Wednesday Poster Session
Exploiting Semantic Embedding and Visual Feature for Facial Action Unit Detection
Huiyuan Yang, Lijun Yin, Yi Zhou, Jiuxiang Gu
Recent study on detecting facial action units (AU) has utilized auxiliary information (i.e., facial landmarks, relationship among AUs and expressions, web facial images, etc.), in order to improve the AU detection performance. [Expand]
Show Tweets
Wednesday Poster Session
L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing
Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang
A deep facial attribute editing model strives to meet two requirements: (1) attribute correctness -- the target attribute should correctly appear on the edited face image; (2) irrelevance preservation -- any irrelevant information (e.g., identity) should not be changed after editing. [Expand]
Show Tweets
Tuesday Poster Session
LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity
Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang
When translating text inputs into layouts or images, existing works typically require explicit descriptions of each object in a scene, including their spatial information or the associated relationships. [Expand]
Show Tweets
Tuesday Poster Session
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking
Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua
Multi-person pose estimation and tracking serve as crucial steps for video understanding. [Expand]
Wednesday Poster Session
Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis
Karren Yang, Samuel Goldman, Wengong Jin, Alex X. Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler
In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development. [Expand]
Show Tweets
Tuesday Poster Session
Partially View-Aligned Representation Learning With Noise-Robust Contrastive Loss
Mouxing Yang, Yunfan Li, Zhenyu Huang, Zitao Liu, Peng Hu, Xi Peng
In real-world applications, it is common that only a portion of data is aligned across views due to spatial, temporal, or spatiotemporal asynchronism, thus leading to the so-called Partially View-aligned Problem (PVP). [Expand]
Show Tweets
Monday Poster Session
Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow
Shangrong Yang, Chunyu Lin, Kang Liao, Chunjie Zhang, Yao Zhao
Distortion rectification is often required for fisheye images. [Expand]
Tuesday Poster Session
Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View Transformation
Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Yuexin Ma, Shengfeng He, Jia Pan
HD map reconstruction is crucial for autonomous driving. [Expand]
Show Tweets
Thursday Poster Session
SelfSAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network
Xu Yang, Cheng Deng, Zhiyuan Dang, Kun Wei, Junchi Yan
Graph convolution networks (GCNs) are a powerful deep learning approach and have been successfully applied to representation learning on graphs in a variety of real-world applications. [Expand]
Show Tweets
Friday Poster Session
StruMonoNet: Structure-Aware Monocular 3D Prediction
Zhenpei Yang, Li Erran Li, Qixing Huang
Monocular 3D prediction is one of the fundamental problems in 3D vision. [Expand]
Show Tweets
Wednesday Poster Session
Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection
Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu
Weakly supervised temporal action detection aims to localize temporal boundaries of actions and identify their categories simultaneously with only video-level category labels during training. [Expand]
Show Tweets
Monday Poster Session
Anchor-Free Person Search
Yichao Yan, Jinpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection and person re-identification (re-id). [Expand]
Wednesday Poster Session
Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching
Shiyang Yan, Li Yu, Yuan Xie
Image-text matching is an important multi-modal task with massive applications. [Expand]
Wednesday Poster Session
Online Learning of a Probabilistic and Adaptive Scene Representation
Zike Yan, Xin Wang, Hongbin Zha
Constructing and maintaining a consistent scene model on-the-fly is the core task for online spatial perception, interpretation, and action. [Expand]
Thursday Poster Session
Self-Aligned Video Deraining With Transmission-Depth Consistency
Wending Yan, Robby T. Tan, Wenhan Yang, Dengxin Dai
In this paper, we address the problems of rain streaks and rain accumulation removal in video, by developing a self-aligned network with transmission-depth consistency. [Expand]
Show Tweets
Thursday Poster Session
Primitive Representation Learning for Scene Text Recognition
Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao
Scene text recognition is a challenging task due to diverse variations of text instances in natural scene images. [Expand]
Monday Poster Session
Unsupervised Hyperbolic Metric Learning
Jiexi Yan, Lei Luo, Cheng Deng, Heng Huang
Learning feature embedding directly from images without any human supervision is a very challenging and essential task in the field of computer vision and machine learning. [Expand]
Show Tweets
Thursday Poster Session
Jo-SRC: A Contrastive Approach for Combating Noisy Labels
Yazhou Yao, Zeren Sun, Chuanyi Zhang, Fumin Shen, Qi Wu, Jian Zhang, Zhenmin Tang
Due to the memorization effect in Deep Neural Networks (DNNs), training with noisy labels usually results in inferior model performance. [Expand]
Tuesday Poster Session
Joint-DetNAS: Upgrade Your Detector With NAS, Pruning and Dynamic Distillation
Lewei Yao, Renjie Pi, Hang Xu, Wei Zhang, Zhenguo Li, Tong Zhang
We propose Joint-DetNAS, a unified NAS framework for object detection, which integrates 3 key components: Neural Architecture Search, pruning, and Knowledge Distillation. [Expand]
Wednesday Poster Session
Adversarial Invariant Learning
Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu
Though machine learning algorithms are able to achieve pattern recognition from the correlation between data and labels, the presence of spurious features in the data decreases the robustness of these learned relationships with respect to varied testing environments. [Expand]
Show Tweets
Thursday Poster Session
Linguistic Structures As Weak Supervision for Visual Scene Graph Generation
Keren Ye, Adriana Kovashka
Prior work in scene graph generation requires categorical supervision at the level of triplets---subjects and objects, and predicates that relate them, either with or without bounding box information. [Expand]
Wednesday Poster Session
Iso-Points: Optimizing Neural Implicit Surfaces With Hybrid Representations
Wang Yifan, Shihao Wu, Cengiz Oztireli, Olga Sorkine-Hornung
Neural implicit functions have emerged as a powerful representation for surfaces in 3D. [Expand]
Monday Poster Session
Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework
Miao Yin, Yang Sui, Siyu Liao, Bo Yuan
Advanced tensor decomposition, such as Tensor train (TT) and Tensor ring (TR), has been widely studied for deep neural network (DNN) model compression, especially for recurrent neural networks (RNNs). [Expand]
Show Tweets
Wednesday Poster Session
RaScaNet: Learning Tiny Models by Raster-Scanning Images
Jaehyoung Yoo, Dongwook Lee, Changyong Son, Sangil Jung, ByungIn Yoo, Changkyu Choi, Jae-Joon Han, Bohyung Han
Deploying deep convolutional neural networks on ultra-low power systems is challenging due to the extremely limited resources. [Expand]
Show Tweets
Thursday Poster Session
Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing
Yuanyuan Yuan, Shuai Wang, Mingyue Jiang, Tsong Yueh Chen
Visual question answering (VQA) takes an image and a natural-language question as input and returns a natural-language answer. [Expand]
Show Tweets
Friday Poster Session
Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner
Chong Yu
With the development of deep learning, neural networks tend to be deeper and larger to achieve good performance. [Expand]
Show Tweets
Tuesday Poster Session
Adaptive Weighted Discriminator for Training Generative Adversarial Networks
Vasily Zadorozhnyy, Qiang Cheng, Qiang Ye
Generative adversarial network (GAN) has become one of the most important neural network models for classical unsupervised machine learning. [Expand]
Tuesday Poster Session
Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval
Yawen Zeng, Da Cao, Xiaochi Wei, Meng Liu, Zhou Zhao, Zheng Qin
Given an untrimmed video and a query sentence, cross-modal video moment retrieval aims to rank a video moment from pre-segmented video moment candidates that best matches the query sentence. [Expand]
Show Tweets
Monday Poster Session
Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces
Alireza Zaeemzadeh, Niccolo Bisagno, Zeno Sambugaro, Nicola Conci, Nazanin Rahnavard, Mubarak Shah
The goal of out-of-distribution (OOD) detection is to handle the situations where the test samples are drawn from a different distribution than the training data. [Expand]
Wednesday Poster Session
Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation
Mengyao Zhai, Lei Chen, Greg Mori
Deep neural networks are susceptible to catastrophic forgetting: when encountering a new task, they can only remember the new task and fail to preserve its ability to accomplish previously learned tasks. [Expand]
Monday Poster Session
ABMDRNet: Adaptive-Weighted Bi-Directional Modality Difference Reduction Network for RGB-T Semantic Segmentation
Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han
Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images. [Expand]
Show Tweets
Monday Poster Session
Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss
Lu Zhang, Shuigeng Zhou, Jihong Guan, Ji Zhang
Most object detection methods require huge amounts of annotated data and can detect only the categories that appear in the training set. [Expand]
Show Tweets
Thursday Poster Session
Attention-Guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton
Xi Zhang, Xiaolin Wu
We propose a deep learning system for attention-guided dual-layer image compression (AGDL). [Expand]
Thursday Poster Session
Coarse-To-Fine Person Re-Identification With Auxiliary-Domain Classification and Second-Order Information Bottleneck
Anguo Zhang, Yueming Gao, Yuzhen Niu, Wenxi Liu, Yongcheng Zhou
Person re-identification (Re-ID) is to retrieve a particular person captured by different cameras, which is of great significance for security surveillance and pedestrian behavior analysis. [Expand]
Show Tweets
Monday Poster Session
Confluent Vessel Trees With Accurate Bifurcations
Zhongwen Zhang, Dmitrii Marin, Maria Drangova, Yuri Boykov
We are interested in unsupervised reconstruction of complex near-capillary vasculature with thousands of bifurcations where supervision and learning are infeasible. [Expand]
Wednesday Poster Session
Cross-View Cross-Scene Multi-View Crowd Counting
Qi Zhang, Wei Lin, Antoni B. Chan
Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution. [Expand]
Show Tweets
Monday Poster Session
Data-Free Knowledge Distillation for Image Super-Resolution
Yiman Zhang, Hanting Chen, Xinghao Chen, Yiping Deng, Chunjing Xu, Yunhe Wang
Convolutional network compression methods require training data for achieving acceptable results, but training data is routinely unavailable due to some privacy and transmission limitations. [Expand]
Show Tweets
Wednesday Poster Session
Cross-View Gait Recognition With Deep Universal Linear Embeddings
Shaoxiong Zhang, Yunhong Wang, Annan Li
Gait is considered an attractive biometric identifier for its non-invasive and non-cooperative features compared with other biometric identifiers such as fingerprint and iris. [Expand]
Show Tweets
Wednesday Poster Session
DeepACG: Co-Saliency Detection via Semantic-Aware Contrast Gromov-Wasserstein Distance
Kaihua Zhang, Mingliang Dong, Bo Liu, Xiao-Tong Yuan, Qingshan Liu
The objective of co-saliency detection is to segment the co-occurring salient objects in a group of images. [Expand]
Show Tweets
Thursday Poster Session
DualGraph: A Graph-Based Method for Reasoning About Label Noise
HaiYang Zhang, XiMing Xing, Liang Liu
Unreliable labels derived from large-scale dataset prevent neural networks from fully exploring the data. [Expand]
Show Tweets
Wednesday Poster Session
Explicit Knowledge Incorporation for Visual Reasoning
Yifeng Zhang, Ming Jiang, Qi Zhao
Existing explainable and explicit visual reasoning methods only perform reasoning based on visual evidence but do not take into account knowledge beyond what is in the visual scene. [Expand]
Show Tweets
Monday Poster Session
Generating Manga From Illustrations via Mimicking Manga Creation Workflow
Lvmin Zhang, Xinrui Wang, Qingnan Fan, Yi Ji, Chunping Liu
We present a framework to generate manga from digital illustrations. [Expand]
Show Tweets
Tuesday Poster Session
Hallucination Improves Few-Shot Object Detection
Weilin Zhang, Yu-Xiong Wang
Learning to detect novel objects with a few instances is challenging. [Expand]
Thursday Poster Session
iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
Shifeng Zhang, Chen Zhang, Ning Kang, Zhenguo Li
It is nontrivial to store rapidly growing big data nowadays, which demands high-performance lossless compression techniques. [Expand]
Monday Poster Session
Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual Dataset
Zhimeng Zhang, Lincheng Li, Yu Ding, Changjie Fan
One-shot talking face generation should synthesize high visual quality facial videos with reasonable animations of expression and head pose, and just utilize arbitrary driving audio and arbitrary single face image as the source. [Expand]
Show Tweets
Tuesday Poster Session
Keypoint-Graph-Driven Learning Framework for Object Pose Estimation
Shaobo Zhang, Wanqing Zhao, Ziyu Guan, Xianlin Peng, Jinye Peng
Many recent 6D pose estimation methods exploited object 3D models to generate synthetic images for training because labels come for free. [Expand]
Show Tweets
Monday Poster Session
Learning by Watching
Jimuyang Zhang, Eshed Ohn-Bar
When in a new situation or geographical location, human drivers have an extraordinary ability to watch others and learn maneuvers that they themselves may have never performed. [Expand]
Thursday Poster Session
Learning a Facial Expression Embedding Disentangled From Identity
Wei Zhang, Xianpeng Ji, Keyu Chen, Yu Ding, Changjie Fan
The facial expression analysis requires a compact and identity-ignored expression representation. [Expand]
Show Tweets
Tuesday Poster Session
Learning Temporal Consistency for Low Light Video Enhancement From Single Images
Fan Zhang, Yu Li, Shaodi You, Ying Fu
Single image low light enhancement is an important task and it has many practical applications. [Expand]
Show Tweets
Tuesday Poster Session
Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction
Shipeng Zhang, Lizhi Wang, Lei Zhang, Hua Huang
Snapshot hyperspectral imaging has been developed to capture the spectral information of dynamic scenes. [Expand]
Show Tweets
Thursday Poster Session
Learning To Restore Hazy Video: A New Real-World Dataset and a New Method
Xinyi Zhang, Hang Dong, Jinshan Pan, Chao Zhu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Fei Wang
Most of the existing deep learning-based dehazing methods are trained and evaluated on the image dehazing datasets, where the dehazed images are generated by only exploiting the information from the corresponding hazy ones. [Expand]
Show Tweets
Wednesday Poster Session
Learning To Aggregate and Personalize 3D Face From In-the-Wild Photo Collection
Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang
Non-prior face modeling aims to reconstruct 3D face only from images without shape assumptions. [Expand]
Show Tweets
Thursday Poster Session
Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos
Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen
We address the problem of localizing a specific moment from an untrimmed video by a language sentence query. [Expand]
Show Tweets
Thursday Poster Session
Learning a Self-Expressive Network for Subspace Clustering
Shangzhi Zhang, Chong You, Rene Vidal, Chun-Guang Li
State-of-the-art subspace clustering methods are based on the self-expressive model, which represents each data point as a linear combination of other data points. [Expand]
Show Tweets
Thursday Poster Session
MR Image Super-Resolution With Squeeze and Excitation Reasoning Attention Network
Yulun Zhang, Kai Li, Kunpeng Li, Yun Fu
High-quality high-resolution (HR) magnetic resonance (MR) images afford more detailed information for reliable diagnosis and quantitative image analyses. [Expand]
Show Tweets
Thursday Poster Session
Objects Are Different: Flexible Monocular 3D Object Detection
Yunpeng Zhang, Jiwen Lu, Jie Zhou
The precise localization of 3D objects from a single image without depth information is a highly challenging problem. [Expand]
Tuesday Poster Session
Posterior Promoted GAN With Distribution Discriminator for Unsupervised Image Synthesis
Xianchao Zhang, Ziyang Cheng, Xiaotong Zhang, Han Liu
Sufficient real information in generator is a critical point for the generation ability of GAN. [Expand]
Show Tweets
Tuesday Poster Session
Person Re-Identification Using Heterogeneous Local Graph Attention Networks
Zhong Zhang, Haijia Zhang, Shuang Liu
Recently, some methods have focused on learning local relation among parts of pedestrian images for person re-identification (Re-ID), as it offers powerful representation capabilities. [Expand]
Show Tweets
Thursday Poster Session
Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging
Feilong Zhang, Xianming Liu, Cheng Guo, Shiyi Lin, Junjun Jiang, Xiangyang Ji
Phase retrieval from intensity-only measurements plays a central role in many real-world imaging tasks. [Expand]
Show Tweets
Wednesday Poster Session
PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS With Relationship Recovery
Tianyi Zhang, Jie Lin, Peng Hu, Bin Zhao, Mohamed M. Sabry Aly
Non-maximum Suppression (NMS) is an essential post-processing step in modern convolutional neural networks for object detection. [Expand]
Friday Poster Session
Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised Object Re-Identification
Xiao Zhang, Yixiao Ge, Yu Qiao, Hongsheng Li
Unsupervised object re-identification targets at learning discriminative representations for object retrieval without any annotations. [Expand]
Show Tweets
Tuesday Poster Session
Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization
Jiaru Zhang, Yang Hua, Zhengui Xue, Tao Song, Chengyu Zheng, Ruhui Ma, Haibing Guan
Bayesian neural networks have been widely used in many applications because of the distinctive probabilistic representation framework. [Expand]
Show Tweets
Tuesday Poster Session
RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words
Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji
Recent progress on visual question answering has explored the merits of grid features for vision language tasks. [Expand]
Show Tweets
Thursday Poster Session
RPN Prototype Alignment for Domain Adaptive Object Detector
Yixin Zhang, Zilei Wang, Yushi Mao
Recent years have witnessed great progress in object detection. [Expand]
Show Tweets
Thursday Poster Session
Self-Guided and Cross-Guided Learning for Few-Shot Segmentation
Bingfeng Zhang, Jimin Xiao, Terry Qin
Few-shot segmentation has been attracting a lot of attention due to its effectiveness to segment unseen object classes with a few annotated samples. [Expand]
Wednesday Poster Session
Sparse Multi-Path Corrections in Fringe Projection Profilometry
Yu Zhang, Daniel Lau, David Wipf
Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns. [Expand]
Show Tweets
Thursday Poster Session
SRDAN: Scale-Aware and Range-Aware Domain Adaptation Network for Cross-Dataset 3D Object Detection
Weichen Zhang, Wen Li, Dong Xu
Geometric characteristic plays an important role in the representation of an object in 3D point clouds. [Expand]
Show Tweets
Tuesday Poster Session
TSGCNet: Discriminative Geometric Feature Learning With Two-Stream Graph Convolutional Network for 3D Dental Model Segmentation
Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen
The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. [Expand]
Tuesday Poster Session
Unbalanced Feature Transport for Exemplar-Based Image Translation
Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, Chunyan Miao
Despite the great success of GANs in images translation with different conditioned inputs such as semantic segmentation and edge map, generating high-fidelity images with reference styles from exemplars remains a grand challenge in conditional image-to-image translation. [Expand]
Show Tweets
Thursday Poster Session
3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management
Tianyi Zhao, Kai Cao, Jiawen Yao, Isabella Nogues, Le Lu, Lingyun Huang, Jing Xiao, Zhaozheng Yin, Ling Zhang
The pancreatic disease taxonomy includes ten types of masses (tumors or cysts) [20, 8]. [Expand]
Thursday Poster Session
Deep Lucas-Kanade Homography for Multimodal Image Alignment
Yiming Zhao, Xinming Huang, Ziming Zhang
Estimating homography to align image pairs captured by different sensors or image pairs with large appearance changes is an important and general challenge for many computer vision applications. [Expand]
Friday Poster Session
Cascaded Prediction Network via Segment Tree for Temporal Video Grounding
Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin
Temporal video grounding aims to localize the target segment which is semantically aligned with the given sentence in an untrimmed video. [Expand]
Show Tweets
Tuesday Poster Session
Distribution-Aware Adaptive Multi-Bit Quantization
Sijie Zhao, Tao Yue, Xuemei Hu
In this paper, we explore the compression of deep neural networks by quantizing the weights and activations into multi-bit binary networks (MBNs). [Expand]
Show Tweets
Wednesday Poster Session
Graph-Based High-Order Relation Discovery for Fine-Grained Recognition
Yifan Zhao, Ke Yan, Feiyue Huang, Jia Li
Fine-grained object recognition aims to learn effective features that can identify the subtle differences between visually similar objects. [Expand]
Show Tweets
Thursday Poster Session
PhD Learning: Learning With Pompeiu-Hausdorff Distances for Video-Based Vehicle Re-Identification
Jianan Zhao, Fengliang Qi, Guangyu Ren, Lin Xu
Vehicle re-identification (re-ID) is of great significance to urban operation, management, security and has gained more attention in recent years. [Expand]
Show Tweets
Monday Poster Session
Prior Based Human Completion
Zibo Zhao, Wen Liu, Yanyu Xu, Xianing Chen, Weixin Luo, Lei Jin, Bohui Zhu, Tong Liu, Binqiang Zhao, Shenghua Gao
We study a very challenging task, human image completion, which tries to recover the human body part with a reasonable human shape from the corrupted region. [Expand]
Show Tweets
Wednesday Poster Session
Spk2ImgNet: Learning To Reconstruct Dynamic Scene From Continuous Spike Stream
Jing Zhao, Ruiqin Xiong, Hangfan Liu, Jian Zhang, Tiejun Huang
The recently invented retina-inspired spike camera has shown great potential for capturing dynamic scenes. [Expand]
Show Tweets
Thursday Poster Session
Self-Generated Defocus Blur Detection via Dual Adversarial Discriminators
Wenda Zhao, Cai Shang, Huchuan Lu
Although existing fully-supervised defocus blur detection (DBD) models significantly improve performance, training such deep models requires abundant pixel-level manual annotation, which is highly time-consuming and error-prone. [Expand]
Show Tweets
Tuesday Poster Session
Deep Compositional Metric Learning
Wenzhao Zheng, Chengkun Wang, Jiwen Lu, Jie Zhou
In this paper, we propose a deep compositional metric learning (DCML) framework for effective and generalizable similarity measurement between images. [Expand]
Show Tweets
Wednesday Poster Session
Deep Convolutional Dictionary Learning for Image Denoising
Hongyi Zheng, Hongwei Yong, Lei Zhang
Inspired by the great success of deep neural networks (DNNs), many unfolding methods have been proposed to integrate traditional image modeling techniques, such as dictionary learning (DicL) and sparse coding, into DNNs for image restoration. [Expand]
Monday Poster Session
High-Speed Image Reconstruction Through Short-Term Plasticity for Spiking Cameras
Yajing Zheng, Lingxiao Zheng, Zhaofei Yu, Boxin Shi, Yonghong Tian, Tiejun Huang
Fovea, located in the centre of the retina, is specialized for high-acuity vision. [Expand]
Show Tweets
Tuesday Poster Session
Improving Multiple Object Tracking With Single Object Tracking
Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, Hanqing Lu
Despite considerable similarities between multiple object tracking (MOT) and single object tracking (SOT) tasks, modern MOT methods have not benefited from the development of SOT ones to achieve satisfactory performance. [Expand]
Show Tweets
Monday Poster Session
Patchwise Generative ConvNet: Training Energy-Based Models From a Single Natural Image for Internal Learning
Zilong Zheng, Jianwen Xie, Ping Li
Exploiting internal statistics of a single natural image has long been recognized as a significant research paradigm where the goal is to learn the distribution of patches within the image without relying on external training data. [Expand]
Tuesday Poster Session
Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning
Zhuoran Zheng, Wenqi Ren, Xiaochun Cao, Xiaobin Hu, Tao Wang, Fenglong Song, Xiuyi Jia
During the last couple of years, convolutional neural networks (CNNs) have achieved significant success in the single image dehazing task. [Expand]
Show Tweets
Friday Poster Session
Unsupervised Disentanglement of Linear-Encoded Facial Semantics
Yutong Zheng, Yu-Kai Huang, Ran Tao, Zhiqiang Shen, Marios Savvides
We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision. [Expand]
Tuesday Poster Session
Single Image Reflection Removal With Absorption Effect
Qian Zheng, Boxin Shi, Jinnan Chen, Xudong Jiang, Ling-Yu Duan, Alex C. Kot
In this paper, we consider the absorption effect for the problem of single image reflection removal. [Expand]
Show Tweets
Thursday Poster Session
Glance and Gaze: Inferring Action-Aware Points for One-Stage Human-Object Interaction Detection
Xubin Zhong, Xian Qu, Changxing Ding, Dacheng Tao
Modern human-object interaction (HOI) detection approaches can be divided into one-stage methods and two-stage ones. [Expand]
Thursday Poster Session
DAP: Detection-Aware Pre-Training With Weak Supervision
Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang
This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. [Expand]
Tuesday Poster Session
Neighborhood Contrastive Learning for Novel Class Discovery
Zhun Zhong, Enrico Fini, Subhankar Roy, Zhiming Luo, Elisa Ricci, Nicu Sebe
In this paper, we address Novel Class Discovery (NCD), the task of unveiling new classes in a set of unlabeled samples given a labeled dataset with known classes. [Expand]
Show Tweets
Wednesday Poster Session
Decoupled Dynamic Filter Networks
Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang
Convolution is one of the basic building blocks of CNN architectures. [Expand]
Tuesday Poster Session
Effective Sparsification of Neural Networks With Global Sparsity Constraint
Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang
Weight pruning is an effective technique to reduce the model size and inference time for deep neural networks in real world deployments. [Expand]
Tuesday Poster Session
Embracing Uncertainty: Decoupling and De-Bias for Robust Temporal Grounding
Hao Zhou, Chongyang Zhang, Yan Luo, Yanjun Chen, Chuanping Hu
Temporal grounding aims to localize temporal boundaries within untrimmed videos by language queries, but it faces the challenge of two types of inevitable human uncertainties: query uncertainty and label uncertainty. [Expand]
Wednesday Poster Session
Face Forensics in the Wild
Tianfei Zhou, Wenguan Wang, Zhiyuan Liang, Jianbing Shen
On existing public benchmarks, face forgery detection techniques have achieved great success. [Expand]
Tuesday Poster Session
Human De-Occlusion: Invisible Perception and Recovery for Humans
Qiang Zhou, Shiyin Wang, Yitong Wang, Zilong Huang, Xinggang Wang
In this paper, we tackle the problem of human de-occlusion which reasons about occluded segmentation masks and invisible appearance content of humans. [Expand]
Tuesday Poster Session
Image De-Raining via Continual Learning
Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha
While deep convolutional neural networks (CNNs) have achieved great success on image de-raining task, most existing methods can only learn fixed mapping rules between paired rainy/clean images on a single dataset. [Expand]
Show Tweets
Tuesday Poster Session
Graph-Based High-Order Relation Modeling for Long-Term Action Recognition
Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng
Long-term actions involve many important visual concepts, e.g., objects, motions, and sub-actions, and there are various relations among these concepts, which we call basic relations. [Expand]
Show Tweets
Wednesday Poster Session
Learning Placeholders for Open-Set Recognition
Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan
Traditional classifiers are deployed under closed-set setting, with both training and test classes belong to the same set. [Expand]
Tuesday Poster Session
Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jiang
Monocular 3D object detection is an important task in autonomous driving. [Expand]
Show Tweets
Wednesday Poster Session
Positive Sample Propagation Along the Audio-Visual Event Line
Jinxing Zhou, Liang Zheng, Yiran Zhong, Shijie Hao, Meng Wang
Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). [Expand]
Wednesday Poster Session
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation
Tianfei Zhou, Jianwu Li, Xueyi Li, Ling Shao
This paper addresses the task of unsupervised video multi-object segmentation. [Expand]
Tuesday Poster Session
Prototype Augmentation and Self-Supervision for Incremental Learning
Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, Cheng-Lin Liu
Despite the impressive performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning new tasks incrementally. [Expand]
Show Tweets
Tuesday Poster Session
Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning
Kai Zhu, Yang Cao, Wei Zhai, Jie Cheng, Zheng-Jun Zha
Few-shot class-incremental learning is to recognize the new classes given few samples and not forget the old classes. [Expand]
Show Tweets
Tuesday Poster Session
Learning To Reconstruct High Speed and High Dynamic Range Videos From Events
Yunhao Zou, Yinqiang Zheng, Tsuyoshi Takatani, Ying Fu
Event cameras are novel sensors that capture the dynamics of a scene asynchronously. [Expand]
Show Tweets
Monday Poster Session