本贴是对 CVPR2021 已接受论文的粗略汇总,后期会有更详细的总结。期待ing......
官网链接:http://cvpr2021.thecvf.com
开会时间:2021年6月19日-6月25日
论文接收公布时间:2021年2月28日
接收论文IDs:
- 人脸
- 姿态
- 三维
- 分割
- 跟踪
- 点云
- 航拍
- 图像增强
- 知识蒸馏
- 人物交互
- 视频预测
- GAN
- GNN
- 未分
🎆🎆🎆更新提示:3月15日新增20篇(1人脸+3分割+13D+1成像+1医学+1量化+1动作识别+1持续学习+1视频生成+1Reid+1VQA+2GAN+1NAS+16D+3未分)
-
人脸
-
分割
- Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
⭐code - [Rethinking BiSeNet For Real-time Semantic Segmentation]
⭐code - Modular Interactive Video Object Segmentation:Interaction-to-Mask, Propagation and Difference-Aware Fusion
😮oral⭐code🏠project📺video
- Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
-
三维
-
成像
-
医学
-
量化
-
动作识别
-
持续学习
-
图像视频生成
-
Reid
- Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
⭐code
- Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
-
VQA
-
GAN
-
NAS
-
6D
-
未分
- Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
⭐code - CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
⭐code - Augmentation Strategies for Learning with Noisy Labels
⭐code
- Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
- Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
🌻dataset
- Spatially-Adaptive Pixelwise Networks for Fast Image Translation
🏠project
采用超网络和隐式函数,极快的图像到图像翻译速度(比基线快18倍) - Image Generators with Conditionally-Independent Pixel Synthesis
😮oral⭐code
- AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries
⭐code
解读:CVPR 2021接收论文:AdCo基于对抗的对比学习
- 场景文本检测
- Simulating Unknown Target Models for Query-Efficient Black-box Attacks
⭐code
黑盒对抗攻击 - Delving into Data: Effectively Substitute Training for Black-box Attack
基于高效训练替代模型的黑盒攻击方法
解读:8
- Learning Asynchronous and Sparse Human-Object Interaction in Videos
- QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
⭐code - Reformulating HOI Detection as Adaptive Set Prediction
- Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
⭐code🏠project📺video - VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
视频字幕、视频问答和视频对话任务的多模式框架 - Open-book Video Captioning with Retrieve-Copy-Generate Network
- Learning the Superpixel in a Non-iterative and Lifelong Manner
- IIRC: Incremental Implicitly-Refined Classification
🏠project - Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning
- Rainbow Memory: Continual Learning with a Memory of Diverse Samples
- Training Networks in Null Space for Continual Learning
😮oral⭐code
- Coarse-Fine Networks for Temporal Activity Detection in Videos
- 3D CNNs with Adaptive Temporal Feature Resolutions
- Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack
- BASAR:Black-box Attack on Skeletal Action Recognition
📺video - TDN: Temporal Difference Networks for Efficient Action Recognition
⭐code - ACTION-Net: Multipath Excitation for Action Recognition
⭐code
- 时序动作定位
- Modeling Multi-Label Action Dependencies for Temporal Action Localization
😮oral
提出基于注意力的网络架构来学习视频中的动作依赖性,用于解决多标签时间动作定位任务。 - Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
基于显著边界特征学习的无锚框时序动作定位
解读:10
- Modeling Multi-Label Action Dependencies for Temporal Action Localization
- Improving Unsupervised Image Clustering With Robust Learning
⭐code
利用鲁棒学习改进无监督图像聚类技术
- PML: Progressive Margin Loss for Long-tailed Age Classification
- Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels
⭐code - Fine-grained Angular Contrastive Learning with Coarse Labels
😮oral
使用自监督进行 Coarse Labels(粗标签)的细粒度分类方面的工作。粗标签与细粒度标签相比,更容易和更便宜,因为细粒度标签通常需要域专家。 - Graph-based High-Order Relation Discovery for Fine-grained Recognition
基于特征间高阶关系挖掘的细粒度识别方法
解读:20
- FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
⭐code - GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
⭐code - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism
😮oral⭐code
- ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
- NeX: Real-time View Synthesis with Neural Basis Expansion
😮oral🏠project📺video
利用神经基础扩展的实时视图合成技术
- Counterfactual Zero-Shot and Open-Set Visual Recognition
⭐code - Few-shot Open-set Recognition by Transformation Consistency
- PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers
📺video
通过消除 location-dependent 透视效果来改进3D人体姿势估计技术工作。 - CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild
- Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration
⭐code - Monocular Real-time Full Body Capture with Inter-part Correlations
📺video - Behavior-Driven Synthesis of Human Dynamics
⭐code🏠project - Context Modeling in 3D Human Pose Estimation: A Unified Perspective
- Learning Compositional Representation for 4D Captures with Neural ODE
- Densely connected multidilated convolutional networks for dense prediction tasks
提出的D3Net在语义分割&音乐源分离任务上的表现优于SOTA网络 - Dense Contrastive Learning for Self-Supervised Visual Pre-Training
⭐code
- Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
⭐code
- A Deep Emulator for Secondary Motion of 3D Characters
- Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction
😮oral🏠project📺video - Deep Implicit Templates for 3D Shape Representation
😮oral⭐code🏠project📺video
CVPR 2021 Oral,清华学者提出Deep Implicit Templates,极大扩展DIF能力 - SMPLicit: Topology-aware Generative Model for Clothed People
🏠project
- 深度估计
- PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss
- Beyond Image to Depth: Improving Depth Prediction using Echoes
⭐code🏠project - Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos
😮oral⭐code🏠project📺video
- Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph
- Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach
- Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
⭐code
ECCV 2020 Facebook Mapillary Visual Place Recognition Challenge 冠军方案
- 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management
用纯多模态 CT 影像可替代目前 JHMI 的需要做肿瘤化学检测和 DNA 测序+医学影像的综合多模态诊断流程,从诊断准确度上有可比较性,定量诊断精度更优 - Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
肿瘤影像里面智能 PACS 辅助医生读片的重要功能 - Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization
基于CT 影像的骨折/骨质疏松系统 - Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
⭐code
多机构合作,利用联合学习改进基于深度学习的磁共振图像重建技术 - DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images
😮oral⭐code
DeepTag: 一种无监督的深度学习方法,用于心脏标记磁共振图像的运动跟踪 - Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles
- 医学图像分割
- Transformer Interpretability Beyond Attention Visualization
⭐code - MIST: Multiple Instance Spatial Transformer Network
试图从热图中进行可微的top-K选择(MIST)(目前在自然图像上也有了一些结果;) 用它可以在没有任何定位监督的情况下进行检测和分类(并不是它唯一能做的事情!)
- 动作识别检测
- 3D Vision Transformers for Action Recognition
用于动作识别的3D视觉Transformer
- 3D Vision Transformers for Action Recognition
- 目标检测
- 图像处理
- 人机交互
- 图像分割
- Meta Batch-Instance Normalization for Generalizable Person Re-Identification
- Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification
- Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification
⭐code - [Self-supervised 3D Reconstruction and Re-Projection for Texture Insensitive Person Re-identification]
基于自监督三维重建和重投影的纹理不敏感行人重识别
解读:12 - Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
⭐code
- Learning Student Networks in the Wild
- ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network
⭐code - RepVGG: Making VGG-style ConvNets Great Again
⭐code - Coordinate Attention for Efficient Mobile Network Design
⭐code
- 剪枝
- 模型扩展
- 量化
- 知识蒸馏
- Dogfight: Detecting Drones from Drone Videos
- 航空影像分割
- 航空影像检测
- Data-Free Knowledge Distillation For Image Super-Resolution
- AdderSR: Towards Energy Efficient Image Super-Resolution
⭐code - Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images
🏠project📺video
CVPR 2021,Cross-MPI以底层场景结构为线索的端到端网络,在大分辨率(x8)差距下也可完成高保真的超分辨率 - ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
⭐code
- Robust Reference-based Super-Resolution via C²-Matching
- GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
😮oral🏠project - BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
⭐code🏠project - [Temporal Modulation Network for Controllable Space-Time Video Super-Resolution]
作者主页
基于时空特征可控插值的视频超分辨率网络
解读:18
- Weakly-supervised Grounded Visual Question Answering using Capsules
- Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
- Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
⭐code🏠project - Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs
- Image-to-image Translation via Hierarchical Style Disentanglement
⭐code - Efficient Conditional GAN Transfer with Knowledge Propagation across Classes
⭐code - Anycost GANs for Interactive Image Synthesis and Editing
⭐code🏠project📺video
Anycost GAN,可适应广泛的硬件和延迟要求,以及实现交互式图像编辑 - TediGAN: Text-Guided Diverse Image Generation and Manipulation
⭐code🏠project📺video - Generative Hierarchical Features from Synthesizing Images
😮oral⭐code🏠project
作者称预训练 GAN 生成器可以当作是一种学习的多尺度损失。用它进行训练可以带来高度竞争的层次化和分离的视觉特征,称之为生成层次化特征(GH-Feat)。并进一步表明,GH-Feat不仅有利于生成性任务,更重要的是有利于分辨性任务,包括人脸验证、关键点检测、layout prediction、迁移学习、style mixing、图像编辑等。 - Teachers Do More Than Teach: Compressing Image-to-Image Models
- PISE: Person Image Synthesis and Editing with Decoupled GAN
⭐code - LOHO: Latent Optimization of Hairstyles via Orthogonalization
- Image-to-image Translation via Hierarchical Style Disentanglement
😮oral⭐code
在图像到图像翻译上实现层次风格解耦 - CoMoGAN: continuous model-guided image-to-image translation
😮oral⭐code - HumanGAN: A Generative Model of Humans Images
- HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms
⭐code - DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
⭐code
- 小样本学习
- Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
- Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
- [Learning Dynamic Alignment via Meta-filter for Few-shot Learning]
作者主页
通过元卷积核实现基于动态对齐的小样本学习
解读:17
- Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
- 域泛化
- FSDR: Frequency Space Domain Randomization for Domain Generalization
受 JPEG 将空间图像转换为多个频率分量(FCs)的启发,提出频率空间域随机化(FSDR),通过保留域变量FCs(DIFs)和只随机化域变量FCs(DVFs)来随机化频率空间的图像。 - Domain Generalization via Inference-time Label-Preserving Target Projections
😮 Oral
- FSDR: Frequency Space Domain Randomization for Domain Generalization
- 零样本学习
- Probabilistic Embeddings for Cross-Modal Retrieval
- QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
- 图像恢复
- 去阴影
- 去模糊
- 去反射
- 去雾
- 去噪
- 去雨
- 曝光校正
- 人脸识别
- A 3D GAN for Improved Large-pose Facial Recognition
- When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
⭐github - MagFace: A Universal Representation for Face Recognition and Quality Assessment
😮oral⭐code
人脸识别+质量,今年的Oral presentation。 代码待整理 - WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
🏠project - ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
😮oral🏠project📺video - Spherical Confidence Learning for Face Recognition
😮oral
基于超球流形置信度学习的人脸识别 - Consistent Instance False Positive Improves Fairness in Face Recognition
基于实例误报一致性的人脸识别公平性提升方法
解读:7 - CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement
- Cross-Domain Similarity Learning for Face Recognition in Unseen Domains
- A 3D GAN for Improved Large-pose Facial Recognition
- Deepfake检测
- 人脸质量评估
- 3D人脸重建
- Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
😮oral
在开放的人像集合中学习3D人脸的聚合与特异化重建 - 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction
⭐code🏠project
- Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
- AttentiveNAS: Improving Neural Architecture Search via Attentive
- HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens
- ReNAS: Relativistic Evaluation of Neural Architecture Search
- OPANAS: One-Shot Path Aggregation Network Architecture Search for Object
- Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search
北京大学人工智能研究院机器学习研究中心 - Contrastive Neural Architecture Search with Neural Architecture Comparators
⭐code - Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator
⭐code
- 多目标跟踪
- Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
- Track to Detect and Segment: An Online Multi-Object Tracker
🏠project📺video - Multiple Object Tracking with Correlation Learning
- Learning a Proposal Classifier for Multiple Object Tracking
⭐code
- Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
- Information-Theoretic Segmentation by Inpainting Error Maximization
- Simultaneously Localize, Segment and Rank the Camouflaged Objects
⭐code - Capturing Omni-Range Context for Omnidirectional Segmentation
⭐code
- 全景分割
- 4D Panoptic LiDAR Segmentation
- Cross-View Regularization for Domain Adaptive Panoptic Segmentation
😮oral
用于域自适应全景分割的跨视图正则化方法 - Part-aware Panoptic Segmentation
- Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation
联合物体和物质挖掘的弱监督全景分割
解读:15
- 4D Panoptic LiDAR Segmentation
- 语义分割
- PLOP: Learning without Forgetting for Continual Semantic Segmentation
⭐code - Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
⭐dataset📺video - Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation
- Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation
- Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
😮oral⭐code - Learning Statistical Texture for Semantic Segmentation
- MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation
⭐code
语义分割中的无监督域适应的域感知元损失校正 - Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations
- Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
⭐code - [Rethinking BiSeNet For Real-time Semantic Segmentation]
⭐code
- PLOP: Learning without Forgetting for Continual Semantic Segmentation
- 场景理解/场景解析
- Exploring Data Efficient 3D Scene Understanding with Contrastive Scene Contexts
😮oral🏠project📺video - Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis
🏠project
利用面向边缘的推理进行基于3D点的场景图分析---场景理解 - Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
场景图生成---场景解析 - Monte Carlo Scene Search for 3D Scene Understanding
- Exploring Data Efficient 3D Scene Understanding with Contrastive Scene Contexts
- 抠图
- Real-Time High Resolution Background Matting
😮oral⭐code🏠project📺video
最新开源抠图技术,实时快速高分辨率,4k(30fps)、现代GPU(60fps)
解读:单块GPU实现4K分辨率每秒30帧,华盛顿大学实时视频抠图再升级,毛发细节到位
最新开源抠图技术,实时快速高分辨率,4k(30fps)、现代GPU(60fps)
- Real-Time High Resolution Background Matting
- 视频动作分割
- Global2Local: Efficient Structure Search for Video Action Segmentation
从全局到局部:面向视频动作分割的高效网络结构搜索
解读:19
- Global2Local: Efficient Structure Search for Video Action Segmentation
- 时序动作分割
- 雷达分割
- Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
😮oral⭐code
在 SemanticKITTI 榜单排名第一(until CVPR DDL),在 nuScenes 中获得 SOTA,并对其他基于激光雷达的任务保持了良好的泛化能力,包括激光雷达全景分割和激光雷达三维检测,其中就基于此工作,在 SemanticKITTI 全景分割榜单也排名第一。
- Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
- 视频目标分割
- Multiple Instance Active Learning for Object Detection
⭐code - Positive-Unlabeled Data Purification in the Wild for Object Detection
- Depth from Camera Motion and Object Detection
⭐github📺video - Towards Open World Object Detection
😮oral⭐code - General Instance Distillation for Object Detection
- Distilling Object Detectors via Decoupled Features
- MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection
- Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
😮oral - You Only Look One-level Feature (YOLOF)
⭐code
不需要 FPN 的有效目标检测器 - Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
⭐code
- 小样本目标检测
- Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
首个研究少样本检测任务的语义关系推理,并证明它可提升强基线的潜。 - Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection
北京大学人工智能研究院机器学习研究中心 - FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding
⭐code
- Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
- 多目标检测
- 3D目标检测
- Categorical Depth Distribution Network for Monocular 3D Object Detection
😮oral - 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection
⭐code🏠project📺video
更多:CVPR 2021|利用IoU预测进行半监督式3D目标检测 - ST3D: Self-training for Unsupervised Domain Adaptation on 3D ObjectDetection
⭐code - Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
- Categorical Depth Distribution Network for Monocular 3D Object Detection
- 旋转目标检测
- 目标定位
- 密集目标检测
- 弱监督
- 半监督
- 自监督
- Self-supervised Geometric Perception
😮oral⭐code
作者称 SGP 是第一个在几何感知中进行特征学习的通用框架,不需要任何来自 ground-truth 几何标签的监督。SGP以EM方式运行,它迭代执行几何模型的鲁棒估计以生成伪标签,并在噪声伪标签的监督下进行特征学习。将 SGP 应用于相机姿势估计和点云配准,并证明在大规模真实数据集中,SGP 的性能等同于甚至优于监督的权威。
- Self-supervised Geometric Perception
- Diffusion Probabilistic Models for 3D Point Cloud Generation
😮oral⭐code - Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
- MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization
😮oral⭐code - TPCN: Temporal Point Cloud Networks for Motion Forecasting
用于运动预测的时空点云网络 - PointGuard: Provably Robust 3D Point Cloud Classification
- How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines
⭐code
- 点云配准
- PREDATOR: Registration of 3D Point Clouds with Low Overlap
😮oral⭐code🏠project - SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
⭐code - Robust Point Cloud Registration Framework Based on Deep Graph Matching
⭐code - PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency
⭐code
- PREDATOR: Registration of 3D Point Clouds with Low Overlap
- 点云补全
- Sequential Graph Convolutional Network for Active Learning
- Quantifying Explainers of Graph Neural Networks in Computational Pathology
- Binary Graph Neural Networks
- Inverting the Inherence of Convolution for Visual Recognition
- Representative Batch Normalization with Feature Calibration
- UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining
- Reconsidering Representation Alignment for Multi-view Clustering
- Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map
- Instance Localization for Self-supervised Detection Pretraining
⭐code - Model-Contrastive Federated Learning
提出模型对比学习来解决联合学习中的非IID数据问题 - Neural Geometric Level of Detail:Real-time Rendering with Implicit 3D Surfaces
😮Oral⭐code🏠project - Data-Free Model Extraction
⭐code - Single-Stage Instance Shadow Detection with Bidirectional Relation Learning
😮oral
⭐code - Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning
😮oral - PatchmatchNet: Learned Multi-View Patchmatch Stereo
😮oral⭐code - [Online Bag-of-Visual-Words Generation for Unsupervised Representation Learning]
- [Semantic Palette: Guiding Scene Generation with Class Proportions]
- Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors
😮oral - POSEFusion:Pose-guided Selective Fusion for Single-view Human Volumetric Capture
😮oral - Multi-Objective Interpolation Training for Robustness to Label Noise
⭐code - Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations
⭐code - Simpler Certified Radius Maximization by Propagating Covariances
😮oral📺video - Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
- Discovering Hidden Physics Behind Transport Dynamics
😮oral - Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder
😮oral⭐code🏠project - Deep Gradient Projection Networks for Pan-sharpening
⭐code - Consensus Maximisation Using Influences of Monotone Boolean Functions
😮oral
- Forecasting Irreversible Disease via Progression Learning
- Causal Hidden Markov Model for Time Series Disease Forecasting
- Towards Unified Surgical Skill Assessment
- RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words
RSTNet: 基于可区分视觉词和非视觉词的自适应注意力机制的图像描述生成模型
解读:14 - Removing the Background by Adding the Background: Towards a Background Robust Self-supervised Video Representation Learning
通过添加背景来去除背景影响:背景鲁棒的自监督视频表征学习
解读:11 - Representative Batch Normalization with Feature Calibration
😮oral
作者主页
基于特征校准的表征批规范化方法解读:4 - Learning Compositional Representation for 4D Captures with Neural ODE
- Involution: Inverting the Inherence of Convolution for Visual Recognition
⭐code - Spatially Consistent Representation Learning
- Limitations of Post-Hoc Feature Alignment for Robustness
- AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation
⭐code - Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
⭐code - CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
⭐code - Augmentation Strategies for Learning with Noisy Labels
⭐code - Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
⭐code - CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
⭐code - Augmentation Strategies for Learning with Noisy Labels
⭐code
-
Visual Perception for Navigation in Human Environments
第二届人类环境导航视觉感知征稿⚠️ 4月15截止 -
UG 2 + Challenge
旨在通过应用图像恢复和增强算法提高分析性能,推动对 "difficult"图像的分析。参与者任务是开发新的算法,以改进对在问题条件下拍摄的图像分析。
👑10K美元奖金- 低能见度环境下的目标检测
- 雾霾条件下的(半)监督目标检测
- (半)低光条件下的人脸检测
- 黑暗视频中的动作识别
- 黑暗中进行完全监督动作识别
- 黑暗中进行半监督动作识别
- 低能见度环境下的目标检测
-
Continual Learning in Computer Vision 征稿中
旨在聚集学术界和工业界的研究人员和工程师,讨论持续学习的最新进展。- Best paper award: 500 USD + 500 USD worth of Huawei cloud credits (HUAWEI)
- Overall Challenge winner: 1,000 USD + 500 USD worth of Huawei cloud credits (HUAWEI)
- Supervised-Learning track winner: 500 USD (HUAWEI)
- Reinforcement-Learning track winner: 500 USD (ServiceNow)
-
Responsible Computer Vision
⚠️ 3月25日截止
本次研讨会将广泛讨论计算机视觉背景下负责任的人工智能的三个主要方面:公平性;可解释性和透明度;以及隐私。 -
Holistic Video Understanding
目的是建立一个整合所有语义概念联合识别的视频基准,因为每个任务的单一类标签往往不足以描述视频的整体内容。 -
FGVC 8
第八届细粒度视觉分类研讨会(FGVC8)将通过细粒度视觉理解的视角,探讨细粒度学习、自监督学习、半监督学习、matching(匹配)、localization(定位)、域适应、迁移学习、小样本学习、机器教学、多模态学习(如音频和视频)、众包和分类学预测等相关话题。⚠️ 论文截稿日期为4月2日
征稿主题包含以下几个方面- Fine-grained categorization细粒度分类
- Novel datasets and data collection strategies for fine-grained categorization用于细粒度分类的新型数据集和数据收集策略
- Appropriate error metrics for fine-grained categorization细粒度分类的适当误差指标
- Low/few shot learning少/小样本学习
- Self-supervised learning自监督学习
- Semi-supervised learning半监督学习
- Transfer-learning from known to novel subcategories
- Attribute and part based approaches
- Taxonomic predictions
- Addressing long-tailed distributions
- Human-in-the-loop
- Fine-grained categorization with humans in the loop
- Embedding human experts’ knowledge into computational models
- Machine teaching
- Interpretable fine-grained models
- Multi-modal learning
- Using audio and video data
- Using geographical priors
- Learning shape
- Fine-grained applications
- Product recognition
- Animal biometrics and camera traps
- Museum collections
- Agricultural
- Medical
- Fashion
- 相关挑战赛如下(部分已在Kaggle网站开始)
- GeoLifeCLEF2021
利用观测结果与航空图像和环境特征配对,预测物种的存在 - Semi-iNat2021
由iNaturalist的数据组成的半监督细粒度图像分类 - iNatChallenge2021
对1万类动植物进行图像分类挑战赛 - iMet2021
对艺术品进行细粒度属性分类 - iMat-Fashion2021未开始
服装实例分割和细粒度属性分类 - Hotel-ID 2021
从图像中识别酒店房间 - HerbariumChallenge2021
从数据集中识别标本,该数据集包含来自美洲、大洋洲和太平洋地区的近66,000种 vascular plant species(维管束植物)的 2.5M 图像 - iWildCam2021
对图像序列中每个物种的动物数量计数 - PlantPathologyChallenge2021未开始
对病害植物的图像进行分类
- GeoLifeCLEF2021
- Fine-grained categorization细粒度分类