paper.txt

Quantum reinforcement learning 
Quantifying and Mitigating the Impact of Label Errors on Model Disparity Metrics
Suppression helps: Lateral Inhibition-inspired Convolutional Neural Network for Image Classification
Factorized Fourier Neural Operators
DFPC: Data flow driven pruning of coupled channels without data.
TVSPrune - Pruning Non-discriminative filters via Total Variation separability of intermediate representations without fine tuning
Adversarial Training descends without descent: Finding actual descent directions based on Danskin's theorem
A Study of Biologically Plausible Neural Network: the Role and Interactions of Brain-Inspired Mechanisms in Continual Learning
Learning Continuous Normalizing Flows For Faster Convergence To Target Distribution via Ascent Regularizations
pFedKT: Personalized Federated Learning via Knowledge Transfer
FARE: Provably Fair Representation Learning
ONLINE RESTLESS BANDITS WITH UNOBSERVED STATES
Dual-Domain Diffusion Based Progressive Style Rendering towards Semantic Structure Preservation
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Learning to aggregate: A parameterized aggregator to debias aggregation for cross-device federated learning
NeuralStagger: accelerating physics constrained neural PDE solver with spatial-temporal decomposition
Towards Robust Online Dialogue Response Generation
Deep Reinforcement Learning based Insight Selection Policy
Data Leakage in Tabular Federated Learning
Long-horizon video prediction using a dynamic latent hierarchy
SwinZS3: Zero-Shot Semantic Segmentation with a Swin Transformer
Softened Symbol Grounding for Neuro-symbolic Systems
Encoding Recurrence into Transformers
Generating Intuitive Fairness Specifications for Natural Language Processing
Learning to Perturb for Contrastive Learning of Unsupervised Sentence Representations
Proper Scoring Rules for Survival Analysis
Social Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS
Mini-batch $k$-means terminates within $O(d/\epsilon)$ iterations
Convergence is Not Enough: Average-Case Performance of No-Regret Learning Dynamics
Gene finding revisited: improved robustness through structured decoding from learning embeddings
PPAT: Progressive Graph Pairwise Attention Network for Event Causality Identification
Learning Uncertainty for Unknown Domains with Zero-Target-Assumption
Detecting Out-of-Distribution Data with Semi-supervised Graph ��Feature" Networks
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flow
Towards a Complete Theory of Neural Networks with Few Neurons
Machine Learning from Explanations
Functional Risk Minimization
Latent Linear ODEs with Neural Kalman Filtering for Irregular Time Series Forecasting
Transformer-based model for symbolic regression via joint supervised learning
Gradient-Based Transfer Learning
Coreset for Rational Functions
Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation
Transformer needs NMDA receptor nonlinearity for long-term memory
Simple Spectral Graph Convolution from an Optimization Perspective
QAID: Question Answering Inspired Few-shot Intent Detection
Rethinking the Value of Prompt Learning for Vision-Language Models
Partial Output Norm: Mitigating the Model Output Blow-up Effect of Cross Entropy Loss 
Disentangled Feature Swapping Augmentation for Weakly Supervised Semantic Segmentation
FLOP: Tasks for Fitness Landscapes Of Protein families using sequence- and structure-based representations
Distributed Least Square Ranking with Random Features
Doing Fast Adaptation Fast: Conditionally Independent Deep Ensembles for Distribution Shifts
Solving stochastic weak Minty variational inequalities without increasing batch size
Diversity Boosted Learning for Domain Generalization with a Large Number of Domains
A Hybrid Framework for Generating A Country-scale Synthetic Population
Towards Performance-maximizing Network Pruning via Global Channel Attention
Adaptive Block-wise Learning for Knowledge Distillation
Curriculum-based Co-design of Morphology and Control of Voxel-based Soft Robots
Object-Centric Learning with Slot Mixture Models
WiNeRT: Towards Neural Ray Tracing for Wireless Channel Modelling and Differentiable Simulations
Pocket-specific 3D Molecule Generation by Fragment-based Autoregressive Diffusion Models
Learning with Non-Uniform Label Noise: A Cluster-Dependent Semi-Supervised Approach
Towards scalable and non-IID robust Hierarchical Federated Learning via Label-driven Knowledge Aggregator
Free Bits: Platform-Aware Latency Optimization of Mixed-Precision Neural Networks for Edge Deployment
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
On the Certification of Classifiers for Outperforming Human Annotators
Loss Adapted Plasticity: Learning From Data With Unreliable Sources
Share Your Representation Only: Guaranteed Improvement of the Privacy-Utility Tradeoff in Federated Learning
Quantized Disentangled Representations for Object-Centric Visual Tasks
Supervised Random Feature Regression via Projection Pursuit
Graph Spline Networks for Efficient Continuous Simulation of Dynamical Systems
Online black-box adaptation to label-shift in the presence of conditional-shift
RuDar: Weather Radar Dataset for Precipitation Nowcasting with Geographical and Seasonal Variability
Learning Representations for Reinforcement Learning with Hierarchical Forward Models
xTrimoABFold: Improving Antibody Structure Prediction without Multiple Sequence Alignments 
Thresholded Lexicographic Ordered Multi-Objective Reinforcement Learning
HOW SAMPLING AFFECTS TRAINING: AN EFFECTIVE SAMPLING THEORY STUDY FOR LONG-TAILED IMAGE CLASSIFICATION
MolBART: Generative Masked Language Models for Molecular Representations
EquiMod: An Equivariance Module to Improve Self-Supervised Learning
Cross-utterance Conditioned Coherent Speech Editing via Biased Training and Entire Inference
Manipulating Multi-agent Navigation Task via Emergent Communications
Task-Aware Information Routing from Common Representation Space in Lifelong Learning
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
SWRM: Similarity Window Reweighting and Margins for Long-Tailed Recognition
Transport with Support: Data-Conditional Diffusion Bridges
Supervised Q-Learning can be a Strong Baseline for Continuous Control
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Self-Supervised Off-Policy Ranking via Crowd Layer
Probing for Correlations of Causal Facts: Large Language Models and Causality
Geometry Problem Solving based on Counterfactual Evolutionary Reasoning
Few-Shot Domain Adaptation For End-to-End Communication
HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Network 
Causal Inference for Knowledge Graph Completion
Formal Specifications from Natural Language
DELTA: Diverse Client Sampling for Fasting Federated Learning
Incremental Predictive Coding: A Parallel and Fully Automatic Learning Algorithm
Rethinking Metric Based Contrastive Learning Method��s Generalization Capability
RISC-V MICROARCHITECTURE EXPLORATION VIA REINFORCEMENT LEARNING
Improve distance metric learning by learning positions of class centers
The guide and the explorer: smart agents for resource-limited iterated batch reinforcement learning
FairGBM: Gradient Boosting with Fairness Constraints
Kinship Representation Learning with Face Componential Relation
Pseudo-Differential Integral Operator for Learning Solution Operators of Partial Differential Equations
How (Un)Fair is Text Summarization?
Simulating Task-Free Continual Learning Streams From Existing Datasets
Online Bias Correction for Task-Free Continual Learning
A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration
Enriching Online Knowledge Distillation with Specialist Ensemble
Improved Training of Physics-Informed Neural Networks with Model Ensembles
Improved Gradient Descent Optimization Algorithm based on Inverse Model-Parameter Difference
Variational Learning ISTA
Moment Distributionally Robust Probabilistic Supervised Learning
CLEP: Exploiting Edge Partitioning for Graph Contrastive Learning
Meta-Learning the Inductive Biases of Simple Neural Circuits
Enabling Equation Learning with the Bayesian Model Evidence via systematic $R^2$-elimination
Curvature Informed Furthest Point Sampling
Accelerating spiking neural network training using the $d$-block model
RG: OUT-OF-DISTRIBUTION DETECTION WITH REACTIVATE GRADNORM
Don��t fear the unlabelled: safe semi-supervised learning via debiasing
Gandalf : Data Augmentation is all you need for Extreme Classification
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
Attention Flows for General Transformers
Grounded Contrastive Learning for Open-world Semantic Segmentation
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
Learning Group Importance using the Differentiable Hypergeometric Distribution
Convergence Rate of Primal-Dual Approach to Constrained Reinforcement Learning with Softmax Policy
Cross-Layer Retrospective Retrieving via Layer Attention
RephraseTTS: Dynamic Length Text based Speech Insertion with Speaker Style Transfer
Decision S4: Efficient Sequence-Based RL via State Spaces Layers
Deep autoregressive density nets vs neural ensembles for model-based offline reinforcement learning
Light and Accurate: Neural Architecture Search via Two Constant Shared Weights Initialisations
Unveiling the sampling density in non-uniform geometric graphs
Smooth image-to-image translations with latent space interpolations
Boosting Causal Discovery via Adaptive Sample Reweighting
Robust Training through Adversarially Selected Data Subsets
Beyond Reward: Offline Preference-guided Policy Optimization
Iterative Circuit Repair Against Formal Specifications
Neural Probabilistic Logic Programming in Discrete-Continuous Domains
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Behavior Proximal Policy Optimization 
UiTTa: Online Test-Time Adaptation by User Interaction
FedGC: An Accurate and Efficient Federated Learning under Gradient Constraint for Heterogeneous Data
Actionable Neural Representations: Grid Cells from Minimal Constraints
xTrimoDock: Cross-Modal Transformer for Multi-Chain Protein Docking
Compression-aware Training of Neural Networks using Frank-Wolfe
Modeling content creator incentives on algorithm-curated platforms
MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals
Group-Disentangling Conditional Shift
When and Why Is Pretraining Object-Centric Representations Good for Reinforcement Learning?
Face reconstruction from facial templates by learning latent space of a generator network
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules
A sparse, fast, and stable representation for multiparameter topological data analysis
What's in a name? The Influence of Personal Names on Spatial Reasoning in BLOOM Large Language Models
Contrastive Representation Learning for Multi-scale Spatial Scenes
Improving Protein Interaction Prediction using Pretrained Structure Embedding
Batch Normalization and Bounded Activation Functions
Versatile Energy-Based Models for High Energy Physics
MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for Long-tailed Semantic Segmentation
Concept-level Debugging of Part-Prototype Networks
Geometrically regularized autoencoders for non-Euclidean data
Model-based Unknown Input Estimation via Partially Observable Markov Decision Processes
TransFool: An Adversarial Attack against Neural Machine Translation Models
Protein Sequence Design in a Latent Space via Model-based Reinforcement Learning
Breaking Large Language Model-based Code Generation
The GANfather: Controllable generation of malicious activity to expose detection weaknesses and improve defence systems.
Proximal Validation Protocol
A Message Passing Perspective on Learning Dynamics of Contrastive Learning
Farsighter: Efficient Multi-step Exploration for Deep Reinforcement Learning
Help Me Explore: Combining Autotelic and Social Learning via Active Goal Queries
AUTOMATIC CURRICULUM FOR UNSUPERVISED REIN- FORCEMENT LEARNING
Exploiting Personalized Invariance for Better Out-of-distribution Generalization in Federated Learning
Filtered Semi-Markov CRF
Zeroth-Order Optimization with Trajectory-Informed Derivative Estimation
Distance VS. Coordinate: Distance Based Embedding Improves Model Generalization for Routing Problems
Towards biologically plausible Dreaming and Planning
Mixture of Basis for Interpretable Continual Learning with Distribution Shifts
Extracting Meaningful Attention on Source Code: An Empirical Study of Developer and Neural Model Code Exploration
Denoising Differential Privacy in Split Learning
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
On Representation Learning in the First Layer of Deep CNNs and the Dynamics of Gradient Descent
Learning Layered Implicit Model for 3D Avatar Clothing Representation
Scrunch: Preventing sensitive property inference through privacy-preserving representation learning
Uniform-in-time propagation of chaos for the mean field gradient Langevin dynamics
GM-VAE: Representation Learning with VAE on Gaussian Manifold
Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples
Generalizable Multi-Relational Graph Representation Learning:  A Message Intervention Approach
Causal Explanations of Structural Causal Models
Asynchronous Distributed Bilevel Optimization
Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Confidence-Based Feature Imputation for Graphs with Partially Known Features
Explicitly Maintaining Diverse Playing Styles in Self-Play
Toward Learning Geometric Eigen-Lengths Crucial for Robotic Fitting Tasks
Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions
LiftedCL: Lifting Contrastive Learning for Human-Centric Perception
Individual Privacy Accounting with Gaussian Differential Privacy
Evolving Populations of Diverse RL Agents with MAP-Elites
Deconfounded Noisy Labels Learning
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data 
Learning Test Time Augmentation with Cascade Loss Prediction
Adaptive Computation with Elastic Input Sequence
Opportunistic Actor-Critic (OPAC) with Clipped Triple Q-learning
Optimizing Data-Flow in Binary Neural Networks
Gray-Box Gaussian Processes for Automated Reinforcement Learning
Protein Sequence and Structure Co-Design with Equivariant Translation
Deep Equilibrium Non-Autoregressive Sequence Learning
PTUnifier: Pseudo Tokens as Paradigm Unifiers in Medical Vision-and-Language Pre-training
SGD Through the Lens of Kolmogorov Complexity
Offline imitation learning by controlling the effective planning horizon
Learning in temporally structured environments
Identifying Phase Transition Thresholds of Permuted Linear Regression via Message Passing
RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates
Improving the Calibration of Fine-tuned Language Models via Denoising Variational Auto-Encoders
A Hierarchical Bayesian Approach to Federated Learning
Neural Representations in Multi-Task Learning guided by Task-Dependent Contexts
MCTransformer: Combining Transformers And Monte-Carlo Tree Search For Offline Reinforcement Learning
One-Step Estimator for Permuted Sparse Recovery
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Guarded Policy Optimization with Imperfect Online Demonstrations
Fast Nonlinear Vector Quantile Regression
Multi Task Learning of Different Class Label Representations for Stronger Models
On the Existence of a Trojaned Twin Model
On Information Maximisation in Multi-View Self-Supervised Learning
Leveraging Large Language Models for Multiple Choice Question Answering
SELCOR: Self-Correction for Weakly Supervised Learning
Efficiently Meta-Learning for Robust Deep Networks without Prior Unbiased Set
Learning with Logical Constraints but without Shortcut Satisfaction
Certified Training: Small Boxes are All You Need
Label Similarity Aware Contrastive Learning
Counterfactual Generation Under Confounding
Regression with Label Differential Privacy
Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
SRBGCN: Tangent space-Free Lorentz Transformations for Graph Feature Learning
Transfer NAS with Meta-learned Bayesian Surrogates
Mitigating the Limitations of Multimodal VAEs with Coordination-Based Approach
Incompatibility between Deterministic Policy and Generative Adversarial Imitation Learning
FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation
Contrastive Learning of Molecular Representation with Fragmented Views
Theoretical Study of Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Sharp Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Selective Frequency Network for Image Restoration
Contextualized Generative Retrieval
Mirror Training for Input Convex Neural Network
Scaling Up Probabilistic Circuits by Latent Variable Distillation
Oscillation Neural Ordinary Differential Equations
Improving Differentiable Neural Architecture Search by Encouraging Transferability
MA-BERT: Towards Matrix Arithmetic-only BERT Inference by Eliminating Complex Non-linear Functions
Automatically Answering and Generating Machine Learning Final Exams
CAT: Collaborative Adversarial Training
Efficient Certified Training and Robustness Verification of Neural ODEs
Arbitrary Virtual Try-On Network: Characteristics Representation and Trade-off between Body and Clothing
A Benchmark Dataset for Learning from Label Proportions
UL2: Unifying Language Learning Paradigms
Emergence of Exploration in Policy Gradient Reinforcement Learning via Resetting
CASR: Generating Complex Sequences with Autoregressive Self-Boost Refinement
SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
On the convergence of SGD under the over-parameter setting
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers
Offline Reinforcement Learning via Weighted $f$-divergence
Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
Some Practical Concerns and Solutions for Using Pretrained Representation in Industrial Systems
Exphormer: Scaling Graph Transformers with Expander Graphs
Generalization to translation shifts in object detection: a study in architectures and augmentations
Feature selection and low test error in shallow low-rotation ReLU networks
Backpropagation through Combinatorial Algorithms: Identity with Projection Works
Therbligs in Action: Video Understanding through Motion Primitives
On the Adversarial Robustness against Natural Weather Perturbations
Coupled Multiwavelet Operator Learning for Coupled Differential Equations
Don��t Bet on Sparsity: Designing Brain-inspired Distance-preserving Encoder
Mid-Vision Feedback for Convolutional Neural Networks
Cross-Window Self-Training via Context Variations from Sparsely-Labeled Time Series
Revisiting and Improving FGSM Adversarial Training
Safe Reinforcement Learning From Pixels Using a Stochastic Latent Representation
TrojText: Test-time Invisible Textual Trojan Insertion
An Experiment Design Paradigm using Joint Feature Selection and Task Optimization
Multi-Objective Online Learning
Improved Training of Physics-Informed Neural Networks Using Energy-Based Priors: a Study on Electrical Impedance Tomography
Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Muliple Heterogeneous Datasets
Robustness Guarantees for Adversarially Trained Neural Networks
Fast-PINN for Complex Geometry: Solving PDEs with Boundary Connectivity Loss
Noise Transforms Feed-Forward Networks into Sparse Coding Networks
DEFENDING BACKDOOR ATTACKS VIA ROBUSTNESS AGAINST NOISY LABEL
A Kernel Perspective of Skip Connections in Convolutional Networks
SlothBomb: Efficiency Poisoning Attack against Dynamic Neural Networks
Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing
Sparse Distributed Memory is a Continual Learner
Optimistic Exploration in Reinforcement Learning Using Symbolic Model Estimates
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
Towards Automatic Generation of Advanced Shift Networks
Robust attributions require rethinking robustness metrics
Learned Nearest-Class-Mean for Biased Representations in Long-Tailed Recognition
GradientMix: A Simple yet Effective Regularization for Large Batch Training
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining
Discrete State-Action Abstraction via the Successor Representation
Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access
Towards Learning Implicit Symbolic Representation for Visual Reasoning
GNNInterpreter: A Probabilistic Generative Model-Level Explanation for Graph Neural Networks
Intra-Instance VICReg: Bag of Self-Supervised Image Patch Embedding Explains the Performance
Rethinking Symbolic Regression: Morphology and Adaptability in the Context of Evolutionary Algorithms
Efficient, probabilistic analysis of combinatorial neural codes
On Pre-training Language Model for Antibody
Challenging Common Assumptions about Catastrophic Forgetting
Learning to reason over visual objects
Imitating Graph-Based Planning with Goal-Conditioned Policies
Prefer to Classify: Improving Text Classifier via Pair-wise Preference Learning
Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning
Simple and Deep Graph Attention Networks
A theoretical study of inductive biases in contrastive learning
Combinatorial Pure Exploration of Causal Bandits
How to fine-tune vision models with SGD
Computational Language Acquisition with Theory of Mind
R��nyi Supervised Contrastive Learning for Transferable Representation
MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News Detection
Walking the Tightrope: An Investigation of the Convolutional Autoencoder Bottleneck
A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
Pareto Invariant Risk Minimization
Understanding and Adopting Rational Behavior by Bellman Score Estimation
Meta-Learning for Bootstrapping Medical Image Segmentation from Imperfect Supervision 
L2B: Learning to Bootstrap for Combating Label Noise
What Makes Convolutional Models Great on Long Sequence Modeling?
Progressive Mixup Augmented Teacher-Student Learning for Unsupervised Domain Adaptation
M$^3$SAT: A Sparsely Activated Transformer for Efficient Multi-Task Learning from Multiple Modalities
Editing models with task arithmetic
Structured World Representations via Block-Slot Attention
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Atomized Deep Learning Models
Topology Matters in Fair Graph Learning: a Theoretical Pilot Study
Context-Aware Image Completion
Speech denoising by listening to noise
Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution Trajectories
DYNAMIC BATCH NORM STATISTICS UPDATE FOR NATURAL ROBUSTNESS
SKTformer: A Skeleton Transformer for Long Sequence Data
CktGNN:  Circuit Graph Neural Network for Electronic Design Automation
How Should I Plan? A Performance Comparison of Decision-Time vs. Background Planning
Substructure-Atom Cross Attention for Molecular Representation Learning
Differentially Private Algorithms for Smooth Nonconvex ERM
Untangling Effect and Side Effect: Consistent Causal Inference in Non-Targeted Trials
AMA: Asymptotic Midpoint Augmentation for Margin Balancing and Moderate Broadening
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
MEDIC: Model Backdoor Removal by Importance Driven Cloning
The Role of Pre-training Data in Transfer Learning
Compressed Predictive Information Coding
Importance of Class Selectivity in Early Epochs of Training
Mechanistic Mode Connectivity
CLASSIFICATION OF INCOMPLETE DATA USING AUGMENTED MLP
On the Convergence of Federated Deep AUC Maximization
Towards A Unified Neural Architecture for Visual Recognition and Reasoning
BLOOM Large Language Models and the Chomsky Hierarchy
WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus
HloEnv: A Graph Rewrite Environment for Deep Learning Compiler Optimization Research
Towards Diverse Perspective Learning with Switch over Multiple Temporal Pooling
Deep Latent State Space Models for Time-Series Generation
Specformer: Spectral Graph Neural Networks Meet Transformers
MetaP: How to Transfer Your Knowledge on Learning Hidden Physics
CommsVAE: Learning the brain's macroscale communication dynamics using coupled sequential VAEs
Beyond the injective assumption in causal representation learning
Answer Me if You Can: Debiasing Video Question Answering via Answering Unanswerable Questions
Language Models Can (kind of) Reason: A Systematic Formal Analysis of Chain-of-Thought
Approximation ability of Transformer networks for functions with various smoothness of Besov spaces: error analysis and token extraction
Clustering Embedding Tables, Without First Learning Them
Architecture Matters in Continual Learning
Machine Learning Force Fields with Data Cost Aware Training
Covariance Matrix Adaptation MAP-Annealing
Learning Rewards and Skills to Follow Commands with a Data Efficient Visual-Audio Representation
Reinforcement Learning-Based Estimation for Partial Differential Equations
Heterogeneous-Agent Mirror Learning
ADELT: Unsupervised Transpilation Between Deep Learning Frameworks
Recursive Time Series Data Augmentation
Auto-Encoding Goodness of Fit
VER: Learning Natural Language Representations for Verbalizing Entities and Relations
Adaptive IMLE for Few-shot Image Synthesis
Understanding the Covariance Structure of Convolutional Filters
Reinforcement Logic Rule Learning for Temporal Point Processes 
On Making Graph Continual Learning Easy, Fool-Proof, and Extensive: a Benchmark Framework and Scenarios
Masked Distillation with Receptive Tokens
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
TextShield: Beyond Successfully Detecting Adversarial Sentences in NLP
Efficient Deep Reinforcement Learning Requires Regulating Statistical Overfitting
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation
GNN Domain Adaptation using Optimal Transport
Ask Me Anything: A simple strategy for prompting language models
MixBin: Towards Budgeted Binarization
Limits of Algorithmic Stability for Distributional Generalization
WikiWhy: Answering and Explaining Cause-and-Effect Questions
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Do We Really Need Graph Models for Skeleton-Based Action Recognition? A Topology-Agnostic Approach with Fully-Connected Networks
An Integrated Multi-Label Multi-Modal Framework in Deep Metric Learning
Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Conservative Exploration in Linear MDPs under Episode-wise Constraints
Pseudometric guided online query and update for offline reinforcement learning
Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks
Probe Into Multi-agent Adversarial Reinforcement Learning through Mean-Field Optimal Control
Robust Algorithms on Adaptive Inputs from Bounded Adversaries
Chasing All-Round Graph Representation Robustness: Model, Training, and Optimization
Training Neural Networks with Low-Precision Model Memory
Raisin: Residual Algorithms for Versatile Offline Reinforcement Learning
VQR: Automated Software Vulnerability Repair Through Vulnerability Queries
Corruption-free Single-view Self-supervised Learning on Graphs
Fully Online Meta Learning
Learning Globally Smooth Functions on Manifolds
On Representing Mixed-Integer Linear Programs by Graph Neural Networks
LEARNING DYNAMIC ABSTRACT REPRESENTATIONS FOR SAMPLE-EFFICIENT REINFORCEMENT LEARNING
Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation
On Representing Linear Programs by Graph Neural Networks
On the Importance and Applicability of Pre-Training for Federated Learning
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Autoregressive Graph Network for Learning Multi-step Physics
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth
Who are playing the games?
Quasiconvex Shallow Neural Network
The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation
Minimalistic Unsupervised Learning with the Sparse Manifold Transform
Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning
Over-Training with Mixup May Hurt Generalization
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions
Text-Conditioned Graph Generation Using Discrete Graph Variational Autoencoders
Dynamic Neural Network is All You Need: Understanding the Robustness of Dynamic Mechanisms in Neural Networks
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Learning Shareable Bases for Personalized Federated Image Classification
Curriculum-inspired Training for Selective Neural Networks
Layer-wise Balanced Activation Mechanism
A Probabilistic Framework For Modular Continual Learning
Knowledge-Grounded Reinforcement Learning
Git Re-Basin: Merging Models modulo Permutation Symmetries
The Tilted Variational Autoencoder: Improving Out-of-Distribution Detection
The Role of Coverage in Online Reinforcement Learning
Learning Mixture Models with Simultaneous Data Partitioning and Parameter Estimation
Estimating Treatment Effects using Neurosymbolic Program Synthesis
Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning
UNDERSTANDING HTML WITH LARGE LANGUAGE MODELS
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
Kuiper: Moderated Asynchronous Federated Learning on Heterogeneous Mobile Devices with Non-IID Data
Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward
Semi-Autoregressive Energy Flows: Towards Determinant-Free Training of Normalizing Flows
PINTO: Faithful Language Reasoning Using Prompted-Generated Rationales
State Decomposition for Model-free Partially observable Markov Decision Process
Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning
Return Augmentation gives Supervised RL Temporal Compositionality
Neural Integral Equations
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
Automatic Data Augmentation via Invariance-Constrained Learning
GEASS: Neural causal feature selection for high-dimensional biological data
Unsupervised 3D Scene Representation Learning via Movable Object Inference
FoveaTer: Foveated Transformer for Image Classification
Linearly Mapping from Image to Text Space
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
Characterizing intrinsic compositionality in transformers with Tree Projections
What Do We Maximize in Self-Supervised Learning And Why Does Generalization Emerge?
SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Similarity-Based Cooperation
Consistent Data Distribution Sampling for Large-scale Retrieval
NOVEL FEATURE REPRESENTATION STRATEGIES FOR TIME SERIES FORECASTING WITH PREDICTED FUTURE COVARIATES
Augmentation Component Analysis: Modeling Similarity via the Augmentation Overlaps
Reproducible Bandits
Persistence-based Contrastive Learning with Graph Neural Recurrent Networks for Time-series Forecasting
ACE-EM: Boosted ab initio Cryo-EM 3D Reconstruction with Asymmetric Complementary Autoencoder
Diffusion-based point cloud generation with smoothness constraints
NEURAL HAMILTONIAN FLOWS IN GRAPH NEURAL NETWORKS
Convergence Analysis of Split Learning on Non-IID Data
Principal Trade-off Analysis
Neural Bregman Divergences for Distance Learning
Neural Autoregressive Refinement for Self-Supervised Outlier Detection beyond Images
Offline Reinforcement Learning from Heteroskedastic Data Via Support Constraints
Finding Private Bugs: Debugging Implementations of Differentially Private Stochastic Gradient Descent 
Robust Generative Flows on Reliable Image Reconstruction without Training Data
A Computationally Efficient Sparsified Online Newton Method
TG-Gen: A Deep Generative Model Framework for Temporal Graphs
Solving Continual Learning via Problem Decomposition
Long Term Fairness via Performative Distributionally Robust Optimization
The In-Sample Softmax for Offline Reinforcement Learning
LUNA: Language as Continuing Anchors for Referring Expression Comprehension
Bias Propagation in Federated Learning
A Study of Causal Confusion in Preference-Based Reward Learning
UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph
Comparing Human and Machine Bias in Face Recognition
Sufficient Subgraph Embedding Memory for Continual Graph Representation Learning
One cannot stand for everyone! Leveraging Multiple User Simulators to train Task-oriented Dialogue Systems
Towards Out-of-Distribution Adversarial Robustness
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Learning Deep Operator Networks: The Benefits of Over-Parameterization
How Useful are Gradients for OOD Detection Really?
Many-Body Approximation for Tensors
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Memorization Capacity of Neural Networks with Conditional Computation
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel
Boosting Drug-Target Affinity Prediction from Nearest Neighbors
Weighted Clock Logic Point Process
Simple Emergent Action Representations from Multi-Task Policy Training
Iterative Task-adaptive Pretraining for Unsupervised Word Alignment
Open-Set 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning
Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm
Interaction-Based Disentanglement of Entities for Object-Centric World Models
CodeT5Mix: A Pretrained Mixture of Encoder-decoder Transformers for Code Understanding and Generation
Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Federated Neural Bandits
Compositional Task Representations for Large Language Models
What do large networks memorize?
TILDE-Q: a Transformation Invariant Loss Function for Time-Series Forecasting
Pretraining One Language Model for All With the Text-To-Text Framework Using Model-Generated Signals
A Picture of the Space of Typical Learning Tasks
Linear Mode Connectivity of Deep Neural Networks via Permutation Invariance and Renormalization
Multi-View Masked Autoencoders for Visual Control
Boosting Adversarial Training with Masked Adaptive Ensemble
MILE: Memory-Interactive Learning Engine for Solving Mathematical Problems
UNICO: Efficient Unified Hardware-Software Co-Optimization For Deep Neural Networks
Diffusion-GAN: Training GANs with Diffusion
Contextual Subspace Approximation with Neural Householder Transforms
Mind the Pool: Convolutional Neural Networks Can Overfit Input Size
Towards Unsupervised Time Series Representation Learning: A Decomposition Perspective
Reparameterization through Spatial Gradient Scaling
Boomerang: Local sampling on image manifolds using diffusion models
TOWARD RELIABLE NEURAL SPECIFICATIONS
A second order regression model shows edge of stability behavior
Learning Frequency-aware Network for Continual Learning
Unsupervised Learning for Combinatorial Optimization Needs Meta Learning
Latent Topology Induction for Understanding Contextualized Representations
DyG2Vec: Representation Learning for Dynamic Graphs With Self-supervision
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models
Adaptive Optimization in the $\infty$-Width Limit
Pyramidal Denoising Diffusion Probabilistic Models
Guiding Energy-based Models via Contrastive Latent Variables
Deep Watermarks for Attributing Generative Models
Steerable Equivariant Representation Learning
Differentially Private Diffusion Models
Outlier-Robust Group Inference via Gradient Space Clustering
Broken Neural Scaling Laws
Learning to perceive objects by prediction
Avoiding spurious correlations via logit correction
LEARNING CONTEXT-AWARE ADAPTIVE SOLVERS TO ACCELERATE QUADRATIC PROGRAMMING
Learning Latent Structural Causal Models
Pre-Training for Robots: Leveraging Diverse Multitask Data via Offline Reinforcement Learning
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-Free RL
S$^6$-DAMON: Bridging Self-Supervised Speech Models and Real-time Speech Recognition
Teaching Algorithmic Reasoning via In-context Learning
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
Disentangled Conditional Variational Autoencoder for Unsupervised Anomaly Detection
Diffusion-based Image Translation using disentangled style and content representation
An Analytic Framework for Robust Training of Differentiable Hypothesis
Federated Learning with Heterogeneous Label Noise: A Dual Structure Approach
Correspondences between word learning in children and captioning models 
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Implicit Regularization for Group Sparsity
Why do Models with Conditional Computation Learn Suboptimal Solutions?
Stabilized training of joint energy-based models and its practical applications
HesScale: Scalable Computation of Hessian Diagonals
Adaptive Anchor for Robust Keypoint Localization
Divide-and-Cluster: Spatial Decomposition Based Hierarchical Clustering
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
ORCA: Interpreting Prompted Language Models via Locating Supporting Evidence in the Ocean of Pretraining Data
Getting away with more network pruning: From sparsity to geometry and linear regions
Real-time variational method for learning neural trajectory and its dynamics
Supervised Metric Learning for Retrieval via Contextual Similarity Optimization
Large Language Models are Human-Level Prompt Engineers
Do Not Blindly Imitate the Teacher: Loss Perturbation for Knowledge Distillation
Fast Yet Effective Graph Unlearning through Influence Analysis
Faster Hyperparameter Search for GNNs via Calibrated Dataset Condensation
FedTiny: Pruned Federated Learning Towards Specialized Tiny Models
Spatiotemporal Modeling of Multivariate Signals with Graph Neural Networks and Structured State Space Models
TI-VAE: A temporally independent VAE with applications to latent factor learning in neuroimaging
Pruning Deep Neural Networks from a Sparsity Perspective
High-dimensional Continuum Armed and High-dimensional Contextual Bandit: with Applications to Assortment and Pricing
Learning to represent and predict evolving visual signals via polar straightening
Protecting Bidder Information in Neural Auctions
On Representation Learning Under Class Imbalance
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Interpretable (meta)factorization of clinical questionnaires to identify general dimensions of psychopathology
Enhancing Meta Learning via Multi-Objective Soft Improvement Functions
Discrete Predictor-Corrector Diffusion Models for Image Synthesis
Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models
Infusing Lattice Symmetry Priors in Neural Networks Using Soft Attention Masks
Counterfactual Vision-Language Data Synthesis with Intra-Sample Contrast Learning
META-LEARNING FOR UNSUPERVISED OUTLIER DETECTION WITH OPTIMAL TRANSPORT
GPTQ: Accurate Quantization for Generative Pre-trained Transformers
Domain-Invariant Auxiliary Learning for Robust Few-Shot Predictions from Noisy Data
Attentive MLP for Non-Autoregressive Generation
ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning
Dynamics Model Based Adversarial Training For Competitive Reinforcement Learning
ADVL: Adaptive Distillation for Vision-Language Tasks
A new characterization of the edge of stability based on a sharpness measure aware of batch gradient distribution
Finding the smallest tree in the forest: Monte Carlo Forest Search for UNSAT solving
$\mathrm{SE}(3)$-Equivariant Attention Networks for Shape Reconstruction in Function Space
PBES: PCA Based Exemplar Sampling Algorithm for Continual Learning
3D-IntPhys: Learning 3D Visual Intuitive Physics for Fluids, Rigid Bodies, and Granular Materials
Continual Post-Training of Language Models
Min-Max Multi-objective Bilevel Optimization with Applications in Robust Machine Learning
IAE: Implicit Autoencoder for Point Cloud Self-supervised Representation Learning
The Plug and Play of Language Models for Text-to-image Generation
Learning Arborescence with An Efficient Inference Algorithm
Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
A Score-Based Model for Learning Neural Wavefunctions
Benchmarking Algorithms for Domain Generalization in Federated Learning
The Vendi Score: A Diversity Evaluation Metric for Machine Learning
How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus
A Control-Centric Benchmark for Video Prediction
Continual Learning Based on Sub-Networks and Task Similarity
A Stable and Scalable Method for Solving Initial Value PDEs with Neural Networks
Shallow Learning In Materio.
How Can Deep Learning Performs Deep (Hierarchical) Learning
Data Subset Selection via Machine Teaching
Do Summarization Models Synthesize?
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Multi-Grid Tensorized Fourier Neural  Operator for High Resolution PDEs
$\beta$-Stochastic Sign SGD: A Byzantine Resilient and Differentially Private Gradient Compressor for Federated Learning
Sequential Brick Assembly with Efficient Constraint Satisfaction
Cross-Domain Self-Supervised Deep Learning for Robust Alzheimer's Disease Progression Modeling
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Heavy-tailed Noise Does Not Explain the Gap Between SGD and Adam, but Sign Descent Might
BiAdam: Fast Adaptive Bilevel Optimization Methods
Building Normalizing Flows with Stochastic Interpolants
Elicitation Inference Optimization for Multi-Principal-Agent Alignment
Dual Student Networks for Data-Free Model Stealing
Augmentation Curriculum Learning For Generalization in RL
Composite Slice Transformer: An Efficient Transformer with Composition of Multi-Scale Multi-Range Attentions
Graph Fourier MMD for signals on data graphs
Equal Improvability: A New Fairness Notion Considering the Long-term Impact
Does progress on ImageNet transfer to real world datasets?
Competitive Physics Informed Networks 
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Designing and Using Goal-Conditioned Tools
Post-mortem on a deep learning contest: a Simpson��s paradox and the complementary roles of scale metrics versus shape metrics
ProtFIM: Fill-in-Middle Protein Sequence Design via Protein Language Models
Beyond Deep Learning: An Evolutionary Feature Engineering Approach to Tabular Data Classification
Proportional Multicalibration
On The Impact of Machine Learning Randomness on Group Fairness
Using the Training History to Detect and Prevent Overfitting in Deep Learning Models
Multi-scale Sinusoidal Embeddings Enable Learning on High Resolution Mass Spectrometry Data
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
Efficient parametric approximations of neural net function space distance
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Energy-Inspired Self-Supervised Pretraining for Vision Models
Effectively Modeling Time Series with Simple Discrete State Spaces
Forgetful causal masking makes causal language models better zero-shot learners
When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?
A Time Series is Worth 64 Words:  Long-term Forecasting with Transformers
Protecting DNN from Evasion Attacks using Ensemble of High Focal Diversity
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-Oriented Dialogue Systems
Efficient Stochastic Optimization for Attacking Randomness Involved Inference
Supervision Complexity and its Role in Knowledge Distillation
GLINKX: A Scalable Unified Framework For Homophilous and Heterophilous Graphs
Marich: A Query-efficient & Online Model Extraction Attack using Public Data
CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS
Lovasz Theta Contrastive Learning
Transferable Unlearnable Examples
MUG: Interactive Multimodal Grounding on User Interfaces
Tabular Deep Learning when $d \gg n$ by Using an Auxiliary Knowledge Graph
Random Laplacian Features for Learning with Hyperbolic Space
Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay
Neural Causal Models for Counterfactual Identification and Estimation
Connecting representation and generation via masked vision-language transformer
Is margin all you need? An extensive empirical study of active learning on tabular data
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
Target Conditioned Representation Independence (TCRI); from Domain-Invariant to Domain-General Representations
Multi-Task Option Learning and Discovery for Stochastic Path Planning
MolEBM: Molecule Generation and Design by Latent Space Energy-Based Modeling
Information-Theoretic Diffusion
Bandwith Enables Generalization in Quantum Kernel Models
Giving Robots a Hand: Broadening Generalization via Hand-Centric Human Video Demonstrations
SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference
No Pairs Left Behind: Improving Metric Learning with Regularized Triplet Objective
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning
Gradient Preconditioning for Non-Lipschitz smooth Nonconvex Optimization
Predictive Coding with Approximate Laplace Monte Carlo
What Spurious Features Can Pretrained Language Models Combat?
SIMPLE: A Gradient Estimator for k-Subset Sampling
Transformers Implement First-Order Logic with Majority Quantifiers
Robustness Evaluation Using Local Substitute Networks
Learning Iterative Neural Optimizers for Image Steganography
Graph Neural Networks as Multi-View Learning
Cramming: Training a language model on a single GPU in one day
BertNet: Harvesting Knowledge Graphs from Pretrained Language Models
How Hard is Trojan Detection in DNNs? Fooling Detectors With Evasive Trojans
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Label-Free Synthetic Pretraining of Object Detectors
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Current Anomaly Detectors are Anomalous: On Semantic Treatment of OOD Inputs
FedX: Federated Learning for Compositional Pairwise Risk Optimization
On the Sensitivity of Reward Inference to Misspecified Human Models
DeepDFA: Dataflow Analysis-Guided Efficient Graph Learning for Vulnerability Detection
Probability flow solution of the Fokker-Planck equation
Binding Language Models in Symbolic Languages
Probabilistic Categorical Adversarial Attack and Adversarial Training
Multi-Sample Contrastive Neural Topic Model as Multi-Task Learning
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Less Is More: Training on Low-Fidelity Images Improves Robustness to Adversarial Attacks
Greedy Information Maximization for Online Feature Selection
Towards Fair Classification against Poisoning Attacks
Unveiling Transformers with LEGO: A Synthetic Reasoning Task
How Much Data Are Augmentations Worth?  An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Spatial Reasoning Network for Zero-shot Constrained Scene Generation
Robust Graph Dictionary Learning
Matrix factorization under the constraint of connectivity between observed and source data ~ Muscle synergy analysis based on connectivity between muscle and brain activities ~
Fundamental limits on the robustness of image classifiers
Stochastic Constrained DRO with a Complexity Independent of Sample Size
Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
Dissecting adaptive methods in GANs
Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints
Understanding Influence Functions and Datamodels via Harmonic Analysis
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
Robustness for Free: Adversarially Robust Anomaly Detection Through Diffusion Model
Optimal control neural networks for data-driven discovery of gradient flows.
ErrorAug: Making Errors to Find Errors in Semantic Segmentation
Kernel Regression with Infinite-Width Neural Networks on Millions of Examples
Information Plane Analysis for Dropout Neural Networks
Fed-Cor: Federated Correlation Test with Secure Aggregation
Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments
Dynamical systems embedding with a physics-informed convolutional network
Learning Harmonic Molecular Representations on Riemannian Manifold
When is Offline Hyperparameter Selection Feasible for Reinforcement Learning?
Plansformer: Generating Multi-Domain Symbolic Plans using Transformers
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
VISION TRANSFORMER FOR MULTIVARIATE TIME- SERIES CLASSIFICATION (VITMTSC)
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
Preserving In-Context Learning Ability in Large Language Model Fine-tuning
Efficiently Controlling Multiple Risks with Pareto Testing
Graph Mixup with Soft Alignments
CNN Compression and Search Using Set Transformations with Width Modifiers on Network Architectures
Event-former: A Self-supervised Learning Paradigm for Temporal Point Processes
Learning Interpretable Dynamics from Images of a Freely Rotating 3D Rigid Body
NOTELA: A Generalizable Method for Source Free Domain Adaptation
Characteristic Neural Ordinary Differential Equation
Fast Sampling of Diffusion Models with Exponential Integrator
STay-On-the-Ridge (STON'R): Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games
Federated Representation Learning via Maximal Coding Rate Reduction
3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data
gDDIM: Generalized denoising diffusion implicit models
Panning for Gold in Federated Learning: Targeted Text Extraction under Arbitrarily Large-Scale Aggregation
Artificial Neuronal Ensembles with Learned Context Dependent Gating
Linkless Link Prediction via Relational Distillation
Controllable Concept Transfer of Intermediate Representations
A Differentiable Loss Function for Learning Heuristics in A*
Understanding Multi-Task Scaling in Machine Translation
Learning Language Representations with Logical Inductive Bias
AsymQ: Asymmetric Q-loss to mitigate overestimation bias in off-policy reinforcement learning
Movement-to-Action Transformer Networks for Temporal Action Proposal Generation
INSPIRE: A Framework for Integrating Individual User Preferences in Recourse
How Does Self-supervised Learning Work? A Representation Learning Perspective
Empowering Graph Representation Learning with Test-Time Graph Transformation
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization
GROOT: Corrective Reward Optimization for Generative Sequential Labeling
Interpretations of Domain Adaptations via Layer Variational Analysis
Forget Unlearning: Towards True Data-Deletion in Machine Learning
Meta-Learning with Explicit Task Information
Evaluating Unsupervised Denoising Requires Unsupervised Metrics
Denoising Diffusion Samplers
How I Learned to Stop Worrying and Love Retraining
The Value of Out-of-distribution Data
Recursive Neural Programs: Variational Learning of Image Grammars and Part-Whole Hierarchies
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets
Image Classification by Throwing Quantum Kitchen Sinks at Tensor Networks
Cross-Domain Few-Shot Relation Extraction via Representation Learning and Domain Adaptation
Factors Influencing Generalization in Chaotic Dynamical Systems
Interpretable Geometric Deep Learning via Learnable Randomness Injection
Koopman Operator Learning for Accelerating Quantum Optimization and Machine Learning
GOGGLE: Generative Modelling for Tabular Data by Learning Relational Structure
Query by Self
A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Progressive Prompts: Continual Learning for Language Models without Forgetting
Differentiable Rendering with Reparameterized Volume Sampling
Deep Learning From Crowdsourced Labels: Coupled Cross-Entropy Minimization, Identifiability, and Regularization
Maximum Likelihood Learning of Energy-Based Models for Simulation-Based Inference
Provable Re-Identification Privacy
Just Avoid Robust Inaccuracy: Boosting Robustness Without Sacrificing Accuracy
Projective Proximal Gradient Descent for Nonconvex Nonsmooth Optimization: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property
First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains
A Kernel-Based View of Language Model Fine-Tuning
Variable Compositionality Reliably Emerges in Neural Networks
Systematic Rectification of Language Models via Dead-end Analysis
Model-free Reinforcement Learning that Transfers Using Random Reward Features
Differentiable Channel Selection for Self-Attention
Membership Inference Attacks Against Text-to-image Generation Models
Multiple sequence alignment as a sequence-to-sequence learning problem
Fair Graph Message Passing with Transparency
FedExP: Speeding up Federated Averaging via Extrapolation
Graph Neural Networks Are More Powerful Than we Think
A Mixture-of-Expert Approach to RL-based Dialogue Management
A Retrieve-and-Read Framework for Knowledge Graph Reasoning
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation
An Empirical Study of the Neural Contextual Bandit Algorithms
Backpropagation at the Infinitesimal Inference Limit of Energy-Based Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive Hebbian Learning
A Theoretical Framework for Inference and Learning in Predictive Coding Networks
Causally-guided Regularization of Graph Attention improves Generalizability
On a Benefit of Masked Language Model Pretraining: Robustness to Simplicity Bias
FLGAME: A Game-theoretic Defense against Backdoor Attacks In Federated Learning
DeepReShape: Redesigning  Neural Networks for Private Inference
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes
Semi-Supervised Single Domain Generalization with Label-Free Adversarial Data Augmentation
A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search
Memory Efficient Dynamic Sparse Training
Accelerated Training via Principled Methods for Incrementally Growing Neural Networks
Progressive Mix-Up for Few-Shot Supervised Multi-Source Domain Transfer
Mitigating Propagation Failures in PINNs using Evolutionary Sampling
Revisiting Information-Based Clustering with Pseudo-Posterior Models
Neural Compositional Rule Learning for Knowledge Graph Reasoning
Temporal Change Sensitive Representation for Reinforcement Learing
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Fairness via Adversarial Attribute Neighbourhood Robust Learning
Efficient approximation of neural population structure and correlations with probabilistic circuits
Exploring perceptual straightness in learned visual representations
Improving Subgraph Representation Learning via Multi-View Augmentation
Efficient Proxy for NAS is Extensible Now
System identification of neural systems: If we got it right, would we know?
TKIL: Tangent Kernel Optimization for Class Balanced Incremental Learning
Is Forgetting Less a Good Inductive Bias for Forward Transfer?
High-Precision Regressors for Particle Physics
Learning Structured Representations by Embedding Class Hierarchy
Promptagator: Few-shot Dense Retrieval From 8 Examples
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
Brain-like representational straightening of natural movies in robust feedforward neural networks
FunkNN: Neural Interpolation for Functional Generation
A Framework for Comprehensive Evaluations of Graph Neural Network based Community Detection using Node Clustering
TEXTCRAFT: ZERO-SHOT GENERATION OF HIGH FIDELITY AND DIVERSE SHAPES FROM TEXT
CrystalBox: Efficient Model-Agnostic Explanations for Deep RL Controllers
Label Propagation with Weak Supervision
TypeT5: Seq2seq Type Inference using Static Analysis
Approximating any Function via Coreset for Radial Basis Functions: Towards Provable Data Subset Selection For Efficient Neural Networks training
Axiomatic Explainer Locality With Optimal Transport
Fine-Tuning Offline Policies With Optimistic Action Selection
Improving the Strength of Human-Like Models in Chess
Test-Time Training on Video Streams
AGRO: Adversarial discovery of error-prone Groups for Robust Optimization
Learning Multiobjective Program Through Online Learning
Dichotomy of Control: Separating What You Can Control from What You Cannot
Progressive Knowledge Distillation:  Constructing Ensembles for Efficient Inference
Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction
LogicDP: Creating Labels for Graph Data via Inductive Logic Programming
Simulating Environments for Evaluating Scarce Resource Allocation Policies
Domain Transfer with Large Dynamics Shift in Offline Reinforcement Learning
Learning to reason with relational abstractions
A Simple Approach for State-Action Abstraction using a Learned MDP Homomorphism
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Revisiting Curiosity for Exploration in Procedurally Generated Environments
Online Learning for Obstacle Avoidance
Transformer-based World Models Are Happy With 100k Interactions
Can Neural Networks Learn Implicit Logic from Physical Reasoning?
Blockwise self-supervised learning with Barlow Twins
DIGEST: FAST AND COMMUNICATION EFFICIENT DECENTRALIZED LEARNING WITH LOCAL UPDATES
Learning to Improve Code Efficiency
Real Data Distributions Prefer Simplicity and So Do Our Models: Why Machine Learning and Model Selection Are Possible
Backdoor Attacks in the Supply Chain of Masked Image Modeling
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret
On Achieving Optimal Adversarial Test Error
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Serving Graph Compression for Graph Neural Networks
Optimal Data Sampling for Training Neural Surrogates of Programs
Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation
Achieving Communication-Efficient Policy Evaluation for Multi-Agent Reinforcement Learning: Local TD-Steps or Batching?
Learning where and when to reason in neuro-symbolic inference
Aging with GRACE: Lifelong Model Editing with Key-Value Adaptors
A VAE for Transformers with Nonparametric Variational Information Bottleneck
Learning MLPs on Graphs: A Unified View of Effectiveness, Robustness, and Efficiency
On The Specialization of Neural Modules
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Information-Theoretic Underpinnings of Generalization and Translation in Emergent Communication
Optimal Transport-Based Supervised Graph Summarization
Contrastive Vision Transformer for Self-supervised Out-of-distribution Detection
Does the Half Adversarial Robustness Represent the Whole? It Depends... A Theoretical Perspective of Subnetwork Robustness
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Improving Accuracy and Explainability of Online Handwriting Recognition
On the duality between contrastive and non-contrastive self-supervised learning
Few-Shot Incremental Learning Using HyperTransformers
The Brainy Student: Scalable Unlearning by Selectively Disobeying the Teacher
FIGARO: Controllable Music Generation using Learned and Expert Features
A Neural PDE Solver with Temporal Stencil Modeling
The Right Losses for the Right Gains: Improving the Semantic Consistency of Deep Text-to-Image Generation with Distribution-Sensitive Losses
Selection Collider Bias in Large Language Models
CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data
Language models are multilingual chain-of-thought reasoners
DreamFusion: Text-to-3D using 2D Diffusion
Recitation-Augmented Language Models
Continual Active Learning
KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals
Credible, Sealed-bid, Optimal Repeated Auctions With Differentiable Economics
The Power of Feel-Good Thompson Sampling: A Unified Framework for Linear Bandits
Two-Tailed Averaging: Anytime Adaptive Once-in-a-while Optimal Iterate Averaging for Stochastic Optimization
Reward Design with Language Models
Calibrating the Rigged Lottery: Making All Tickets Reliable
Replay Buffer with Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning
Contrastive Audio-Visual Masked Autoencoder
Pessimistic Model-Based Actor-Critic for Offline Reinforcement Learning: Theory and Algorithms
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Soft Diffusion: Score Matching For General Corruptions
Open-Vocabulary Panoptic Segmentation MaskCLIP
Robust Federated Learning with Majority Adversaries via Projection-based Re-weighting
Double Wins: Boosting Accuracy and Efficiency of Graph Neural Networks by Reliable Knowledge Distillation
A Statistical Framework for Personalized Federated Learning and Estimation: Theory, Algorithms, and Privacy
Invariant Aggregator for Defending against Federated Backdoor Attacks
Improving Adversarial Robustness of Deep Neural Networks via Self-adaptive Margin Defense
Laser: Latent Set Representations for 3D Generative Modeling
Towards Efficient Gradient-Based Meta-Learning in Heterogenous Environments
Knowledge Cascade: Reverse Knowledge Distillation
Optimal Transport for Offline Imitation Learning
FedorAS: Federated Architecture Search under system heterogeneity
Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Towards A Unified View of Sparse Feed-Forward Network in Transformer
Learning multi-scale local conditional probability models of images
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
Online Continual Learning with Feedforward Adaptation
Understanding ReLU Network Robustness Through Test Set Certification Performance
Mind the Privacy Budget: How Generative Models Spend their Privacy Budgets
Resource Efficient Self-Supervised Learning for Speech Recognition
Subsampling in Large Graphs Using Ricci Curvature
Membership Leakage in Pre-trained Language Models
DSI++: Updating Transformer Memory with New Documents
Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
The Game of Hidden Rules: A New Challenge for Machine Learning
Motif-based Graph Representation Learning with Application to Chemical Molecules
Graph schemas as abstractions for transfer learning, inference, and planning
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
Beam Tree Recursive Cells
The Ultimate Combo: Boosting Adversarial Example Transferability by Composing Data Augmentations
In-Time Refining Optimization Trajectories Toward Improved Robust Generalization
Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation
Improving Aspect Ratio Distribution Fairness in Detector Pretraining via Cooperating RPN��s
Learning parsimonious dynamics for generalization in reinforcement learning
DECODING LAYER SALIENCY IN TRANSFORMERS
UNDERSTANDING THE ROLE OF POSITIONAL ENCODINGS IN SENTENCE REPRESENTATIONS
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Score-based Continuous-time Discrete Diffusion Models
Decision Transformer under Random Frame Dropping
Semi-supervised consistency regularization for accurate cell type fraction and gene expression estimation
Adversarial Imitation Learning with Preferences
How to Do a Vocab Swap?  A Study of Embedding Replacement for Pre-trained Transformers
Attribution Scores are Redundant: Explaining Feature Contribution By Trajectories
SuperFed: Weight Shared Federated Learning
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
Recurrent Back-Projection Generative Adversarial Network for Video Super Resolution
Neural Networks as Paths through the Space of Representations
From Points to Functions: Infinite-dimensional Representations in Diffusion Models
Disentangling with Biological Constraints: A Theory of Functional Cell Types
ESEAD: An Enhanced Simple Ensemble and Distillation Framework for Natural Language Processing
Efficient One-Shot Neural Architecture Search With Progressive Choice Freezing Evolutionary Search
Synthetic Data Generation of Many-to-Many Datasets via Random Graph Generation
Learning rigid dynamics with face interaction graph networks
On the Importance of Contrastive Loss in Multimodal Learning
MAD for Robust Reinforcement Learning in Machine Translation
An Exploration of Conditioning Methods in Graph Neural Networks
Speed Up Iterative Non-Autoregressive Transformers by Distilling Multiple Steps
Global View For GCN: Why Go Deep When You Can Be Shallow?
Cross-Silo Training of Differentially Private Models with  Secure Multiparty Computation
HyperTime: Implicit Neural Representations for Time Series Generation
Generative Adversarial Federated Model
Unsupervised Pretraining for Neural Value Approximation
Homotopy Learning of Parametric Solutions to Constrained Optimization Problems
When Rigid Coherency Hurts: Distributional Coherency Regularization for Probabilistic Hierarchical Time Series Forecasting
EENet: Learning to Early Exit for Adaptive Inference
MALIBO: Meta-Learning for Likelihood-free Bayesian Optimization
Finding and only finding local Nash equilibria by both pretending to be a follower
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Networks
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
SurCo: Learning Linear Surrogates for Combinatorial Nonlinear Optimization Problems
DT+GNN: A Fully Explainable Graph Neural Network using Decision Trees
Why (and When) does Local SGD Generalize Better than SGD?
Function-space regularized R��nyi divergences
Constant-Factor Approximation Algorithms for Socially Fair $k$-Clustering
Re-calibrated Wasserstein GAN for large-scale imputation with informative missing
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Depth Separation with Multilayer Mean-Field Networks
Robust Policy Optimization in Deep Reinforcement Learning
Analogical Networks for Memory-Modulated 3D  Parsing
Fake It Until You Make It : Towards Accurate Near-Distribution Novelty Detection
Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
Domain Invariant Q-Learning for model-free robust continuous control under visual distractions
Continual Learning with Soft-Masking of Parameter-Level Gradient Flow
Asynchronous Message Passing: A new Framework for Learning in Graphs
Integrating Symmetry into Differentiable Planning with Steerable Convolutions
MolJET: Multimodal Joint Embedding Transformer for Conditional de novo Molecular Design and Multi-Property Optimization
The Challenges of Exploration for Offline Reinforcement Learning
SGD with large step sizes learns sparse features
Synergies Between Disentanglement and Sparsity: a Multi-Task Learning Perspective
Discerning Hydroclimatic Behavior with a Deep Convolutional Residual Regressive Neural Network
Causal Reasoning in the Presence of Latent Confounders via Neural ADMG Learning
ESC: A Benchmark For Multi-Domain End-to-End Speech Recognition
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
Pareto Rank-Preserving Supernetwork for HW-NAS
ProSampler: Improving Contrastive Learning by Better Mini-batch Sampling
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games 
Bispectral Neural Networks
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
Zero-Shot Retrieval with Search Agents and Hybrid Environments
Hyper-Decision Transformer for Efficient Online Policy Adaptation
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment
Solving Continuous Control via Q-learning
Make-A-Video: Text-to-Video Generation without Text-Video Data
EiX-GNN : Concept-level eigencentrality explainer for graph neural networks
Unsupervised Adaptation for Fairness under Covariate Shift
Pushing the limits of self-supervised learning: Can we outperform supervised learning without labels?
Towards Dynamic Sparsification by Iterative Prune-Grow LookAheads
Learning Useful Representations for Shifting Tasks and Distributions 
Personalized Reward Learning with Interaction-Grounded Learning (IGL)
From Adaptive Query Release to Machine Unlearning
ReAct: Synergizing Reasoning and Acting in Language Models
Towards convergence to Nash equilibria in two-team zero-sum games
Ensemble Homomorphic Encrypted Data Classification
Generative Pretraining for Black-Box Optimization
Meta-Learning Black-Box Optimization via Black-Box Optimization
The Use of Open-Source Boards for Data Collection and Machine Learning in Remote Deployments
Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise
DensePure: Understanding Diffusion Models towards Adversarial Robustness
Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural Networks
Towards Understanding How Machines Can Learn Causal Overhypotheses 
Grounding Graph Network Simulators using Physical Sensor Observations
Skill Decision Transformer
Architectural Backdoors in Neural Networks
In-distribution and Out-of-distribution Generalization for Graph Neural Networks
Where to Diffuse, How to Diffuse and How to get back: Learning in Multivariate Diffusions
Contrastive Corpus Attribution for Explaining Representations
The ethical ambiguity of AI data enrichment:  Measuring gaps in research ethics norms and practices
Spatio-temporal point processes with deep non-stationary kernels
Federated Learning from Small Datasets
Zero-shot Human-Object Interaction Recognition by Bridging Generative and Contrastive Image-Language Models
Explainable Machine Learning Predictions for the Long-term Performance of Brain-Computer Interfaces
The Minimal Feature Removal Problem in Neural Networks
Effectively using  public data in privacy preserving Machine learning
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Illusory Adversarial Attacks on Sequential Decision-Makers and Countermeasures
Prompt Tuning with Prompt-aligned Gradient for Vision-Language Models 
Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences
DINO as a von Mises-Fisher mixture model
Continuous Depth Recurrent Neural Differential Equations
Optimal Membership Inference Bounds for Adaptive Composition of Sampled Gaussian Mechanisms
Advantage Constrained Proximal Policy Optimization in Multi-Agent Reinforcement Learning
Scalable Batch-Mode Deep Bayesian Active Learning via Equivalence Class Annealing
Neural multi-event forecasting on spatio-temporal point processes using probabilistically enriched transformers
Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception
Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search
Can we achieve robustness from data alone?
Catastrophic overfitting is a bug but it is caused by features
Semi Parametric Inducing Point Networks
Perceptual Grouping in Vision-Language Models
CADet: Fully Self-Supervised Anomaly Detection With Contrastive Learning
SPRINT: Scalable Semantic Policy Pre-training via Language Instruction Relabeling
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
Evaluation of Active Feature Acquisition Methods under Missing Data
DAG Learning via Sparse Relaxations
Explicitly Minimizing the Blur Error of Variational Autoencoders
GraphEditor: An Efficient Graph Representation Learning and Unlearning Approach
3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction
PGASL: Predictive and Generative Adversarial Semi-supervised Learning for imbalanced data
Towards a More Rigorous Science of Blindspot Discovery in Image Models
How gradient estimator variance and bias impact learning in neural networks
Automatically Auditing Large Language Models via Discrete Optimization
Do We Really Need Complicated Model Architectures For Temporal Networks?
Synthetic Pre-Training Tasks for Neural Machine Translation
On the System-Level Effectiveness of Physical Object-Hiding Adversarial Attack in Autonomous Driving
BIG-Graph: Brain Imaging Genetics by Graph Neural Network
Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function
Data Feedback Loops: Model-driven Amplification of Dataset Biases
A $2$-parameter Persistence Layer for Learning
Is Conditional Generative Modeling all you need for Decision Making?
META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions
TEMPERA: Test-Time Prompt Editing via Reinforcement Learning
Combining pretrained speech and text encoders for spoken language processing
A Large Scale Sample Complexity Analysis of Neural Policies in the Low-Data Regime
Evaluating Representations with Readout Model Switching
Provable Defense Against Geometric Transformations
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation
Pseudoinverse-Guided Diffusion Models for Inverse Problems
Autoregressive Diffusion Model for Graph Generation
Contrastive introspection (ConSpec) to rapidly identify invariant steps for success
Self-supervised video pretraining yields strong image representations
Planning with Language Models through Iterative Energy Minimization
The Union of Manifolds Hypothesis
Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations
UniS-MMC: Learning Unimodality-supervised Multimodal Contrastive Representations
Progressive Data Dropout: An Adaptive Training Strategy for Large-Scale Supervised Learning
Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt Representation Drift in Continual Learning
Panoptically guided Image Inpainting with Image-level and Object-level Semantic Discriminators
Auditing Fairness Online through Interactive Refinement
REM: Routing Entropy Minimization for Capsule Networks
Don��t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure
Variational Classification
ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond
Accelerated Single-Call Methods for Constrained Min-Max Optimization
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
'I pick you choose': Joint human-algorithm decision making in multi-armed bandits
UnDiMix: Hard Negative Sampling Strategies for Contrastive Representation Learning
What Matters In The Structured Pruning of Generative Language Models?
MaxMin-Novelty: Maximizing Novelty via Minimizing the State-Action Values in Deep Reinforcement Learning
Complete Likelihood Objective for Latent Variable Models
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry
Performance Bounds for Model and Policy Transfer in Hidden-parameter MDPs
Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning
Emergence of shared sensory-motor graphical language from visual input
Compositional Task Generalization with Discovered Successor Feature Modules
DexDeform: Dexterous Deformable Object Manipulation with Human Demonstrations and Differentiable Physics
NAG-GS: semi-implicit, accelerated and robust stochastic optimizer.
Loop Unrolled Shallow Equilibrium Regularizer (LUSER) - A Memory-Efficient Inverse Problem Solver
Robust Universal Adversarial Perturbations
Understanding the Complexity Gains of Contextual Multi-task RL with Curricula
The Lie Derivative for Measuring Learned Equivariance
Effective passive membership inference attacks in federated learning against overparameterized models
Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning
Handling Covariate Shifts in Federated Learning  with Generalization Guarantees
Agree to Disagree: Diversity through Disagreement for Better Transferability
Expected Probabilistic Hierarchies
Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks
A distinct unsupervised reference model from the environment helps continual learning
The Crossword Puzzle: Simplifying Deep Neural Network Pruning with Fabulous Coordinates
Learning the Visualness of Text Using Large Vision-Language Models
SemPPL: Predicting Pseudo-Labels for Better Contrastive Representations
Differentially Private Adaptive Optimization with Delayed Preconditioners
Towards a Mathematics Formalisation Assistant using Large Language Models
Learning Robust Representations via Nuisance-extended Information Bottleneck
FedLite: Improving Communication Efficiency in Federated Split Learning
Adversarial Policies Beat Professional-Level Go AIs
Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions
Long Range Language Modeling via Gated State Spaces
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
A Deep Dive into Dataset Imbalance and Bias in Face Identification
Causally Constrained Data Synthesis For Private Data Release
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Bayes-MIL: A New Probabilistic Perspective on Attention-based Multiple Instance Learning for Whole Slide Images
Exploring Connections Between Memorization And Membership Inference
Action Matching: A Variational Method for Learning Stochastic Dynamics from Samples
Pre-train Graph Neural Networks for Brain Network Analysis
Investigating Multi-task Pretraining and Generalization in Reinforcement Learning
FIT: A Metric for Model Sensitivity
Transfer Learning with Deep Tabular Models
An Empirical Study on the Efficacy of Deep Active Learning Techniques
CrAM: A Compression-Aware Minimizer
Using Language to Extend to Unseen Domains
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Speeding up Policy Optimization with Vanishing Hypothesis and Variable Mini-Batch Size
Understanding Train-Validation Split in Meta-Learning with Neural Networks
Revisiting Robustness in Graph Machine Learning
Variational Information Pursuit for Interpretable Predictions
Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction
EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
A simple Training-Free Method for Rejection Option
Self-Programming Artificial Intelligence Using Code-Generating Language Models
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
Logical Message Passing Networks with One-hop Inference on Atomic Formulas
Noise-Robust De-Duplication at Scale
P2PRISM - Peer to peer learning with individual prism for secure aggregation
Multi-scale Attention for Diabetic Retinopathy Detection in Retinal Fundus Images
Blessing from Experts: Super Reinforcement Learning in Confounded Environments
Unscented Autoencoder
Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces
Explanation Uncertainty with Decision Boundary Awareness
Hierarchical Neural Program Synthesis
SARNET: SARCASM VS TRUE-HATE DETECTION NETWORK
Learning Portable Skills by Identifying Generalizing Features with an Attention-Based Ensemble
Few-shot Backdoor Attacks via Neural Tangent Kernels
Quantitative Universal Approximation Bounds for Deep Belief Networks
Hyperparameter Optimization through Neural Network Partitioning
DiscoBAX - Discovery of optimal intervention sets in genomic experiment design
How to Enable Uncertainty Estimation in Proximal Policy Optimization
Joint-Predictive Representations for Multi-Agent Reinforcement Learning
Symmetries, Flat Minima and the Conserved Quantities of Gradient Flow
DP-SGD-LF: Improving Utility under Differentially Private Learning via Layer Freezing
Explainability as statistical inference
Concept-based Explanations for Out-of-Distribution Detectors
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels
Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees
Planning with Large Language Models for Code Generation
Unleash Model Capacity for Universal Dense Retrieval by Task Specialty Optimization
Training Equilibria in Reinforcement Learning
Hebbian Deep Learning Without Feedback
A Simulation-based Framework for Robust Federated Learning to Training-time Attacks
Key Design Choices for Double-transfer in Source-free Unsupervised Domain Adaptation
PALM: Preference-based Adversarial Manipulation against Deep Reinforcement Learning
Architectural optimization over subgroups of equivariant neural networks
Unsupervised Non-Parametric Signal Separation Using Bayesian Neural Networks
SPIDER: Searching Personalized Neural Architecture for Federated Learning
On Gradient Descent Convergence beyond the Edge of Stability
Synaptic Dynamics Realize First-order Adaptive Learning and Weight Symmetry
FedAvg Converges to Zero Training Loss Linearly: The Power of Overparameterized Multi-Layer Neural Networks
Robustifying Language Models via Adversarial Training with Masked Gradient
Robust Graph Representation Learning via Predictive Coding
Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time
PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Category Discovery
Multi-Hypothesis 3D human pose estimation metrics favor miscalibrated distributions
Learning to Abstain from Uninformative Data
Order Matters: Agent-by-agent Policy Optimization
AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions
Conformal Prediction is Robust to Label Noise
$\Phi$-DVAE: Learning Physically Interpretable Representations with Nonlinear Filtering
Revisiting Structured Dropout
Reducing the Capacity Gap via Spherical Knowledge Distillation
Flatter, Faster: Scaling Momentum for Optimal Speedup of SGD
Learning implicit hidden Markov models using neural likelihood-free inference
Brain Signal Generation and Data Augmentation with a Single-Step Diffusion Probabilistic Model
Know Your Boundaries: The Advantage of Explicit Behavior Cloning in Offline RL
On the Convergence of AdaGrad on $\mathbb{R}^d$: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Bounded Attacks and Robustness in Image Transform Domains
SP2 : A Second Order Stochastic Polyak Method
Multi-Objective GFlowNets
Making Better Decision by Directly Planning in Continuous Control
Large language models are not zero-shot communicators
Data dependent frequency sensitivity of convolutional neural networks
Is end-to-end learning enough for fitness activity recognition?
Efficient Exploration using Model-Based Quality-Diversity with Gradients
ResFed: Communication Efficient Federated Learning by Transmitting Deep Compressed Residuals
HiT-MDP: Learning the SMDP option framework on MDPs with Hidden Temporal Variables
Improved Group Robustness via Classifier Retraining on Independent Splits
(Certified!!) Adversarial Robustness for Free!
URVoice: An Akl-Toussaint/ Graham- Sklansky Approach towards Convex Hull Computation for Sign Language Interpretation
Gaussian-Bernoulli RBMs Without Tears
Image Emotion Recognition using Cognitive Contextual Summarization Framework
PES: Probabilistic Exponential Smoothing for Time Series Forecasting
Efficient Conditionally Invariant Representation Learning
Distinguishing Feature Model for Ranking From Pairwise Comparisons
Heterogeneous Neuronal and Synaptic Dynamics for Spike-Efficient Unsupervised Learning: Theory and Design Principles
MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises
Forget to Learn (F2L): Rethinking Replay Loss in Unsupervised Continuous Domain Adaptation
A probabilistic framework for task-aligned intra- and inter-area neural manifold estimation
Applying Second Order Optimization to Deep Transformers with Parameter-Efficient Tuning
Density Sketches for Sampling and Estimation
Mask-tuning: Towards  Improving  Pre-trained Language Models' Generalization
Meta-Learning via Classifier(-free) Guidance
Tiered Pruning for Efficient Differentialble Inference-Aware Neural Architecture Search
MyoDex: Generalizable Representations for Dexterous Physiological Manipulation
Do We Really Need Labels for Backdoor Defense?
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Single SMPC Invocation DPHelmet: Differentially Private Distributed Learning on a Large Scale
A Scalable Training Strategy for Blind Multi-Distribution Noise Removal
$\ell$Gym: Natural Language Visual Reasoning with Reinforcement Learning
Re-Benchmarking Out-of-Distribution Detection in Deep Neural Networks
Towards Antisymmetric Neural Ansatz Separation
Multi-instance Interactive Segmentation with Self-Supervised Transformer
Triplet learning of task representations in latent space for continual learning
Spurious Features in Continual Learning
Time Series Subsequence Anomaly Detection via Graph Neural Networks
Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness
Improving Generalization of Motor-Imagery Brainwave Decoding via Dynamic Convolutions
On the Expressive Power of Geometric Graph Neural Networks
Fusion over the Grassmann Manifold for Incomplete-Data Clustering
Off Policy Average Reward Actor Critic with Deterministic Policy Search
Why Did This Model Forecast This Future? Information-Theoretic Temporal Saliency for Counterfactual Explanations of Probabilistic Forecasts
CLMIU: Commonsense Learning in Multimodal Image Understanding.
Topological Data Analysis-Deep Learning Framework for Predicting Cancer Phenotypes
In-Situ Text-Only Adaptation of Speech Models with Low-Overhead Speech Imputations
Rethinking Uniformity in Self-Supervised Representation Learning
Proposal-Contrastive Pretraining for Object Detection from Fewer Data
SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration
Bridging between Pool- and Stream-Based Active Learning with Temporal Data Coherence
The Robustness Limits of SoTA Vision Models to Natural Variation
Scaling Laws For Deep Learning Based Image Reconstruction
Robust Exploration via Clustering-based Online Density Estimation
Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning
DLP: Data-Driven Label-Poisoning Backdoor Attack
AlphaFold Distillation for Improved Inverse Protein Folding
Convexifying Transformers: Improving optimization and understanding of transformer networks
Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
A Cognitive-inspired Multi-Module Architecture for Continual Learning
Shuffled Transformers for Blind Training
Non-Gaussian Process Regression
ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
Hardware-aware compression with Random Operation Access Specific Tile (ROAST) hashing
SoftZoo: A Soft Robot Co-design Benchmark For Locomotion In Diverse Environments
Smooth Mathematical Functions from Compact Neural Networks
Self-Supervised Learning of Maximum Manifold Capacity Representations
PMI-guided Masking Strategy to Enable Few-shot Learning for Genomic Applications
TOWARDS AN OBJECTIVE EVALUATION OF THE TRUSTWORTHINESS OF CLASSIFIERS
Fine-grain Inference on Out-of-Distribution Data with Hierarchical Classification
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
The Adversarial Regulation of the Temporal Difference Loss Costs More Than Expected
Beyond Link Prediction: On Pre-Training Knowledge Graph Embeddings
SYNC: Efficient Neural Code Search Through Structurally Guided Hard Negative Curricula
Masked Siamese ConvNets: Towards an Effective Masking Strategy for General-purpose Siamese Networks 
Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries
Maximum Entropy Information Bottleneck for Confidence-aware Stochastic Embedding
Reprogramming Large Pretrained Language Models for Antibody Sequence Infilling
Optimal Scalarizations for Provable Multiobjective Optimization
Using semantic distance for diverse and sample efficient genetic programming
Semi-parametric Prompt-Generation for Model Editing
Fast Bayesian Updates for Deep Learning with a Use Case in Active Learning
Improved Learning-augmented Algorithms for k-means and k-medians Clustering
A Subspace Correction Method for ReLU Neural Networks for Solving PDEs
Neural Implicit Shape Editing using Boundary Sensitivity
Amortised Invariance Learning for Contrastive Self-Supervision
Direct-Effect Risk Minimization
DIFFUSION GENERATIVE MODELS ON SO(3)
Certifiably Robust Transformers with 1-Lipschitz Self-Attention
Revisiting Populations in multi-agent Communication
Univariate vs Multivariate Time Series Forecasting with Transformers
Semantic Transformation-based Data Augmentation for Few-Shot Learning
Sequential Gradient Coding For Straggler Mitigation
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation
Choreographer: Learning and Adapting Skills in Imagination
Disentanglement of Correlated Factors via Hausdorff Factorized Support
TimeSeAD: Benchmarking Deep Time-Series Anomaly Detection
On the optimization and generalization of overparameterized implicit neural networks
Differentially Private Conditional Text Generation For Synthetic Data Production
Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal
Generating Sequences by Learning to Self-Correct
Bringing robotics taxonomies to continuous domains via GPLVM on hyperbolic manifolds
COC curve: operating neural networks at high accuracy and low manual effort
Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers
Repository-Level Prompt Generation for Large Language Models of Code
Predicting Out-of-Domain Generalization with Local Manifold Smoothness
FP_AINet: Fusion Prototype with Adaptive Induction Network for Few-Shot Learning
CLUSTERBERT: MULTI-STAGE FINE-TUNING OF TRANSFORMERS FOR DEEP TEXT CLUSTERING
Neural Network Differential Equation Solvers allow unsupervised error estimation and correction
Wide Attention is the Way Forward for Transformers
Variational Prompt Tuning Improves Generalization of Vision-Language Models
DCT-DiffStride: Differentiable Strides with Real-Valued Data
Interneurons accelerate learning dynamics in recurrent neural networks for statistical adaptation
Burstormer: Burst Image Restoration and Enhancement Transformer
Understanding DDPM Latent Codes Through Optimal Transport
Soft Sampling for Efficient Training of Deep Neural Networks on Massive Data
Learning About Progress From Experts
Learning Fair Graph Representations via Automated Data Augmentations
FUN: Filter-based Unlearnable Datasets
A new photoreceptor-inspired CNN layer enables deep learning models of retina to generalize across lighting conditions
3D Neural Embedding Likelihood for Robust Sim-to-Real Transfer in Inverse Graphics
Dynamic Scheduled Sampling with Imitation Loss for Neural Text Generation
Emergence of Maps in the Memories of Blind Navigation Agents
Latent Neural ODEs with Sparse Bayesian Multiple Shooting
$\mathcal{O}$-GNN: incorporating ring priors into molecular modeling
MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection
Training Normalizing Flows from Dependent Data
Spectral Augmentation for Self-Supervised Learning on Graphs
An ensemble view on mixup
Improving Adversarial Robustness by Contrastive Guided Diffusion Process
$\sigma$Reparam: Stable Transformer Training with Spectral Reparametrization
Towards Multi-spatiotemporal-scale Generalized PDE Modeling
PAC Reinforcement Learning for Predictive State Representations
Federated Learning on Adaptively Weighted Nodes by Bilevel Optimization
Removing Structured Noise with Diffusion Models
Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning
Fourier PINNs: From Strong Boundary Conditions to Adaptive Fourier Bases
Distributed Graph Neural Network Training with Periodic Stale Representation Synchronization
SAGE: Semantic-Aware Global Explanations for Named Entity Recognition
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games
Graph Contrastive Learning with Model Perturbation
Robust Scheduling with GFlowNets
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models
Autoregressive Conditional Neural Processes
Exploring Methods for Parsing Movie Scripts - Feature Extraction for Further Social Injustice Analysis
MultiQuan RDP: Rate-Distortion-Perception Coding via Offset Quantizers
$k$NN Prompting: Learning Beyond the Context with Nearest Neighbor Inference
Closed-loop Transcription via Convolutional Sparse Coding
Transformers Learn Shortcuts to Automata
Efficient neural representation in the cognitive neuroscience domain: Manifold Capacity in One-vs-rest Recognition Limit
ULF: UNSUPERVISED LABELING FUNCTION CORRECTION USING CROSS-VALIDATION FOR WEAK SUPERVISION
Islands of Confidence: Robust Neural Network Classification with Uncertainty Quantification
REST: REtrieve & Self-Train for generative action recognition
Quantization-aware Policy Distillation (QPD)
Conceptual Behavior and Human-Likeness in Vision-and-Language Models
Highly Parallel Deep Ensemble Learning
On the Forward Invariance of Neural ODEs
Obtaining More Generalizable Fair Classifiers on Imbalanced Datasets
GMML is All you Need
Understanding The Robustness of Self-supervised Learning Through Topic Modeling
Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning
Distilling Pre-trained Knowledge in Chemical Reactions for Molecular Property Prediction
Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards
Learning Debiased Representations via Conditional Attribute Interpolation
Active Learning at the ImageNet Scale
Deep Probabilistic Time Series Forecasting over Long Horizons
Revealing Dominant Eigendirections via Spectral Non-Robustness Analysis in the Deep Reinforcement Learning Policy Manifold
MC-SSL: Towards Multi-Concept Self-Supervised Learning
Latent Hierarchical Imitation Learning for Stochastic Environments
Trimsformer: Trimming Transformer via Searching for Low-Rank Structure
Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping
 Continual Zero-shot Learning through Semantically Guided Generative Random Walks
Mesh-free Eulerian Physics-Informed Neural Networks
Self-supervised learning with rotation-invariant kernels
Strong inductive biases provably prevent harmless interpolation
Active Learning based Structural Inference
Batch Normalization Explained
AN OPERATOR NORM BASED PASSIVE FILTER PRUNING METHOD FOR EFFICIENT CNNS
Neuromechanical Autoencoders: Learning to Couple Elastic and Neural Network Nonlinearity
Temporal Dynamics Aware Adversarial Attacks On Discrete-Time Graph Models
Automatic Curriculum Generation for Reinforcement Learning in Zero-Sum Games
Internet-augmented language models through few-shot prompting for open-domain question answering
Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Language Modeling Using Tensor Trains
Bridging the Gap to Real-World Object-Centric Learning
Weighted Regularization for Efficient Neural Network Compression
Stay Moral and Explore: Learn to Behave Morally in Text-based Games
Efficient Discovery of Dynamical Laws in Symbolic Form
Brain2GAN; Reconstructing perceived faces from the primate brain via StyleGAN3
Self-Guided Diffusion Models
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics
Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training
Would decentralization hurt generalization?
Variational Pseudo Labels for Meta Test-time Adaptation
No-Regret Learning in Strongly Monotone Games Converges to a Nash Equilibrium
Generalized Belief Transport
Adversarial Cheap Talk
Multi-stationary point losses for robust model
Learning Stackelberg Equilibria and Applications to Economic Design Games
Learning to Induce Causal Structure 
Attention Based Models for Cell Type Classification on Single-Cell RNA-Seq Data
Personalized federated composite learning with forward-backward envelopes
Tackling Imbalanced Class in Federated Learning via Class Distribution Estimation
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Sublinear Algorithms for Kernel Matrices via Kernel Density Estimation
CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration
Achieve Near-Optimal Individual Regret & Low Communications in Multi-Agent Bandits
Online Boundary-Free Continual Learning by Scheduled Data Prior
HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization
HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical VAE
Efficient Learning of Rationalizable Equilibria in General-Sum Games
A Higher Precision Algorithm for Computing the $1$-Wasserstein Distance
Energy-Based Test Sample Adaptation for Domain Generalization
Representation Power of Graph Convolutions : Neural Tangent Kernel Analysis
Bidirectional Language Models Are Also Few-shot Learners
Revisiting adapters with adversarial training
Human-AI Coordination via Human-Regularized Search and Learning
Solving Math Word Problems with Process-based and Outcome-based Feedback
EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections for Federated Learning with Heterogeneous Data
Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness
Uncovering Directions of Instability via Quadratic Approximation of Deep Neural Loss in Reinforcement Learning
Marginal Probability Explanation: A Saliency Map with Closed-loop Validation
A Theory of Dynamic Benchmarks
On the Trade-Off between Actionable Explanations and the Right to be Forgotten
Learning to Cooperate and Communicate Over Imperfect Channels
A GENERAL SCENARIO-AGNOSTIC REINFORCEMENT LEARNING FOR TRAFFIC SIGNAL CONTROL
Uncertainty-aware off policy learning
Renamer: A Transformer Architecture In-variant to Variable Renaming
Learning What and Where - Unsupervised Disentangling Location and Identity Tracking
BALTO: efficient tensor program optimization with diversity-based active learning
RoCourseNet: Distributionally Robust Training of a Prediction Aware Recourse Model
Inducing Meaningful Units from Character Sequences with Dynamic Capacity Slot Attention
Enhanced Spatio-Temporal Image Encoding for Online Human Activity Recognition
In-context Reinforcement Learning with Algorithm Distillation
BiasPAD: A Bias-Progressive Auto-Debiasing Framework
On the Importance of Diversity in Data-free Model Stealing
Computing all Optimal Partial Transports
Towards Federated Learning of Deep Graph Neural Networks
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations
SmilesFormer: Language Model for Molecular Design
Continuously Parameterized Mixture Models
AE-FLOW: Autoencoders with Normalizing Flows  for  Medical Images Anomaly Detection 
Learning a Domain-Agnostic Policy through Adversarial Representation Matching for Cross-Domain Policy Transfer
A Self-Attention Ansatz for Ab-initio Quantum Chemistry
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse
How robust is unsupervised representation learning to distribution shift?
Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation
Multi-Behavior Dynamic Contrastive Learning for Recommendation
Analyzing diffusion as serial reproduction
Pseudo-label Training and Model Inertia in Neural Machine Translation
Adaptive Smoothing Gradient Learning for Spiking Neural Networks
Going Beyond Approximation: Encoding  Constraints for Explainable Multi-hop Inference via Differentiable Combinatorial Solvers
Robust and accelerated single-spike spiking neural network training with applicability to challenging temporal tasks
Using Planning to Improve Semantic Parsing of Instructional Texts
A NEW PARADIGM FOR CROSS-MODALITY PERSON RE-IDENTIFICATION
Causal Mean Field Multi-Agent Reinforcement Learning
Hidden Markov Mixture of Gaussian Process Functional Regression: Utilizing Multi-Scale Structure for Time-Series Forecasting
HyperDeepONet: learning operator with complex target function space using the limited resources via hypernetwork
CLAS: Central Latent Action Spaces for Coordinated Multi-Robot Manipulation
Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis
Towards Reliable Link Prediction with Robust Graph Information Bottleneck
Enforcing Delayed-Impact Fairness Guarantees
Affinity-Aware Graph Networks
Few-shot Lifelong Reinforcement Learning with Generalization Guarantees: An Empirical PAC-Bayes Approach
Towards the Detection of Diffusion Model Deepfakes
Global-Scale Species Mapping From Crowdsourced Data
CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning
Multivariate Time Series Forecasting By Graph Attention Networks With Theoretical Guarantees
Wasserstein Generalization Bound for Few-Shot Learning
Maximal Correlation-Based Post-Nonlinear Learning for Bivariate Causal Discovery
A View From Somewhere: Human-Centric Face Representations
Identifiability Results for Multimodal Contrastive Learning
Task-Agnostic Unsupervised Robust Representation Learning
Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach
Latent Graph Inference using Product Manifolds
UNICORN: A Unified Backdoor Trigger Inversion Framework
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention
On the Robustness of Dataset Inference
Client-agnostic Learning and Zero-shot Adaptation for Federated Domain Generalization
Towards Robust Model Watermark via Reducing Parametric Vulnerability
DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations
This Looks Like It Rather Than That: ProtoKNN For Similarity-Based Classifiers
SEQuence-rPPG: A Fast BVP Signal Extraction Method From Frame Sequences
Understanding weight-magnitude hyperparameters in training  binary networks
Sample-efficient multi-objective molecular optimization with GFlowNets
Learning Robust Kernel Ensembles with Kernel Average Pooling
Affinity-VAE for clustering and classification of objects in multidimensional image data
Model Stealing Attacks Against Vision-Language Models
Causal Attention to Exploit Transient Emergence of Causal Effect
A Simple Nadaraya-Watson Head for Explainable and Calibrated Classification
Imitating Human Behaviour with Diffusion Models
Learning Privacy-Preserving Graph Embeddings Against Sensitive Attributes Inference
InteriorSim: A Photorealistic Simulator for Embodied AI
Prompt-Based Metric Learning for Few-Shot NER
MetaPhysiCa: Causality-aware Robustness to OOD Initial Conditions in Physics-informed Machine Learning
Representation Balancing with Decomposed Patterns for Treatment Effect Estimation
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Guided Safe Shooting: model based reinforcement learning with safety constraints
Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Analyzing Transformers in Embedding Space
Enhancing the Inductive Biases of Graph Neural ODE for Modeling Dynamical Systems
Efficient Planning in a Compact Latent Action Space
Improved Stein Variational Gradient Descent with Importance Weights
Correlative Information Maximization Based Biologically Plausible Neural Networks for Correlated Source Separation
Simplicity bias leads to amplified performance disparities
Annealed Fisher Implicit Sampler
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection
Towards Conditionally Dependent Masked Language Models
Leveraging Importance Weights in Subset Selection
Interactive Sequential Generative Models
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures
Copy is All You Need
Graph Backup: Data Efficient Backup Exploiting Markovian Transitions
Finding Generalization Measures by Contrasting Signal and Noise
Linearised Implicit Variational Inference
Adversarial Driving Policy Learning by Misunderstanding the Traffic Flow
Differentiable and transportable structure learning
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Association Rules in QUBO Samples and Where to Find Them
Why adversarial training can hurt robust accuracy
ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
Counterfactual Explanation via Search in Gaussian Mixture Distributed Latent Space
FedPD: Defying data heterogeneity through privacy distillation
Harnessing Client Drift with Decoupled Gradient Dissimilarity
SeKron: A Decomposition Method Supporting Many Factorization Structures
Localized Randomized Smoothing for Collective Robustness Certification
Learning Dictionaries over Datasets through Wasserstein Barycenters
Spatial Entropy as an Inductive Bias for Vision Transformers
Learning Interpretable Neural Discrete Representation for Time Series Classification
Representational Dissimilarity Metric Spaces for Stochastic Neural Networks
Hierarchical Prototypes for  Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning
MILAN: Masked Image Pretraining on Language Assisted Representation
Irregularity Reflection Neural Network for Time Series Forecasting
Sequential Learning of Neural Networks for Prequential MDL
Reducing Communication Entropy in Multi-Agent Reinforcement Learning
Relaxed Attention for Transformer Models
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data
Learning topology-preserving data representations
Interpreting Class Conditional GANs with Channel Awareness
Escaping saddle points in zeroth-order optimization:  two function evaluations suffice
Vector Quantization and Shifting: Exploiting Latent Properties to Optimize Neural Codecs
Time-Myopic Go-Explore: Learning A State Representation for the Go-Explore Paradigm
Mastering Spatial Graph Prediction of Road Networks
A Simple Framework for Low-Resolution Detection with High-resolution Knowledge
Learning Probabilistic Topological Representations Using Discrete Morse Theory
Zero-Label Prompt Selection
The Curious Case of Benign Memorization
A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning
Deep Class Conditional Gaussians for Continual Learning
AGREE: A Simple Aggregator of Detectors�� Decisions
Unbiased Supervised Contrastive Learning
ReaKE: Contrastive Molecular Representation Learning with Chemical Synthetic Knowledge Graph
Graph MLP-Mixer
Multivariate Gaussian Representation of Previous Tasks for Continual Learning
Learning to Register Unbalanced Point Pairs
On Feature Diversity in Energy-based Models
Physics Model-based Autoencoding for Magnetic Resonance Fingerprinting
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection
Multi-objective optimization via equivariant deep hypervolume approximation
Conditional Execution Of Cascaded Models Improves The Accuracy-Efficiency Trade-Off
Adversarial Text to Continuous Image Generation
Fine-grained Few-shot Recognition by Deep Object Parsing
Can Wikipedia Help Offline Reinforcement Learning?
NGswin: N-Gram Swin Transformer for Efficient Single Image Super-Resolution
Lightweight Equivariant Graph Representation Learning for Protein Engineering
DiffusER: Diffusion via Edit-based Reconstruction
Modeling Temporal Data as Continuous Functions with Process Diffusion
KeyCLD: Learning Constrained Lagrangian Dynamics in Keypoint Coordinates from Images
DynaMS: Dyanmic Margin Selection for Efficient Deep Learning
TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization
How does Uncertainty-aware Sample-selection Help Decision against Action Noise?
SPC-Net: A New Scalable Point Cloud Compression Framework for Both Machine and Human Vision Tasks
Inversely Eliciting Numerical Reasoning in Language Models via Solving Linear Systems
Model-based Causal Bayesian Optimization
Targeted Attacks on Timeseries Forecasting
QuAFL: Federated Averaging Made Asynchronous and Communication-Efficient
Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption
Learning to Solve Constraint Satisfaction Problems with Recurrent Transformers
MARLlib: Extending RLlib for Multi-agent Reinforcement Learning
Improving the imputation of missing data with Markov Blanket discovery
Boosting the Cycle Counting Power of Graph Neural Networks with I$^2$-GNNs
Optimizing Connectivity through Network Gradients for the Restricted Machine
Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search
Fundamental Limits in Formal Verification of Message-Passing Neural Networks
Score Matching via Differentiable Physics
QUIC-FL: : Quick Unbiased Compression for Federated Learning
FedMEKT: Split Multimodal Embedding Knowledge Transfer in Federated Learning
Short-Term Memory Convolutions
LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval
A GNN-Guided Predict-and-Search Framework for Mixed-Integer Linear Programming
Parameter Averaging for SGD Stabilizes the Implicit Bias towards Flat Regions
On Explaining Neural Network Robustness with Activation Path
Flareon: Stealthy Backdoor Injection via Poisoned Augmentation
Dimensionless instance segmentation by learning graph representations of point clouds
Structure by Architecture: Structured Representations without Regularization
Understanding Neural Coding on Latent Manifolds by Sharing Features and Dividing Ensembles
Learning Fast and Slow for Time Series Forecasting
Perturbation Defocusing for Adversarial Defense
Accuracy Boosters: Epoch-Driven Mixed-Mantissa Block Floating-Point for DNN Training
Compressing multidimensional weather and climate data into neural networks
Guess the Instruction! Making Language Models Stronger Zero-Shot Learners
Probabilistic Imputation for Time-series Classification with Missing Data
Delve into the Layer Choice of BP-based Attribution Explanations
Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
Multi-Head State Space Model for Sequence Modeling
A Weight Variation-Aware Training Method for Hardware Neuromorphic Chips
Semantic Prior for Weakly Supervised Class-Incremental Segmentation
A Mutual Information Duality Algorithm for Multi-Agent Specialization
DECAP: Decoding CLIP Latents for Zero-shot Captioning
Heterogeneous Loss Function with Aggressive Rejection for Contaminated data in anomaly detection
Biological Factor Regulatory Neural Network
Preserving Semantics in Textual Adversarial Attacks
Unbiased Decisions Reduce Regret: Adversarial Optimism for the Bank Loan Problem
That Label's got Style: Handling Label Style Bias for Uncertain Image Segmentation
Prompt Injection: Parameterization of Fixed Inputs
Holistic Adversarially Robust Pruning
PASHA: Efficient HPO and NAS with Progressive Resource Allocation
Thinking fourth dimensionally: Treating Time as a Random Variable in EBMs
Diversity of Generated Unlabeled Data Matters for Few-shot Hypothesis Adaptation
StableDR: Stabilized Doubly Robust Learning for Recommendation on Data Missing Not at Random
Hybrid-Regressive Neural Machine Translation
Variational Causal Dynamics: Discovering Modular World Models from Interventions
Query The Agent: Improving Sample Efficiency Through Epistemic Uncertainty Estimation
Differentiable Logic Programming for Probabilistic Reasoning
Rewiring with Positional Encodings for GNNs
Automatic Dictionary Generation: Could Brothers Grimm Create a Dictionary with BERT?
Feed-Forward Latent Domain Adaptation
Sampling-based inference for large linear models, with application to linearised Laplace
Defending against Adversarial Audio  via Diffusion Model
Text-Guided Diffusion Image Style Transfer with Contrastive Loss Fine-tuning
FedProp: Cross-client Label Propagation for Federated Semi-supervised Learning
Test-Time Adaptation for Real-World Denoising Networks via Noise-Aware Image Generation
Gated Inference Network: Inferencing and Learning State-Space Models
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning
Cold Posteriors through PAC-Bayes
Learning 3D Point Cloud Embeddings using Optimal Transport
Training language models for deeper understanding improves brain alignment
DeNF: Unsupervised Scene-Decompositional Normalizing Flows
VQ-TR: Vector Quantized Attention for Time Series Forecasting
Local KL Convergence Rate for Stein Variational Gradient Descent with Reweighted Kernel
A Decomposition Based Dual Projection Model for Multivariate Time Series Forecasting and Anomaly Detection
LEXA: Language-agnostic Cross-consistency Training for Question Answering Tasks
FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization
CCT: Cross-consistency training for Clone Detection and Code Search Tasks
Robust Explanation Constraints for Neural Networks
Cyclophobic Reinforcement Learning
Emergent collective intelligence from massive-agent cooperation and competition
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
GraphVF: Controllable Protein-Specific 3D Molecule Generation with Variational Flow
Graph Neural Networks as Gradient Flows: understanding graph convolutions via energy
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Revisit Finetuning strategy for Few-Shot Learning to Strengthen the Equivariance of Emdeddings
Memory Learning of Multivariate Asynchronous Time Series
Scalable Multi-Modal Continual Meta-Learning
Optimizing Spca-based Continual Learning: A Theoretical Approach
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning
CAKE: CAusal and collaborative proxy-tasKs lEarning for Semi-Supervised Domain Adaptation
RulE: Neural-Symbolic Knowledge Graph Reasoning with Rule Embedding
Sampling-free Inference for Ab-Initio Potential Energy Surface Networks
PET-NeuS: Positional Encoding Triplanes for Neural Surfaces
Hidden Schema Networks
A New Hierarchy of Expressivity for Graph Neural Networks
Learning Input-agnostic Manipulation Directions in StyleGAN with Text Guidance
Learning Task Agnostic Temporal Consistency Correction
Does Structural Information have been Fully Exploited in Graph Data?
Prescribed Safety Performance Imitation Learning from A Single Expert Dataset
End-to-end Invariance Learning with Relational Inductive Biases in Multi-Object Robotic Manipulation
DAVA: Disentangling Adversarial Variational Autoencoder
Comparing Auxiliary Tasks for Learning Representations for Reinforcement Learning
TDR-CL: Targeted Doubly Robust Collaborative Learning for Debiased Recommendations
Dynamic-Aware GANs: Time-Series Generation with Handy Self-Supervision
Learning Gradient-based Mixup towards Flatter Minima for Domain Generalization
DeepGRAND: Deep Graph Neural Diffusion
Learning Discrete Representation with Optimal Transport Quantized Autoencoders
How to Keep Cool While Training
Learning System Dynamics from Sensory Input under Optimal Control Principles
Dual Algorithmic Reasoning
Lmser-pix2seq: Learning Stable Sketch Representations For Sketch Healing
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and  Voice Conversion
Toward Effective Deep Reinforcement Learning for 3D Robotic Manipulation: End-to-End Learning from Multimodal Raw Sensory Data
Domain Generalisation via Domain Adaptation: An Adversarial Fourier Amplitude Approach
Improving Generative Flow Networks with Path Regularization
On the Shortcut Learning in Multilingual Neural Machine Translation
Confidential-PROFITT: Confidential PROof of FaIr Training of Trees
Consolidator: Mergable Adapter with Group Connections for Vision Transformer
Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms
Homotopy-based training of NeuralODEs for accurate dynamics discovery
Transformers with Multiresolution Attention Heads
Anti-Symmetric DGN: a stable architecture for Deep Graph Networks
Contrastive Learning for Unsupervised Domain Adaptation of Time Series
Model-Based Decentralized Policy Optimization 
CLIP model is an Efficient Continual Learner
Online Low Rank Matrix Completion
Modality Complementariness: Towards Understanding Multi-modal Robustness
Effective Offline Reinforcement Learning via Conservative State Value Estimation
ChemAlgebra : Algebraic Reasoning on Chemical Reactions
A Primal-Dual Framework for Transformers and Neural Networks
Explaining RL Decisions with Trajectories
Reinforcement Learning using a Molecular Fragment Based Approach for Reaction Discovery
Keypoint Matching via Random Network Consensus
Visually-augmented pretrained language models for NLP Tasks without Images
Calibration for Decision Making via Empirical Risk Minimization
Improving Adversarial Robustness via Frequency Regularization
I Speak, You Verify: Toward Trustworthy Neural Program Synthesis
FastFill: Efficient Compatible Model Update
Learnable Graph Convolutional Attention Networks
Indoor Localisation for Detecting Medication Use in Parkinson's Disease
Scaffolding a Student to Instill Knowledge
User-Interactive Offline Reinforcement Learning
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Server Aggregation as Linear Regression: Reformulation for Federated Learning
Private and Efficient Meta-Learning with Low Rank and Sparse decomposition
$\omega$GNNs: Deep Graph Neural Networks Enhanced by Multiple Propagation Operators
Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
SLTUNET: A Simple Unified Model for Sign Language Translation
Pruning by Active Attention Manipulation
Robustness of Unsupervised Representation Learning without Labels
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
ACQL: An Adaptive Conservative Q-Learning Framework for Offline Reinforcement Learning
Fisher-Legendre (FishLeg) optimization of deep neural networks
A law of adversarial risk, interpolation, and label noise
Lossy Image Compression with Conditional Diffusion Models
Invariance Makes a Difference: Disentangling the Role of Invariance and Equivariance in Representations
Improving the generalization ability of the chaotic time-series classification models by residual component extraction
Learning DAGs from Fourier-Sparse Data
ASIF: coupled data turns unimodal models to multimodal without training
Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments
ProtoGNN: Prototype-Assisted Message Passing Framework for Non-Homophilous Graphs
MonoFlow: A Unified Generative Modeling Framework for GAN Variants
The Effective coalitions of Shapley value For Integrated Gradients
Cold Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator
Generative Spoken Language Model based on continuous word-sized audio tokens
Tree-structure segmentation for logistic regression
Neural Image Compression with a Diffusion-based Decoder
Learning ReLU networks to high uniform accuracy is intractable
GAML: geometry-aware meta-learning via a fully adaptive preconditioner
Caption supervision enables robust learners: a controlled study of distributionally robust model training
Active Learning for Object Detection with Evidential Deep Learning and Hierarchical Uncertainty Aggregation
How Sharpness-Aware Minimization Minimizes Sharpness?
Learning to solve the Hidden Clique Problem with Graph Neural Networks
On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
Out-of-Domain Intent Detection Considering Multi-turn Dialogue Contexts
Consciousness-Aware Multi-Agent Reinforcement Learning
Better with Less: Data-Active Pre-training of Graph Neural Networks
MAST: Masked Augmentation Subspace Training for Generalizable Self-Supervised Priors
Pseudo-Edge: Semi-Supervised Link Prediction with Graph Neural Networks
Graph-based Deterministic Policy Gradient for Repetitive Combinatorial Optimization Problems
Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes
Contextual Transformer for Offline Reinforcement Learning
Two-Dimensional Weisfeiler-Lehman Graph Neural Networks for Link Prediction
Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees
Efficient Controllable Generation with Guarantee
Towards graph-level anomaly detection via deep evolutionary mapping
Global Explainability of GNNs via Logic Combination of Learned Concepts
Pessimistic Policy Iteration for Offline Reinforcement Learning
BO-Muse: A Human expert and AI teaming framework for accelerated experimental design 
Coordination Scheme Probing for Generalizable Multi-Agent Reinforcement Learning
Generalization error bounds for Neural Networks with ReLU activation
Two Birds, One Stone: An Equivalent Transformation for Hyper-relational Knowledge Graph Modeling
Gradient Gating for Deep Multi-Rate Learning on Graphs
Self-Supervised Extreme Compression of Gigapixel Images
Combating noisy labels with stochastic noise-tolerated supervised contrastive learning
MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
Capturing the Motion of Every Joint: 3D Human Pose and Shape Estimation with Independent Tokens
Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression
Neural-based classification rule learning for sequential data
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
$\epsilon$-Invariant Hierarchical Reinforcement Learning for Building Generalizable Policy
Learning Control by Iterative Inversion
DetectBench: An Object Detection Benchmark for OOD Generalization Algorithms
Generalization Bounds with Arbitrary Complexity Measures
Graphics Capsule: Learning hierarchical 3D representations from 2D images and its application on human faces
Weak Supervision Variational Auto-Encoder
Object Detection with OOD Generalizable Neural Architecture Search
Learning To Invert: Simple Adaptive Attacks for Gradient Inversion in Federated Learning
Leveraging Unlabeled Data to Track Memorization
CCIL: Context-conditioned imitation learning for urban driving
Improving Continual Learning by Accurate Gradient Reconstructions of the Past
Revisiting Dense Retrieval with Unaswerable Counterfactuals
Group-wise Verifiable Distributed Computing for Machine Learning under Adversarial Attacks
Extending graph transformers with quantum computed aggregation
Policy-Based Self-Competition for Planning Problems
Can Fair Federated Learning reduce the need for personalization?
Efficient Out-of-Distribution Detection based on In-Distribution Data Patterns Memorization with Modern Hopfield Energy
Conditional Policy Similarity: An Overlooked Factor in Zero-Shot Coordination
Pareto-Efficient Decision Agents for Offline Multi-Objective Reinforcement Learning
Learning from Asymmetrically-corrupted Data in Regression for Sensor Magnitude
NAGphormer: A Tokenized Graph Transformer for Node Classification in Large Graphs
Bayesian Oracle for bounding information gain in neural encoding models
Near Optimal Private and Robust Linear Regression
From Distance to Dependency: A Paradigm Shift of Full-reference Image Quality Assessment
Siamese-NAS: Using Trained Samples Efficiently to Find Lightweight Neural Architecture by Prior Knowledge
Inverse Learning with Extremely Sparse Feedback for Recommendation
Instance-Specific Augmentation: Capturing Local Invariances
Spectral Subgraph Localization
Dynamical Signatures of Learning in Recurrent Networks
Shifts 2.0: Extending The Dataset of Real Distributional Shifts
$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
Prototypical Context-aware Dynamics Generalization for High-dimensional Model-based Reinforcement Learning
Efficient Hyperparameter Optimization Through Tensor Completion
Learning Vortex Dynamics for Fluid Inference and Prediction
Self-Supervised SVDE from Videos with Depth Variance to Shifted Positional Information
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data
MATA*: Combining Learnable Node Matching with A* Algorithm for Approximate Graph Edit Distance Computation
On student-teacher deviations in distillation: does it pay to disobey?
Quantum Vision Transformers
Merging Models Pre-Trained on Different Features with Consensus Graph
Unsupervised Performance Predictor for Architecture Search
Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
Quality-Similar Diversity via Population Based Reinforcement Learning
PREDICTION OF TOURISM FLOW WITH SPARSE DATA INCORPORATING TOURIST GEOLOCATIONS
Uncertainty-oriented Order Learning for Facial Beauty Prediction
Modeling the Uncertainty with Maximum Discrepant Students for Semi-supervised 2D Pose Estimation
UTS: When Monotonic Value Factorisation Meets Non-monotonic and Stochastic Targets
TransLog: A Unified Transformer-based Framework for Log Anomaly Detection
Meta-learning with Auto-generated Tasks for Predicting Human Behaviour in Normal Form Games
Are Graph Attention Networks Attentive Enough? Rethinking Graph Attention by Capturing Homophily and Heterophily
FairGrad: Fairness Aware Gradient Descent
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats
Tensor-Based Sketching Method for the Low-Rank Approximation of Data Streams.
CRISP: Curriculum based Sequential neural decoders for Polar code family
A Mathematical Framework for Characterizing Dependency Structures of Multimodal Learning
Language Models are Realistic Tabular Data Generators
Data augmentation alone can improve adversarial training
Learning Rotation-Equivariant Features for Visual Correspondence
Learning Diffusion Bridges on Constrained Domains
Revisiting Uncertainty Estimation for Node Classification: New Benchmark and Insights
CUTS: Neural Causal Discovery from Unstructured Time-Series Data
Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning
Balancing MSE against Abrupt Changes for Time-Series Forecasting
PAVI: Plate-Amortized Variational Inference
Near-optimal Coresets for Robust Clustering
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
Test-time Adaptation for Segmentation via Image Synthesis
On the Importance of In-distribution Class Prior for Out-of-distribution Detection
Quantized Compressed Sensing with Score-Based Generative Models
Unbiased Representation of Electronic Health Records for Patient Outcome Prediction
Valid P-Value for Deep Learning-driven Salient Region
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
Skill Graph for Real-world Quadrupedal Robot Reinforcement Learning
Pre-training Protein Structure Encoder via Siamese Diffusion Trajectory Prediction
Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning
Decompositional Generation Process for Instance-Dependent Partial Label Learning
Multimodal Masked Autoencoders Learn Transferable Representations
Adversarial Causal Augmentation for Graph Covariate Shift
Learning from conflicting data with hidden contexts
Building a Subspace of Policies for Scalable Continual Learning
Test-Time AutoEval with Supporting Self-supervision
Complexity-Based Prompting for Multi-step Reasoning
ECLAD: Extracting Concepts with Local Aggregated Descriptors
Not All Tasks Are Born Equal: Understanding Zero-Shot Generalization
MA2QL: A Minimalist Approach to Fully Decentralized Multi-Agent Reinforcement Learning
Learning Asymmetric Visual Semantic Embedding for Image-Text Retrieval
Representation Interference Suppression via Non-linear Value Factorization for Indecomposable Markov Games
On Threshold Functions in Learning to Generate Feasible Solutions of Mixed Integer Programs
SDAC: Efficient Safe Reinforcement Learning with Low-Biased Distributional Actor-Critic
So-TVAE: Sentiment-oriented Transformer-based Variational Autoencoder Network for Live Video Commenting
SoTeacher: Toward Student-oriented Teacher Network Training for Knowledge Distillation
GuardHFL: Privacy Guardian for Heterogeneous Federated Learning
Class-wise Visual Explanations for Deep Neural Networks
Decentralized Policy Optimization
Identification of the Adversary from a Single Adversarial Example
Similarity of Neural Architectures Based on Input Gradient Transferability
Image Segmentation using Transfer Learning with DeepLabv3 to Facilitate Photogrammetric Limb Scanning
G-Censor: Graph Contrastive Learning with Task-Oriented Counterfactual Views
Unsupervised 3d object learning through neuron activity aware plasticity
Visually-Augmented Language Modeling
A HIERARCHICAL FRAGMENT-BASED MODEL FOR 3D DRUG-LIKE MOLECULE GENERATION
Multi-Layered 3D Garments Animation
Preventing Mode Collapse When Imitating Latent Policies from Observations
Unsupervised Learning of Structured Representations via Closed-Loop Transcription
DETRDistill: A Simple Knowledge Distillation Framework for DETR-Families
Closed Boundary Learning for NLP Classification Tasks with the Universum Class
Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
MeGraph: Graph Representation Learning on Connected Multi-scale Graphs
Learning Reduced Fluid Dynamics
Symmetric Pruning in Quantum Neural Networks
Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off
On the Robustness of Randomized Ensembles to Adversarial Perturbations
Minimum Variance Unbiased N:M Sparsity for the Neural Gradients
Incremental Learning of Structured Memory via Closed-Loop Transcription
Curved Data Representations in Deep Learning
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
Self-supervised debiasing using low rank regularization
Wasserstein Gradient Flows for Optimizing GMM-based Policies
Compositional Image Generation and Manipulation with Latent Diffusion Models
Neural Unbalanced Optimal Transport via Cycle-Consistent Semi-Couplings
Prompt Tuning for Graph Neural Networks
Budgeted Training for Vision Transformer
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
Understanding and Mitigating Robust Overfitting through the Lens of Feature Dynamics
DualMatch: Promoting Semi-Supervised Learning with Hierarchical Label and Contrastive Learning
Augmentative Topology Agents For Open-Ended Learning
Partial Differential Equation-Regularized Neural Networks: An Application to Image Classification
Learning to Boost Resilience of Complex Networks via Neural Edge Rewiring
Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Mind's Eye: Grounded Language Model Reasoning through Simulation
Visual Expertise and the Log-Polar Transform Explain Image Inversion Effects
Cross-Protein Wasserstein Transformer for Protein-Protein Interactions
What Do Self-Supervised Vision Transformers Learn?
Continuous Monte Carlo Graph Search
Confident Sinkhorn Allocation for Pseudo-Labeling
Rank-1 Matrix Completion with Gradient Descent and Small Random Initialization
Adversarial Robustness based on Randomized Smoothing in Quantum Machine Learning 
Multi-Vector Retrieval as Sparse Alignment
Sampled Transformer for Point Sets
Scaling Laws in Mean-Field Games
PartAfford: Part-level Affordance Discovery
On The Relative Error of Random Fourier Features for Preserving Kernel Distance
UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction
Robust Quantity-Aware Aggregation for Federated Learning
Efficient debiasing with contrastive weight pruning
Linear Convergence of Decentralized FedAvg for Non-Convex Objectives: The Interpolation Regime
Rethinking Missing Modality Learning: From a Decoding View
Global Nash Equilibrium in a Class of Nonconvex N-player Games
UNDERSTANDING PURE CLIP GUIDANCE FOR VOXEL GRID NERF MODELS
Neural Semi-Counterfactual Risk Minimization
Task-Agnostic Online Meta-Learning in Non-stationary Environments
Meta-Weighted Language Model Tuning for Augmentation-Enhanced Few-Shot Learning
Online Reinforcement Learning via Posterior Sampling of Policy
NewModel: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Weakly Supervised Neuro-Symbolic Image Manipulation via Multi-Hop Complex Instructions
Graph Neural Networks for Aerodynamic Flow Reconstruction from Sparse Sensing
Learning Binary Networks on Long-Tailed Distributions
Deep Attention Pooling Graph Neural Network for Text Classification
Backdoor Mitigation by Correcting Activation Distribution Alteration
Pose Transfer using a Single Spatial Transformation
Local Distance Preserving Auto-encoders using Continuous k-Nearest Neighbours Graphs
Clustering for directed graphs using parametrized random walk diffusion kernels
Poisoning Generative Models to Promote Catastrophic Forgetting
Squeeze Training for Adversarial Robustness
Concealing Sensitive Samples for Enhanced Privacy in Federated Learning
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Understanding Graph Contrastive Learning From A Statistical Perspective
Revisiting the Activation Function for Federated Image Classification
Rethinking Knowledge Distillation with Raw Features for Semantic Segmentation
Open-domain Visual Entity Linking
Robustify Transformers with Robust Kernel Density Estimation
Pushing the Accuracy-Fairness Tradeoff Frontier with Introspective Self-play
MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model
Learning to Predict Parameter for Unseen Data
PADDLES: Phase-Amplitude Spectrum Disentangled Early Stopping for Learning with Noisy Labels
UNREAL: Unlabeled Nodes Retrieval and Labeling for Heavily-imbalanced Node Classification
Textless Phrase Structure Induction from Visually-Grounded Speech
On Nullspace of Vision Transformers and What Does it Tell Us?
Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence
The batch size can affect inference results
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making
GRAPHSENSOR: A Graph Attention Network for Time-Series Sensor Data
ProsodyBERT: Self-Supervised Prosody Representation for Style-Controllable TTS
FedDebias: Reducing the Local Learning Bias Improves Federated Learning on Heterogeneous Data
CRISP: Curriculum inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation
Mitigating Out-of-Distribution Data Density Overestimation in Energy-Based Models
Provably efficient multi-task Reinforcement Learning in large state spaces
An Equal-Size Hard EM Algorithm for Diverse Dialogue Generation
NeuralEQ: Neural-Network-Based Equalizer for High-Speed Wireline Communication
Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?
The hidden uniform cluster prior in self-supervised learning
Revisiting Over-smoothing in Graph Neural Networks
Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation
Mosaic Representation Learning for Self-supervised Visual Pre-training
Inverse Optimal Transport with Application to Contrastive Learning
Learning Multi-Object Positional Relationships via Emergent Communication
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation
Route, Interpret, Repeat: Blurring the Line Between Posthoc Explainability and Interpretable Models 
On Regularization for Explaining Graph Neural Networks: An Information Theory Perspective
The Dark Side of Invariance: Revisiting the Role of Augmentations in Contrastive Learning
Language model with Plug-in Knowldge Memory
Hierarchical Gaussian Mixture based Task Generative Model for Robust Meta-Learning
Game-Theoretic Understanding of Misclassification
The Final Ascent: When Bigger Models Generalize Worse on Noisy-Labeled Data
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Task Ambiguity in Humans and Language Models
Learning from student's mistakes: Improving mean teacher for end-to-end semi-supervised video action detection
Equivariant Disentangled Transformation for Domain Generalization under Combination Shift
Best Possible Q-Learning
Analysis of Radio Localiser Networks under Distribution Shift
Winning Both the Accuracy of Floating Point Activation and the Simplicity of Integer Arithmetic
Tensor Decompositions For Temporal Knowledge Graph Completion with Time Perspective
Preference Transformer: Modeling Human Preferences using Transformers for RL
Flow Matching for Generative Modeling
Graph-informed Neural Point Process With Monotonic Nets
Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives
Learning to Decouple Complex System for Sequential Data
Restoration based Generative Models
How hard are computer vision datasets? Calibrating dataset difficulty to viewing time
Self-Supervised Logit Adjustment
Proportional Amplitude Spectrum Training Augmentation for Synthetic-to-Real Domain Generalization
More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization
StepGCN: Step-oriented Graph Convolutional Networks in Representation Learning
GAPS: Few-Shot Incremental Semantic Segmentation via Guided Copy-Paste Synthesis
Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks
How Distinguishable Are Vocoder Models? Analyzing Vocoder Fingerprints for Fake Audio
Hierarchical Multi-Resolution Graph Generation Networks
Any-scale Balanced Samplers for Discrete Space
Stochastic Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
BinSGDM:  Extreme One-Bit Quantization for Communication Efficient Large-Scale Distributed Training 
Gradient-based Algorithms for Pessimistic Bilevel Optimization
Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design
Leaves: Learning Views for Time-Series Data in Contrastive Learning
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Ridge Regression and Wide Neural Networks
Imbalanced Semi-supervised Learning with Bias Adaptive Classifier
COMNET : CORTICAL MODULES ARE POWERFUL
A Multi-objective Perspective towards Improving Meta-Generalization
Do We Always Need to Penalize Variance of Losses for Learning with Label Noise?
DeepGuiser: Learning to Disguise Neural Architectures for Impeding Adversarial Transfer Attacks
Network Controllability Perspectives on Graph Representation
FACS: FAST ADAPTIVE CHANNEL SQUEEZING
Pre-trained Language Models can be Fully Zero-Shot Learners
On Compositional Uncertainty Quantification for Seq2seq Graph Parsing
Generative Gradual Domain Adaptation with Optimal Transport
ENHANCING THE PRIVACY OF FEDERATED LEARNING THROUGH DATA SYNTHESIS
Free Lunch for Domain Adversarial Training: Environment Label Smoothing
DYNAMIC ENSEMBLE FOR PROBABILISTIC TIME- SERIES FORECASTING VIA DEEP REINFORCEMENT LEARNING
Scaling Forward Gradient With Local Losses
Recommendation with User Active Disclosing Willingness
PAC-NeRF: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification
Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach
Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Understanding Embodied Reference with Touch-Line Transformer
Evaluating Robustness of Cooperative MARL: A Model-based Approach
The Emergence of Prototypicality: Unsupervised Feature Learning in Hyperbolic Space
Rethinking Symbolic Regression Datasets and Benchmarks for Scientific Discovery
The Cost of Privacy in Fair Machine Learning
Coordinated Strategy Identification Multi-Agent Reinforcement Learning
VARIATIONAL ADAPTIVE GRAPH TRANSFORMER FOR MULTIVARIATE TIME SERIES MODELING
One-Vs-All AUC Maximization: an effective solution to the low-resource named entity recognition problem
Efficient Large-scale Transformer Training via Random and Layerwise Token Dropping
Towards Robust Dataset Learning
Demystifying black-box DNN training processes through Concept-Monitor
Generalization Mechanics in Deep Learning
Large Language Models Can Self-improve
Calibration Matters: Tackling Maximization Bias in Large-scale Advertising Recommendation Systems
Excess risk analysis for epistemic uncertainty with application to variational inference
Memorization-Dilation: Modeling Neural Collapse Under Noise
Spacetime Representation Learning
Meta-Learning General-Purpose Learning Algorithms with Transformers
Learning to Extrapolate: A Transductive Approach
Label-free Concept Bottleneck Models
COMBAT: Alternated Training for Near-Perfect Clean-Label Backdoor Attacks
Multi-level Protein Structure Pre-training via Prompt Learning
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks
GLM-130B: An Open Bilingual Pre-trained Model
Causal Estimation for Text Data with (Apparent) Overlap Violations
Understanding Pruning at Initialization: An Effective Node-Path Balancing Perspective
Intrinsic Computational Complexity of Equivariant Neural Networks
Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer
MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
Improving the Estimation of Instance-dependent Transition Matrix by using Self-supervised Learning
Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational) Autoencoder in Fourier Space
A general differentially private learning framework for decentralized data
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks
Evaluating Robustness of Generative Models with Adversarial Networks
Weakly-Supervised Domain Adaptation in Federated Learning
PD-MORL: Preference-Driven Multi-Objective Reinforcement Learning Algorithm
When Majorities Prevent Learning: Eliminating Bias to Improve Worst-group and Out-of-distribution Generalization
Precautionary Unfairness in Self-Supervised Contrastive Pre-training
Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning
Oracle-oriented Robustness: Robust Image Model Evaluation with Pretrained Models as Surrogate Oracle
Certified Robustness on Structural Graph Matching
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Bayesian Optimal Experimental Design for the Survey Bandit Setting
Deep Contrastive Learning Approximates Ensembles of One-Class SVMs with Neural Tangent Kernels
Synchronized Contrastive Pruning for Efficient Self-Supervised Learning
VEHICLE-INFRASTRUCTURE COOPERATIVE 3D DETECTION VIA FEATURE FLOW PREDICTION
M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
ReG-NAS: Graph Neural Network Architecture Search using Regression Proxy Task
Mesh-Independent Operator Learning for PDEs using Set Representations
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
Robust Multi-Agent Reinforcement Learning against Adversaries on Observation
Limitations of Piecewise Linearity for Efficient Robustness Certification
Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations
Hypothetical Training for Robust Machine Reading Comprehension of Tabular Context
FlexRound: Learnable Rounding by Element-wise Division for Post-Training Quantization
Re-calibrating Feature Attributions for Model Interpretation
Adversarial Diversity in Hanabi
3D UX-Net: A Large Kernel Volumetric ConvNet Modernizing Hierarchical Transformer for Medical Image Segmentation
Multi-Reward Fusion:  Learning from Other Policies by Distilling 
Push and Pull:  Competing Feature-Prototype Interactions  Improve Semi-supervised Semantic Segmentation
MaskNeRF: Masked Neural Radiance Fields for Sparse View Synthesis
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small
Equivariant Descriptor Fields: SE(3)-Equivariant Energy-Based Models for End-to-End Visual Robotic Manipulation Learning
Anatomical Structure-Aware Image Difference Graph Learning for Difference-Aware Medical Visual Question Answering
Explaining Temporal Graph Models through an Explorer-Navigator Framework
Tackling Diverse Tasks via Cross-Modal Transfer Learning
Leveraged Asymmetric Loss with Disambiguation for Multi-label Recognition with One-Positive Annotations
Self-supervised Learning for Cell Segmentation and Quantification in Digital Pathology Images
Mitigating Demographic Bias of Federated Learning Models via Global Domain Smoothing
Safe Reinforcement Learning with Contrastive Risk Prediction
Analysis of differentially private synthetic data: a general measurement error approach
Imbalanced Lifelong Learning with AUC Maximization
On the Efficacy of Server-Aided Federated Learning against Partial Client Participation
Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
LA-BALD: An Information-Theoretic Image Labeling Task Sampler
Text and Patterns: For Effective Chain of Thought It Takes Two to Tango
Offline RL for Natural Language Generation with Implicit Language Q Learning
MoCa: Cognitive Scaffolding for Language Models in Causal and Moral Judgment Tasks
Anchor Sampling for Federated Learning with Partial Client Participation
Lattice Convolutional Networks for Learning Ground States of Quantum Many-Body Systems
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
On the Soft-Subnetwork for Few-Shot Class Incremental Learning
Fairness-Aware Model-Based Multi-Agent Reinforcement Learning for Traffic Signal Control
Approximating How Single Head Attention Learns
Efficient Attention via Control Variates
Pathfinding Neural Cellular Automata
Learning to Optimize Quasi-Newton Methods
Toxicity in Multilingual Machine Translation at Scale
An Adaptive Policy to Employ Sharpness-Aware Minimization
Semi-Offline Reinforcement Learning for Portfolio Optimization
FedMT: Federated Learning with Mixed-type Labels
A Note on Quantifying the Influence of Energy Regularization for Imbalanced Classification
Penalizing the High-likelihood: A Novel Sampling Method for Open-ended Neural Text Generation via Inverse Probability Weighting
Unlearning with Fisher Masking
Augmented Lagrangian is Enough for Optimal Offline RL with General Function Approximation and Partial Coverage
Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds
A Semantic Hierarchical Graph Neural Network for Text Classification
Injecting Image Details into CLIP's Feature Space
Adaptive Sparse Softmax: An Effective and Efficient Softmax Variant for Text Classification
Stochastic Bridges as Effective Regularizers for Parameter-Efficient Tuning
Continuous Goal Sampling: A Simple Technique to Accelerate Automatic Curriculum Learning
What do Vision Transformers Learn?  A Visual Exploration
When do Convolutional Neural Networks Stop Learning?
Detecting and Mitigating Indirect Stereotypes in Word Embeddings
OCIM : Object-centric Compositional Imagination for Visual Abstract Reasoning
How Weakly Supervised Information helps Contrastive Learning
A computational framework to unify representation similarity and function in biological and artificial neural networks
Turning a Curse Into a Blessing: Enabling Data-Free Backdoor Unlearning via Stabilized Model Inversion
Fairness and Accuracy under Domain Generalization
DROP: Conservative Model-based Optimization for Offline Reinforcement Learning
Language Models Can Teach Themselves to Program Better
Adaptive Kernel Selection for Convolutional Neural Network
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering
MVP: Multi-task Supervised Pre-training for Natural Language Generation
Learning Unified Representations for Multi-Resolution Face Recognition
Latent Bottlenecked Attentive Neural Processes
Improving Inductive Link Prediction through Learning Generalizable Node Representations
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment
Online Min-max Optimization: Nonconvexity, Nonstationarity, and Dynamic Regret
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
Towards Better Selective Classification
Offline Equilibrium Finding
ASGNN: Graph Neural Networks with Adaptive Structure
Iteratively Learning Novel Strategies with Diversity Measured in State Distances
Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment
ATTRIBUTES RECONSTRUCTION IN HETEROGENEOUS NETWORKS VIA GRAPH AUGMENTATION
Graph Signal Sampling for Inductive One-Bit Matrix Completion: a Closed-form Solution
DocPrompting: Generating Code by Retrieving the Docs
Comparing semantic and morphological analogy completion in word embeddings
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
Automatic Chain of Thought Prompting in Large Language Models
Enforcing zero-Hessian in meta-learning
An efficient encoder-decoder architecture with top-down attention for speech separation
Treatment Effect Estimation with Collider Bias and Confounding Bias
Adaptive Weight Decay: On The Fly Weight Decay Tuning for Improving Robustness
Annealed Training for Combinatorial Optimization on Graphs
Machine Unlearning of Federated Clusters
Semi-Supervised Segmentation-Guided Tumor-Aware Generative Adversarial Network for Multi-Modality Brain Tumor Translation
Brainformers: Trading Simplicity for Efficiency
Control Graph as Unified IO for Morphology-Task Generalization
Learning to Generate Pseudo Anomalies
Effective Self-Supervised Transformers For Sparse Time Series Data
SpaceEvo: Searching Hardware-Friendly Search Space for Efficient Int8 Inference
Scalable feature selection via sparse learnable masks
On the Interplay Between Misspecification and Sub-optimality Gap: From Linear Contextual Bandits to Linear MDPs
HSVC: Transformer-based Hierarchical Distillation for Software Vulnerability Classification
HAS IT REALLY IMPROVED? KNOWLEDGE GRAPH BASED SEPARATION AND FUSION FOR RECOMMENDATION
Counterfactual Contrastive Learning for Robust Text Classification
SAM as an Optimal Relaxation of Bayes
Denoising MCMC for Accelerating Diffusion-Based Generative Models
Learning on Large-scale Text-attributed Graphs via Variational Inference
Efficient Shapley Values Estimation by Amortization for Text Classification
SplitMixer: Fat Trimmed From MLP-like Models
Multimedia Generative Script Learning for Task Planning
On Assimilating Learned Views in Contrastive Learning
Upcycled-FL: Improving Accuracy and Privacy with Less Computation in Federated Learning
Dataset Projection: Finding Target-aligned Subsets of Auxiliary Data
Rethinking Identity in Knowledge Graph  Embedding
Eigen Memory Trees
Energy-based Predictive Representation for Reinforcement Learning
Which Invariance Should We Transfer? A Causal Minimax Learning Approach
Exclusive Supermask Subnetwork Training for Continual Learning
Dual personalization for federated recommendation on devices
Unsupervised Manifold Linearizing and Clustering
Moderate Coreset: A Universal Method of Data Selection for Real-world Data-efficient Deep Learning
Effectively Clarify Confusion via Visualized Aggregation and Separation of Deep Representation
Time-Transformer AAE: Connecting Temporal Convolutional Networks and Transformer for Time Series Generation
A comparison of dataset distillation and active learning in text classification
Temporally Consistent Video Transformer for Long-Term Video Prediction
Extreme Q-Learning: MaxEnt RL without Entropy
Autoencoding Hyperbolic Representation for Adversarial Generation
CAREER: Transfer Learning for Economic Prediction of Labor Data
Federated Nearest Neighbor Machine Translation
Latent Variable Representation for Reinforcement Learning
Look in The Mirror: Molecular Graph Contrastive Learning with Line Graph
Precision Collaboration for Federated Learning
RLSBench: A Large-Scale Empirical Study of Domain Adaptation Under Relaxed Label Shift
ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs
Spatial reasoning as Object Graph Energy Minimization
Words are all you need? Language as an approximation for representational similarity
Graph Contrastive Learning with Reinforced Augmentation
A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization
Block-Diagonal Structure Learning for Subspace Clustering
Decentralized Federated Learning via Overlapping Data Augmentation
Offline RL of the Underlying MDP from Heterogeneous Data Sources
An interpretable contrastive logical knowledge learning method for sentiment analysis
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
The Impact of Neighborhood Distribution in Graph Convolutional Networks
Training image classifiers using Semi-Weak Label Data
Confidence Estimation Using Unlabeled Data
Towards Class-Balanced Transductive Few-Shot Learning
Spectral Decomposition Representation for Reinforcement Learning
On Accelerated Perceptrons and Beyond
DITTO: Offline Imitation Learning with World Models
BAT-Chain: Bayesian-Aware Transport Chain for Topic Hierarchies Discovery
On the Importance of Calibration in Semi-supervised Learning
SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning
Unleashing the Potential of Data Sharing in Ensemble Deep Reinforcement Learning
Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication
Node Importance Specific Meta Learning in Graph Neural Networks
Attention-Guided Backdoor Attacks against Transformers
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Seq2Seq Pre-training with Dual-channel Recombination for Translation
Oracles and Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning
Structural Code Representation Learning for Auto-Vectorization
Overthinking the Truth: Understanding how Language Models process False Demonstrations
Sequential Attention for Feature Selection
Trusted Aggregation (TAG): Model Filtering Backdoor Defense In Federated Learning
What Deep Representations Should We Learn? -- A Neural Collapse Perspective
Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
BiViT: Exploring Binary Vision Transformers
Magnum: Tackling High-Dimensional Structures with Self-Organization
Provably Efficient Lifelong Reinforcement Learning with Linear Representation
Towards Adversarially Robust Deepfake Detection: An Ensemble Approach
Fast Adaptation via Human Diagnosis of Task Distribution Shift
Link Prediction with Non-Contrastive Learning
Distributed Differential Privacy in Multi-Armed Bandits
Thrust: Adaptively Propels Large Language Models with External Knowledge
Progress measures for grokking via mechanistic interpretability
Deep Bayesian Active Learning for Accelerating Stochastic Simulation
On the Mysterious Optimization Geometry of Deep Neural Networks
Implicit regularization via Spectral Neural Networks and non-linear matrix sensing
Goal-Space Planning with Subgoal Models
MET : Masked Encoding for Tabular data
Shortcut Learning Through the Lens of Early Training Dynamics
On $\mathcal{O}(1/K)$ Convergence and Low Sample Complexity for Single-Timescale Policy Evaluation with Nonlinear Function Approximation
Generating Features with Increased Crop-Related Diversity for Few-shot Object Detection
On the Implicit Bias Towards Depth Minimization in Deep Neural Networks
Prometheus: Endowing Low Sample and Communication Complexities to Constrained Decentralized Stochastic Bilevel Learning
SGD and Weight Decay Provably Induce a Low-Rank Bias in Neural Networks
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features
PiFold: Toward effective and efficient protein inverse folding
A Theoretical Understanding of Vision Transformers: Learning, Generalization, and Sample Complexity
AlphaDesign: A graph protein design method and benchmark on AlphaFold DB
Transfer Learning with Context-aware Feature Compensation
Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions
Learning Unsupervised Forward Models from Object Keypoints
K-SAM: Sharpness-Aware Minimization at the Speed of SGD
Copula Conformal Prediction for Multi-step Time Series Forecasting
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
Quantum 3D graph structure learning with applications to molecule computing
Distributional Signals for Node Classification in Graph Neural Networks
Vector Quantized Wasserstein Auto-Encoder
Exact Representation of Sparse Networks with Symmetric Nonnegative Embeddings
Skill-Based Reinforcement Learning with Intrinsic Reward Matching
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
TuneUp: A Training Strategy for Improving Generalization of Graph Neural Networks
Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures
A Scalable and Exact Gaussian Process Sampler via Kernel Packets
Model ChangeLists: Characterizing Changes in ML Prediction APIs
Provably Auditing Ordinary Least Squares in Low Dimensions
Exploring semantic information in disease: Simple Data Augmentation Techniques for Chinese Disease Normalization
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
Planning Goals for Exploration
Deep Direct Discriminative Decoders for High-dimensional Time-series Data Analysis
Learning Sparse Group Models Through Boolean Relaxation
LVQ-VAE:End-to-end Hyperprior-based Variational Image Compression with Lattice Vector Quantization
Direct Embedding of Temporal Network Edges via Time-Decayed Line Graphs
Neural DAG Scheduling via One-Shot Priority Sampling
TrajGRU-Attention-ODE: Novel Spatiotemporal Predictive Models
Learning-Based Radiomic Prediction of Type 2 Diabetes Mellitus Using Image-Derived Phenotypes
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Meta Temporal Point Processes
EmbedDistill: A geometric knowledge distillation for information retrieval
Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning
Deconstructing Distributions: A Pointwise Framework of Learning
Logical view on fairness of a binary classification task
Revisiting Instance-Reweighted Adversarial Training
Diffusion Models for Causal Discovery via Topological Ordering
Scalable and Equivariant Spherical CNNs by Discrete-Continuous (DISCO) Convolutions
Towards Solving Industrial Sequential Decision-making Tasks under Near-predictable Dynamics via Reinforcement Learning: an Implicit Corrective Value Estimation Approach
Graph Convolutional Normalizing Flows for Semi-Supervised Classification and Clustering
Weakly Supervised Explainable Phrasal Reasoning with Neural Fuzzy Logic
Simplified State Space Layers for Sequence Modeling
Learning Listwise Domain-Invariant Representations for Ranking
DCI-ES: An Extended Disentanglement Framework with Connections to Identifiability
Eigenvalue Initialisation and Regularisation for Koopman Autoencoders
Learning from Labeled Images and Unlabeled Videos for Video Segmentation
Score-based Generative 3D Mesh Modeling
Faster federated optimization under second-order similarity
Assessing Neural Network Robustness via Adversarial Pivotal Tuning of Real Images
REV: Information-Theoretic Evaluation of Free-Text Rationales
Examining the Difference Among Transformers and CNNs with Explanation Methods
A Quasistatic Derivation of Optimization Algorithms' Exploration on Minima Manifolds
Mutual Partial Label Learning with Competitive Label Noise
The Graph Learning Attention Mechanism: Learnable Sparsification Without Heuristics
Partial Label Unsupervised Domain Adaptation with Class-Prototype Alignment
Why Self Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing
Exp-$\alpha$: Beyond Proportional Aggregation in Federated Learning
Learning Efficient Hybrid Particle-continuum Representations of Non-equilibrium N-body Systems
Towards Large Scale Transfer Learning for Differentially Private Image Classification
Neural Network Approximation of Lipschitz Functions in High Dimensions with Applications to Inverse Problems
Weighted Ensemble Self-Supervised Learning
DOT: Fast Cell Type Deconvolution by Optimal Transport
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms
Bias Amplification Improves Worst-Group Accuracy without Group Information
Actionable Recourse Guided by User Preference
Large Learning Rate Matters for Non-Convex Optimization
A Deep Learning Framework for Musical Acoustics Simulations
Domain Generalization via Heckman-type Selection Models 
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
Guiding Safe Exploration with Weakest Preconditions
MetaMD: Principled Optimiser Meta-Learning for Deep Learning
A Sample Based Method for Understanding The Decisions of Neural Networks Semantically
Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Active Sampling for Node Attribute Completion on Graphs
Effects of Graph Convolutions in Multi-layer Networks
SimPer: Simple Self-Supervised Learning of Periodic Targets
Explaining Patterns in Data  with  Language Models via Interpretable Autoprompting
Lipschitz regularized gradient flows and latent generative particles
Post-hoc Concept Bottleneck Models
Undersampling is a Minimax Optimal Robustness Intervention in Nonparametric Classification
Emb-GAM: an Interpretable and Efficient Predictor using Pre-trained Language Models
When Source-Free Domain Adaptation Meets Learning with Noisy Labels
Is a Caption Worth a Thousand Images? A Study on Representation Learning
Parameter-Efficient Fine-Tuning Design Spaces
Concept Gradient: Concept-based Interpretation Without Linear Assumption
FedCUAU: Clustered Federated Learning using weight divergence
Constraining Representations Yields Models That Know What They Don't Know
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
CoMoE: Contrastive Mixture-of-Experts are Efficient Representation Learners
Mixed Federated Learning: Joint Decentralized and Centralized Learning
OTCOP: Learning optimal transport maps via constraint optimizations
An Extensible Multi-modal Multi-task Object Dataset with Materials
Sampling with Mollified Interaction Energy Descent
Does Zero-Shot Reinforcement Learning Exist?
Few-Shot Text Classification with Dual Contrastive Consistency Training
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Toward Discovering Options that Achieve Faster Planning
Conditional Permutation Invariant Flows
TILP: Differentiable Learning of Temporal Logical Rules on Knowledge Graphs
Hyperbolic Deep Reinforcement Learning
Learning Controllable Adaptive Simulation for Multi-scale Physics
Gated Neural ODEs: Trainability, Expressivity and Interpretability
Value-Based Membership Inference Attack on Actor-Critic Reinforcement Learning
Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Learned Neural Network Representations are Spread Diffusely with Redundancy
Contrastive Graph Representation Learning with Cross-view Reconstruction
Neural DAEs: Constrained neural networks
Revisiting the Assumption of Latent Separability for Backdoor Defenses
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
Koopman Neural Operator Forecaster for Time-series with Temporal Distributional Shifts
Uncertainty-Driven Exploration for Generalization in Reinforcement Learning
SPIDR: SDF-based Neural Point Fields for Illumination and Deformation
On Convergence of Federated Averaging Langevin Dynamics
Posthoc Privacy guarantees for neural network queries
MetaGL: Evaluation-Free Selection of Graph Learning Models via Meta-Learning
FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data
A Simple Unsupervised Data Depth-based Method to Detect Adversarial Images
Co-Evolution As More Than a Scalable Alternative for Multi-Agent Reinforcement Learning
Adaptive Parametric Prototype Learning for Cross-Domain Few-Shot Classification
Minimum Description Length Control
RainProof: An Umbrella to Shield Text Generator from Out-Of-Distribution Data
Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning
PerFedMask: Personalized Federated Learning with Optimized Masking Vectors
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Variational Latent Branching Model for Off-Policy Evaluation
Building compact representations for image-language learning
Discretization Invariant Learning on Neural Fields
Dynamic Pretraining of Vision-Language Models
HEAV: Hierarchical Ensembling of Augmented Views for Image Captioning
Tuning Frequency Bias in Neural Network Training with Nonuniform Data
Global Counterfactual Explanations Are Reliable Or Efficient, But Not Both
Learning Multimodal Data Augmentation in Feature Space
Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Achieving Sub-linear Regret in Infinite Horizon Average Reward Constrained MDP with Linear Function Approximation
Causal Imitation Learning via Inverse Reinforcement Learning
Amos: An Adam-style Optimizer with Adaptive Weight Decay towards Model-Oriented Scale
The Surprising Computational Power of Nondeterministic Stack RNNs
Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications
Agnostic Learning of General ReLU Activation Using Gradient Descent
Parametrizing Product Shape Manifolds by Composite Networks
CURE: A Pre-training Framework on Large-scale Patient Data for Treatment Effect Estimation
A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC 
Tabular Data to Image Generation: Benchmark Data, Approaches, and Evaluation
Learning Hyper Label Model for Programmatic Weak Supervision
SlenderGNN: Accurate, Robust, and Interpretable GNN, and the Reasons for its Success
FedFA:  Federated Feature Augmentation
Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks
Show and Write: Entity-aware Article Generation with Image Information
Noise$^+$2Noise: Co-taught De-noising Autoencoders for Time-Series Data
Adversarial Representation Learning for Canonical Correlation Analysis
BYPASSING THE STABILITY-PLASTICITY TRADEOFF TO REDUCE PREDICTIVE CHURN
Neural Implicit Manifold Learning for Topology-Aware Generative Modelling
LT-SNN: Self-Adaptive Spiking Neural Network for Event-based Classification and Object Detection
Characterizing neural representation of cognitively-inspired deep RL agents during an evidence accumulation task
Epistemological Bias As a Means for the Automated Detection of Injustices in News Media
Neural Constraint Inference: Inferring Energy Constraints in Interacting Systems
Self-supervised Continual Learning based on Batch-mode Novelty Detection
Stable Optimization of Gaussian Likelihoods
Break the Wall Between Homophily and Heterophily for Graph Representation Learning
Representing Latent Dimensions Using Compressed Number Lines
Efficient Sequence Packing without Cross-contamination: Accelerating Large Language Models without Impacting Performance
Cortically motivated recurrence enables task extrapolation
Learning Object-Centric Dynamic Modes from Video and Emerging Properties
Code Means More Than Plain Language: Bringing Syntax Structure Awareness To Algorithmic Problem Solution Generation
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement
Learning with Stochastic Orders
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games
MEDFAIR: BENCHMARKING FAIRNESS FOR MEDICAL IMAGEING
Does Decentralized Learning with Non-IID Unlabeled Data Benefit from Self Supervision?
Polarity is all you need to learn and transfer faster
On the Geometry of Reinforcement Learning in Continuous State and Action Spaces
Deep Invertible Approximation of Topologically Rich Maps between Manifolds
Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds
Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness
Countering the Attack-Defense Complexity Gap for Robust Classifiers
Evaluating Counterfactual Explainers
SMART: Sentences as Basic Units for Text Evaluation
A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Explaining  Image Classification through Knowledge-aware Neuron Interpretation
Tier Balancing: Towards Dynamic Fairness over Underlying Causal Factors
Anamnesic Neural Differential Equations with Orthogonal Polynomial Projections
Neural Design for Genetic Perturbation Experiments
Conceptual SCAN: Learning With and About Rules
An alternative approach to train neural networks using monotone variational inequality
Have Missing Data? Make It Miss More! Imputing Tabular Data with Masked Autoencoding
Invertible normalizing flow neural networks by JKO scheme
Unsupervised learning of features and object boundaries from local prediction
Towards Causal Concepts for Explaining Language Models
TRIDE: A Temporal, Robust, and Informative Data Augmentation Framework for Disease Progression Modeling
Multi-Segmental Informational Coding for Self-Supervised Representation Learning
Rule-based policy regularization for reinforcement learning-based building control
Neural Graphical Models
AUGMENTING ZERO-SHOT DENSE RETRIEVERS WITH PLUG-IN MIXTURE-OF-MEMORIES
Efficient Discrete Multi Marginal Optimal Transport Regularization
AutoTransfer: AutoML with Knowledge Transfer - An Application to Graph Neural Networks
Meta-learning from demonstrations improves compositional generalization
Deep Dependency Networks for Action Classification in Video
Temporal Dependencies in Feature Importance for Time Series Prediction
Peaks2Image: Reconstructing fMRI Statistical Maps from Peaks
Bridging the Gap between Semi-supervised and Supervised Continual Learning via Data Programming
Characterizing the spectrum of the NTK via a power series expansion
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
A critical look at evaluation of GNNs under heterophily: Are we really making progress?
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
A Non-monotonic Self-terminating Language Model
Counterfactual Memorization in Neural Language Models
TT-Rules: Extracting & Optimizing Exact Rules of a CNN-Based Model - Application to Fairness
uGLAD: A deep learning model to recover conditional independence graphs
Quantifying Memorization Across Neural Language Models
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
Federated Self-supervised Learning for Heterogeneous Clients
ContraSim -- A Similarity Measure Based on Contrastive Learning
Learning to Segment from Noisy Annotations: A Spatial Correction Approach
PointConvFormer: Revenge of the Point-Based Convolution
Measuring Forgetting of Memorized Training Examples
Leveraging Human Features at Test-Time
Graduated Non-Convexity for Robust Self-Trained Language Understanding
On the Activation Function Dependence of the Spectral Bias of Neural Networks
MaskViT: Masked Visual Pre-Training for Video Prediction
Text Summarization with Oracle Expectation
MERMADE: $K$-shot Robust Adaptive Mechanism Design via Model-Based Meta-Learning
Continuous-time identification of dynamic state-space models by deep subspace encoding
Waveformer: Linear-Time Attention with Forward and Backward Wavelet Transform
SemSup-XC: Semantic Supervision for Extreme Classification
SaMoE: Parameter Efficient MoE Language Models via Self-Adaptive Expert Combination
How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections
Interpretable Debiasing of Vectorized Language Representations with Iterative Orthogonalization
Communication-Optimal Distributed Graph Clustering under Duplication Models
Unpacking Large Language Models with Conceptual Consistency
Graph in Graph Neural Network
LSTM-BASED-AUTO-BI-LSTM for Remaining Useful Life (RUL) Prediction: the first round of test results
Recurrent Real-valued Neural Autoregressive Density Estimator for Online Density Estimation and Classification of Streaming Data
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations
Structural Adversarial Objectives for Self-Supervised Representation Learning
VIMA: General Robot Manipulation with Multimodal Prompts
Discovering Latent Knowledge in Language Models Without Supervision
ModReduce: A Multi-Knowledge Distillation Framework with Online Learning
Prefix Conditioning Unifies Language and Label Supervision
Defending against Reconstruction attacks using R��nyi Differential Privacy
Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation
Reconciling Security and Communication Efficiency in Federated Learning
Semantic Image Manipulation with Background-guided Internal Learning
Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement Learning
Noise Injection Node Regularization for Robust Learning
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size
Efficient Edge Inference by Selective Query
Learning Intuitive Policies Using Action Features
Differentially Private $L_2$-Heavy Hitters in the Sliding Window Model
Scaling Convex Neural Networks with Burer-Monteiro Factorization
Human-level Atari 200x faster
Wide Graph Neural Network
Taming the Long Tail of Deep Probabilistic Forecasting
Approximate Conditional Coverage via Neural Model Approximations
AUTOJOIN: EFFICIENT ADVERSARIAL TRAINING FOR ROBUST MANEUVERING VIA DENOISING AUTOEN- CODER AND JOINT LEARNING
Private Data Stream Analysis for Universal Symmetric Norm Estimation
StarGraph: Knowledge Representation Learning based on Incomplete Two-hop Subgraph
Temporal Domain Generalization with Drift-Aware Dynamic Neural Networks
Leveraging Incompatibility to Defend Against Backdoor Poisoning
Towards Representative Subset Selection for Self-Supervised Speech Recognition
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
Integrating Episodic and Global Novelty Bonuses for Efficient Exploration
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Dynamics-aware Skill Generation from Behaviourally Diverse Demonstrations
Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks
Learning to Act through Activation Function Optimization in Random Networks
Safer Reinforcement Learning with Counterexample-guided Offline Training
Pitfalls of Gaussians as a noise distribution in NCE
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Risk Control for Online Learning Models
Generative Adversarial Training for Neural Combinatorial Optimization Models
Federated Learning with Openset Noisy Labels
Perfectly Secure Steganography Using Minimum Entropy Coupling
The power of choices in decision tree learning
Identifiability of Label Noise Transition Matrix 
 Learning from Others: Similarity-based Regularization for Mitigating Artifacts
Calibrating Transformers via Sparse Gaussian Processes
Representation Learning via Consistent Assignment of Views over Random Partitions
Model Transferability with Responsive Decision Subjects 
Red PANDA: Disambiguating Anomaly Detection by Removing Nuisance Factors
Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Is Attention All That NeRF Needs?
STOCHASTIC NO-REGRET LEARNING FOR GENERAL GAMES WITH VARIANCE REDUCTION
The Dark Side of AutoML: Towards Architectural Backdoor Search
Generalization and Estimation Error Bounds for Model-based Neural Networks
Isometric Representations in Neural Networks Improve Robustness
Memory-efficient Trajectory Matching for Scalable Dataset Distillation
TAN without a burn: Scaling laws of DP-SGD
A sampling framework for value-based reinforcement learning
Enhancing Cross-Category Learning in Recommendation Systems with Multi-Layer Embedding Training
StructViT: Learning Correlation Structures for Vision Transformers
The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and their Empirical Equivalence
Bi-Stride Multi-Scale Graph Neural Network for Mesh-Based Physical Simulation
Spatially Resolved Temporal Networks: Online Unsupervised Representation Learning of High Frequency Time Series
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Length
Boosting Adversarial Transferability using Dynamic Cues
Taming Policy Constrained Offline Reinforcement Learning for Non-expert Demonstrations
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions
Attentional Context Alignment for Multimodal Sequential Learning
Matching receptor to odorant with protein language and graph neural networks
REAP: A Large-Scale Realistic Adversarial Patch Benchmark
How does overparametrization affect performance on minority groups?
Federated Training of Dual Encoding Models on Small Non-IID Client Datasets
Offline Policy Comparison with Confidence: Benchmarks and Baselines
PRANC: Pseudo RAndom Networks for Compacting deep models
NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Biological connectomes as a representation for the architecture of artificial neural networks
MSQ-BioBERT: Ambiguity Resolution to Enhance BioBERT Medical Question-Answering
Part-Based Models Improve Adversarial Robustness
Asymmetric Certified Robustness via Feature-Convex Neural Networks
PGrad: Learning Principal Gradients For Domain Generalization
Learning Efficient Models From Few Labels By Distillation From Multiple Tasks
Extremely Simple Activation Shaping for Out-of-Distribution Detection
ZiCo: Zero-shot NAS via inverse Coefficient of Variation on Gradients
Statistical Guarantees for Consensus Clustering
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Expressive Monotonic Neural Networks
Active Image Indexing
Towards Explaining Distribution Shifts
Perturbation Analysis of Neural Collapse
Learning Simultaneous Navigation and Construction in Grid Worlds 
Learning to CROSS exchange to solve min-max vehicle routing problems
PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs
Compositional Law Parsing with Latent Random Functions
Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
First-order Context-based Adaptation for Generalizing to New Dynamical Systems
CBP-QSNN: Spiking Neural Networks Quantized Using Constrained Backpropagation
Leveraging the Third Dimension in Contrastive Learning
O-ViT: Orthogonal Vision Transformer
STaSy: Score-based Tabular data Synthesis
REDUCING OVERSMOOTHING IN GRAPH NEURAL NETWORKS BY CHANGING THE ACTIVATION FUNCTION
Disentangled (Un)Controllable Features
Visual Prompt Tuning For Test-time Domain Adaptation
Mitigating Dataset Bias by Using Per-Sample Gradient
CAMA: A New Framework for Safe Multi-Agent Reinforcement Learning  Using Constraint Augmentation
CWATR: Generating Richer Captions with Object Attributes
Internal Purity: A Differential Entropy based Internal Validation Index for Clustering Validation
Task Regularized Hybrid Knowledge Distillation For Continual Object Detection
Efficient Model Updates for Approximate Unlearning of Graph-Structured Data
Risk-aware Bayesian RL for Cautious Exploration
Change Detection for bi-temporal images classification based on Siamese Variational AutoEncoder and Transfer Learning
G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation
Learning Top-k Classification with Label Ranking
QUANTIZATION AWARE FACTORIZATION FOR DEEP NEURAL NETWORK COMPRESSION
Populating memory in Continual Learning with Consistency Aware Sampling
Fairness of Federated Learning with Dynamic Participants
A Unified Algebraic Perspective on Lipschitz Neural Networks
AudioGen: Textually Guided Audio Generation
Faster Reinforcement Learning with Value Target Lower Bounding
Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid Learning in RNNs
Towards Minimax Optimal Reward-free Reinforcement Learning in Linear MDPs
PromptSum: Planning with Mixed Prompts for Parameter-Efficient Controllable Abstractive Summarization
Context and History Aware Other-Shaping
ReD-GCN: Revisit the Depth of Graph Convolutional Network
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
Multiple Modes for Continual Learning
A Theory of Equivalence-Preserving Program Embeddings
Multimodal Open-Vocabulary Video Classification via Vision and Language Models
On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning
Energy-based Out-of-Distribution Detection for Graph Neural Networks
Theoretical  Characterization of Neural Network Generalization with Group Imbalance
SDMuse: Stochastic Differential Music Editing and Generation via Hybrid Representation
Masked Autoencoders Enable Efficient Knowledge Distillers
Formal Interpretability with Merlin-Arthur Classifiers
Quasi-optimal Learning with Continuous Treatments
Generalization Bounds for Federated Learning: Fast Rates, Unparticipating Clients and Unbounded Losses
Contrastive Unsupervised Learning of World Model with Invariant Causal Features
GOING BEYOND 1-WL EXPRESSIVE POWER WITH 1-LAYER GRAPH NEURAL NETWORKS
System Identification as a Reinforcement Learning Problem
When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning
More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
Language-Aware Soft Prompting for Vision & Language Foundation Models
On Structural Expressive Power of Graph Transformers
Projected Latent Distillation for Data-Agnostic Consolidation in Multi-Agent Continual Learning
Few-shot Cross-domain Image Generation via Inference-time Latent-code Learning
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Black-Box Adversarial Attack Guided by Model Behavior for Programming Pre-trained Language Models
Learning Critically in Federated Learning with Noisy and Heterogeneous Clients
Rethinking Positive Sampling for Contrastive Learning with Kernel
Stationary Deep Reinforcement Learning with Quantum K-spin Hamiltonian Equation
Performance Disparities Between Accents in Automatic Speech Recognition
Do Perceptually Aligned Gradients Imply Robustness?
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Multitask Reinforcement Learning by Optimizing Neural Pathways
Input Perturbation Reduces Exposure Bias in Diffusion Models
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge
How deep convolutional neural networks lose spatial information with training
Which Layer is Learning Faster? A Systematic Exploration of Layer-wise Convergence Rate for Deep Neural Networks
Joint Embedding Self-Supervised Learning in the Kernel Regime
Linear convergence for natural policy gradient with log-linear policy parametrization
A non-asymptotic analysis of oversmoothing in Graph Neural Networks
Class-Incremental Learning with Repetition
Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting
Theoretical Characterization of How Neural Network Pruning Affects its Generalization
Backdoors Stuck At The Frontdoor: Multi-Agent Backdoor Attacks That Backfire
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Interpolating Compressed Parameter Subspaces
Liquid Structural State-Space Models
Equivariant Hypergraph Diffusion Neural Operators
Ollivier-Ricci Curvature for Hypergraphs: A Unified Framework
gGN: learning to represent nodes in directed graphs as low-rank Gaussian distributions
Domain-Unified Prompt Representations for Source-Free Domain Generalization
Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies
Sharper Analysis of Sparsely Activated Wide Neural Networks with Trainable Biases
Hard-Meta-Dataset++: Towards Understanding Few-Shot Performance on Difficult Tasks
REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH
Self-supervised Speech Enhancement using Multi-Modal Data
Impulse Control Arbitration for A Dual System of Exploitation and Exploration
Sparse MoE with Random Routing as the New Dropout: Training Bigger and Self-Scalable Models
Deep Patch Visual Odometry
Compositional Semantic Parsing with Large Language Models
TiAda: A Time-scale Adaptive Algorithm For Nonconvex Minimax Optimization
Generalization Properties of Retrieval-based Models
Semi-Variance Reduction for Fair Federated Learning
Multi-Modality Alone is Not Enough: Generating Scene Graphs using Cross-Relation-Modality Tokens
FaiREE: fair classification with finite-sample and distribution-free guarantee
Bidirectional global to local attention for deep metric learning.
Deep Evidential Reinforcement Learning for Dynamic Recommendations
Exponential Generalization Bounds with Near-Optimal Rates for $L_q$-Stable Algorithms
Disentangling Learning Representations with Density Estimation
Coarse-to-fine Knowledge Graph Domain Adaptation based on Distantly-supervised Iterative Training
Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication
Class Interference of Deep Networks
Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives
SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching
Siamese DETR
Provable Sharpness-Aware Minimization with Adaptive Learning Rate 
Prompting GPT-3 To Be Reliable
Contrastive Graph Few-Shot Learning
Domain Generalization in Regression
Adversarial Training of Self-supervised Monocular Depth Estimation against Physical-World Attacks
Sparsity-Constrained Optimal Transport
A Risk-Averse Equilibrium for Multi-Agent Systems
SuperWeight Ensembles: Automated Compositional Parameter Sharing Across Diverse Architechtures
Human alignment of neural network representations
Imitation Learning for Mean Field Games with Correlated Equilibria
EFFECTIVE FREQUENCY-BASED BACKDOOR ATTACKS WITH LOW POISONING RATIOS
Turning the Curse of Heterogeneity in Federated Learning into a Blessing for Out-of-Distribution Detection
Clustering and Ordering Variable-Sized Sets: The Catalog Problem
RangeAugment:  Efficient Online Augmentation with Range Learning
How Predictors Affect Search Strategies in Neural Architecture Search?
Unbiased Stochastic Proximal Solver for Graph Neural Networks with Equilibrium States
Energy Transformer
Privacy-Preserving Vision Transformer on Permutation-Encrypted Images
Lightweight CNNs Under A Unifying Tensor View
DiGress: Discrete Denoising diffusion for graph generation
Sample Relationships through the Lens of Learning Dynamics with Label Information
Geometric Networks Induced by Energy Constrained Diffusion
Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics
Jump-Start Reinforcement Learning
GT-CausIn: a novel causal-based insight for traffic prediction
KerDEQ: Optimization induced Deep Equilibrium models via Gaussian Kernel
AD-NEGF: An End-to-End Differentiable Quantum Transport Simulator for Sensitivity Analysis and Inverse Problems
Incomplete to complete multiphysics forecasting - a hybrid approach for learning unknown phenomena
Bi-Level Dynamic Parameter Sharing among Individuals and Teams for Promoting Collaborations in Multi-Agent Reinforcement Learning
TCNL: Transparent and Controllable Network Learning Via Embedding Human-Guided Concepts
How to prepare your task head for finetuning
Gradient-based optimization is not necessary for generalization in neural networks
Uplift Modelling based on Graph Neural Network Combined with Causal Knowledge
Sequence to sequence text generation with diffusion models
Rethinking Deep Spiking Neural Networks: A Multi-Layer Perceptron Approach
Collaborative Symmetricity Exploitation for Offline Learning of Hardware Design Solver
From ChebNet to ChebGibbsNet
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
On The Implicit Bias of Weight Decay in Shallow Univariate ReLU Networks
Mitigating Memorization of Noisy Labels via Regularization between Representations
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and Multi-Layer Perceptrons
Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model
BSTT: A Bayesian Spatial-Temporal Transformer for Sleep Staging
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
Beyond re-balancing: distributionally robust augmentation against class-conditional distribution shift in long-tailed recognition
Improving Deep Policy Gradients with Value Function Search
MEDICAL IMAGE UNDERSTANDING WITH PRETRAINED VISION LANGUAGE MODELS: A COMPREHENSIVE STUDY
SPI-GAN: Denoising Diffusion GANs with Straight-Path Interpolations
Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models
Temporal Coherent Test Time Optimization for Robust Video Classification
Offline Communication Learning with Multi-source Datasets
A Learning Based Hypothesis Test for Harmful Covariate Shift
Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets
SynMotor: A Benchmark Suite for Object Attribute Regression and Multi-task Learning
Training via Confidence Ranking
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Towards Understanding Robust Memorization in Adversarial Training
Self-supervised Geometric Correspondence for Category-level 6D Object Pose Estimation in the Wild
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
Non-parametric Outlier Synthesis
SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
Latent-space disentanglement with untrained generator networks allows to isolate different motion types in video data
FV-MgNet: Fully Connected V-cycle MgNet for Interpretable Time Series Forecasting
Prosody-TTS: Self-Supervised Prosody Pretraining with Latent Diffusion For Text-to-Speech
Robust Self-Supervised Learning with Lie Groups
Self-Paced Learning  Enhanced Physics-informed Neural Networks for Solving Partial Differential Equations
Moving Beyond Handcrafted Architectures in Self-Supervised Learning
Approximation and non-parametric estimation of functions over high-dimensional spheres via deep ReLU networks
Embedding Fourier for Ultra-High-Definition Low-Light Image Enhancement
Population-Based Reinforcement Learning for Combinatorial Optimization Problems
Adversarial Attack Detection Through Network Transport Dynamics
On the Relationship Between Adversarial Robustness and Decision Region in Deep Neural Networks
Confounder Identification-free Causal Visual Feature Learning
Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations
Learning Adversarial Linear Mixture Markov Decision Processes with Bandit Feedback and Unknown Transition
Weakly Supervised Knowledge Transfer with Probabilistic Logical Reasoning for Object Detection
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification
Signs in the Lottery: Structural Similarities Between Winning Tickets
Computational Doob h-transforms for Online Filtering of Discretely Observed Diffusions
Adversarial Attack Detection Under Realistic Constraints
Reconciling feature sharing and multiple predictions with   MIMO Vision Transformers
A Neural Mean Embedding Approach for Back-door and Front-door Adjustment
Chopping Formers is what you need in Vision
Towards Estimating Transferability using Hard Subsets
Data Pricing Mechanism Based on Property Rights Compensation Distribution
Incremental Unified Parameter Additional Tuning with Basic Memory Replaying
Knowledge-Driven Active Learning
FastDiff 2: Dually Incorporating GANs into Diffusion Models for High-Quality Speech Synthesis
When Neural ODEs meet Neural Operators
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
FFCV: Accelerating Training by Removing Data Bottlenecks
Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning
Learning to Reason and Act in Cascading Processes
Noether Embeddings: Fast Temporal Association Mining
Searching optimal adjustment features for treatment effect estimation
Over-parameterized Model Optimization with Polyak-{\L}ojasiewicz Condition
Differentially private Bias-Term Only Fine-tuning of Foundation Models
Jointly Learning Visual and Auditory Speech Representations from Raw Data
Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning
Differentially Private Optimization on Large Model at Small Cost
Universal Graph Neural Networks without Message Passing
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment
Pre-training via Denoising for Molecular Property Prediction
Equivariant Energy-Guided SDE for Inverse Molecular Design
Contrastive Value Learning: Implicit Models for Simple Offline RL
CLIPPING: Distilling CLIP-based Models for Video-Language Understanding
To be private and robust: Differentially Private Optimizers Can Learn Adversarially Robust Models
Vectorial Graph Convolutional Networks
Traversing Between Modes in Function Space for Fast Ensembling
Poisson Process for Bayesian Optimization
DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning
$Q$-learning with regularization converges with non-linear non-stationary features
Polite Teacher: Semi-Supervised Instance Segmentation with Mutual Learning and Pseudo-Label Thresholding
Reducing Forgetting In Federated Learning with Truncated Cross-Entropy
On the Convergence and Calibration of Deep Learning with Differential Privacy
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
A Simple Yet Powerful Deep Active Learning With Snapshots Ensembles
Normalizing Flows for Interventional Density Estimation
Backdoor or Feature? A New Perspective on Data Poisoning
TAPPFL: TASK-AGNOSTIC PRIVACY-PRESERVING REPRESENTATION LEARNING FOR FEDERATED LEARNING AGAINST ATTRIBUTE INFERENCE ATTACKS
A Curriculum Perspective to Robust Loss Functions
Decoupled Training for Long-Tailed  Classification With Stochastic Representations
IT-NAS: Integrating Lite-Transformer into NAS for Architecture Seletion
Fine-Grained Source Code Vulnerability Detection via Graph Neural Networks
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Randomized Adversarial Style Perturbations for Domain Generalization
Martingale Posterior Neural Processes
GuoFeng: A Discourse-aware Evaluation Benchmark for Language Understanding, Translation and Generation
Multi-View Independent Component Analysis with Shared and Individual Sources
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning
Towards Open Temporal Graph Neural Networks
FedMAE: Federated Self-Supervised Learning with One-Block Masked Auto-Encoder
Learning Discriminative Representations for Chromosome Classification with Small Datasets
APLA: Class-imbalanced Semi-supervised Learning with Adapative Pseudo-labeling and Loss Adjustment
Label-Efficient Online Continual Object Detection in Streaming Video
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
Simplicity bias in $1$-hidden layer neural networks
FedEED: Efficient Federated Distillation with Ensemble of Aggregated Models
Critical Batch Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training
Where prior learning can and can't work in unsupervised inverse problems
When are smooth-ReLUs ReLU-like?
Hypernetwork approach to Bayesian MAML
SpectraNet: multivariate forecasting and imputation under distribution shifts and missing data
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems
Uncertainty and Traffic Light Aware Pedestrian Crossing Intention Prediction
Benchmarking Constraint Inference in Inverse Reinforcement Learning
Forward and Backward Lifelong Learning with Time-dependent Tasks
Memory Gym: Partially Observable Challenges to Memory-Based Agents
Token-level Fitting Issues of Seq2seq Models
Worst-case Few-shot Evaluation: Are Neural Networks Robust Few-shot Learners?
Learning Sampling Policy to Achieve Fewer  Queries for  Zeroth-Order Optimization
Discovering Policies with DOMiNO
Practical Real Video Denoising with Realistic Degradation Model
SpeedyZero: Mastering Atari with Limited Data and Time
HT-Net: Hierarchical Transformer based  Operator Learning Model for Multiscale PDEs
Multi-Agent Multi-Game Entity Transformer
On the Convergence of Gradient Flow on Multi-layer Linear Models
Variational Counterfactual Prediction under Runtime Domain Corruption
Neural Architecture Design and Robustness: A Dataset
Does Deep Learning Learn to Abstract? A Systematic Probing Framework
Learning to mine approximate network motifs
Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger
Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery
Improving Out-of-distribution Generalization with Indirection Representations
Accelerating Guided Diffusion Sampling with Splitting Numerical Methods
RealSinger: Ultra-Realistic Singing Voice Generation via Stochastic Differential Equations
Homeomorphism Alignment in Two Spaces for Unsupervised Domain Adaptation
Demystifying Approximate RL with $\epsilon$-greedy Exploration: A Differential Inclusion View
Batch Multivalid Conformal Prediction
Leveraging Online Semantic Point Fusion for 3D-Aware Object Goal Navigation
Transferring Pretrained Diffusion Probabilistic Models
ELBO-ing Stein Mixtures
Source-Target Coordinated Training with Multi-head Hybrid-Attention for Domain Adaptive Semantic Segmentation
On the Role of Self-supervision in Deep Multi-view Clustering
Schedule-Robust Online Continual Learning
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Attention Enables Zero Approximation Error
Revisiting Activation Function Design for Improving Adversarial Robustness at Scale
Contrastive Hierarchical Clustering
What Does Vision Supervision Bring to Language Models? A Case Study of CLIP
Accurate Bayesian Meta-Learning by Accurate Task Posterior Inference
Learning to Decompose Visual Features with Latent Textual Prompts
ML-ViG: Multi-Label Image Recognition with Vision Graph Convolutional Network
Skill Machines: Temporal Logic Composition in Reinforcement Learning
Surrogate Gradient Design for LIF networks
Context-enriched molecule representations improve few-shot drug discovery
The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks
Warped Convolutional Networks: Bridge Homography to $\mathfrak{sl}(3)$ algebra by Group Convolution
Delving into the Openness of CLIP
Test-Time Adaptation via Self-Training with Nearest Neighbor Information
Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations
Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Relative representations enable zero-shot latent space communication
oViT: An Accurate Second-Order Pruning Framework for Vision Transformers
Learning Basic Interpretable Factors from Temporal Signals via Physics Symmetry
Addressing High-dimensional Continuous Action Space via Decomposed Discrete Policy-Critic
Unsupervised Manifold Alignment with Joint Multidimensional Scaling
Can Single-Pass Contrastive Learning Work for Both Homophilic and Heterophilic Graph?
TOAST: Topological Algorithm for Singularity Tracking
Robust Manifold Estimation Approach for Evaluating Fidelity and Diversity
Disentangling Writer and Character Styles for Handwriting Generation
Exploiting Certified Defences to Attack Randomised Smoothing
Simple and Scalable Nearest Neighbor Machine Translation
On the effectiveness of out-of-distribution data in self-supervised long-tail learning.
Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting
Deep Leakage from Model in Federated Learning
A Universal 3D Molecular Representation Learning Framework
CAPE: Channel-Attention-Based PDE Parameter Embeddings for SciML
Topic and Hyperbolic Transformer to Handle Multi-modal Dependencies
Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
Restricted Generative Projection for One-Class Classification and Anomaly detection
Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer
The Generalized Eigenvalue Problem as a Nash Equilibrium
learning hierarchical multi-agent cooperation with long short-term intention
FEAT: A general framework for Feature-aware Multivariate Time-series Representation Learning 
Learning with Auxiliary Activation for Memory-Efficient Training
Equivariant 3D-Conditional Diffusion Models for Molecular Linker Design
Existence of a bad local minimum of neural networks with general smooth activation functions
Language Modelling with Pixels
Sinkhorn Discrepancy for Counterfactual Generalization
Massively Scaling Heteroscedastic Classifiers
Vera Verto: Multimodal Hijacking Attack
On Incremental Learning with Long Short Term Strategy
Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for Multi-Source Domain Adaptation
Efficient Point Cloud Geometry Compression Through Neighborhood Point Transformer
EA-HAS-Bench: Energy-aware Hyperparameter and Architecture Search Benchmark
Breaking the Curse of Dimensionality for Parametric Elliptic PDEs
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Dynamical Equations With Bottom-up Self-Organizing Properties Learn Accurate Dynamical Hierarchies Without Any Loss Function
Multi-Label Knowledge Distillation
How and Why We Detect Distribution Shift: Critical Analysis of Methods and Benchmarks
ADVERSARY-AWARE PARTIAL LABEL LEARNING WITH LABEL DISTILLATION
Structural Privacy in Graphs
KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP
Distill Vision Transformers to CNNs via Low-Rank Representation Approximation
Learning Graph Neural Network Topologies
Finding the global semantic representation in GAN through Fr��chet Mean
Identical Initialization: A Universal  Approach to Fast and Stable Training of Neural Networks
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
MARS: Meta-learning as Score Matching in the Function Space
Faster Gradient-Free Methods for Escaping Saddle Points
$\textrm{D}^3\textrm{Former}$: Debiased Dual Distilled Transformer for Incremental Learning
Symmetrical SyncMap for Imbalanced General Chunking Problems
Solving Partial Label Learning Problem with Multi-Agent Reinforcement Learning
Uncovering the Effectiveness of Calibration on Open Intent Classification
PMixUp: Simultaneous Utilization of Part-of-Speech Replacement and Feature Space Interpolation for Text Data Augmentation
SDT: Specific Domain Training in Domain Generalization
Lossy Compression with Gaussian Diffusion
Score-Based Graph Generative Modeling with Self-Guided Latent Diffusion
Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces
Deep Generative Wasserstein Gradient Flows
Linear Scalarization for Byzantine-Robust Learning on non-IID data
Where to Go Next for Recommender Systems? ID- vs. Modality-based recommender models revisited
Pixel-Level Task Helps Pruned Network Transfer to Downstream Tasks
Is Class Incremental Learning Truly Learning Representations Continually?
Optimising 2D Pose Representation: Improving Accuracy, Stability and Generalisability inUnsupervised 2D-3D Human Pose Estimation
Model Obfuscation for Securing Deployed Neural Networks
Optimising Event-Driven Spiking Neural Network with Regularisation and Cutoff
ESP: Exponential Smoothing on Perturbations for Increasing Robustness to Data Corruptions
MATS: Memory Attention for Time-Series forecasting
MultiViz: Towards Visualizing and Understanding Multimodal Models
How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression?
DISCO-DANCE: Learning to Discover Skills with Guidance
Exploring Generalization of Non-Contrastive self-supervised Learning
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Blurring Diffusion Models
BrGANs: Stabilizing GANs' Training Process with Brownian Motion Control
Detecting Backdoor Attacks via Layer-wise Feature Analysis
Hyperbolic Self-paced Learning for Self-supervised Skeleton-based Action Representations
Unfair geometries: exactly solvable data model with fairness implications
DropAut: Automatic Dropout Approaches to learn and adapt Drop Rates
Understanding Adversarial Transferability in Federated Learning
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank
Efficient Offline Policy Optimization with a Learned Model
New Insights for the Stability-Plasticity Dilemma in Online Continual Learning
MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer
Multiple Invertible and Equivariant Transformation for Disentanglement in VAEs
StyleMorph: Disentangling Shape, Pose and Appearance through 3D Morphable Image and Geometry Generation 
Accelerated Riemannian Optimization: Handling Constraints to Bound Geometric Penalties
Searching Lottery Tickets in Graph Neural Networks: A Dual Perspective
Video Scene Graph Generation from Single-Frame Weak Supervision
Planning With Uncertainty: Deep Exploration in Model-Based Reinforcement Learning
Unsupervised visualization of image datasets using contrastive learning
On the Expressive Equivalence Between Graph Convolution and Attention Models
Contrastive Consistent Representation Distillation
PowerQuant: Automorphism Search for Non-Uniform Quantization
CLEEGN: A Convolutional Neural Network for Plug-and-Play Automatic EEG Reconstruction
Neural Layered Min-sum Decoders for Algebraic Codes
Deep Gaussian Process State-Space Model for Motion Generation via Stochastic Expectation Propagation
On Uni-modal Feature Learning in Multi-modal Learning
Unified neural representation model for physical and conceptual spaces
Symbolic Physics Learner: Discovering governing equations via Monte Carlo tree search
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels
Efficient block contrastive learning via parameter-free meta-node approximation
Attribute Alignment and Enhancement for Generalized Zero-Shot Learning
BAYES RISK CTC: CONTROLLABLE CTC ALIGNMENT IN SEQUENCE-TO-SEQUENCE TASKS
A Convergent Single-Loop Algorithm for Gromov-Wasserstein in Graph Data 
The Importance of Suppressing Complete Reconstruction in Autoencoders for Unsupervised Outlier Detection
FrAug: Frequency Domain Augmentation for Time Series Forecasting
A Hierarchical Hyper-rectangle Mass Model for Fine-grained Entity Typing
Bayesian semi-supervised learning with a principled likelihood from a generative model of data curation
Continual Learning via Adaptive Neuron Selection
Revisiting Fast Adversarial Training
Ti-MAE: Self-Supervised Masked Time Series Autoencoders
E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking
Deep High-Frequency Extrapolation for Neuronal Spike Restoration
Improving Model Consistency of Decentralized Federated Learning via Sharpness Aware Minimization and Multiple Gossip Approaches
VA-DepthNet: A Variational Approach to Single Image Depth Prediction
Prompt-to-Prompt Image Editing with Cross-Attention Control
ExtraMix: Extrapolatable Data Augmentation for Regression using Generative Models
Exact Group Fairness Regularization via Classwise Robust Optimization
Lightweight Uncertainty for Offline Reinforcement Learning via Bayesian Posterior
DiffEdit: Diffusion-based semantic image editing with mask guidance
Are More Layers Beneficial to Graph Transformers?
Learning Combinatorial Node Labeling Algorithms
Simplicial Hopfield networks
Volumetric Disentanglement  for 3D Scene Manipulation
Versatile Neural Processes for Learning Implicit Neural Representations
Supplementing Domain Knowledge to BERT with Semi-structured Information of Documents
Window Projection Features are All You Need for Time Series Anomaly Detection
DEEP ACCURATE SOLVER FOR THE GEODESIC PROBLEM
PBFormer: Capturing Complex Scene Text Shape with Polynomial Band Transformer
Classically Approximating Variational Quantum Machine Learning with Random Fourier Features
Distributional Meta-Gradient Reinforcement Learning
Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning
CENTROID-BASED JOINT REPRESENTATION FOR HUMAN POSE ESTIMATION AND INSTANCE SEGMENTATION
Addressing Variable Dependency in GNN-based SAT Solving
Pairwise Confidence Difference on Unlabeled Data is Sufficient for Binary Classification
Emergent Communication with Attention
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning
MetaFS: An Effective Wrapper Feature Selection via Meta Learning
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Signal to Sequence Attention-Based Multiple Instance Network for Segmentation Free Inference of RNA Modifications
Interval-based Offline Policy Evaluation without Sufficient Exploration or Realizability
A Differential Geometric View and Explainability of GNN on Evolving Graphs
$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks
Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization
Multi-Prompt Alignment for Multi-source Unsupervised Domain Adaptation
Adversarial Examples Guided Pseudo-label Refinement for Decentralized Domain Adaptation
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
Dense Correlation Fields for Motion Modeling in Action Recognition
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
What's Behind the Mask: Estimating Uncertainty in Image-to-Image Problems
A Time-Consistency Curriculum for Learning from Instance-Dependent Noisy Labels
Black-box Knowledge Distillation
Open Set Recognition by Mitigating Prompt Bias
Efficient Personalized Federated Learning via Sparse Model-Adaptation
Molecule Generation for Target Receptor Binding via Continuous Normalizing Flows
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data
Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection
GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network
Gradient Deconfliction via Orthogonal Projections onto Subspaces For Multi-task Learning
Relative Contribution Mechanism: A Unified Paradigm for Disassembling Convolutional Neural Networks
Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios
Self-Consistent Learning: Cooperation between Generators and Discriminators
Learning Dynamical Characteristics with Neural Operators for Data Assimilation
Lost Domain Generalization Is a Natural Consequence of Lack of Training Domains
Graph Neural Networks for Link Prediction with Subgraph Sketching
Leveraging Hard Negative Priors for Automatic Medical Report Generation
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Style Balancing and Test-Time Style Shifting for Domain Generalization
Least Disagree Metric-based Active Learning
Personalized Federated Hypernetworks for Privacy Preservation in Multi-Task Reinforcement Learning
NSCL: Noise-Resistant Soft Contrastive Learning for Universal Domain Adaptation
Global-Local Bayesian Transformer for Semantic Correspondence
Semantic Category Discovery with Vision-language Representations
Deep Causal Generative Modeling for Tabular Data Imputation and Intervention
CBLab: Scalable Traffic Simulation with Enriched Data Supporting
Personalized Decentralized Bilevel Optimization over Stochastic and Directed Networks
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Learning Object Affordance with Contact and Grasp Generation
Deep Graph-Level Clustering Using Pseudo-Label-Guided Mutual Information Maximization Network
Deep Generative Model based Rate-Distortion for Image Downscaling Assessment
Selective Classifier Ensemble
Better Generative Replay for Continual Federated Learning
Unified Probabilistic Modeling of Image Aesthetic Rating Distributions towards Measuring Subjectivity
Enhancing the Transferability of Adversarial Examples via a Few Queries and Fuzzy Domain Eliminating
Analyzing adversarial robustness of vision transformers against spatial and spectral attacks
Label-distribution-agnostic Ensemble Learning on Federated Long-tailed Data
MULTI-VIEW DEEP EVIDENTIAL FUSION NEURAL NETWORK FOR ASSESSMENT OF SCREENING MAMMOGRAMS
Data-Free Continual Graph Learning 
Generative Modelling with Inverse Heat Dissipation
Self-supervision through Random Segments with Autoregressive Coding (RandSAC)
Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images
Benchmarking Approximate k-Nearest Neighbour Search for Big High Dimensional Dynamic Data
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
E-Forcing: Improving Autoregressive Models by Treating it as an Energy-Based One
Joint Generator-Ranker Learning for Natural Language Generation
The Progressive Alignment-aware Multimodal Fusion with Easy2hard Strategy for Multimodal Neural Machine Translation
Masked Vector Quantization
On the Importance of the Policy Structure in Offline Reinforcement Learning
Bandit Learning in Many-to-one Matching Markets with Uniqueness Conditions
Can you Trust your Disentanglement?
TRANSFORMER-PATCHER: ONE MISTAKE WORTH ONE NEURON
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Semi-Implicit Variational Inference via Score Matching
Sharper Bounds for Uniformly Stable Algorithms with Stationary $\varphi$-mixing Process
Few-Shot Anomaly Detection on Industrial Images through Contrastive Fine-Tuning
Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression
On the Edge of Benign Overfitting: Label Noise and Overparameterization Level
Predictive Inference with Feature Conformal Prediction
Measuring Image Complexity as a Discrete Hierarchy using MDL Clustering
Recon: Reducing Conflicting Gradients From the Root For Multi-Task Learning
OCD: Learning to Overfit with Conditional Diffusion Models
Measure the Predictive Heterogeneity
On the robustness of self-supervised models for generative spoken language modeling
Non-equispaced Fourier Neural Solvers for PDEs
Time to augment visual self-supervised learning
Adversarial IV Regression for Demystifying Causal Features on Adversarial Examples
Probable Dataset Searching Method with Uncertain Dataset Information in Adjusting Architecture Hyper Parameter
Impact of the Last Fully Connected Layer on Out-of-distribution Detection
Towards Lightweight, Model-Agnostic and Diversity-Aware Active Anomaly Detection
Multi-Level Contrastive Learning for Dense Prediction Task
Switching One-Versus-the-Rest Loss to Increase Logit Margins for Adversarial Robustness
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Scaled Neural Multiplicative Model for Tractable Optimization
Quasi-Taylor Samplers for Diffusion Generative Models based on Ideal Derivatives
Group-oriented Cooperation in Multi-Agent Reinforcement Learning
Exploring Temporally Dynamic Data Augmentation for Video Recognition
CacheGNN: Enhancing Graph Neural Networks with Global Information Caching
Towards Information-Theoretic Pattern Mining in Time Series
Agent Prioritization with Interpretable Relation for Trajectory Prediction
$z$-SignFedAvg: A Unified  Stochastic Sign-based Compression for Federated Learning
Transfer Learning with Pre-trained Conditional Generative Models
DECN: Evolution Inspired Deep Convolution Network for Black-box Optimization
Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots
On the Power-Law Hessian Spectra in Deep Learning
Optformer: Beyond Transformer for Black-box Optimization
Deformable Graph Transformer
Exact manifold Gaussian Variational Bayes
SuperMarioDomains: Generalizing to Domains with Evolving Graphics
Variance-Aware Sparse Linear Bandits
Multi-Treatment Effect Estimation with Proxy: Contrastive Learning and Rank Weighting
CircNet: Meshing 3D Point Clouds with Circumcenter Detection
In-sample Actor Critic for Offline Reinforcement Learning
Leveraging Future Relationship Reasoning for Vehicle Trajectory Prediction
DeepTime: Deep Time-index Meta-learning for Non-stationary Time-series Forecasting
Non-Parametric State-Space Models: Identifiability, Estimation and Forecasting
ETSformer: Exponential Smoothing Transformers for Time-series Forecasting
LMSeg: Language-guided Multi-dataset Segmentation
Horizon-Free Reinforcement Learning for Latent Markov Decision Processes
Learning Invariant Features for Online Continual Learning
RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Treeformer: Dense Gradient Trees for Efficient Attention Computation
Visual Reinforcement Learning with Self-Supervised 3D Representations
ODAM: Gradient-based Instance-Specific Visual Explanations for Object Detection
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization
Toward Adversarial Training on Contextualized Language Representation
Efficient Method for Bi-level Optimization with Non-smooth Lower-Level Problem
Estimating Riemannian Metric with Noise-Contaminated Intrinsic Distance
In Search of Smooth Minima for Purifying Backdoor in Deep Neural Networks
Joint Gaussian Mixture Model for Versatile Deep Visual Model Explanation
Gromov-Wasserstein Autoencoders
Localized Graph Contrastive Learning
OrthoReg: Improving Graph-regularized MLPs via Orthogonality Regularization
Group-Equivariant Transformers Without Positional Encoding
CUSTOMIZING PRE-TRAINED DIFFUSION MODELS FOR YOUR OWN DATA
Optimal Activation Functions for the Random Features Regression Model
Deep Learning-based Source Code Complexity Prediction
Learning to Learn with Generative Models of Neural Network Checkpoints
Improving Explanation Reliability through Group Attribution
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Uncertainty Guided Depth Fusion for Spike Camera
Personalized Semantics Excitation for Federated Image Classification
Intrinsic Motivation via Surprise Memory
Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data
Unsupervised Object-Centric Learning with Bi-level Optimized Query Slot Attention
Set-Level Self-Supervised Learning from Noisily-Labeled Data
Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks
Theoretical generalization bounds for improving the efficiency of deep online training
EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning
A Representation Bottleneck of Bayesian Neural Networks
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Cycle to Clique (Cy2C) Graph Neural Network: A Sight to See beyond Neighborhood Aggregation
Latent State Marginalization as a Low-cost Approach to Improving Exploration
TensorVAE: A Direct Generative Model for Molecular Conformation Generation driven by Novel Feature Engineering
Smoothed-SGDmax: A Stability-Inspired Algorithm to Improve Adversarial Generalization
Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap
Bias Mimicking: A Simple Sampling Approach for Bias Mitigation
MaskFusion: Feature Augmentation for Click-Through Rate Prediction via Input-adaptive Mask Fusion
Finite-time Analysis of Single-timescale Actor-Critic on Linear Quadratic Regulator
From Coarse to Fine-grained Concept based Discrimination for Phrase Detection
Scalable 3D Object-centric Learning
Towards Boosting the Open-Domain Chatbot with Human Feedback
Learning to Generate All Feasible Actions
Empirical Study of Pre-training a Backbone for 3D Human Pose and Shape Estimation
Sparsity by Redundancy: Solving $L_1$ with a Simple Reparametrization
Test-Time Adaptation for Visual Document Understanding
Learned Index with Dynamic $\epsilon$
Breaking the Curse of Dimensionality in Multiagent State Space: A Unified Agent Permutation Framework
LAU: A novel two-parameter learnable Logmoid Activation Unit
3D Molecular Generation by Virtual Dynamics
N-Student Learning: An Approach to Model Uncertainty and Combat Overfitting
Wav2Tok: Deep Sequence Tokenizer for Audio Retrieval
Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
Better handling unlabeled entity problem using PU-learning and negative sampling
PV3D: A 3D Generative Model for Portrait Video Generation
k-Median Clustering via Metric Embedding: Towards Better Initialization with Differential Privacy
Analysis of Error Feedback in Compressed Federated Non-Convex Optimization
Characterizing the Influence of Graph Elements
Adversarially Robust Neural Lyapunov Control
MICN: Multi-scale Local and Global Context Modeling for Long-term Series Forecasting
EMP: Effective Multidimensional Persistence for Graph Representation Learning
Class Prototype-based Cleaner for Label Noise Learning
Improving Vision Attention with Random Walk Graph Kernel
SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model Communication
Hierarchical Sliced Wasserstein Distance
Test-time Adaptation for Better Adversarial Robustness
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection
Prototypical Calibration for Few-shot Learning of Language Models
NERDS: A General Framework to Train Camera Denoisers from Single Noisy Images
Communication-Efficient and Drift-Robust Federated Learning via Elastic Net
Hierarchical Protein Representations via Complete 3D Graph Networks
Adversarial Attacks on Adversarial Bandits
Multiscale Multimodal Transformer for Multimodal Action Recognition
Partition Matters in Learning and Learning-to-Learn Implicit Neural Representations
Grounding High Dimensional Representation Similarity by Comparing Decodability and Network Performance
Likelihood adjusted semidefinite programs for clustering heterogeneous data
RGI: robust GAN-inversion for mask-free image inpainting and unsupervised pixel-wise anomaly detection
Coverage-centric Coreset Selection for High Pruning Rates
AIA: learn to design greedy algorithm for NP-complete problems using neural networks
Self-Adaptive Perturbation Radii for Adversarial Training
Hybrid and Collaborative Passage Reranking
GCINT: Dynamic Quantization Algorithm for Training Graph Convolution Neural Networks Using Only Integers
ILA-DA: Improving Transferability of Intermediate Level Attack with Data Augmentation
Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning 
3EF: Class-Incremental Learning via Efficient Energy-Based Expansion and Fusion
Out-of-distribution Representation Learning for Time Series Classification
A Closer Look at the Calibration of Differentially Private Learners
AVT: Audio-Video Transformer for Multimodal Action Recognition
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation
Few-Shot Learning with Representative Global Prototype
Important Channel Tuning
Feature-Driven Talking Face Generation with StyleGAN2
Schema Inference for Interpretable Image Classification
SimForest:  An Efficient Plug-in to Boost Few-Shot Learning Performance
Supernet Training for Federated Image Classification Under System Heterogeneity
Domain-Specific Risk Minimization for Out-of-Distribution Generalization
CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
Covariance-Robust Minimax Probability Machines for Algorithmic Recourse
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Ensuring DNN Solution Feasibility for Optimization Problems with Linear Constraints
SpeedAug: A Simple Co-Augmentation Method for Unsupervised Audio-Visual Pre-training
EM-Network: Learning Better Latent Variable for Sequence-to-Sequence Models
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
REPRESENTATIVE PROTOTYPE WITH CONSTRASTIVE LEARNING FOR SEMI-SUPENVISED FEW-SHOT CLASSIFICATION
Data-efficient Supervised Learning is Powerful for Neural Combinatorial Optimization
Temporally-Weighted Spike Encoding for Event-based Object Detection and Classification
Spiking Convolutional Neural Networks for Text Classification
RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild
Personalized Federated Learning with Feature Alignment and Classifier Collaboration
Distributionally Robust Recourse Action
Randomized Smoothing with Masked Inference for Adversarially Robust NLP Systems
Rethinking the Structure of Stochastic Gradients: Empirical and Statistical Evidence
Representing Multi-view Time-series Graph Structures for Multivariate Long-term Time-series Forecasting
Improving Language Model Pretraining with Text Structure Information
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk
Chasing Better Deep Image Priors Between Over- and Under-parameterization
Generalizable Person Re-identification Without Demographics
Simple Yet Effective Graph Contrastive Learning for Recommendation
Clustering-Assisted Foreground and Background Separation for Weakly-supervised Temporal Action Localization
MemoNav: Working Memory Model for Visual Navigation
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
Progressive Voronoi Diagram Subdivision Enables Accurate Data-free Class-Incremental Learning
Data Valuation Without Training of a Model
HotProtein: A Novel Framework for Protein Thermostability Prediction and Editing
Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information
RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning
Behavior Prior Representation learning for Offline Reinforcement Learning
How Does Adaptive Optimization Impact Local Neural Network Geometry?
Substructured Graph Convolution for Non-overlapping Graph Decomposition
Concentric Ring Loss for Face Forgery Detection
MaskConver: A Universal Panoptic and Semantic Segmentation Model with Pure Convolutions
On the Neural Tangent Kernel of Equilibrium Models
From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
Causal Knowledge Transfer from Task Affinity
Beyond Counting Linear Regions of Neural Networks, Simple Linear Regions Dominate!
Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing
SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency
On the Perils of Cascading Robust Classifiers
GENERATIVE OF ORIGIN MODEL DISTRIBUTION MASKED WITH EMOTIONS AND TOPICS DISTRIBUTION IN HYBRID METHOD
Visual Classification via Description from Large Language Models
Unsupervised Visual Anomaly Detection with Score-Based Generative Model
A Data-Based Perspective on Transfer Learning
Contrastive Novelty Learning: Anticipating Outliers with Large Language Models
MIMT: Masked Image Modeling Transformer for Video Compression
Speculative Decoding: Lossless Speedup of Autoregressive Translation
$$CONVOLUTION AND POOLING OPERATION MODULE WITH ADAPTIVE STRIDE PROCESSING EFFEC$$
Transformer Module Networks for Systematic Generalization in Visual Question Answering
Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph Representation Learning
Cluster and Landmark Attributes Infused Graph Neural Networks for Link prediction
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions
Adaptive Robust Evidential Optimization For Open Set Detection from Imbalanced Data
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
Representational Task Bias in Zero-shot Recognition at Scale
AxBERT: An Explainable Chinese Spelling Correction Method Driven by Associative Knowledge Network
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
FINE: Future-Aware Inference for Streaming Speech Translation
PATCH-MIX TRANSFORMER FOR UNSUPERVISED DOMAIN ADAPTATION: A GAME PERSPECTIVE
Dual Diffusion Implicit Bridges for Image-to-Image Translation
HYPERPRUNING: EFFICIENT PRUNING THROUGH LYAPUNOV METRIC HYPERSEARCH
Relational Curriculum Learning for Graph Neural Networks
The World is Changing: Improving Fair Training under Correlation Shifts
ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks
Average Sensitivity of Decision Tree Learning
Minimum Curvature Manifold Learning
GeONet: a neural operator for learning the Wasserstein geodesic
Causal Proxy Models For Concept-Based Model Explanations
Offline Reinforcement Learning with Differential Privacy
Relational Attention: Generalizing Transformers for Graph-Structured Tasks
Accelerating Adaptive Federated Optimization with Local Gossip Communications
On the Complexity of Bayesian Generalization
Distilling Model Failures as Directions in Latent Space
Stable Target Field for Reduced Variance Score Estimation
Graph Contrastive Learning Under Heterophily: Utilizing Graph Filters to Generate Graph Views
Countinuous pseudo-labeling from the start
Hybrid Federated Learning for Feature & Sample Heterogeneity: Algorithms and Implementation
SimA: Simple Softmax-free Attention For Vision Transformers
Bridging the Gap Between Cascade and End-to-End Cross-modal Translation Models: A Zero-Shot Approach
Policy Architectures for Compositional Generalization in Control
GNNDelete: A General Unlearning Strategy for Graph Neural Networks
Lower Bounds for Differentially Private ERM: Unconstrained and Non-Euclidean
Compound Tokens: Channel Fusion for Vision-Language Representation Learning
The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
Are vision transformers more robust than CNNs for Backdoor attacks?
Adaptive Gradient Methods with Local Guarantees
Combinatorial-Probabilistic Trade-Off: P-Values of Community Properties Test in the Stochastic Block Models
RelationCLIP: Training-free Fine-grained Visual and Language Concept Matching
Min-Max Zero-Shot Multi-Label Classification
Meta-Learning in Games
GLASU: A Communication-Efficient Algorithm for Federated Learning with Vertically Distributed Graph Data
Dynamic Embeddings of Temporal High-Order Interactions via Neural Diffusion-Reaction Processes
Fair Federated Learning via Bounded Group Loss
DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
An Upper Bound for the Distribution Overlap Index and Its Applications
Learning by Distilling Context
Target-Free Ligand Scoring via One-Shot Learning
Inverse Kernel Decomposition
Structured Pruning of CNNs at Initialization
Constructive TT-representation of the tensors given as index interaction functions with applications
Thinking Two Moves Ahead: Anticipating Other Users Improves Backdoor Attacks in Federated Learning
Continuized Acceleration for Quasar Convex Functions  in Non-Convex Optimization
Towards Global Optimality in Cooperative MARL with Sequential Transformation
Sparse tree-based Initialization for Neural Networks
Learning Soft Constraints From Constrained Expert Demonstrations
VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis
Unravel Structured Heterogeneity of Tasks in Meta-Reinforcement Learning via Exploratory Clustering
An Investigation of Domain Generalization with Rademacher Complexity
Towards Efficient Posterior Sampling in Deep Neural Networks via Symmetry Removal
Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging
On Emergence of Activation Sparsity in Trained Transformers
Explainable Recommender with Geometric Information Bottleneck
Near-optimal Policy Identification in Active Reinforcement Learning
FixEval: Execution-based Evaluation of Program Fixes for Competitive Programming Problems
Algorithmic Determination of the Combinatorial Structure of the Linear Regions of ReLU Neural Networks
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations
FoSR: First-order spectral rewiring for addressing oversquashing in GNNs
Early Stopping for Deep Image Prior
ON COMPLEX-DOMAIN CNN REPRESENTATIONS FOR CLASSIFYING REAL/COMPLEX-VALUED DATA
FAME: Fast Adaptive Moment Estimation based on Triple Exponential Moving Average
Progressive Transformation Learning For Leveraging Virtual Images in Training
In-Context Policy Iteration
Learning to Grow Pretrained Models for Efficient Transformer Training
Generative Modeling Helps Weak Supervision (and Vice Versa)
What does a platypus look like? Generating customized prompts for zero-shot image classification
Hyperbolic Contrastive Learning for Visual Representations beyond Objects
Provable Memorization Capacity of Transformers
Knowledge-Driven New Drug Recommendation
Beyond Traditional Transfer Learning: Co-finetuning for Action Localisation
Output Distribution over the Entire Input Space: A Novel Perspective to Understand Neural Networks
Learning Control Policies for Region Stabilization in Stochastic Systems
InCoder: A Generative Model for Code Infilling and Synthesis
Bridge the Inference Gaps of Neural Processes via Expectation Maximization
Contrastive Prompt Tuning Improves Generalization in Vision-Language Models
Decentralized Robust V-learning for Solving Markov Games with Model Uncertainty
Generated Graph Detection
Neural Embeddings for Text
Find Your Friends: Personalized Federated Learning with the Right Collaborators
Masked Vision and Language Modeling for Multi-modal Representation Learning
Quantum Fourier Networks for solving Parametric PDEs
Agent-based Graph Neural Networks
Generating Adversarial Examples with Task Oriented Multi-Objective Optimization
On the Performance of Temporal Difference Learning With Neural Networks
Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation
Markup-to-Image Diffusion Models with Scheduled Sampling
ADVERSARIALLY BALANCED REPRESENTATION FOR CONTINUOUS TREATMENT EFFECT ESTIMATION
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
How Much Space Has Been Explored? Measuring the Chemical Space Covered by Databases and Machine-Generated Molecules
Semantic Video Synthesis from Video Scene Graphs
D-CIPHER: Discovery of Closed-form Partial Differential Equations
Towards Identification of Microaggressions in real-life and Scripted conversations, using Context-Aware Machine Learning Techniques.
UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
CUDA: Curriculum of Data Augmentation for Long-tailed Recognition
Understanding new tasks through the lens of training data via exponential tilting
Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions
Equilibrium-finding via exploitability descent with learned best-response functions
A Unified Framework for Comparing Learning Algorithms
Neural Network Approximations of PDEs Beyond Linearity: Representational Perspective
Calibrating Sequence likelihood Improves Conditional Language Generation
Masked inverse folding with sequence transfer for protein representation learning
Convolutions are competitive with transformers for protein sequence pretraining
Learning differentiable solvers for systems with hard constraints
FedDAR: Federated Domain-Aware Representation Learning
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Learning to Estimate Shapley Values with Vision Transformers
No Double Descent in PCA: Training and Pre-Training in High Dimensions
Predicting Drug Repurposing Candidates and Their Mechanisms from A Biomedical Knowledge Graph
ProGen2: Exploring the Boundaries of Protein Language Models
Interval Bound Interpolation for Few-shot Learning with Few Tasks
A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet
Data Poisoning Attacks Against Multimodal Encoders
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models
CEPD: Co-Exploring Pruning and Decomposition for Compact DNN Models
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
Tessellated Neural Networks: A Robust Defence against Adversarial Attacks
Retrieval-based Controllable Molecule Generation
ELRT: Towards Efficient Low-Rank Training for Compact Neural Networks
InfoOT: Information Maximizing Optimal Transport
To be robust and to be fair: aligning fairness with robustness
Posterior Sampling Model-based Policy Optimization under Approximate Inference
Causal discovery from conditionally stationary time series
Fair Clustering via Equalized Confidence
Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees
Tangential Wasserstein Projections
Data Drift Correction via Time-varying Importance Weight Estimator
Analytical Composition of Differential Privacy via the Edgeworth Accountant
Policy-Induced Self-Supervision Improves Representation Finetuning in Visual RL
Deep Generative Symbolic Regression
What Can we Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers?
Solving and Learning non-Markovian Stochastic Control problems in continuous-time with Neural RDEs
Spatio-temporal Self-Attention for Egocentric 3D Pose Estimation
MAE are Secretly Efficient Learners
RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation
Multi-Agent Policy Transfer via Task Relationship Modeling
When does Bias Transfer in Transfer Learning?
Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
AIM: Adapting Image Models for Efficient Video Understanding
Impossibly Good Experts and How to Follow Them
On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Distributionally Robust Post-hoc Classifiers under Prior Shifts
Transformer Meets Boundary Value Inverse Problems
Transferability Between Regression Tasks
Diagnosing and exploiting the computational demands of videos games for deep reinforcement learning
NeuralPCG: Learning Preconditioner for Solving Partial Differential Equations with Graph Neural Network
Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
Cross-Quality Few-Shot Transfer for Alloy Yield Strength Prediction: A New Material Science Benchmark and An Integrated Optimization Framework
Parameter-varying neural ordinary differential equations with partition-of-unity networks
Robust Reinforcement Learning with Distributional Risk-averse formulation
Unicom: Universal and Compact Representation Learning for Image Retrieval
The Reward Hypothesis is False
Convergence of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Diffusion Probabilistic Fields
Improving Information Retention in Large Scale Online Continual Learning
Shape Analysis by Shadow Synthesis
Landscape Learning for Neural Network Inversion
Stochastic Multi-Person 3D Motion Forecasting
ON INJECTING NOISE DURING INFERENCE
LEARNING THE SPECTROGRAM TEMPORAL RESOLUTION FOR AUDIO CLASSIFICATION
Beyond calibration: estimating the grouping loss of modern neural networks
Hybrid RL: Using both offline and online data can make RL efficient
Spotting Expressivity Bottlenecks and Fixing Them Optimally 
Scalable and Privacy-enhanced Graph Generative Model for Graph Neural Networks
Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning
Entropy-Regularized Model-Based Offline Reinforcement Learning
Reward-free Policy Learning through Active Human Involvement
Automaton Distillation: A Neuro-Symbolic Transfer Learning Approach for Deep RL
Sign and Basis Invariant Networks for Spectral Graph Representation Learning
Certification of Attribution Robustness for Euclidean Distance and Cosine Similarity Measure
Diffusing Graph Attention
Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting
Code Translation with Compiler Representations
GAIN: On the Generalization of Instructional Action Understanding
Deep Reinforcement learning on Adaptive Pairwise Critic and Asymptotic Actor
Model-based Value Exploration in Actor-critic Deep Reinforcement Learning
Omnigrok: Grokking Beyond Algorithmic Data
ManyDG: Many-domain Generalization for Healthcare Applications
Adversarial Detector for Decision Tree Ensembles Using Representation Learning
Learning with Instance-Dependent Label Noise: Balancing Accuracy and Fairness
Flow Annealed Importance Sampling Bootstrap
Learning with MISELBO: The Mixture Cookbook
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Robust Attention for Contextual Biased Visual Recognition
A unified optimization framework of ANN-SNN Conversion: towards optimal mapping from activation values to firing rates
Multi-Objective Reinforcement Learning: Convexity, Stationarity and Pareto Optimality
Point-based Molecular Representation Learning from Conformers
Continual Unsupervised Disentangling of Self-Organizing Representations
Inducing Gaussian Process Networks
Causal Inference via Nonlinear Variable Decorrelation in Healthcare
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes
Fooling SHAP with Stealthily Biased Sampling
Towards Realtime Distributed Virtual Flow Meter via Compressed Continual Learning
Asynchronous Gradient Play in Zero-Sum Multi-agent Games
Novel View Synthesis with Diffusion Models
DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images
Robust Neural ODEs via Contractivity-promoting Regularization
Analyzing the Effects of Classifier Lipschitzness on Explainers
Complex-Target-Guided Open-Domain Conversation based on offline reinforcement learning
Trading Information between Latents in Hierarchical Variational Autoencoders
"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts
VC Theoretical Explanation of Double Descent
Points2NeRF: Generating Neural Radiance Fields from 3D point cloud
Imitation Improvement Learning for  Large-scale Capacitated Vehicle Routing Problems
Enhance Local Consistency for Free: A Multi-Step Inertial Momentum Approach
SYNG4ME: Model Evaluation using Synthetic Test Data
Take One Gram of Neural Features, Get Enhanced Group Robustness
LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence
DEEPER-GXX: DEEPENING ARBITRARY GNNS
Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Learning Human-Compatible Representations for Case-Based Decision Support
Long-Tailed Learning Requires Feature Learning
Understanding Hindsight Goal Relabeling Requires Rethinking Divergence Minimization
DoE2Vec: Representation Learning for Exploratory Landscape Analysis
How to Exploit Hyperspherical Embeddings for Out-of-Distribution Detection?
Inferring Causal Relations between Temporal Events
AnyDA: Anytime Domain Adaptation
Improving Deep Regression with Ordinal Entropy
Revisiting Pretraining Objectives for Tabular Deep Learning
OoD-Control: Out-of-Distribution Generalization for Adaptive UAV Flight Control
AdaptFSP: Adaptive Fictitious Self Play
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features
VLG: General Video Recognition with Web Textual Knowledge
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Take 5: Interpretable Image Classification with a Handful of Features
Uncertainty-based Multi-Task Data Sharing for Offline Reinforcement Learning
On the Fast Convergence of Unstable Reinforcement Learning Problems
Iterative Patch Selection for High-Resolution Image Recognition
HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworks
Conditional Antibody Design as 3D Equivariant Graph Translation
 Robust Constrained Reinforcement Learning
Differentiable Meta-Logical Programming
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders
Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation
Efficient Federated Domain Translation 
EIT: Enhanced Interactive Transformer for Sequence Generation
Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization
What can be learnt with wide convolutional neural networks?
3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation
Towards Skilled Population Curriculum for MARL
Logit Clipping for Robust Learning against Label Noise
Clifford Neural Layers for PDE Modeling
GOOD: Exploring geometric cues for detecting objects in an open world
Enhancing Robustness of Deep Networks Based on a Two-phase Model of Their Training with Noisy Labels
Bringing Saccades and Fixations into Self-supervised Video Representation Learning
Improve learning combining crowdsourced labels by weighting Areas Under the Margin
Distraction is All You Need For Fairness
Learning Diverse and Effective Policies with Non-Markovian Rewards
Emergent world representations: Exploring a sequence model trained on a synthetic task
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
M$^3$Video: Masked Motion Modeling for Self-Supervised Video Representation Learning
ObPose: Leveraging Pose for Object-Centric Scene Inference and Generation in 3D
FedCL: Critical Learning Periods-aware Adaptive Client Selection in Federated Learning
TabCaps: A Capsule Neural Network for Tabular Data Classification with BoW Routing
Learning Instance-Solution Operator For Optimal Control
CorruptEncoder: Data Poisoning Based Backdoor Attacks to Contrastive Learning
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Heterogeneous Continual Learning
Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds
BAMBI: Vertical Federated Bilevel Optimization with Privacy-Preserving and Computation Efficiency
Revitalize Region Feature for Democratizing Video-language Pre-training of Retrieval
Local Attention Layers for Vision Transformers
MESSAGENET: MESSAGE CLASSIFICATION USING NATURAL LANGUAGE PROCESSING AND META-DATA
Koopman neural operator for learning non-linear partial differential equations
Regularizing hard examples improves robustness
Universal approximation and model compression for radial neural networks 
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks
MULTILEVEL XAI: VISUAL AND LINGUISTIC BONDED EXPLANATIONS
Efficient Evaluation of Adversarial Robustness for Deep Hashing based Retrieval
An Exact Poly-Time Membership-Queries Algorithm for Extracting a Three-Layer ReLU Network
Neural Discrete Reinforcement Learning
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
Formal Conceptual Views in Neural Networks
Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
A New Paradigm for Federated Structure Non-IID Subgraph Learning
An Intrinsic Dimension Perspective of Transformers for Sequential Modeling
SketchKnitter: Vectorized Sketch Generation with Diffusion Models
Evidential Uncertainty and Diversity Guided Active Learning for Scene Graph Generation
ErGOT: entropy-regularized graph optimal transport
Test-time recalibration of conformal predictors under distribution shift based on unlabeled examples
TabDDPM: Modelling Tabular Data with Diffusion Models
BED: Boundary-Enhanced Decoder for Chinese Word Segmentation
Gradient Inversion via Over-parameterized Convolutional Network in Federated Learning
Memory-Augmented Variational Adaptation for Online Few-Shot Segmentation
Tailoring Language Generation Models under Total Variation Distance
SeqSHAP: Subsequence Level Shapley Value Explanations for Sequential Predictions
Newton Losses: Efficiently Including Second-Order Information into Gradient Descent
BPFL: Towards Efficient Byzantine-Robust and Provably Privacy-Preserving Federated Learning
Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
Anisotropic Message Passing: Graph Neural Networks with Directional and Long-Range Interactions
Learn Low-dimensional Shortest-path Representation of Large-scale and Complex Graphs
SYNC: SAFETY-AWARE NEURAL CONTROL FOR STABILIZING STOCHASTIC DELAY-DIFFERENTIAL EQUATIONS
Byzantine-robust Decentralized Learning via ClippedGossip
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning
Reinforcement learning for instance segmentation with high-level priors
Differentiable Mathematical Programming for Object-Centric Representation Learning
Transformers are Sample-Efficient World Models
Considering Layerwise Importance in the Lottery Ticket Hypothesis
Generalized Sum Pooling for Metric Learning
SAAL: Sharpness-Aware Active Learning
Scalable Subset Sampling with Neural Conditional Poisson Networks
High probability error bounds of SGD in unbounded domain
Improved Convergence of Differential Private SGD with Gradient Clipping
Learning Inductive Object-Centric Slot Initialization via Clustering
Group-level Brain Decoding with Deep Learning
QUANTILE-LSTM: A ROBUST LSTM FOR ANOMALY DETECTION
Mutual Information-guided Knowledge Transfer for Open-World Semi-Supervised Learning
RegQ: Convergent Q-Learning with Linear Function Approximation using Regularization
Neural Field Discovery Disentangles Equivariance in Interacting Dynamical Systems
DIMENSION-REDUCED ADAPTIVE GRADIENT METHOD
Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision
Towards the Out-of-Distribution Generalization of Contrastive Self-Supervised Learning
Online Policy Optimization for Robust MDP
Toeplitz Neural Network for Sequence Modeling
An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning
Relative Positional Encoding Family via Unitary Transformation
Revisiting Feature Acquisition Bias for Few-Shot Fine-Grained Image Classification
ColoristaNet for Photorealistic Video Style Transfer
Auto-Encoding Adversarial Imitation Learning
$\Delta$-PINNs: physics-informed neural networks on complex geometries
On the Nonconvex Convergence of SGD
BiTAT: Neural Network Binarization with Task-Dependent Aggregated Transformation
Dynamic Loss for Learning with Label Noise
Memory of Unimaginable Outcomes in Experience Replay
Temperature Schedules for self-supervised contrastive methods on long-tail data
Deep Learning on Implicit Neural Representations of Shapes
Continual Vision-Language Representaion Learning with Off-Diagonal Information
Learning Counterfactually Invariant Predictors
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting
ImaginaryNet: Learning Object Detectors without Real Images and Annotations
Don't Throw Your Old Policies Away: Knowledge-based Policy Recycling Protects Against Adversarial Attacks
Contextual bandits with concave rewards, and an application to fair ranking
Contrastive Adversarial Loss for Point Cloud Reconstruction
Low-complexity Deep Video Compression with A Distributed Coding Architecture
When Few-shot Meets Cross-domain Object Detection: Learning Instance-level Class Prototypes for Knowledge Transfer
Gradient Boosting Performs Gaussian Process Inference
Constrained Reinforcement Learning for Safety-Critical Tasks via Scenario-Based Programming
TGP: Explainable Temporal Graph Neural Networks for Personalized Recommendation
When is Adversarial Robustness Transferable?
COFS: COntrollable Furniture layout Synthesis
Distribution Shift Detection for Deep Neural Networks
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification
Efficient Policy Space Response Oracles
An Optimal Transport Perspective on Unpaired Image Super-Resolution
A Functional Perspective on Multi-Layer Out-of-Distribution Detection
The Continuous CNN: from Task-Specific to Unified CNN Architecture
Ahead-of-Time P-Tuning
MAXENT LOSS: CONSTRAINED MAXIMUM ENTROPY FOR CALIBRATING DEEP NEURAL NETWORKS
Unsupervised Threshold Learning with "$L$"-trend Prior For Visual Anomaly Detection
Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training
Efficient and Stealthy Backdoor Attack Triggers are Close at Hand
SimST: A GNN-Free Spatio-Temporal Learning Framework for Traffic Forecasting
Property Inference Attacks Against t-SNE Plots
Physically Plausible and Conservative Solutions to Navier-Stokes Equations Using Physics-Informed CNNs
GAMR: A Guided Attention Model for (visual) Reasoning
On the Connection between Fisher's Criterion and Shannon's Capacity: Theoretical Concepts and Implementation
Pixel-Aligned Non-parametric Hand Mesh Reconstruction
Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding 
Is the Deep Model Representation Sparse and Symbolic with Causal Patterns?
Learning QUBO Forms in Quantum Annealing
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Approximate Nearest Neighbor Search through Modern Error-Correcting Codes
Social and environmental impact of recent developments in machine learning on biology and chemistry research
TransformMix: Learning Transformation and Mixing Strategies for Sample-mixing Data Augmentation
When to Make and Break Commitments?
Generalization bounds and algorithms for estimating the effect of multiple treatments and dosage
DENSE RGB SLAM WITH NEURAL IMPLICIT MAPS
Monocular Scene Reconstruction with 3D SDF Transformers
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression
Learning Heterogeneous Interaction Strengths by Trajectory Prediction with Graph Neural Network
From $t$-SNE to UMAP with contrastive learning
On the optimal precision of GANs
Disentangled Knowledge Transfer: A New Perspective for Personalized Federated Learning
D4AM: A General Denoising Framework for Downstream Acoustic Models
Fully Continuous Gated Recurrent Units For processing Time Series
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning 
On Intriguing Layer-Wise Properties of Robust Overfitting in Adversarial Training
Does Federated Learning Really Need Backpropagation?
Teaching Others is Teaching Yourself Regularization For Controllable Language Models
Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers
Saliency-guided Vision Transformer for Few-shot Keypoint Detection
Towards Learning Imperceptible Adversarial Distribution for Black-Box Attacks
Active Learning with Partial Labels
Specialization of Sub-paths for Adaptive Depth Networks
Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective
Fine-Grained Image Retrieval with Neighbor-Attention Label Correction
How Normalization and Weight Decay Can Affect SGD? Insights from a Simple Normalized Model
Closing the Performance Gap between Cumbersome and Lightweight Contrastive Models
DCAPS: Dual Cross-Attention Coupled with Stabilizer for Few-Shot Common Action Localization
Generalize Learned Heuristics to Solve Large-scale Vehicle Routing Problems in Real-time
MUTUAL EXCLUSIVE MODULATOR FOR LONG-TAILED RECOGNITION
RetinexUTV: ROBUST RETINEX MODEL WITH UNFOLDING TOTAL VARIATION
Adapting Pre-trained Language Models for Quantum Natural Language Processing
Towards the Generalization of Contrastive Self-Supervised Learning
Towards Controllable Policy through Goal-Masked Transformers
Fed-CBS: Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction
Comparative Analysis between Vision Transformers and CNNs from the view of Neuroscience
Uncertainty-Aware Meta-Learning for Multimodal Task Distributions
Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes
On the complexity of nonsmooth automatic differentiation
CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving
Bag of Tricks for Unsupervised Text-to-Speech
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy
Holistically Explainable Vision Transformers
Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
Neural Volumetric Mesh Generator
PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion
DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization
Enabling Probabilistic Inference on Large-Scale Spiking Neural Networks
Less is More: Identifying the Cherry on the Cake for Dynamic Networks
Advancing Radiograph Representation Learning with Masked Record Modeling
Instance-wise Batch Label Restoration via Gradients in Federated Learning
Self Check-in: Tight Privacy Amplification for Practical Distributed Learning
Re-parameterizing Your Optimizers rather than Architectures
Protein Representation Learning via Knowledge Enhanced Primary Structure Reasoning
Provable Unsupervised Data Sharing for Offline Reinforcement Learning
Federated Learning for Inference at Anytime and Anywhere
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
A Robustly and Effectively Optimized Pretraining Approach for Masked Autoencoder
Diffusion Posterior Sampling for General Noisy Inverse Problems
Low-Rank Graph Neural Networks Inspired by the Weak-balance Theory in Social Networks
Do We Need Neural Collapse? Learning Diverse Features for Fine-grained and Long-tail Classification
Node-Level Membership Inference Attacks Against Graph Neural Networks
HRBP: Hardware-friendly Regrouping towards Block-wise Pruning for Sparse Training
MAGA: Modeling a Group Action
Learning in Compressed Domain via Knowledge Transfer
DepthFL : Depthwise Federated Learning for Heterogeneous Clients
Masked Image Modeling with Denoising Contrast
Holding Monotonic Improvement and Generality for Multi-Agent Proximal Policy Optimization
Monkeypox with Cross Infection Hypothesis via Epidemiological Mode
LPMARL: Linear Programming based Implicit Task Assignment for Hierarchical Multi-agent Reinforcement Learning
Transmission Dynamics of Hepatitis B: Analysis and Control
Mass-Editing Memory in a Transformer
Enhancement and Numerical Assessment of Novel SARS-CoV-2 Virus Transmission Model
GoBigger: A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation
Masked Unsupervised Self-training for Label-free Image Classification 
Recursion of Thought: Divide and Conquer Reasoning with Language Models
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Environment Partitioning For Invariant Learning By Decorrelation
Learning the Positions in CountSketch
Towards the gradient adjustment by loss status for Neural Network Optimization
On the Necessity of Disentangled Representations for Downstream Tasks
Grouped self-attention mechanism for a memory-efficient Transformer
Linear Video Transformer with Feature Fixation
Neural Frailty Machine: Beyond proportional hazard assumption in neural survival regressions
A Closer Look at Dual Batch Normalization and Two-domain Hypothesis In Adversarial Training With Hybrid Samples
Generative Recorrupted-to-Recorrupted: An Unsupervised Image Denoising Network for Arbitrary Noise Distribution
Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup
Understanding Catastrophic Overfitting in Fast Adversarial Training From a Non-robust Feature Perspective
AutoDisc: Automatic Distillation Schedule for Large Language Model Compression
Lifting the Curse of Capacity Gap in Distilling Large Language Models
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Geo-NN: An End-to-End Framework for Geodesic Mean Estimation on the Manifold of Symmetric Positive Definite Matrices
HIVE: HIerarchical Volume Encoding for Neural Implicit Surface Reconstruction
Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN
Communication-Efficient Federated Learning with Accelerated Client Gradient
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Ranking-Enhanced Unsupervised Sentence Representation Learning
Simultaneously Learning Stochastic and Adversarial Markov Decision Process with Linear Function Approximation
Statistical Efficiency of Score Matching: The View from Isoperimetry
Quadratic models for understanding neural network dynamics
Improving Adversarial Transferability with Worst-case Aware Attacks
TCFimt: Temporal Counterfactual Forecasting from Individual Multiple Treatment Perspective
Curved Representation Space of Vision Transformers
Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective
Gated Domain Units for Multi-source Domain Generalization
CooPredict : Cooperative Differential Games For Time Series Prediction
Learning large-scale Kernel Networks
Self-Architectural Knowledge Distillation for Spiking Neural Networks
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations
Local Coefficient Optimization in Federated Learning
Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation
E$^2$: Entropy Discrimination and Energy Optimization for Source-free Universal Domain Adaptation
Protective Label Enhancement for Label Privacy
Synergistic Neuromorphic Federated Learning with ANN-SNN Conversion For Privacy Protection
On Fairness Measurement for Generative Models
Federated Semi-supervised Learning with Dual Regulator
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Globally Optimal Training of Neural Networks with Threshold Activation Functions
Robust Learning with Decoupled Meta Label Purifier
Molecule Generation For Target Protein Binding with Structural Motifs
Bag of Tricks for FGSM Adversarial Training 
Exploring interactions between modalities for deepfake detection
Towards Robustness Certification Against Universal Perturbations
Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
MAT: Mixed-Strategy Game of  Adversarial Training in Fine-tuning
Defense against Backdoor Attacks via Identifying and Purifying Bad Neurons
Basic Binary Convolution Unit for Binarized Image Restoration Network
DSP: Dynamic Semantic Prototype for Generative Zero-Shot Learning
Analyzing the Latent Space of GAN through Local Dimension Estimation
MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition
A Causal Approach to Detecting Multivariate Time-series Anomalies and Root Causes
Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Cross-modal Graph Contrastive Learning with Cellular Images
A Closer Look at Self-supervised Lightweight Vision Transformers
MANDERA: Malicious Node Detection in Federated Learning via Ranking
MQSP: Micro-Query Sequence Parallelism for Linearly Scaling Long Sequence Transformer
HagSeg: Hardness-adaptive Guidance for Semi-supervised Semantic Segmentation
DSPNet: Towards Slimmable Pretrained Networks based on Discriminative Self-supervised Learning
Generative Multi-Flow Networks: Centralized, Independent and Conservation
A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation
Why pseudo-label based algorithm is effective? --from the perspective of  pseudo-labeled data
Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity
ContraGen: Effective Contrastive Learning For Causal Language Model
Fast 6D Object Pose Refinement via Implicit Surface Representation Driven Optimization
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Bone Shape Reconstruction
Time Series Anomaly Detection via Hypothesis Testing for Dynamical Systems
Exploring the Generalizability of CNNs via Activated Representational Substitution
Measuring and Narrowing the Compositionality Gap in Language Models
FedFA: Federated Learning with Feature Alignment for Heterogeneous Data
HiViT: A Simpler and More Efficient Design of Hierarchical Vision Transformer
Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis
Multimodal Federated Learning via Contrastive Representation Ensemble
Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
Identifying Weight-Variant Latent Causal Models
Beyond Single Path Integrated Gradients for Reliable Input Attribution via Randomized Path Sampling
Sweet Gradient Matters: Designing Consistent and Efficient Estimator for Zero-Shot Neural Architecture Search
Bridging attack and prompting: An Enhanced Visual Prompting at the pixel level
Neural Collaborative Filtering Bandits via Meta Learning
MABA-Net: Masked Additive Binary Activation Network
Cascaded Teaching Transformers with Data Reweighting for Long Sequence Time-series Forecasting
Decoupled and Patch-based Contrastive Learning for Long-tailed Visual Recognition
Can CNNs Be More Robust Than Transformers?
motifNet: Functional motif interactions discovered in mRNA sequences with implicit neural representation learning
Decoupled Mixup for Data-efficient Learning
FAIRER: Fairness as Decision Rationale Alignment
Rethinking Data Augmentation for Improving Transferable Targeted Attacks
A Deep Dive into the Stability-Plasticity Dilemma in Class-Incremental Learning
KITE: A Kernel-based Improved Transferability Estimation Method
Risk-Aware Reinforcement Learning with Coherent Risk Measures and Non-linear Function Approximation
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics
Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients
Learning Continuous Grasping Function with a Dexterous Hand from Human Demonstrations
Hazard Gradient Penalty for Survival Analysis
Model-Agnostic Meta-Attack: Towards Reliable  Evaluation of Adversarial Robustness
Rethink Depth Separation with Intra-layer Links
Reach the Remote Neighbors: Dual-Encoding Transformer for Graphs
The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning
Only For You: Deep Neural Anti-Forwarding Watermark Preserves Image Privacy
When Do Models Generalize? A Perspective From Data-Algorithm Compatibility
On the Saturation Effect of Kernel Ridge Regression
Adversarial perturbation based latent reconstruction for domain-agnostic self-supervised learning
Unsupervised Model Selection for Time Series Anomaly Detection
Constrained Hierarchical Deep Reinforcement Learning with Differentiable Formal Specifications
Topic Aware Transformer: Domain Shift for Unconditional Text Generation Model
PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting
Protein Representation Learning by Geometric Structure Pretraining
Conditional Invariances for Conformer Invariant Protein Representations
Learning PDE Solution Operator for Continuous Modeling of Time-Series
Quantum-Inspired Tensorized Embedding with Application to Node Representation Learning
Identifying Latent Causal Content for Multi-Source Domain Adaptation
Robust Self-Supervised Image Denoising with Cyclic Shift and Noise-Intensity-Aware Uncertainty
Trainable Weight Averaging: Efficient Training by Optimizing Historical Solutions
Revealing Single Frame Bias for Video-and-Language Learning
Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths
DEEAPR: Controllable Depth Enhancement via Adaptive Parametric Feature Rotation
Deep Active Anomaly Detection With Diverse Queries
MODULAR FEDERATED CONTRASTIVE LEARNING WITH PEER NORMALIZATION
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Learning Proximal Operators to Discover Multiple Optima
Guiding continuous operator learning through Physics-based boundary constraints
AdaWAC: Adaptively Weighted Augmentation Consistency Regularization for Volumetric Medical Image Segmentation
Limitations of the NTK for Understanding Generalization in Deep Learning
Federated Learning of Large Models at the Edge via Principal Sub-Model Training
Low-Entropy Features Hurt Out-of-Distribution Performance
Implicit Offline Reinforcement Learning via Supervised Learning
A Unimodal, Uncertainty-Aware Deep Learning Approach for Ordinal Regression
Augmentation Backdoors
Neural Radiance Field Codebooks
Scalable Estimation of Nonparametric Markov Networks with Mixed-Type Data
Determinant regularization for Deep Metric Learning
Data-Efficient and Interpretable Tabular Anomaly Detection
Extracting Expert's Goals by What-if Interpretable Modeling
FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification
A Critical Analysis of Out-of-Distribution Detection for Document Understanding
Learnable Visual Words for Interpreting Image Recognition Models
Compact Bilinear Pooling via General Bilinear Projection
AANG : Automating Auxiliary Learning
Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
Diffusion Probabilistic Modeling of Protein Backbones in 3D for the motif-scaffolding problem
NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes
RbX: Region-based explanations of prediction models
Rethinking Graph Lottery Tickets: Graph Sparsity Matters
The Impact of Approximation Errors on Warm-Start Reinforcement Learning: A Finite-time Analysis
NeRN: Learning Neural Representations for Neural Networks
Private Federated Learning Without a Trusted Server: Optimal Algorithms for Convex Losses
3D-Aware Video Generation
Joint rotational invariance and adversarial training of a dual-stream Transformer yields state of the art Brain-Score for Area V4
AutoSparse: Towards Automated Sparse Training
Improving Molecular Pretraining with Complementary Featurizations
Learning to Communicate using Contrastive Learning 
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning
Motif-induced Graph Normalization
Stochastic Gradient Methods with Preconditioned Updates
Reversible Column Networks
Flexible Relation Preserving for Adversarial Training
Formal Mathematics Statement Curriculum Learning
A Unified Causal View of Domain Invariant Representation Learning
PIPS: Path Integral Stochastic Optimal Control for Path Sampling in Molecular Dynamics
ID and OOD Performance Are Sometimes Inversely Correlated on Real-world Datasets
Continual Learning with Group-wise Neuron Normalization
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
Visual Transformation Telling
PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets
Joint Spatiotemporal Attention for Mortality Prediction of Patients with Long COVID
Predicting Antimicrobial MICs for Nontyphoidal Salmonella Using Multitask Representations Learning 
Bootstrap Motion Forecasting With Self-Consistent Constraints
Sparse Hyperbolic Representation Learning
Fair Multi-exit Framework for Facial Attribute Classification
Learning to Split for Automatic Bias Detection
Union Subgraph Neural Networks
Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts
Frame Adaptive Network
On the Robustness of Safe Reinforcement Learning under Observational Perturbations
Rethinking Saliency in Data-free Class Incremental Learning
Rethinking the Training Shot Number in Robust Model-Agnostic Meta-Learning
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
Neural Decoding of Visual Imagery via Hierarchical Variational Autoencoders
Cooperative Adversarial Learning via Closed-Loop Transcription
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel
Learn Appropriate Precise Distributions for Binary Neural Networks
Correcting Data Distribution Mismatch in Offline Meta-Reinforcement Learning with Few-Shot Online Adaptation
Sharper Rates and Flexible Framework for Nonconvex SGD with Client and Data Sampling
Universal embodied intelligence: learning from crowd, recognizing the world, and reinforced with experience
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders
Multifactor Sequential Disentanglement via Structured Koopman Autoencoders
Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks
T2D: Spatiotemporal Feature Learning Based on Triple 2D Decomposition
Online Placebos for Class-incremental Learning
Evaluating Long-Term Memory in 3D Mazes
Packed Ensembles for efficient uncertainty estimation
Proactive Multi-Camera Collaboration for 3D Human Pose Estimation
OpenFE: Automated Feature Generation beyond Expert-level Performance
FDNet: Focal Decomposed Network for Efficient, Robust and Practical time series forecasting
Physics-empowered Molecular Representation Learning
On the Difficulties of Video Summarization: Structure and Subjectivity
CCMLN: Combinatorial Correction for Multi-Label Classification with Noisy Labels
Revisiting Domain Randomization Via Relaxed State-Adversarial Policy Optimization
Consistent Targets Provide Better Supervision in Semi-supervised Object Detection
Become a Proficient Player with Limited Data through Watching Pure Videos
Evaluation of Attribution Explanations without Ground Truth
Human MotionFormer: Transferring Human Motions with Vision Transformers
Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning
Multi-Agent Sequential Decision-Making via Communication
LAMDA: Latent mapping for domain adaption of image generators
Hierarchies of Reward Machines
EfficientTTS 2: Variational End-to-End Text-to-Speech Synthesis and Voice Conversion
LatentAugment: Dynamically Optimized Latent Probabilities of Data Augmentation
Cali-NCE: Boosting Cross-modal Video Representation Learning with Calibrated Alignment
Novel Class Discovery under Unreliable Sampling
NEW TRAINING FRAMEWORK FOR SPEECH ENHANCEMENT USING REAL NOISY SPEECH
PA-LoFTR: Local Feature Matching with 3D Position-Aware Transformer
Policy Contrastive Imitation Learning
Backstepping Temporal Difference Learning
Reconciling Adversarial Robustness with Accuracy via Randomized Weights
Hidden Markov Transformer for Simultaneous Machine Translation
Rank Preserving Framework for Asymmetric Image Retrieval 
MINI: Mining Implicit Novel Instances for Few-Shot Object Detection
D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing
Single-level Adversarial Data Synthesis based on Neural Tangent Kernels
Learning to Count Everything: Transformer-based Trackers are Strong Baselines for Class Agnostic Counting
Unified Algorithms for RL with Decision-Estimation Coefficients: No-Regret, PAC, and Reward-Free Learning
Strength-Adaptive Adversarial Training
Teach me how to Interpolate a Myriad of Embeddings
Exploring Parameter-Efficient Fine-tuning for Improving Communication Efficiency in Federated Learning
Mega: Moving Average Equipped Gated Attention
Going Deeper with Spiking Neurons: Towards Binary Outputs of Deep Logic Spiking Neural Network
IEDR: A Context-aware Intrinsic and Extrinsic Disentangled Recommender System
Correcting Three Existing Beliefs on Mutual Information in Contrastive Learning
Deep Deformation Based on Feature-Constraint  for 3D Human Mesh Correspondence
Explaining Representation Bottlenecks of Convolutional Decoder Networks
Batch Normalization Is Blind to the First and Second Derivatives of the Loss w.r.t. Features
Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Exploring Chemical Space with Score-based Out-of-distribution Generation
Divide and conquer policy for efficient GAN training
Node Number Awareness Representation for Graph Similarity Learning
Evaluating Fairness Without Sensitive Attributes: A Framework Using Only Auxiliary Models
Dataset Condensation with Latent Space Knowledge Factorization and Sharing
Optimal Neural Network Approximation of Wasserstein Gradient Direction via Convex Optimization
Parallel Deep Neural Networks Have Zero Duality Gap
Causal RL Agents for Out-of-distribution Generalization
Multi-domain image generation and translation with identifiability guarantees
Interventional Rationalization
Information-Theoretic Analysis of Unsupervised Domain Adaptation
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
DELVING INTO THE HIERARCHICAL STRUCTURE FOR EFFICIENT LARGE-SCALE BI-LEVEL LEARNING
Can GNNs Learn Heuristic Information for Link Prediction?
Understanding Zero-shot Adversarial Robustness for Large-Scale Models
HOYER REGULARIZER IS ALL YOU NEED FOR EXTREMELY SPARSE SPIKING NEURAL NETWORKS
Controllable Evaluation and Generation of Physical Adversarial Patch on Face Recognition
Why Adversarial Training of ReLU Networks Is Difficult?
Rademacher Complexity Over $\mathcal{H} \Delta \mathcal{H}$ Class for Adversarially Robust Domain Adaptation
Continual evaluation for lifelong learning: Identifying the stability gap
On the Universal Approximation Property of Deep Fully Convolutional Neural Networks
Can We Faithfully Represent Absence States to Compute Shapley Values on a DNN?
FedGSNR: Accelerating Federated Learning on Non-IID Data via Maximum Gradient Signal to Noise Ratio
Dataless Knowledge Fusion by Merging Weights of Language Models
Domain-Indexing Variational Bayes for Domain Adaptation
Improving the Transferability of Adversarial Attacks through Experienced Precise Nesterov Momentum
TaylorNet: A Taylor-Driven Generic Neural Architecture
Semi-supervised learning of partial differential operators and dynamical flows
View Synthesis with Sculpted Neural Points
Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval
Continual Pre-trainer is an Incremental Model Generalizer
DFlow: Learning to Synthesize Better Optical Flow Datasets via a Differentiable Pipeline
FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training
One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
An Improved Baseline for Masked Contrastive Learning
Make Memory Buffer Stronger in Continual Learning: A Continuous Neural Transformation Approach
Sparse Random Networks for Communication-Efficient Federated Learning
WaveMix-Lite: A Resource-efficient Neural Network for Image Analysis
On the Impact of Adversarially Robust Models on Algorithmic Recourse
Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning
Breaking Beyond COCO Object Detection
BinaryVQA: A Versatile Dataset to Push the Limits of VQA Models
NormSoftmax: Normalize the Input of Softmax to Accelerate and Stabilize Training
Diverse, Difficult, and Odd Instances (D2O): A New Test Set for Object Classification
Differentially Private Dataset Condensation
Variation-based Cause Effect Identification
A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis
TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training 
Robust Fair Clustering: A Novel Fairness Attack and Defense Framework
Learning to Jointly Share and Prune Weights for Grounding Based Vision and Language Models
Coupling Semi-supervised Learning with Reinforcement Learning for Better Decision Making -- An application to Cryo-EM Data Collection
Spatial Attention Kinetic Networks with E(n)-Equivariance
Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Light-weight probing of unsupervised representations for Reinforcement Learning
Understanding Rare Spurious Correlations in Neural Networks
Graph Domain Adaptation via Theory-Grounded Spectral Regularization
Effective dimension of machine learning models
CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning
Data-Free One-Shot Federated Learning Under Very High Statistical Heterogeneity
Personalized Subgraph Federated Learning
Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization
Initial Value Problem Enhanced Sampling for Closed-Loop Optimal Control Design with Deep Neural Networks
Human Pose Estimation in the Dark
Tackling the Retrieval Trilemma with Cross-Modal Indexing
C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining
ProtoVAE: Using Prototypical Networks for Unsupervised Disentanglement
Neural Diffusion Processes
Global Context Vision Transformers
Adversarial Learned Fair Representations using Dampening and Stacking
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
Imposing conservation properties in deep dynamics modeling via contrastive learning
Language Models Can See: Plugging Visual Controls in Text Generation
GReTo: Remedying dynamic graph topology-task discordance via target homophily
Towards predicting dynamic stability of power grids with Graph Neural Networks
Pareto-Optimal Diagnostic Policy Learning in Clinical Applications via Semi-Model-Based Deep Reinforcement Learning
Closing the gap: Exact maximum likelihood training of generative autoencoders using invertible layers
ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging
Dynamics-inspired Neuromorphic Representation Learning
Abstract Visual Reasoning by Self-supervised Contrastive Learning
POPGym: Benchmarking Partially Observable Reinforcement Learning
ETAD: A Sampling-Based Approach for Efficient Temporal Action Detection
HierBatching: Locality-Aware Out-of-Core Training of Graph Neural Networks
Everybody Needs Good Neighbours: An Unsupervised Locality-based Method for Bias Mitigation
Continual Learning In Low-coherence Subspace: A Strategy To Mitigate Learning Capacity Degradation
Particle-based Variational Inference with Preconditioned Functional Gradient Flow
An Efficient Mean-field Approach to High-Order Markov Logic
A theory of representation learning in neural networks gives a deep generalisation of kernel methods
A spatiotemporal graph neural network with multi granularity for air quality prediction
Highway Reinforcement Learning
Learning Locality and Isotropy in Dialogue Modeling
Dynamic Historical Adaptation for Continual Image-Text Modeling
AutoGT: Automated Graph Transformer Architecture Search
Rememory-Based SimSiam for Unsupervised Continual Learning
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
i-MAE: Are Latent Representations in Masked Autoencoders Linearly Separable?
Logit Margin Matters: Improving Transferable Targeted Adversarial Attack by Logit Calibration
GSCA: Global Spatial Correlation Attention
Cross Modal Domain Generalization for Query-based Video Segmentation
Accumulative Poisoning Defense with Memorization Discrepancy
Combating Exacerbated Heterogeneity for Robust Decentralized Models
Shot Retrieval and Assembly with Text Script for Video Montage Generation
Pruning with Output Error Minimization for Producing Efficient Neural Networks
Orientation-Aware Graph Neural Networks for Protein Structure Representation Learning
Adaptive Update Direction Rectification for Unsupervised Continual Learning
Language Model Pre-training with Linguistically Motivated Curriculum Learning
Towards Generalized Combinatorial Solvers via Reward Adjustment Policy Optimization
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Sensitivity-aware Visual Parameter-efficient Tuning
Towards Robust Object Detection Invariant to Real-World Domain Shifts
Light Sampling Field and BRDF Representation for Physically-based Neural Rendering
Margin-based Neural Network Watermarking
Your Denoising Implicit Model is a Sub-optimal Ensemble of Denoising Predictions
DREAM: Domain-free Reverse Engineering Attributes of Black-box Model
Structural Generalization of Visual Imitation Learning with Position-Invariant Regularization
Dealing with missing data using attention and latent space regularization
Revisiting Global Pooling through the Lens of Optimal Transport
On the Importance of Pretrained Knowledge Distillation for 3D Object Detection
Bidirectional Propagation for Cross-Modal 3D Object Detection
Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling
Towards Expressive Graph Representations for Graph Neural Networks
EurNet: Efficient Multi-Range Relational Modeling of Spatial Multi-Relational Data
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
Learning without Prejudices: Continual Unbiased Learning via Benign and Malignant Forgetting
FINDE: Neural Differential Equations for Finding and Preserving Invariant Quantities
Controllable Adaptive Learning
Approximate Vanishing Ideal Computations at Scale
How you start matters for generalization
Understanding Incremental Learning of Gradient Descent: A Fine-grained analysis of Matrix Sensing
Selective Annotation Makes Language Models Better Few-Shot Learners
Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields
Efficient, Stable, and Analytic Differentiation of the Sinkhorn Loss
A Holistic View of Noise Transition Matrix in Deep Learning and Beyond
Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle
Near-Optimal Adversarial Reinforcement Learning with Switching Costs
Bias Mitigation Framework for Intersectional Subgroups in Neural Networks
NORM: Knowledge Distillation via N-to-One Representation Matching
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Revisiting Embeddings for Graph Neural Networks
NOAH: A New Head Structure To Improve Deep Neural Networks For Image Classification
Empirical analysis of representation learning and exploration in neural kernel bandits
S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction
Exploiting Spatial Separability for Deep Learning Multichannel Speech Enhancement with an Align-and-Filter Network
A deep top-down approach to hierarchically coherent probabilistic forecasting 
CroMA: Cross-Modality Adaptation for Monocular BEV Perception
CausalAgents: A Robustness Benchmark for Motion Forecasting Using Causal Relationships
Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
Dynamical Isometry for Residual Networks
PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Variational Imbalanced Regression
EMO: Episodic Memory Optimization for  Few-Shot Meta-Learning
Critic Sequential Monte Carlo
Radial Spike and Slab Bayesian Neural Networks for Sparse Data in Ransomware Attacks
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Explainability of deep reinforcement learning algorithms in robotic domains by using Layer-wise Relevance Propagation
High Dimensional Bayesian Optimization with Reinforced Transformer Deep Kernels
Learning to Take a Break: Sustainable Optimization of Long-Term User Engagement
Laziness, Barren Plateau, and Noises in Machine Learning
HyperQuery: A Framework for Higher Order Link Prediction
Generative Model Based Noise Robust Training for Unsupervised Domain Adaptation
Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Sparse Token Transformer with Attention Back Tracking
A Deep Conjugate Direction Method for Iteratively Solving Linear Systems
MixMask: Revisiting Masked Siamese Self-supervised Learning in Asymmetric Distance
Smart Multi-tenant Federated Learning
Robust Active Distillation
Controllable Image Generation via Collage Representations
Robust Multi-Agent Reinforcement Learning with State Uncertainties
Tiny Adapters for Vision Transformers
Accelerating Inverse Reinforcement Learning with Expert Bootstrapping
Kernel Neural Optimal Transport
Neural Optimal Transport
SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
Harnessing spectral representations for subgraph alignment
MotifExplainer: a Motif-based Graph Neural Network Explainer
Receding Neuron Importances for Structured Pruning
Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
Proximal Curriculum for Reinforcement Learning Agents
Spherical Sliced-Wasserstein
Intepreting & Improving Pretrained Language Models: A Probabilistic Conceptual Approach
Neural Optimal Transport with General Cost Functionals
Does Dataset Lottery Ticket Hypothesis Exist?
Random Weight Factorization improves the training of Continuous Neural Representations
Triangle Inequality for Inverse Optimal Control
InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning
Mixed-Precision Inference Quantization: Problem Resetting, Mapping math concept and Branch\&bound methods
Latent Offline Distributional Actor-Critic
Mixed-Precision Inference Quantization: Radically Towards Faster inference speed,  Lower Storage requirement, and Lower Loss
Leveraging Double Descent for Scientific Data Analysis: Face-Based Social Behavior as a Case Study
Fusion of Deep Transfer Learning with Mixed convolution network
CONTINUAL MODEL EVOLVEMENT WITH INNER-PRODUCT RESTRICTION
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
PREF: Phasorial Embedding Fields for Compact Neural Representations
Truthful Self-Play
Iterative $\alpha$-(de)Blending: Learning a Deterministic Mapping Between Arbitrary Densities
Strategic Classification on Graphs
Causal Information Bottleneck Boosts Adversarial Robustness of Deep Neural Network
Continual Transformers: Redundancy-Free Attention for Online Inference
FedPSE: Personalized Sparsification with Element-wise Aggregation for Federated Learning
Towards Online Real-Time Memory-based Video Inpainting Transformers
Dimensionality-Varying Diffusion Process
Edge-Varying Fourier Graph Network for Multivariate Time Series Forecasting
Learning Symbolic Models for Graph-structured Physical Mechanism
Leveraging variational autoencoders for multiple data imputation
Unleashing Mask: Explore the Intrinsic Out-of-distribution Detection Capability
Dirichlet-based Uncertainty Calibration for Active Domain Adaptation
Accurate Image Restoration with Attention Retractable Transformer
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Imitate Your Own Refinement: Knowledge Distillation Sheds Light on Efficient Image-to-Image Translation
Efficient Trojan Injection: 90% Attack Success Rate Using 0.04% Poisoned Samples
Priors, Hierarchy, and Information Asymmetry for Skill Transfer in Reinforcement Learning
Self-Supervised Set Representation Learning for Unsupervised Meta-Learning
Neural Episodic Control with State Abstraction
Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization
Multigraph Topology Design for Cross-Silo Federated Learning
Universal Speech Enhancement with Score-based Diffusion
Partial Advantage Estimator for Proximal Policy Optimization
Critical Sampling for Robust Evolution Behavior Learning of Unknown Dynamical Systems
Causal Representation Learning for Instantaneous and Temporal Effects
Visual Imitation Learning with Patch Rewards
Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents
Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers
AdaDQH Optimizer: Evolving from Stochastic to Adaptive by Auto Switch of Precondition Matrix
CodeT:  Code Generation with Generated Tests
Learning Specialized Activation Functions for Physics-informed Neural Networks
Dateformer: Transformer Extends Look-back Horizon to Predict Longer-term Time Series
CAMVR: Context-Adaptive Multi-View Representation Learning for Dense Retrieval
BIL: Bandit Inference Learning for Online Representational Similarity Test
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Learning to Generate Columns with Application to Vertex Coloring
On Storage Neural Network Augmented Approximate Nearest Neighbor Search
TPC-NAS: Sub-Five-Minute Neural Architecture Search for Image Classification, Object-Detection, and Super-Resolution
The Role of ImageNet Classes in Fr��chet Inception Distance
Imitation Learning via Differentiable Physics
Diffusion Models Already Have A Semantic Latent Space
Improving group robustness under noisy labels using predictive uncertainty
Mutual Information Regularized Offline Reinforcement Learning
Towards Real-Time Neural Image Compression With Mask Decay
Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
Rethinking Learning Dynamics in RL using Adversarial Networks
Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information
Exploit Unlabeled Data on the Server! Federated Learning via Uncertainty-aware Ensemble Distillation and Self-Supervision
Sample Importance in SGD Training
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Dataset Pruning: Reducing Training Data by Examining Generalization Influence
Visual Timing For Sound Source Depth Estimation in the Wild
Masked Visual-Textual Prediction for Document Image Representation Pretraining
Physics-Regularized Stereo Matching for Depth Estimation
Learning Robust Goal Space with Hypothetical Analogy-Making
Link Prediction without Graph Neural Networks
AdaStride: Using Adaptive Strides in Sequential Data for Effective Downsampling
SAE: Estimation for Transition Matrix in Annotation Algorithms
Subclass-balancing Contrastive Learning for Long-tailed Recognition
Effective Cross-instance Positive Relations for Generalized Category Discovery
Learning Symbolic Rules for Reasoning in Quasi-Natural Language
Parallel Federated Learning over Heterogeneous Devices
Deep Duplex Learning for Weak Supervision
Plateau in Monotonic Linear Interpolation --- A "Biased" View of Loss Landscape for Deep Networks
Expected Gradients of Maxout Networks and Consequences to Parameter Initialization
The KFIoU Loss for Rotated Object Detection
Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting
Focusing on what to decode and what to train: Efficient Training with HOI Split Decoders and Split Target Guided DeNoising
Go-Explore with a guide: Speeding up search in sparse reward settings with goal-directed intrinsic rewards
Mugs: A Multi-Granular Self-Supervised  Learning Framework
Is Stochastic Gradient Descent Near Optimal?
BrainBERT: Self-supervised representation learning for Intracranial Electrodes
Logic-aware Pre-training of Language Models
Individual Fairness of Data Provider Regarding Privacy Risk and Gain
Critical Learning Periods Augmented Model Poisoning Attacks to Byzantine-Robust Federated Learning
Semi-connected Joint Entity Recognition and Relation Extraction of Contextual Entities in Family History Records
Semi-supervised Node Classification with Imbalanced Receptive Field
On the Universality of Langevin Diffusion for Private Euclidean (Convex) Optimization
Fast Test-Time Adaptation Using Hints
Multi-Dataset Multi-Task Framework for Learning Molecules and  Protein-target Interactions Properties
General Neural Gauge Fields
Learning Disentanglement in Autoencoders through Euler Encoding
Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities
Grafting Vision Transformers
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Online Continual Learning for Progressive Distribution Shift (OCL-PDS): A Practitioner's Perspective
Discovering Informative and Robust Positives for Video Domain Adaptation
Understanding Why Generalized Reweighting Does Not Improve Over ERM
Betty: An Automatic Differentiation Library for Multilevel Optimization
PatchBlender: A Motion Prior for Video Transformers
Linear Connectivity Reveals Generalization Strategies
CEREAL: Few-Sample Clustering Evaluation
MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
Gradient-Guided Importance Sampling for Learning Binary Energy-Based Models
Your Neighbors Are Communicating: Towards Powerful and Scalable Graph Neural Networks
PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Composing Ensembles of Pre-trained Models via Iterative Consensus
Automated Data Augmentations for Graph Classification
Learning Label Encodings for Deep Regression
Riemannian Metric Learning via Optimal Transport
CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and Feature Mapping
Computational-Unidentifiability in Representation for Fair Downstream Tasks
Reliability of CKA as a Similarity Measure in Deep Learning
Gradient Properties of Hard Thresholding Operator
Fair Attribute Completion on Graph with Missing Attributes
Comfort Zone: A Vicinal Distribution for Regression Problems
Implicit Neural Spatial Representations for Time-dependent PDEs
Deep Ranking Ensembles for Hyperparameter Optimization
Multi-skill Mobile Manipulation for Object Rearrangement
Robustness to corruption in pre-trained Bayesian neural networks
Accelerating Federated Learning Convergence via Opportunistic Mobile Relaying
What Knowledge gets Distilled in Knowledge Distillation? 
Single-shot General Hyper-parameter Optimization for Federated Learning
Spatially constrained Adversarial Attack Detection and Localization in the Representation Space of Optical Flow Networks
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction
ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation 
$\mathrm{R}^2$-VOS: Robust Referring Video Object Segmentation via Relational Cycle Consistency
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition 
Deep Ensembles for Graphs with Higher-order Dependencies
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Lossless Filter Pruning via Adaptive Clustering for Convolutional Neural Networks
ViT-Adapter: Exploring Plain Vision Transformer for Accurate Dense Predictions
Towards Understanding Why Mask Reconstruction Pretraining Helps in Downstream Tasks
Similarity and Generalization: from Noise to Corruption
On Trace of PGD-Like Adversarial Attacks
MEGAN: Multi Explanation Graph Attention Network
Learning Control Lyapunov Functions For High-dimensional Unknown Systems using Guided Iterative State Space Exploration
Practical Approaches for Fair Learning with Multitype and Multivariate Sensitive Attributes
Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance
Universal Mini-Batch Consistency for Set Encoding Functions
Learn the Time to Learn: Replay Scheduling in Continual Learning
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
Thalamus: a brain-inspired algorithm for biologically-plausible continual learning and disentangled representations
A Generalized EigenGame With Extensions to Deep Multiview Representation Learning
Deep Variational Implicit Processes
Denoising Masked Autoencoders are Certifiable Robust Vision Learners
Estimating individual treatment effects under unobserved confounding using binary instruments
Approximate Bayesian Inference with Stein Functional Variational Gradient Descent
Soundness and Completeness: An Algorithmic Perspective on Evaluation of Feature Attribution
SCoMoE: Efficient Mixtures of Experts with Structured Communication
Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning
Uncertainty-Aware Self-Supervised Learning with Independent Sub-networks
Prompt Learning with Optimal Transport for Vision-Language Models
An Additive Instance-Wise Approach to Multi-class Model Interpretation
Knowledge-Consistent Dialogue Generation with Language Models and Knowledge Graphs
Offline Model-Based Reinforcement Learning with Causal Structure
Towards Semi-Supervised Learning with Non-Random Missing Labels
Improving Generalization with Domain Convex Game
Elastic Mean-Teacher Distillation Mitigates the Continual Learning Stability Gap
Neural Prompt Search
It Takes Two: Masked Appearance-Motion Modeling for Self-Supervised Video Transformer Pre-Training
DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
LDMIC: Learning-based Distributed Multi-view Image Coding
Improving Differentially-Private Deep Learning with Gradients Index Pruning
Additive Poisson Process: Learning Intensity of Higher-Order Interaction in Poisson Processes
Sound Randomized Smoothing in Floating-Point Arithmetic
Shuffle Gaussian Mechanism for Differential Privacy
Assessing Model Out-of-distribution Generalization with Softmax Prediction Probability Baselines and A Correlation Method
In-the-wild Pretrained Models Are Good Feature Extractors for Video Quality Assessment
Collaborative Pure Exploration in Kernel Bandit
Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path
FedREP: A Byzantine-Robust, Communication-Efficient and Privacy-Preserving Framework for Federated Learning
Few-Shot Transferable Robust Representation Learning via Bilevel Attacks
Targeted Adversarial Self-Supervised Learning
Accurate and Efficient Soma Reconstruction in a Full Adult Fly Brain
NIERT: Accurate Numerical Interpolation through Unifying Scattered Data Representations using Transformer Encoder
Triplet Similarity Learning on Concordance Constraint
Temporal Label Smoothing for Early Prediction of Adverse Events
Test-Time Robust Personalization for Federated Learning
On-Device Domain Generalization
What's Wrong with the Robustness of Object Detectors?
LAVA: Data Valuation without Pre-Specified Learning Algorithms
FONDUE: an Algorithm to Find the Optimal Dimensionality of the Latent Representations of Variational Autoencoders
Distortion-Aware Network Pruning and Feature Reuse for Real-time Video Segmentation
An Encryption Framework for Pre-Trained Neural Networks
How do Variational Autoencoders Learn? Insights from Representational Similarity
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
Manifold Characteristics That Predict Downstream Task Performance
Context Autoencoder for Self-Supervised Representation Learning
Learning to Linearize Deep Neural Networks  for Secure and Efficient Private Inference
Mitigating Forgetting in Online Continual Learning via Contrasting Semantically Distinct Augmentations
Wasserstein Fair Autoencoders
Results for Perfect Classification for Graph Attention on the Contextual Stochastic Block Model
Denoising Diffusion Error Correction Codes
Low-Rank Winograd Transformation for 3D Convolutional Neural Networks
Progressive Purification for Instance-Dependent Partial Label Learning
Meta Knowledge Condensation for Federated Learning
Improved Fully Quantized Training via Rectifying Batch Normalization
Video-based 3D Object Detection with Learnable Object-Centric Global Optimization
Edge Wasserstein Distance Loss for Oriented Object Detection
StyleGenes: Discrete and Efficient Latent Distributions for GANs
Scratching Visual Transformer's Back with Uniform Attention
Exploring Active 3D Object Detection from a Generalization Perspective
A Unified Pretraining Framework for Human Motion Analysis
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Lottery Aware Sparsity Hunting: Enabling Federated Learning on Resource-Limited Edge
Self-Organizing Pathway Expansion for Non-Exemplar Incremental Learning
Corruption Depth: Analysis of DNN depth for Misclassification
MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning
Neuro-Symbolic Procedural Planning with Commonsense Prompting
ZERO: A Large-scale Chinese Cross-modal Benchmark with a New Vision-Language Framework
Learning Object-Language Alignments for Open-Vocabulary Object Detection
Phase transition for detecting a small community in a large network
On the Word Boundaries of Emergent Languages Based on Harris's Articulation Scheme
Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels
Zipper: Decoupling the tradeoff Between Robustness and Accuracy
TempCLR: Temporal Alignment Representation with Contrastive Learning
Generative Augmented Flow Networks
Inferring Fluid Dynamics via Inverse Rendering
How Does Value Distribution in Distributional Reinforcement Learning Help Optimization?
Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint
Coordinate and Generalize: A Unified Framework for Audio-Visual Zero-Shot Learning
Interpreting Distributional Reinforcement Learning: A Regularization Perspective
The Power of Regularization in Solving Extensive-Form Games
Neural Topic Modeling with Embedding Clustering Regularization
Distributional Reinforcement Learning via Sinkhorn Iterations
Contextual Symbolic Policy For Meta-Reinforcement Learning
Do We Really Achieve Fairness with Explicit Sensitive Attributes? 
MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
SinGRAV: Learning a Generative Radiance Volume from a Single Natural Scene
Progressive Compressed Auto-Encoder for Self-supervised Representation Learning
ConBaT: Control Barrier Transformer for Safety-Critical Policy Learning
Robust Transfer Learning Based on Minimax Principle
Interpreting Neural Networks Through the Lens of Heat Flow
Efficient Surrogate Gradients for Training Spiking Neural Networks
DCE: Offline Reinforcement Learning With Double Conservative Estimates
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
S-NeRF: Neural Radiance Fields for Street Views
Generalized structure-aware missing view completion network for incomplete multi-view clustering
EXACT: Compositional Augmentation for Image-level Weakly-Supervised Instance Segmentation
Learning Visual Representation with Synthetic Images and Topologically-defined Labels
 Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization
CFlowNets: Continuous control with Generative Flow Networks
Global Hardest Example Mining with Prototype-based Triplet Loss
Differentiable Gaussianization Layers for Inverse Problems Regularized by Deep Generative Models
Extreme Masking for Learning Instance and Distributed Visual Representations
Quality Matters: Embracing Quality Clues for Robust 3D Multi-Object Tracking
MGMA: Mesh Graph Masked Autoencoders for Self-supervised Learning on 3D Shape
DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection
Exploring Low-Rank Property in Multiple Instance Learning for Whole Slide Image Classification
Evaluating and Inducing Personality in Pre-trained Language Models
MLM with Global Co-occurrence
Node Classification Beyond Homophily: Towards a General Solution
Leveraging Hierarchical Structure for Multi-Domain Active Learning with Theoretical Guarantees
Causal Balancing for Domain Generalization
Elastic Aggregation for Federated Optimization
Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One
Reinforced Sample Reweighting Policy for Semi-supervised Learning
Neural Radiance Fields with Geometric Consistency for Few-Shot Novel View Synthesis
Towards Addressing Label Skews in One-shot Federated Learning
Breaking Correlation Shift via Conditional Invariant Regularizer
CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations
Relaxed Combinatorial Optimization Networks with Self-Supervision: Theoretical and Empirical Notes on the Cardinality-Constrained Case
Block and Subword-Scaling Floating-Point (BSFP) : An Efficient Non-Uniform Quantization For Low Precision Inference
Exploring The Capacity Mismatch Problem in Knowledge Distillation from the View of Soft Labels
Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning
FeatER: An Efficient Network for Human Reconstruction Feature map-based TransformER
Pareto Automatic Multi-Task Graph Representation Learning
Semi-supervised Community Detection via Structural Similarity Metrics
DDM$^2$: Self-Supervised Diffusion MRI Denoising with Generative Diffusion Models
Multivariate Time-series Imputation with Disentangled Temporal Representations
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation
Promoting Semantic Connectivity: Dual Nearest Neighbors Contrastive Learning for Unsupervised Domain Generalization
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
MixPath: A Unified Approach for One-shot Neural Architecture Search 
CAST: Concurrent Recognition and Segmentation with Adaptive Segment Tokens
Improving the Latent Space of Image Style Transfer
Multi-lingual Evaluation of Code Generation Models
GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
How Powerful is Implicit Denoising in Graph Neural Networks
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Distribution Aware Metrics for Conditional Natural Language Generation
GOAT: A Global Transformer on Large-scale Graphs
An Empirical Study on Anomaly detection Using Density Based and Representative Based Clustering algorithms
Recommender Transformers with Behavior Pathways
Outlier Robust Adversarial Training
Towards Discovering Neural Architectures from Scratch
Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
Multiple Instance Learning via Iterative Self-Paced Supervised Contrastive Learning
Automating Nearest Neighbor Search Configuration with Constrained Optimization
A prototype-oriented clustering for domain shift with source privacy
On the Effectiveness of Adapting Pre-trained Transformer Models via Adversarial Noise
Sparse Tokens for Dense Prediction - The Medical Image Segmentation Case
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders
NTK-SAP: Improving neural network pruning by aligning training dynamics
One Ring to Bring Them All: Model Adaptation under Domain and Category Shift
Towards Equivariant Graph Contrastive Learning via Cross-Graph Augmentation
Configuring Mixed-Integer Linear Programming Solvers with Deep Metric Learning
Effective Self-supervised Pre-training on Low-compute networks without Distillation
Graph Neural Bandits
CoRTX: Contrastive Framework for Real-time Explanation
MPCFORMER: FAST, PERFORMANT AND PRIVATE TRANSFORMER INFERENCE WITH MPC
Discovering Distinctive ``Semantics'' in Super-Resolution Networks
Networks are Slacking Off: Understanding Generalization Problem in Image Deraining
Disparate Impact in Differential Privacy from Gradient Misalignment
IDP: Iterative Differentiable Pruning based on  Attention for Deep Neural Networks
Language-Guided Artistic Style Transfer Using the Latent Space of DALL-E
FADE: Enabling Large-Scale Federated Adversarial Training on Resource-Constrained Edge Devices
Temporal Relevance Analysis for Video Action Models
HNeRV: A Hybrid Neural Representation  for Videos
DeepPipe: Deep, Modular and Extendable Representations of Machine Learning Pipelines
OTOv2: Automatic, Generic, User-Friendly
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
On the Importance of Architectures and Hyperparameters for Fairness in Face Recognition
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Human Motion Diffusion Model
Federated Learning in Non-IID Settings Aided by Differentially Private Synthetic Data
Structure-based Drug Design with Equivariant Diffusion Models
Deep reinforced active learning for multi-class image classification
HeatDETR: Hardware-Efficient DETR with Device-Adaptive Thinning
ChemSpacE: Interpretable and Interactive Chemical Space Exploration
A UNIFIED VIEW OF FINDING AND TRANSFORMING WINNING LOTTERY TICKETS
The Effects of Nonlinearity on Approximation Capacity of Recurrent Neural Networks
Friends to Help: Saving Federated Learning from Client Dropout
Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation
Probing into the Fine-grained Manifestation in Multi-modal Image Synthesis
Can discrete information extraction prompts generalize across language models?
Deep Power Laws for Hyperparameter Optimization
A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions,  benefit from negative momenta.
Curiosity-Driven Unsupervised Data Collection for Offline Reinforcement Learning
Understanding and Bridging the Modality Gap for Speech Translation
Big Learning: A Universal Machine Learning Paradigm?
Spike Calibration: Bridging the Gap between ANNs and SNNs in ANN-SNN Conversion 
MIA: A Framework for Certified Robustness of Time-Series Classification and Forecasting Against Temporally-Localized Perturbations
Sparse Q-Learning: Offline Reinforcement Learning with Implicit Value Regularization
Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization
TIB: Detecting Unknown Objects via Two-Stream Information Bottleneck
Revisiting Residual Networks for Adversarial Robustness
Win: Weight-Decay-Integrated Nesterov Acceleration for Adaptive Gradient Algorithms
A Quasi-Bayesian Nonparametric Density Estimator via Autoregressive Predictive Updates
Towards Understanding Convergence and Generalization of AdamW
GeoVeX: Geospatial Vectors with Hexagonal Convolutional Autoencoders
Prompt-Matched Semantic Segmentation
Split and Merge Proxy: pre-training protein-protein contact prediction by mining rich information from monomer data
ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration Measure
Interactive Portrait Harmonization
Self-Distillation for Further Pre-training of Transformers
Iterative Relaxing Gradient Projection for Continual Learning
Adversarial Counterfactual Environment Model Learning
Admeta: A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers with Bidirectional Looking
Learning from Interval-valued Data
Feature Synchronization in Backdoor Attacks
Efficient Hyperdimensional Computing
Contextual Convolutional Networks
Factor Learning Portfolio Optimization Informed by Continuous-Time Finance Models
An Incremental Learning Approach for Sustainable Regional Isolation and Integration
GraphPNAS: Learning Distribution of Good Neural Architectures via Deep Graph Generative Models
Private GANs, Revisited
Hidden Poison: Machine unlearning enables camouflaged poisoning attacks
Improve the Adaptation Process by Reasoning From Failed and Successful Cases
FEW-SHOT NODE PROMPT TUNING
Statistical Inference for Fisher Market Equilibrium
Auxiliary task discovery through generate and test
MMTSA: Multi-Modal Temporal Segment Attention Network for Efficient Human Activity Recognition
Scenario-based Question Answering with Interacting Contextual Properties
Easy Differentially Private Linear Regression
PointDP: Diffusion-driven Purification against 3D Adversarial Point Clouds
Deep Physics-based Deformable Models for Efficient Shape Abstractions
Benchmarking and Improving Robustness of 3D Point Cloud Recognition against Common Corruptions
Visual Recognition with Deep Nearest Centroids
Closing the Gap Between SVRG and TD-SVRG with Gradient Splitting
Rethinking Backdoor Data Poisoning Attacks in the Context of Semi-Supervised Learning
Categorial Grammar Induction as a Compositionality Measure for Emergent Languages in Signaling Games
LPT: Long-tailed Prompt Tuning  for Image Classification
Interpretable Out-of-Distribution Detection using Pattern Identification
TopoZero: Digging into  Topology Alignment on Zero-Shot Learning
Digging into Backbone Design on Face Detection
Towards Stable Test-time Adaptation in Dynamic Wild World
Exploring Over-smoothing in Graph Attention Networks from the Markov Chain Perspective
Sorted eigenvalue comparison $d_{\mathsf{Eig}}$: A simple alternative to $d_{\mathsf{FID}}$
Towards Smooth Video Composition
Deep Dynamic AutoEncoder for Vision BERT Pretraining
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Adversarial Collaborative Learning on Non-IID Features
DiffMimic: Efficient Motion Mimicking with Differentiable Physics
Towards Inferential Reproducibility of Machine Learning Research
Knowledge Distillation based Degradation Estimation for Blind Super-Resolution
Very Large Scale Multi-Agent Reinforcement Learning with Graph Attention Mean Field
Graph Contrastive Learning for Skeleton-based Action Recognition
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
Expected Perturbation Scores for Adversarial Detection
Look Back When Surprised: Stabilizing Reverse Experience Replay for Neural Approximation
BQ-NCO: Bisimulation Quotienting for Generalizable Neural Combinatorial Optimization
CoGANs: Collaborative Generative Adversarial Networks
Multiscale Neural Operator: Learning Fast and Grid-independent PDE Solvers
NASiam: Efficient Representation Learning using Neural Architecture Search for Siamese Networks
Out-of-distribution Detection with Diffusion-based Neighborhood
A Massively Parallel Benchmark for Safe Dexterous Manipulation
Never Revisit: Continuous Exploration in Multi-Agent Reinforcement Learning
Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks
Rethinking the Explanation of Graph Neural Network via Non-parametric Subgraph Matching
Spikformer: When Spiking Neural Network Meets Transformer 
Representation Mutual Learning for End-to-End Weakly-Supervised Semantic Segmentation
DeSCo: Towards Scalable Deep Subgraph Counting
On a Built-in Conflict between Deep Learning and Systematic Generalization
SepRep-Net: Multi-source Free Domain Adaptation via Model Separation and Reparameterization
Consistent and Truthful Interpretation with Fourier Analysis
D2Match: Leveraging Deep Learning and Degeneracy for  Subgraph Matching
Multimodal Analogical Reasoning over Knowledge Graphs
QFuture: Learning Future Expectations in Multi-Agent Reinforcement Learning
MMCAP: LEARNING TO BROAD-SIGHT NEURAL NETWORKS BY CLASS ATTENTION POOLING
GAIN: Enhancing Byzantine Robustness in Federated Learning with Gradient Decomposition
Temporary feature collapse phenomenon in early learning of MLPs
MECTA: Memory-Economic Continual Test-Time Model Adaptation
MocoSFL: enabling cross-client collaborative self-supervised learning
Block-level Stiffness Analysis of Residual Networks
Q-Match: Self-Supervised Learning For Tabular Data by Matching Distributions Induced by a Queue
Supervised Contrastive Regression
SELF-SUPERVISED PRETRAINING FOR DIFFERENTIALLY PRIVATE LEARNING
Explainable Artificial Intelligence: Reaping the Fruits of Decision Trees
Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer
Interpretability with full complexity by constraining feature information
Revisiting Group Robustness: Class-specific Scaling is All You Need
Provable Benefits of Representational Transfer in Reinforcement Learning
Set Discrimination Contrastive Learning
What shapes the loss landscape of self supervised learning?
Hard Regularization to Prevent Collapse in Online Deep Clustering without Data Augmentation
Learning Lightweight Object Detectors via Progressive Knowledge Distillation
 Topologically faithful image segmentation via induced matching of persistence barcodes
Prompt-driven efficient Open-set Semi-supervised Learning
Generalizability of Adversarial Robustness Under Distribution Shifts
Uncertainty-Driven Active Vision for Implicit Scene Reconstruction
No Reason for No Supervision: Improved Generalization in Supervised Models
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies
Active Learning with Controllable Augmentation Induced Acquisition
Learning Axis-Aligned Decision Trees with Gradient Descent
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
A Class-Aware Representation Refinement Framework for Graph Classification
EVA3D: Compositional 3D Human Generation from 2D Image Collections
Spurious Local Minima Provably Exist for Deep Convolutional Neural Networks
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game
Semi-Supervised Semantic Segmentation via Boosting Uncertainty on Unlabeled Data
Clustering Structure Identification With Ordering Graph
Benchmarking Deformable Object Manipulation with Differentiable Physics
Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
Graph Contrastive Learning with Personalized Augmentation
Tree Structure LSTM for Chinese Named Entity Recognition
Unfixed Bias Iterator: A New Iterative Format
Conditional Positional Encodings for Vision Transformers
Variational Reparametrized Policy Learning with Differentiable Physics
A Fairness Analysis on Differentially Private Aggregation of Teacher Ensembles
GENERALIZED MATRIX LOCAL LOW RANK REPRESENTATION BY RANDOM PROJECTION AND SUBMATRIX PROPAGATION
Stable, Efficient, and Flexible Monotone Operator Implicit Graph Neural Networks
ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills
LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN Latent Space
Neural Sorting Networks with Error-Free Differentiable Swap Functions
Twofer: Tackling Continual Domain Shift with Simultaneous Domain Generalization and Adaptation
ModelAngelo: Automated Model Building for Cryo-EM Maps
Stealing and Defending Transformer-based Encoders
VectorMapNet: End-to-end Vectorized HD Map Learning
An information-theoretic approach to unsupervised keypoint representation learning
Distilling Cognitive Backdoor within an Image
Formulating and Proving the Trend of DNNs Learning Simple Concepts
Curriculum Reinforcement Learning via Morphology-Environment Co-Evolution
Domain Generalization with Small Data
3D generation on ImageNet
Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching
Hierarchical Prompting Improves Visual Recognition On Accuracy, Data Efficiency and Explainability
Convergence of the mini-batch SIHT algorithm
Decomposing Texture and Semantics for Out-of-distribution Detection
Selective Classification Via Neural Network Training Dynamics
Rethinking the Expressive Power of GNNs via Graph Biconnectivity
One Transformer Can Understand Both 2D & 3D Molecular Data
Hyperbolic Binary Neural Network
Generating Diverse Cooperative Agents by Learning Incompatible Policies
Mind the Gap: Offline Policy Optimizaiton for Imperfect Rewards
Time Series are Images: Vision Transformer for Irregularly Sampled Time Series
Gamma Sampling: Fine-grained Controlling Language Models without Training
Token-Label Alignment for Vision Transformers
3D-Scene-Entities: Using Phrase-to-3D-Object Correspondences for Richer Visio-Linguistic Models in 3D Scenes
Label Distribution Learning via Implicit Distribution Representation
MultiWave: Multiresolution Deep Architectures through Wavelet Decomposition for Multivariate Timeseries Forecasting and Prediction
Learning to Compose Soft Prompts for Compositional Zero-Shot Learning
SQA3D: Situated Question Answering in 3D Scenes
The Benefits of Model-Based Generalization in Reinforcement Learning
Revisiting Higher-Order Gradient Methods for Multi-Agent Reinforcement Learning
SWORD: Demystify the Secrets of Open-world Instance Recognition
Efficient Covariance Estimation for Sparsified Functional Data
Sparse Mixture-of-Experts are Domain Generalizable Learners
Structure-Sensitive Graph Dictionary Embedding for Graph Classification
FlexPose: Pose Distribution Adaptation with Few-shot Guidance
PEER: A Collaborative Language Model
Guide Detectors in Pixel Space with Global Positioning and Abductive Matching
A simple but effective and efficient global modeling paradigm for image restoration
Contrastive Continuity on Augmentation Stability Rehearsal for Continual Self-Supervised Learning
Empowering Networks With Scale and Rotation Equivariance Using A Similarity Convolution
Uncertainty Calibration via Knowledge Flow under Long-tailed Distribution
$1\times1$ Convolution is All You Need for Image Super-Resolution
ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
Robust and Controllable Object-Centric Learning through Energy-based Models
Learning Antidote Data to Individual Unfairness
Does Continual Learning Equally Forget All Parameters?
Voting from Nearest Tasks: Meta-Vote Pruning of Pretrained Models for Downstream Tasks
AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+
STREET: A MULTI-TASK STRUCTURED REASONING AND EXPLANATION BENCHMARK
Topology-aware robust optimization
Exploring Neural Network Representational Similarity using Filter Subspaces
A Close Look at Token Mixer: From Attention to Convolution
EAGLE: Large-scale Learning of Turbulent Fluid Dynamics with Mesh Transformers
Momentum in Momentum for Adaptive Optimization
Limitless Stability for Graph Convolutional Networks 
MiSAL: Active Learning for Every Budget
SOM-CPC: Unsupervised Contrastive Learning with Self-Organizing Maps for Structured Representations of High-Rate Time Series
DIVISION: Memory Efficient Training via Dual Activation Precision
Lossless Dataset Compression Via Dataset Quantization
CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable and Controllable Text-Guided Image Manipulation
NICO++: Towards Better Benchmarking for Domain Generalization
Gradient Norm Regularizer Seeks Flat Minima and Improves Generalization
Token Merging: Your ViT But Faster
TiDAL: Learning Training Dynamics for Active Learning
CompletionFormer: Depth Completion with Convolutions and Vision Transformers
Provable Adaptivity in Adam
MS3: A Multimodal Supervised Pretrained Model for Semantic Segmentation
An Analysis of Information Bottlenecks
De Novo Molecular Generation via Connection-aware Motif Mining
Multiplane NeRF-Supervised Disentanglement of Depth and Camera Pose from Videos
Shared Knowledge Lifelong Learning
GANet: Graph-Aware Network for Point Cloud Completion with Displacement-Aware Point Augmentor
Multiple output samples for each input in a single-output Gaussian process
Demystifying the Optimization and Generalization of Deep PAC-Bayesian Learning
WeightRelay: Efficient Heterogenous Federated Learning on Time Series
Revisiting the Entropy Semiring for Neural Speech Recognition
Rethinking skip connection model as a learnable Markov chain
Activation Function: Absolute Function,One Function Behaves more Individualized
ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing
Measuring axiomatic identifiability of counterfactual image models
Alternating Differentiation for Optimization Layers
Cross-Domain Autonomous Driving Perception using Contrastive Appearance Adaptation
Out-of-distribution Detection with Implicit Outlier Transformation
Parameter Averaging for Feature Ranking
Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained Models
Nearing or Surpassing: Overall Evaluation of Human-Machine Dynamic Vision Ability
Re-balancing Adversarial Training Over Unbalanced Datasets
Extracting Robust Models with Uncertain Examples
Neural Groundplans: Persistent Neural Scene Representations from a Single Image
Unified Vision and Language Prompt Learning
Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling
Calibrating Multimodal Learning
Understanding Self-Supervised Pretraining with Part-Aware Representation Learning
E-CRF: Embedded Conditional Random Field for Boundary-caused Class Weights Confusion in Semantic Segmentation
Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks
Stochastic Differentially Private and Fair Learning
CLIP-FLOW: CONTRASTIVE LEARNING WITH ITERATIVE PSEUDO LABELING FOR OPTICAL FLOW
Smooth-Reduce: Leveraging Patches for Improved Certified Robustness
CAN: A simple, efficient and scalable contrastive masked autoencoder framework for learning visual representations
On The Inadequacy of Optimizing Alignment and Uniformity in Contrastive Learning of Sentence Representations
Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Bidirectional Learning for Offline Model-based Biological Sequence Design
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning
Self-conditioned Embedding Diffusion for Text Generation
Decoupling Concept Bottleneck Model
OhMG: Zero-shot Open-vocabulary Human Motion Generation
AQUILA: Communication Efficient Federated Learning with Adaptive Quantization of Lazily-Aggregated Gradients
Token Turing Machines
Generaling Multimodal Variational Methods to Sets
Towards a Unified View on Visual Parameter-Efficient Transfer Learning
Everyone's Preference Changes Differently: Weighted Multi-Interest Retrieval Model
Variational Autoencoders with Decremental Information Bottleneck for Disentanglement
Volumetric Optimal Transportation by Fast Fourier Transform
GFlowNets and variational inference
Neural Networks and the Chomsky Hierarchy
DeepSAT: An EDA-Driven Learning Framework for SAT
Neural ePDOs: Spatially Adaptive Equivariant Partial Differential Operator Based  Networks
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Cutting Long Gradient Flows: Decoupling End-to-End Backpropagation Based on Supervised Contrastive Learning
Hierarchical Relational Learning for Few-Shot Knowledge Graph Completion
Learn to Know Unknowns: A Bionic Memory Network for Unsupervised Anomaly Detection
Function-Consistent Feature Distillation
Multi-User Reinforcement Learning with Low Rank Rewards
Domain Specific Denoising Diffusion Probabilistic Models for Brain Dynamics
The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition
Approximated Anomalous Diffusion: Gaussian Mixture Score-based Generative Models
MCAL: Minimum Cost Human-Machine Active Labeling
BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation
A Simple and Provable Method to Adapt Pre-trained Model across Domains with Few Samples
CD-Depth: Unsupervised Domain Adaptation for Depth Estimation via Cross Domain Integration
SegNeRF: 3D Part Segmentation with Neural Radiance Fields
EyeDAS: Securing Perception of Autonomous Cars Against the Stereoblindness Syndrome
Learnable Topological Features For Phylogenetic Inference via Graph Neural Networks
HRDFuse: Monocular 360$^\circ$ Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
SpQAT: A Sparse Quantization-Aware Training Method
Double dynamic sparse training for GANs
Bayesian Robust Graph Contrastive Learning
Hardware-restriction-aware training (HRAT) for memristor neural networks
FreeSeg: Free Mask from Interpretable Contrastive Language-Image Pretraining for Semantic Segmentation
DifFace: Blind Face Restoration with Diffused Error Contraction
ViTKD: Practical Guidelines for ViT Feature Knowledge Distillation
Fairness-aware Contrastive Learning with Partially Annotated Sensitive Attributes
Training Instability and Disharmony Between ReLU and Batch Normalization
Rotamer Density Estimators are Unsupervised Learners of the Effect of Mutations on Protein-Protein Interaction
Faster Neural Architecture "Search" for Deep Image Prior
Dilated convolution with learnable spacings
PatchDCT: Patch Refinement for High Quality Instance Segmentation
Global Prototype Encoding for Incremental Video Highlights Detection
WaGI: Wavelet-based GAN Inversion for Preserving High-Frequency Image Details
Neural-Symbolic Recursive Machine for Systematic Generalization
ChiroDiff: Modelling chirographic data with Diffusion Models
Object Localization helps Action Recognition Models Adapt to New Environments
Active Topological Mapping by Metric-Free Exploration via Task and Motion Imitation
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
SoundNeRirF: Receiver-to-Receiver Sound Neural Room Impulse Response Field
Towards Sustainable Self-supervised Learning
Real-Time Image Demoir$\acute{e}$ing on Mobile Devices
Domain Generalization via Independent Regularization from Early-branching Networks
AutoSKDBERT: Learn to Stochastically Distill BERT
QCRS: Improve Randomized Smoothing using Quasi-Concave Optimization
Training A Multi-stage Deep Classifier with Feedback Signals
Is Self-Supervised Contrastive Learning More Robust Than Supervised Learning?
An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models
Unsupervised Learning of Causal Relationships from Unstructured Data
The Biased Artist: Exploiting Cultural Biases via Homoglyphs in Text-Guided Image Generation Models
Parameterized projected Bellman operator
Module-wise Training of Residual Networks via the Minimizing Movement Scheme
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification
kaBEDONN: posthoc eXplainable Artificial Intelligence with Data Ordered Neural Network
DELTA: DEBIASED FULLY TEST-TIME ADAPTATION
Bit-Pruning: A Sparse Multiplication-Less Dot-Product
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features
KNN-Diffusion: Image Generation via Large-Scale Retrieval
Steering Prototypes with Prompt Tuning for Rehearsal-free Continual Learning
Normalized Activation Function: Toward Better Convergence
IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?
Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
 Decompose to Generalize: Species-Generalized Animal Pose Estimation
Correcting the Sub-optimal Bit Allocation
IDEAL: Query-Efficient Data-Free Learning from Black-Box Models
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
(LA)YER-NEIGH(BOR) SAMPLING: DEFUSING NEIGHBORHOOD EXPLOSION
Probing into Overfitting for Video Recognition
Image as Set of Points
Examining the Value of Neural Filter Pruning -- Retrospect and Prospect
Sparse Misinformation Detector
Hybrid Neuro-Symbolic Reasoning based on Multimodal Fusion
Distilling Text-Image Foundation Models
Trainability Preserving Neural Pruning
Rotation Invariant Quantization for Model Compression
Robustness Exploration of Semantic Information in Adversarial Training
Learning Implicit Scale Conditioned Memory Compensation for Talking Head Generation
On the Dynamics under the Averaged Sample Margin Loss and Beyond
ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers
DrML: Diagnosing and Rectifying Vision Models using Language
Semantic Grouping Network for Audio Source Separation
Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program
Improving Corruption Robustness with Adversarial Feature Alignment Transformers
Sharpness-aware Quantization for Deep Neural Networks
Robust Generalization against Corruptions via Worst-Case Sharpness Minimization
Harnessing Out-Of-Distribution Examples via Augmenting Content and Style
On Stability and Generalization of Bilevel Optimization Problems
Learning GFlowNets from partial episodes for improved convergence and stability
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Self-attentive Rationalization for Graph Contrastive Learning
A Unified Framework of Soft Threshold Pruning
Efficient Automatic Machine Learning via Design Graphs
TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding
Optimizing Server-side Aggregation For Robust Federated Learning via Subspace Training
Measuring Asymmetric Gradient Discrepancy in Parallel Continual Learning
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent
CI-VAE: a Class-Informed Deep Variational Autoencoder for Enhanced Class-Specific Data Interpolation
Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning
Learning Domain-Agnostic Representation for Disease Diagnosis
DOTIN: Dropping Out Task-Irrelevant Nodes for GNNs
Boosting Out-of-Distribution Detection with Multiple Pre-trained Models 
Minimax Optimal Kernel Operator Learning via Multilevel Training
STViT: Semantic Tokens for Efficient Global and Local Vision Transformers
Learning a 3D-Aware Encoder for Style-based Generative Radiance Field
MixQuant: A Quantization Bit-width Search that Can Optimize the Performance of your Quantization Method
Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning
S-SOLVER: Numerically Stable Adaptive Step Size Solver for Neural ODEs
TT-NF: Tensor Train Neural Fields
Partial transportability for domain generalization
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
Feint in Multi-Player Games
Succinct Compression: Lossless Compression for Fast and Memory-Efficient Deep Neural Network Inference
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection
Expanding Datasets With Guided Imagination
ThinkSum: Probabilistic reasoning over sets using large language models
Universal Unlearnable Examples: Cluster-wise Perturbations without Label-consistency
Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation
On the Calibration Set Difficulty and Out-of-distribution Calibration
Design of the topology for contrastive visual-textual alignment
Slimmable Networks for Contrastive Self-supervised Learning
Interpretable Single/Multi-label Text Classification with Unsupervised Constituent-label alignments
Suppressing the Heterogeneity: A Strong Feature Extractor for Few-shot Segmentation
Defactorization Transformer: Modeling Long Range Dependency with Local Window Cost
MaPLe: Multi-modal Prompt Learning
Communication Efficient Fair Federated Recommender System
Grassmannian Class Representation in Deep Learning
Refining Visual Representation for Generalized Zero-Shot Recognition through Implicit-Semantics-Guided Metric Learning
Reward Learning with Trees: Methods and Evaluation
Achieve the Minimum Width of Neural Networks for Universal Approximation
H2RBox: Horizonal Box Annotation is All You Need for Oriented Object Detection
Sparse and Hierarchical Masked Modeling for Convolutional Representation Learning
Functional Relation Field: A Model-Agnostic Framework for Multivariate Time Series Forecasting
Motion-inductive Self-supervised Object Discovery in Videos
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving
Pushing the Limits of Fewshot Anomaly Detection in Industry Vision: Graphcore
Evaluating Weakly Supervised Object Localization Methods Right? A Study on Heatmap-based XAI and Neural Backed Decision Tree
HyperFeel: An Efficient Federated Learning Framework Using Hyperdimensional Computing
TEAS: Exploiting Spiking Activity for Temporal-wise Adaptive Spiking Neural Networks
Quasi-Conservative Score-based Generative Models
Multi-Modal Few-Shot Temporal Action Detection
Representation Learning for Low-rank General-sum Markov Games
Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Diversify and Disambiguate: Out-of-Distribution Robustness via Disagreement
MaSS: Multi-attribute Selective Suppression
Mimic before Reconstruct: Enhance Masked Autoencoders with Feature Mimicking
Neural Attention Memory
Meta Optimal Transport
On amortizing convex conjugates for optimal transport
Exploring Visual Interpretability for Contrastive Language-Image Pretraining
Example-based Planning via Dual Gradient Fields
DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Manipulation
GraphCG: Unsupervised Discovery of Steerable Factors in Graphs
Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching
SIMPLE: Specialized Model-Sample Matching for Domain Generalization
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Trust-consistent Visual Semantic Embedding for Image-Text Matching
Rethinking Knowledge Distillation via Cross-Entropy
Protein structure generation via folding diffusion
Backpropagation Path Search On Adversarial Transferability
Delving into Semantic Scale Imbalance
Masked Surfel Prediction for Self-Supervised Point Cloud Learning
Do Spiking Neural Networks Learn Similar Representation with Artificial Neural Networks? A Pilot Study on SNN Representation
DAG Matters! GFlowNets Enhanced Explainer for Graph Neural Networks
Generalized Category Discovery via Adaptive GMMs without Knowing the Class Number
A MULTI-SCALE STRUCTURE-PRESERVING HETEROLOGOUS IMAGE TRANSFORMATION ALGORITHM BASED ON CONDITIONAL ADVERSARIAL NETWORK LEARNING
Metro: Memory-Enhanced Transformer for Retrosynthetic Planning via Reaction Tree
In the ZONE: Measuring difficulty and progression in curriculum generation
Object Tracking by Hierarchical Part-Whole Attention
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
scFormer: a universal representation learning approach for single-cell data using transformers
Understanding the Training Dynamics in Federated Deep Learning via Aggregation Weight Optimization
BiBench: Benchmarking and Analyzing Network Binarization
Contextual Image Masking Modeling via Synergized Contrasting without View Augmentation for Faster and Better Visual Pretraining
Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning
Winograd Structured Pruning for Fast Winograd Convolution 
Continuous-Discrete Convolution for (3+1)D Geometry-Sequence Modeling in Proteins
ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training
Model-agnostic Measure of Generalization Difficulty
Efficient Multi-Task Reinforcement Learning via Selective Behavior Sharing
Efficient Exploration via Fragmentation and Recall
Hedge Your Actions: Flexible Reinforcement Learning for Complex Action Spaces
Learning Geometric Representations of Interactive Objects
What learning algorithm is in-context learning? Investigations with linear models
Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism
End-to-End Speech Synthesis Based on Deep Conditional Schrödinger Bridges
SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization
NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN
Schrödinger's FP: Training Neural Networks with Dynamic Floating-Point Containers
How Erdös and Rényi Win the Lottery
On the Lower Bound of Minimizing Polyak-Łojasiewicz functions