-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpaper.txt
4937 lines (4937 loc) · 348 KB
/
paper.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Quantum reinforcement learning
Quantifying and Mitigating the Impact of Label Errors on Model Disparity Metrics
Suppression helps: Lateral Inhibition-inspired Convolutional Neural Network for Image Classification
Factorized Fourier Neural Operators
DFPC: Data flow driven pruning of coupled channels without data.
TVSPrune - Pruning Non-discriminative filters via Total Variation separability of intermediate representations without fine tuning
Adversarial Training descends without descent: Finding actual descent directions based on Danskin's theorem
A Study of Biologically Plausible Neural Network: the Role and Interactions of Brain-Inspired Mechanisms in Continual Learning
Learning Continuous Normalizing Flows For Faster Convergence To Target Distribution via Ascent Regularizations
pFedKT: Personalized Federated Learning via Knowledge Transfer
FARE: Provably Fair Representation Learning
ONLINE RESTLESS BANDITS WITH UNOBSERVED STATES
Dual-Domain Diffusion Based Progressive Style Rendering towards Semantic Structure Preservation
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Learning to aggregate: A parameterized aggregator to debias aggregation for cross-device federated learning
NeuralStagger: accelerating physics constrained neural PDE solver with spatial-temporal decomposition
Towards Robust Online Dialogue Response Generation
Deep Reinforcement Learning based Insight Selection Policy
Data Leakage in Tabular Federated Learning
Long-horizon video prediction using a dynamic latent hierarchy
SwinZS3: Zero-Shot Semantic Segmentation with a Swin Transformer
Softened Symbol Grounding for Neuro-symbolic Systems
Encoding Recurrence into Transformers
Generating Intuitive Fairness Specifications for Natural Language Processing
Learning to Perturb for Contrastive Learning of Unsupervised Sentence Representations
Proper Scoring Rules for Survival Analysis
Social Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS
Mini-batch $k$-means terminates within $O(d/\epsilon)$ iterations
Convergence is Not Enough: Average-Case Performance of No-Regret Learning Dynamics
Gene finding revisited: improved robustness through structured decoding from learning embeddings
PPAT: Progressive Graph Pairwise Attention Network for Event Causality Identification
Learning Uncertainty for Unknown Domains with Zero-Target-Assumption
Detecting Out-of-Distribution Data with Semi-supervised Graph ��Feature" Networks
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flow
Towards a Complete Theory of Neural Networks with Few Neurons
Machine Learning from Explanations
Functional Risk Minimization
Latent Linear ODEs with Neural Kalman Filtering for Irregular Time Series Forecasting
Transformer-based model for symbolic regression via joint supervised learning
Gradient-Based Transfer Learning
Coreset for Rational Functions
Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation
Transformer needs NMDA receptor nonlinearity for long-term memory
Simple Spectral Graph Convolution from an Optimization Perspective
QAID: Question Answering Inspired Few-shot Intent Detection
Rethinking the Value of Prompt Learning for Vision-Language Models
Partial Output Norm: Mitigating the Model Output Blow-up Effect of Cross Entropy Loss
Disentangled Feature Swapping Augmentation for Weakly Supervised Semantic Segmentation
FLOP: Tasks for Fitness Landscapes Of Protein families using sequence- and structure-based representations
Distributed Least Square Ranking with Random Features
Doing Fast Adaptation Fast: Conditionally Independent Deep Ensembles for Distribution Shifts
Solving stochastic weak Minty variational inequalities without increasing batch size
Diversity Boosted Learning for Domain Generalization with a Large Number of Domains
A Hybrid Framework for Generating A Country-scale Synthetic Population
Towards Performance-maximizing Network Pruning via Global Channel Attention
Adaptive Block-wise Learning for Knowledge Distillation
Curriculum-based Co-design of Morphology and Control of Voxel-based Soft Robots
Object-Centric Learning with Slot Mixture Models
WiNeRT: Towards Neural Ray Tracing for Wireless Channel Modelling and Differentiable Simulations
Pocket-specific 3D Molecule Generation by Fragment-based Autoregressive Diffusion Models
Learning with Non-Uniform Label Noise: A Cluster-Dependent Semi-Supervised Approach
Towards scalable and non-IID robust Hierarchical Federated Learning via Label-driven Knowledge Aggregator
Free Bits: Platform-Aware Latency Optimization of Mixed-Precision Neural Networks for Edge Deployment
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
On the Certification of Classifiers for Outperforming Human Annotators
Loss Adapted Plasticity: Learning From Data With Unreliable Sources
Share Your Representation Only: Guaranteed Improvement of the Privacy-Utility Tradeoff in Federated Learning
Quantized Disentangled Representations for Object-Centric Visual Tasks
Supervised Random Feature Regression via Projection Pursuit
Graph Spline Networks for Efficient Continuous Simulation of Dynamical Systems
Online black-box adaptation to label-shift in the presence of conditional-shift
RuDar: Weather Radar Dataset for Precipitation Nowcasting with Geographical and Seasonal Variability
Learning Representations for Reinforcement Learning with Hierarchical Forward Models
xTrimoABFold: Improving Antibody Structure Prediction without Multiple Sequence Alignments
Thresholded Lexicographic Ordered Multi-Objective Reinforcement Learning
HOW SAMPLING AFFECTS TRAINING: AN EFFECTIVE SAMPLING THEORY STUDY FOR LONG-TAILED IMAGE CLASSIFICATION
MolBART: Generative Masked Language Models for Molecular Representations
EquiMod: An Equivariance Module to Improve Self-Supervised Learning
Cross-utterance Conditioned Coherent Speech Editing via Biased Training and Entire Inference
Manipulating Multi-agent Navigation Task via Emergent Communications
Task-Aware Information Routing from Common Representation Space in Lifelong Learning
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
SWRM: Similarity Window Reweighting and Margins for Long-Tailed Recognition
Transport with Support: Data-Conditional Diffusion Bridges
Supervised Q-Learning can be a Strong Baseline for Continuous Control
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Self-Supervised Off-Policy Ranking via Crowd Layer
Probing for Correlations of Causal Facts: Large Language Models and Causality
Geometry Problem Solving based on Counterfactual Evolutionary Reasoning
Few-Shot Domain Adaptation For End-to-End Communication
HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Network
Causal Inference for Knowledge Graph Completion
Formal Specifications from Natural Language
DELTA: Diverse Client Sampling for Fasting Federated Learning
Incremental Predictive Coding: A Parallel and Fully Automatic Learning Algorithm
Rethinking Metric Based Contrastive Learning Method��s Generalization Capability
RISC-V MICROARCHITECTURE EXPLORATION VIA REINFORCEMENT LEARNING
Improve distance metric learning by learning positions of class centers
The guide and the explorer: smart agents for resource-limited iterated batch reinforcement learning
FairGBM: Gradient Boosting with Fairness Constraints
Kinship Representation Learning with Face Componential Relation
Pseudo-Differential Integral Operator for Learning Solution Operators of Partial Differential Equations
How (Un)Fair is Text Summarization?
Simulating Task-Free Continual Learning Streams From Existing Datasets
Online Bias Correction for Task-Free Continual Learning
A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration
Enriching Online Knowledge Distillation with Specialist Ensemble
Improved Training of Physics-Informed Neural Networks with Model Ensembles
Improved Gradient Descent Optimization Algorithm based on Inverse Model-Parameter Difference
Variational Learning ISTA
Moment Distributionally Robust Probabilistic Supervised Learning
CLEP: Exploiting Edge Partitioning for Graph Contrastive Learning
Meta-Learning the Inductive Biases of Simple Neural Circuits
Enabling Equation Learning with the Bayesian Model Evidence via systematic $R^2$-elimination
Curvature Informed Furthest Point Sampling
Accelerating spiking neural network training using the $d$-block model
RG: OUT-OF-DISTRIBUTION DETECTION WITH REACTIVATE GRADNORM
Don��t fear the unlabelled: safe semi-supervised learning via debiasing
Gandalf : Data Augmentation is all you need for Extreme Classification
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
Attention Flows for General Transformers
Grounded Contrastive Learning for Open-world Semantic Segmentation
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
Learning Group Importance using the Differentiable Hypergeometric Distribution
Convergence Rate of Primal-Dual Approach to Constrained Reinforcement Learning with Softmax Policy
Cross-Layer Retrospective Retrieving via Layer Attention
RephraseTTS: Dynamic Length Text based Speech Insertion with Speaker Style Transfer
Decision S4: Efficient Sequence-Based RL via State Spaces Layers
Deep autoregressive density nets vs neural ensembles for model-based offline reinforcement learning
Light and Accurate: Neural Architecture Search via Two Constant Shared Weights Initialisations
Unveiling the sampling density in non-uniform geometric graphs
Smooth image-to-image translations with latent space interpolations
Boosting Causal Discovery via Adaptive Sample Reweighting
Robust Training through Adversarially Selected Data Subsets
Beyond Reward: Offline Preference-guided Policy Optimization
Iterative Circuit Repair Against Formal Specifications
Neural Probabilistic Logic Programming in Discrete-Continuous Domains
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Behavior Proximal Policy Optimization
UiTTa: Online Test-Time Adaptation by User Interaction
FedGC: An Accurate and Efficient Federated Learning under Gradient Constraint for Heterogeneous Data
Actionable Neural Representations: Grid Cells from Minimal Constraints
xTrimoDock: Cross-Modal Transformer for Multi-Chain Protein Docking
Compression-aware Training of Neural Networks using Frank-Wolfe
Modeling content creator incentives on algorithm-curated platforms
MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals
Group-Disentangling Conditional Shift
When and Why Is Pretraining Object-Centric Representations Good for Reinforcement Learning?
Face reconstruction from facial templates by learning latent space of a generator network
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules
A sparse, fast, and stable representation for multiparameter topological data analysis
What's in a name? The Influence of Personal Names on Spatial Reasoning in BLOOM Large Language Models
Contrastive Representation Learning for Multi-scale Spatial Scenes
Improving Protein Interaction Prediction using Pretrained Structure Embedding
Batch Normalization and Bounded Activation Functions
Versatile Energy-Based Models for High Energy Physics
MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for Long-tailed Semantic Segmentation
Concept-level Debugging of Part-Prototype Networks
Geometrically regularized autoencoders for non-Euclidean data
Model-based Unknown Input Estimation via Partially Observable Markov Decision Processes
TransFool: An Adversarial Attack against Neural Machine Translation Models
Protein Sequence Design in a Latent Space via Model-based Reinforcement Learning
Breaking Large Language Model-based Code Generation
The GANfather: Controllable generation of malicious activity to expose detection weaknesses and improve defence systems.
Proximal Validation Protocol
A Message Passing Perspective on Learning Dynamics of Contrastive Learning
Farsighter: Efficient Multi-step Exploration for Deep Reinforcement Learning
Help Me Explore: Combining Autotelic and Social Learning via Active Goal Queries
AUTOMATIC CURRICULUM FOR UNSUPERVISED REIN- FORCEMENT LEARNING
Exploiting Personalized Invariance for Better Out-of-distribution Generalization in Federated Learning
Filtered Semi-Markov CRF
Zeroth-Order Optimization with Trajectory-Informed Derivative Estimation
Distance VS. Coordinate: Distance Based Embedding Improves Model Generalization for Routing Problems
Towards biologically plausible Dreaming and Planning
Mixture of Basis for Interpretable Continual Learning with Distribution Shifts
Extracting Meaningful Attention on Source Code: An Empirical Study of Developer and Neural Model Code Exploration
Denoising Differential Privacy in Split Learning
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
On Representation Learning in the First Layer of Deep CNNs and the Dynamics of Gradient Descent
Learning Layered Implicit Model for 3D Avatar Clothing Representation
Scrunch: Preventing sensitive property inference through privacy-preserving representation learning
Uniform-in-time propagation of chaos for the mean field gradient Langevin dynamics
GM-VAE: Representation Learning with VAE on Gaussian Manifold
Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples
Generalizable Multi-Relational Graph Representation Learning: A Message Intervention Approach
Causal Explanations of Structural Causal Models
Asynchronous Distributed Bilevel Optimization
Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Confidence-Based Feature Imputation for Graphs with Partially Known Features
Explicitly Maintaining Diverse Playing Styles in Self-Play
Toward Learning Geometric Eigen-Lengths Crucial for Robotic Fitting Tasks
Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions
LiftedCL: Lifting Contrastive Learning for Human-Centric Perception
Individual Privacy Accounting with Gaussian Differential Privacy
Evolving Populations of Diverse RL Agents with MAP-Elites
Deconfounded Noisy Labels Learning
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
Learning Test Time Augmentation with Cascade Loss Prediction
Adaptive Computation with Elastic Input Sequence
Opportunistic Actor-Critic (OPAC) with Clipped Triple Q-learning
Optimizing Data-Flow in Binary Neural Networks
Gray-Box Gaussian Processes for Automated Reinforcement Learning
Protein Sequence and Structure Co-Design with Equivariant Translation
Deep Equilibrium Non-Autoregressive Sequence Learning
PTUnifier: Pseudo Tokens as Paradigm Unifiers in Medical Vision-and-Language Pre-training
SGD Through the Lens of Kolmogorov Complexity
Offline imitation learning by controlling the effective planning horizon
Learning in temporally structured environments
Identifying Phase Transition Thresholds of Permuted Linear Regression via Message Passing
RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates
Improving the Calibration of Fine-tuned Language Models via Denoising Variational Auto-Encoders
A Hierarchical Bayesian Approach to Federated Learning
Neural Representations in Multi-Task Learning guided by Task-Dependent Contexts
MCTransformer: Combining Transformers And Monte-Carlo Tree Search For Offline Reinforcement Learning
One-Step Estimator for Permuted Sparse Recovery
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Guarded Policy Optimization with Imperfect Online Demonstrations
Fast Nonlinear Vector Quantile Regression
Multi Task Learning of Different Class Label Representations for Stronger Models
On the Existence of a Trojaned Twin Model
On Information Maximisation in Multi-View Self-Supervised Learning
Leveraging Large Language Models for Multiple Choice Question Answering
SELCOR: Self-Correction for Weakly Supervised Learning
Efficiently Meta-Learning for Robust Deep Networks without Prior Unbiased Set
Learning with Logical Constraints but without Shortcut Satisfaction
Certified Training: Small Boxes are All You Need
Label Similarity Aware Contrastive Learning
Counterfactual Generation Under Confounding
Regression with Label Differential Privacy
Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
SRBGCN: Tangent space-Free Lorentz Transformations for Graph Feature Learning
Transfer NAS with Meta-learned Bayesian Surrogates
Mitigating the Limitations of Multimodal VAEs with Coordination-Based Approach
Incompatibility between Deterministic Policy and Generative Adversarial Imitation Learning
FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation
Contrastive Learning of Molecular Representation with Fragmented Views
Theoretical Study of Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Sharp Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Selective Frequency Network for Image Restoration
Contextualized Generative Retrieval
Mirror Training for Input Convex Neural Network
Scaling Up Probabilistic Circuits by Latent Variable Distillation
Oscillation Neural Ordinary Differential Equations
Improving Differentiable Neural Architecture Search by Encouraging Transferability
MA-BERT: Towards Matrix Arithmetic-only BERT Inference by Eliminating Complex Non-linear Functions
Automatically Answering and Generating Machine Learning Final Exams
CAT: Collaborative Adversarial Training
Efficient Certified Training and Robustness Verification of Neural ODEs
Arbitrary Virtual Try-On Network: Characteristics Representation and Trade-off between Body and Clothing
A Benchmark Dataset for Learning from Label Proportions
UL2: Unifying Language Learning Paradigms
Emergence of Exploration in Policy Gradient Reinforcement Learning via Resetting
CASR: Generating Complex Sequences with Autoregressive Self-Boost Refinement
SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
On the convergence of SGD under the over-parameter setting
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers
Offline Reinforcement Learning via Weighted $f$-divergence
Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
Some Practical Concerns and Solutions for Using Pretrained Representation in Industrial Systems
Exphormer: Scaling Graph Transformers with Expander Graphs
Generalization to translation shifts in object detection: a study in architectures and augmentations
Feature selection and low test error in shallow low-rotation ReLU networks
Backpropagation through Combinatorial Algorithms: Identity with Projection Works
Therbligs in Action: Video Understanding through Motion Primitives
On the Adversarial Robustness against Natural Weather Perturbations
Coupled Multiwavelet Operator Learning for Coupled Differential Equations
Don��t Bet on Sparsity: Designing Brain-inspired Distance-preserving Encoder
Mid-Vision Feedback for Convolutional Neural Networks
Cross-Window Self-Training via Context Variations from Sparsely-Labeled Time Series
Revisiting and Improving FGSM Adversarial Training
Safe Reinforcement Learning From Pixels Using a Stochastic Latent Representation
TrojText: Test-time Invisible Textual Trojan Insertion
An Experiment Design Paradigm using Joint Feature Selection and Task Optimization
Multi-Objective Online Learning
Improved Training of Physics-Informed Neural Networks Using Energy-Based Priors: a Study on Electrical Impedance Tomography
Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Muliple Heterogeneous Datasets
Robustness Guarantees for Adversarially Trained Neural Networks
Fast-PINN for Complex Geometry: Solving PDEs with Boundary Connectivity Loss
Noise Transforms Feed-Forward Networks into Sparse Coding Networks
DEFENDING BACKDOOR ATTACKS VIA ROBUSTNESS AGAINST NOISY LABEL
A Kernel Perspective of Skip Connections in Convolutional Networks
SlothBomb: Efficiency Poisoning Attack against Dynamic Neural Networks
Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing
Sparse Distributed Memory is a Continual Learner
Optimistic Exploration in Reinforcement Learning Using Symbolic Model Estimates
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
Towards Automatic Generation of Advanced Shift Networks
Robust attributions require rethinking robustness metrics
Learned Nearest-Class-Mean for Biased Representations in Long-Tailed Recognition
GradientMix: A Simple yet Effective Regularization for Large Batch Training
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining
Discrete State-Action Abstraction via the Successor Representation
Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access
Towards Learning Implicit Symbolic Representation for Visual Reasoning
GNNInterpreter: A Probabilistic Generative Model-Level Explanation for Graph Neural Networks
Intra-Instance VICReg: Bag of Self-Supervised Image Patch Embedding Explains the Performance
Rethinking Symbolic Regression: Morphology and Adaptability in the Context of Evolutionary Algorithms
Efficient, probabilistic analysis of combinatorial neural codes
On Pre-training Language Model for Antibody
Challenging Common Assumptions about Catastrophic Forgetting
Learning to reason over visual objects
Imitating Graph-Based Planning with Goal-Conditioned Policies
Prefer to Classify: Improving Text Classifier via Pair-wise Preference Learning
Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning
Simple and Deep Graph Attention Networks
A theoretical study of inductive biases in contrastive learning
Combinatorial Pure Exploration of Causal Bandits
How to fine-tune vision models with SGD
Computational Language Acquisition with Theory of Mind
R��nyi Supervised Contrastive Learning for Transferable Representation
MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News Detection
Walking the Tightrope: An Investigation of the Convolutional Autoencoder Bottleneck
A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
Pareto Invariant Risk Minimization
Understanding and Adopting Rational Behavior by Bellman Score Estimation
Meta-Learning for Bootstrapping Medical Image Segmentation from Imperfect Supervision
L2B: Learning to Bootstrap for Combating Label Noise
What Makes Convolutional Models Great on Long Sequence Modeling?
Progressive Mixup Augmented Teacher-Student Learning for Unsupervised Domain Adaptation
M$^3$SAT: A Sparsely Activated Transformer for Efficient Multi-Task Learning from Multiple Modalities
Editing models with task arithmetic
Structured World Representations via Block-Slot Attention
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Atomized Deep Learning Models
Topology Matters in Fair Graph Learning: a Theoretical Pilot Study
Context-Aware Image Completion
Speech denoising by listening to noise
Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution Trajectories
DYNAMIC BATCH NORM STATISTICS UPDATE FOR NATURAL ROBUSTNESS
SKTformer: A Skeleton Transformer for Long Sequence Data
CktGNN: Circuit Graph Neural Network for Electronic Design Automation
How Should I Plan? A Performance Comparison of Decision-Time vs. Background Planning
Substructure-Atom Cross Attention for Molecular Representation Learning
Differentially Private Algorithms for Smooth Nonconvex ERM
Untangling Effect and Side Effect: Consistent Causal Inference in Non-Targeted Trials
AMA: Asymptotic Midpoint Augmentation for Margin Balancing and Moderate Broadening
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
MEDIC: Model Backdoor Removal by Importance Driven Cloning
The Role of Pre-training Data in Transfer Learning
Compressed Predictive Information Coding
Importance of Class Selectivity in Early Epochs of Training
Mechanistic Mode Connectivity
CLASSIFICATION OF INCOMPLETE DATA USING AUGMENTED MLP
On the Convergence of Federated Deep AUC Maximization
Towards A Unified Neural Architecture for Visual Recognition and Reasoning
BLOOM Large Language Models and the Chomsky Hierarchy
WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus
HloEnv: A Graph Rewrite Environment for Deep Learning Compiler Optimization Research
Towards Diverse Perspective Learning with Switch over Multiple Temporal Pooling
Deep Latent State Space Models for Time-Series Generation
Specformer: Spectral Graph Neural Networks Meet Transformers
MetaP: How to Transfer Your Knowledge on Learning Hidden Physics
CommsVAE: Learning the brain's macroscale communication dynamics using coupled sequential VAEs
Beyond the injective assumption in causal representation learning
Answer Me if You Can: Debiasing Video Question Answering via Answering Unanswerable Questions
Language Models Can (kind of) Reason: A Systematic Formal Analysis of Chain-of-Thought
Approximation ability of Transformer networks for functions with various smoothness of Besov spaces: error analysis and token extraction
Clustering Embedding Tables, Without First Learning Them
Architecture Matters in Continual Learning
Machine Learning Force Fields with Data Cost Aware Training
Covariance Matrix Adaptation MAP-Annealing
Learning Rewards and Skills to Follow Commands with a Data Efficient Visual-Audio Representation
Reinforcement Learning-Based Estimation for Partial Differential Equations
Heterogeneous-Agent Mirror Learning
ADELT: Unsupervised Transpilation Between Deep Learning Frameworks
Recursive Time Series Data Augmentation
Auto-Encoding Goodness of Fit
VER: Learning Natural Language Representations for Verbalizing Entities and Relations
Adaptive IMLE for Few-shot Image Synthesis
Understanding the Covariance Structure of Convolutional Filters
Reinforcement Logic Rule Learning for Temporal Point Processes
On Making Graph Continual Learning Easy, Fool-Proof, and Extensive: a Benchmark Framework and Scenarios
Masked Distillation with Receptive Tokens
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
TextShield: Beyond Successfully Detecting Adversarial Sentences in NLP
Efficient Deep Reinforcement Learning Requires Regulating Statistical Overfitting
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation
GNN Domain Adaptation using Optimal Transport
Ask Me Anything: A simple strategy for prompting language models
MixBin: Towards Budgeted Binarization
Limits of Algorithmic Stability for Distributional Generalization
WikiWhy: Answering and Explaining Cause-and-Effect Questions
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Do We Really Need Graph Models for Skeleton-Based Action Recognition? A Topology-Agnostic Approach with Fully-Connected Networks
An Integrated Multi-Label Multi-Modal Framework in Deep Metric Learning
Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Conservative Exploration in Linear MDPs under Episode-wise Constraints
Pseudometric guided online query and update for offline reinforcement learning
Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks
Probe Into Multi-agent Adversarial Reinforcement Learning through Mean-Field Optimal Control
Robust Algorithms on Adaptive Inputs from Bounded Adversaries
Chasing All-Round Graph Representation Robustness: Model, Training, and Optimization
Training Neural Networks with Low-Precision Model Memory
Raisin: Residual Algorithms for Versatile Offline Reinforcement Learning
VQR: Automated Software Vulnerability Repair Through Vulnerability Queries
Corruption-free Single-view Self-supervised Learning on Graphs
Fully Online Meta Learning
Learning Globally Smooth Functions on Manifolds
On Representing Mixed-Integer Linear Programs by Graph Neural Networks
LEARNING DYNAMIC ABSTRACT REPRESENTATIONS FOR SAMPLE-EFFICIENT REINFORCEMENT LEARNING
Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation
On Representing Linear Programs by Graph Neural Networks
On the Importance and Applicability of Pre-Training for Federated Learning
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Autoregressive Graph Network for Learning Multi-step Physics
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth
Who are playing the games?
Quasiconvex Shallow Neural Network
The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation
Minimalistic Unsupervised Learning with the Sparse Manifold Transform
Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning
Over-Training with Mixup May Hurt Generalization
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions
Text-Conditioned Graph Generation Using Discrete Graph Variational Autoencoders
Dynamic Neural Network is All You Need: Understanding the Robustness of Dynamic Mechanisms in Neural Networks
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Learning Shareable Bases for Personalized Federated Image Classification
Curriculum-inspired Training for Selective Neural Networks
Layer-wise Balanced Activation Mechanism
A Probabilistic Framework For Modular Continual Learning
Knowledge-Grounded Reinforcement Learning
Git Re-Basin: Merging Models modulo Permutation Symmetries
The Tilted Variational Autoencoder: Improving Out-of-Distribution Detection
The Role of Coverage in Online Reinforcement Learning
Learning Mixture Models with Simultaneous Data Partitioning and Parameter Estimation
Estimating Treatment Effects using Neurosymbolic Program Synthesis
Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning
UNDERSTANDING HTML WITH LARGE LANGUAGE MODELS
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
Kuiper: Moderated Asynchronous Federated Learning on Heterogeneous Mobile Devices with Non-IID Data
Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward
Semi-Autoregressive Energy Flows: Towards Determinant-Free Training of Normalizing Flows
PINTO: Faithful Language Reasoning Using Prompted-Generated Rationales
State Decomposition for Model-free Partially observable Markov Decision Process
Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning
Return Augmentation gives Supervised RL Temporal Compositionality
Neural Integral Equations
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
Automatic Data Augmentation via Invariance-Constrained Learning
GEASS: Neural causal feature selection for high-dimensional biological data
Unsupervised 3D Scene Representation Learning via Movable Object Inference
FoveaTer: Foveated Transformer for Image Classification
Linearly Mapping from Image to Text Space
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
Characterizing intrinsic compositionality in transformers with Tree Projections
What Do We Maximize in Self-Supervised Learning And Why Does Generalization Emerge?
SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Similarity-Based Cooperation
Consistent Data Distribution Sampling for Large-scale Retrieval
NOVEL FEATURE REPRESENTATION STRATEGIES FOR TIME SERIES FORECASTING WITH PREDICTED FUTURE COVARIATES
Augmentation Component Analysis: Modeling Similarity via the Augmentation Overlaps
Reproducible Bandits
Persistence-based Contrastive Learning with Graph Neural Recurrent Networks for Time-series Forecasting
ACE-EM: Boosted ab initio Cryo-EM 3D Reconstruction with Asymmetric Complementary Autoencoder
Diffusion-based point cloud generation with smoothness constraints
NEURAL HAMILTONIAN FLOWS IN GRAPH NEURAL NETWORKS
Convergence Analysis of Split Learning on Non-IID Data
Principal Trade-off Analysis
Neural Bregman Divergences for Distance Learning
Neural Autoregressive Refinement for Self-Supervised Outlier Detection beyond Images
Offline Reinforcement Learning from Heteroskedastic Data Via Support Constraints
Finding Private Bugs: Debugging Implementations of Differentially Private Stochastic Gradient Descent
Robust Generative Flows on Reliable Image Reconstruction without Training Data
A Computationally Efficient Sparsified Online Newton Method
TG-Gen: A Deep Generative Model Framework for Temporal Graphs
Solving Continual Learning via Problem Decomposition
Long Term Fairness via Performative Distributionally Robust Optimization
The In-Sample Softmax for Offline Reinforcement Learning
LUNA: Language as Continuing Anchors for Referring Expression Comprehension
Bias Propagation in Federated Learning
A Study of Causal Confusion in Preference-Based Reward Learning
UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph
Comparing Human and Machine Bias in Face Recognition
Sufficient Subgraph Embedding Memory for Continual Graph Representation Learning
One cannot stand for everyone! Leveraging Multiple User Simulators to train Task-oriented Dialogue Systems
Towards Out-of-Distribution Adversarial Robustness
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Learning Deep Operator Networks: The Benefits of Over-Parameterization
How Useful are Gradients for OOD Detection Really?
Many-Body Approximation for Tensors
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Memorization Capacity of Neural Networks with Conditional Computation
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel
Boosting Drug-Target Affinity Prediction from Nearest Neighbors
Weighted Clock Logic Point Process
Simple Emergent Action Representations from Multi-Task Policy Training
Iterative Task-adaptive Pretraining for Unsupervised Word Alignment
Open-Set 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning
Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm
Interaction-Based Disentanglement of Entities for Object-Centric World Models
CodeT5Mix: A Pretrained Mixture of Encoder-decoder Transformers for Code Understanding and Generation
Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Federated Neural Bandits
Compositional Task Representations for Large Language Models
What do large networks memorize?
TILDE-Q: a Transformation Invariant Loss Function for Time-Series Forecasting
Pretraining One Language Model for All With the Text-To-Text Framework Using Model-Generated Signals
A Picture of the Space of Typical Learning Tasks
Linear Mode Connectivity of Deep Neural Networks via Permutation Invariance and Renormalization
Multi-View Masked Autoencoders for Visual Control
Boosting Adversarial Training with Masked Adaptive Ensemble
MILE: Memory-Interactive Learning Engine for Solving Mathematical Problems
UNICO: Efficient Unified Hardware-Software Co-Optimization For Deep Neural Networks
Diffusion-GAN: Training GANs with Diffusion
Contextual Subspace Approximation with Neural Householder Transforms
Mind the Pool: Convolutional Neural Networks Can Overfit Input Size
Towards Unsupervised Time Series Representation Learning: A Decomposition Perspective
Reparameterization through Spatial Gradient Scaling
Boomerang: Local sampling on image manifolds using diffusion models
TOWARD RELIABLE NEURAL SPECIFICATIONS
A second order regression model shows edge of stability behavior
Learning Frequency-aware Network for Continual Learning
Unsupervised Learning for Combinatorial Optimization Needs Meta Learning
Latent Topology Induction for Understanding Contextualized Representations
DyG2Vec: Representation Learning for Dynamic Graphs With Self-supervision
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models
Adaptive Optimization in the $\infty$-Width Limit
Pyramidal Denoising Diffusion Probabilistic Models
Guiding Energy-based Models via Contrastive Latent Variables
Deep Watermarks for Attributing Generative Models
Steerable Equivariant Representation Learning
Differentially Private Diffusion Models
Outlier-Robust Group Inference via Gradient Space Clustering
Broken Neural Scaling Laws
Learning to perceive objects by prediction
Avoiding spurious correlations via logit correction
LEARNING CONTEXT-AWARE ADAPTIVE SOLVERS TO ACCELERATE QUADRATIC PROGRAMMING
Learning Latent Structural Causal Models
Pre-Training for Robots: Leveraging Diverse Multitask Data via Offline Reinforcement Learning
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-Free RL
S$^6$-DAMON: Bridging Self-Supervised Speech Models and Real-time Speech Recognition
Teaching Algorithmic Reasoning via In-context Learning
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
Disentangled Conditional Variational Autoencoder for Unsupervised Anomaly Detection
Diffusion-based Image Translation using disentangled style and content representation
An Analytic Framework for Robust Training of Differentiable Hypothesis
Federated Learning with Heterogeneous Label Noise: A Dual Structure Approach
Correspondences between word learning in children and captioning models
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Implicit Regularization for Group Sparsity
Why do Models with Conditional Computation Learn Suboptimal Solutions?
Stabilized training of joint energy-based models and its practical applications
HesScale: Scalable Computation of Hessian Diagonals
Adaptive Anchor for Robust Keypoint Localization
Divide-and-Cluster: Spatial Decomposition Based Hierarchical Clustering
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
ORCA: Interpreting Prompted Language Models via Locating Supporting Evidence in the Ocean of Pretraining Data
Getting away with more network pruning: From sparsity to geometry and linear regions
Real-time variational method for learning neural trajectory and its dynamics
Supervised Metric Learning for Retrieval via Contextual Similarity Optimization
Large Language Models are Human-Level Prompt Engineers
Do Not Blindly Imitate the Teacher: Loss Perturbation for Knowledge Distillation
Fast Yet Effective Graph Unlearning through Influence Analysis
Faster Hyperparameter Search for GNNs via Calibrated Dataset Condensation
FedTiny: Pruned Federated Learning Towards Specialized Tiny Models
Spatiotemporal Modeling of Multivariate Signals with Graph Neural Networks and Structured State Space Models
TI-VAE: A temporally independent VAE with applications to latent factor learning in neuroimaging
Pruning Deep Neural Networks from a Sparsity Perspective
High-dimensional Continuum Armed and High-dimensional Contextual Bandit: with Applications to Assortment and Pricing
Learning to represent and predict evolving visual signals via polar straightening
Protecting Bidder Information in Neural Auctions
On Representation Learning Under Class Imbalance
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Interpretable (meta)factorization of clinical questionnaires to identify general dimensions of psychopathology
Enhancing Meta Learning via Multi-Objective Soft Improvement Functions
Discrete Predictor-Corrector Diffusion Models for Image Synthesis
Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models
Infusing Lattice Symmetry Priors in Neural Networks Using Soft Attention Masks
Counterfactual Vision-Language Data Synthesis with Intra-Sample Contrast Learning
META-LEARNING FOR UNSUPERVISED OUTLIER DETECTION WITH OPTIMAL TRANSPORT
GPTQ: Accurate Quantization for Generative Pre-trained Transformers
Domain-Invariant Auxiliary Learning for Robust Few-Shot Predictions from Noisy Data
Attentive MLP for Non-Autoregressive Generation
ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning
Dynamics Model Based Adversarial Training For Competitive Reinforcement Learning
ADVL: Adaptive Distillation for Vision-Language Tasks
A new characterization of the edge of stability based on a sharpness measure aware of batch gradient distribution
Finding the smallest tree in the forest: Monte Carlo Forest Search for UNSAT solving
$\mathrm{SE}(3)$-Equivariant Attention Networks for Shape Reconstruction in Function Space
PBES: PCA Based Exemplar Sampling Algorithm for Continual Learning
3D-IntPhys: Learning 3D Visual Intuitive Physics for Fluids, Rigid Bodies, and Granular Materials
Continual Post-Training of Language Models
Min-Max Multi-objective Bilevel Optimization with Applications in Robust Machine Learning
IAE: Implicit Autoencoder for Point Cloud Self-supervised Representation Learning
The Plug and Play of Language Models for Text-to-image Generation
Learning Arborescence with An Efficient Inference Algorithm
Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
A Score-Based Model for Learning Neural Wavefunctions
Benchmarking Algorithms for Domain Generalization in Federated Learning
The Vendi Score: A Diversity Evaluation Metric for Machine Learning
How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus
A Control-Centric Benchmark for Video Prediction
Continual Learning Based on Sub-Networks and Task Similarity
A Stable and Scalable Method for Solving Initial Value PDEs with Neural Networks
Shallow Learning In Materio.
How Can Deep Learning Performs Deep (Hierarchical) Learning
Data Subset Selection via Machine Teaching
Do Summarization Models Synthesize?
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Multi-Grid Tensorized Fourier Neural Operator for High Resolution PDEs
$\beta$-Stochastic Sign SGD: A Byzantine Resilient and Differentially Private Gradient Compressor for Federated Learning
Sequential Brick Assembly with Efficient Constraint Satisfaction
Cross-Domain Self-Supervised Deep Learning for Robust Alzheimer's Disease Progression Modeling
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Heavy-tailed Noise Does Not Explain the Gap Between SGD and Adam, but Sign Descent Might
BiAdam: Fast Adaptive Bilevel Optimization Methods
Building Normalizing Flows with Stochastic Interpolants
Elicitation Inference Optimization for Multi-Principal-Agent Alignment
Dual Student Networks for Data-Free Model Stealing
Augmentation Curriculum Learning For Generalization in RL
Composite Slice Transformer: An Efficient Transformer with Composition of Multi-Scale Multi-Range Attentions
Graph Fourier MMD for signals on data graphs
Equal Improvability: A New Fairness Notion Considering the Long-term Impact
Does progress on ImageNet transfer to real world datasets?
Competitive Physics Informed Networks
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Designing and Using Goal-Conditioned Tools
Post-mortem on a deep learning contest: a Simpson��s paradox and the complementary roles of scale metrics versus shape metrics
ProtFIM: Fill-in-Middle Protein Sequence Design via Protein Language Models
Beyond Deep Learning: An Evolutionary Feature Engineering Approach to Tabular Data Classification
Proportional Multicalibration
On The Impact of Machine Learning Randomness on Group Fairness
Using the Training History to Detect and Prevent Overfitting in Deep Learning Models
Multi-scale Sinusoidal Embeddings Enable Learning on High Resolution Mass Spectrometry Data
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
Efficient parametric approximations of neural net function space distance
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Energy-Inspired Self-Supervised Pretraining for Vision Models
Effectively Modeling Time Series with Simple Discrete State Spaces
Forgetful causal masking makes causal language models better zero-shot learners
When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Protecting DNN from Evasion Attacks using Ensemble of High Focal Diversity
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-Oriented Dialogue Systems
Efficient Stochastic Optimization for Attacking Randomness Involved Inference
Supervision Complexity and its Role in Knowledge Distillation
GLINKX: A Scalable Unified Framework For Homophilous and Heterophilous Graphs
Marich: A Query-efficient & Online Model Extraction Attack using Public Data
CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS
Lovasz Theta Contrastive Learning
Transferable Unlearnable Examples
MUG: Interactive Multimodal Grounding on User Interfaces
Tabular Deep Learning when $d \gg n$ by Using an Auxiliary Knowledge Graph
Random Laplacian Features for Learning with Hyperbolic Space
Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay
Neural Causal Models for Counterfactual Identification and Estimation
Connecting representation and generation via masked vision-language transformer
Is margin all you need? An extensive empirical study of active learning on tabular data
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
Target Conditioned Representation Independence (TCRI); from Domain-Invariant to Domain-General Representations
Multi-Task Option Learning and Discovery for Stochastic Path Planning
MolEBM: Molecule Generation and Design by Latent Space Energy-Based Modeling
Information-Theoretic Diffusion
Bandwith Enables Generalization in Quantum Kernel Models
Giving Robots a Hand: Broadening Generalization via Hand-Centric Human Video Demonstrations
SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference
No Pairs Left Behind: Improving Metric Learning with Regularized Triplet Objective
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning
Gradient Preconditioning for Non-Lipschitz smooth Nonconvex Optimization
Predictive Coding with Approximate Laplace Monte Carlo
What Spurious Features Can Pretrained Language Models Combat?
SIMPLE: A Gradient Estimator for k-Subset Sampling
Transformers Implement First-Order Logic with Majority Quantifiers
Robustness Evaluation Using Local Substitute Networks
Learning Iterative Neural Optimizers for Image Steganography
Graph Neural Networks as Multi-View Learning
Cramming: Training a language model on a single GPU in one day
BertNet: Harvesting Knowledge Graphs from Pretrained Language Models
How Hard is Trojan Detection in DNNs? Fooling Detectors With Evasive Trojans
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Label-Free Synthetic Pretraining of Object Detectors
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Current Anomaly Detectors are Anomalous: On Semantic Treatment of OOD Inputs
FedX: Federated Learning for Compositional Pairwise Risk Optimization
On the Sensitivity of Reward Inference to Misspecified Human Models
DeepDFA: Dataflow Analysis-Guided Efficient Graph Learning for Vulnerability Detection
Probability flow solution of the Fokker-Planck equation
Binding Language Models in Symbolic Languages
Probabilistic Categorical Adversarial Attack and Adversarial Training
Multi-Sample Contrastive Neural Topic Model as Multi-Task Learning
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Less Is More: Training on Low-Fidelity Images Improves Robustness to Adversarial Attacks
Greedy Information Maximization for Online Feature Selection
Towards Fair Classification against Poisoning Attacks
Unveiling Transformers with LEGO: A Synthetic Reasoning Task
How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Spatial Reasoning Network for Zero-shot Constrained Scene Generation
Robust Graph Dictionary Learning
Matrix factorization under the constraint of connectivity between observed and source data ~ Muscle synergy analysis based on connectivity between muscle and brain activities ~
Fundamental limits on the robustness of image classifiers
Stochastic Constrained DRO with a Complexity Independent of Sample Size
Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
Dissecting adaptive methods in GANs
Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints
Understanding Influence Functions and Datamodels via Harmonic Analysis
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
Robustness for Free: Adversarially Robust Anomaly Detection Through Diffusion Model
Optimal control neural networks for data-driven discovery of gradient flows.
ErrorAug: Making Errors to Find Errors in Semantic Segmentation
Kernel Regression with Infinite-Width Neural Networks on Millions of Examples
Information Plane Analysis for Dropout Neural Networks
Fed-Cor: Federated Correlation Test with Secure Aggregation
Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments
Dynamical systems embedding with a physics-informed convolutional network
Learning Harmonic Molecular Representations on Riemannian Manifold
When is Offline Hyperparameter Selection Feasible for Reinforcement Learning?
Plansformer: Generating Multi-Domain Symbolic Plans using Transformers
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
VISION TRANSFORMER FOR MULTIVARIATE TIME- SERIES CLASSIFICATION (VITMTSC)
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
Preserving In-Context Learning Ability in Large Language Model Fine-tuning
Efficiently Controlling Multiple Risks with Pareto Testing
Graph Mixup with Soft Alignments
CNN Compression and Search Using Set Transformations with Width Modifiers on Network Architectures
Event-former: A Self-supervised Learning Paradigm for Temporal Point Processes
Learning Interpretable Dynamics from Images of a Freely Rotating 3D Rigid Body
NOTELA: A Generalizable Method for Source Free Domain Adaptation
Characteristic Neural Ordinary Differential Equation
Fast Sampling of Diffusion Models with Exponential Integrator
STay-On-the-Ridge (STON'R): Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games
Federated Representation Learning via Maximal Coding Rate Reduction
3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data
gDDIM: Generalized denoising diffusion implicit models
Panning for Gold in Federated Learning: Targeted Text Extraction under Arbitrarily Large-Scale Aggregation
Artificial Neuronal Ensembles with Learned Context Dependent Gating
Linkless Link Prediction via Relational Distillation
Controllable Concept Transfer of Intermediate Representations
A Differentiable Loss Function for Learning Heuristics in A*
Understanding Multi-Task Scaling in Machine Translation
Learning Language Representations with Logical Inductive Bias
AsymQ: Asymmetric Q-loss to mitigate overestimation bias in off-policy reinforcement learning
Movement-to-Action Transformer Networks for Temporal Action Proposal Generation
INSPIRE: A Framework for Integrating Individual User Preferences in Recourse
How Does Self-supervised Learning Work? A Representation Learning Perspective
Empowering Graph Representation Learning with Test-Time Graph Transformation
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization
GROOT: Corrective Reward Optimization for Generative Sequential Labeling
Interpretations of Domain Adaptations via Layer Variational Analysis
Forget Unlearning: Towards True Data-Deletion in Machine Learning
Meta-Learning with Explicit Task Information
Evaluating Unsupervised Denoising Requires Unsupervised Metrics
Denoising Diffusion Samplers
How I Learned to Stop Worrying and Love Retraining
The Value of Out-of-distribution Data
Recursive Neural Programs: Variational Learning of Image Grammars and Part-Whole Hierarchies
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets
Image Classification by Throwing Quantum Kitchen Sinks at Tensor Networks
Cross-Domain Few-Shot Relation Extraction via Representation Learning and Domain Adaptation
Factors Influencing Generalization in Chaotic Dynamical Systems
Interpretable Geometric Deep Learning via Learnable Randomness Injection
Koopman Operator Learning for Accelerating Quantum Optimization and Machine Learning
GOGGLE: Generative Modelling for Tabular Data by Learning Relational Structure
Query by Self
A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Progressive Prompts: Continual Learning for Language Models without Forgetting
Differentiable Rendering with Reparameterized Volume Sampling
Deep Learning From Crowdsourced Labels: Coupled Cross-Entropy Minimization, Identifiability, and Regularization
Maximum Likelihood Learning of Energy-Based Models for Simulation-Based Inference
Provable Re-Identification Privacy
Just Avoid Robust Inaccuracy: Boosting Robustness Without Sacrificing Accuracy
Projective Proximal Gradient Descent for Nonconvex Nonsmooth Optimization: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property
First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains
A Kernel-Based View of Language Model Fine-Tuning
Variable Compositionality Reliably Emerges in Neural Networks
Systematic Rectification of Language Models via Dead-end Analysis
Model-free Reinforcement Learning that Transfers Using Random Reward Features
Differentiable Channel Selection for Self-Attention
Membership Inference Attacks Against Text-to-image Generation Models
Multiple sequence alignment as a sequence-to-sequence learning problem
Fair Graph Message Passing with Transparency
FedExP: Speeding up Federated Averaging via Extrapolation
Graph Neural Networks Are More Powerful Than we Think
A Mixture-of-Expert Approach to RL-based Dialogue Management
A Retrieve-and-Read Framework for Knowledge Graph Reasoning
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation
An Empirical Study of the Neural Contextual Bandit Algorithms
Backpropagation at the Infinitesimal Inference Limit of Energy-Based Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive Hebbian Learning
A Theoretical Framework for Inference and Learning in Predictive Coding Networks
Causally-guided Regularization of Graph Attention improves Generalizability
On a Benefit of Masked Language Model Pretraining: Robustness to Simplicity Bias
FLGAME: A Game-theoretic Defense against Backdoor Attacks In Federated Learning
DeepReShape: Redesigning Neural Networks for Private Inference
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes
Semi-Supervised Single Domain Generalization with Label-Free Adversarial Data Augmentation
A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search
Memory Efficient Dynamic Sparse Training
Accelerated Training via Principled Methods for Incrementally Growing Neural Networks
Progressive Mix-Up for Few-Shot Supervised Multi-Source Domain Transfer
Mitigating Propagation Failures in PINNs using Evolutionary Sampling
Revisiting Information-Based Clustering with Pseudo-Posterior Models
Neural Compositional Rule Learning for Knowledge Graph Reasoning
Temporal Change Sensitive Representation for Reinforcement Learing
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Fairness via Adversarial Attribute Neighbourhood Robust Learning
Efficient approximation of neural population structure and correlations with probabilistic circuits
Exploring perceptual straightness in learned visual representations
Improving Subgraph Representation Learning via Multi-View Augmentation
Efficient Proxy for NAS is Extensible Now
System identification of neural systems: If we got it right, would we know?
TKIL: Tangent Kernel Optimization for Class Balanced Incremental Learning
Is Forgetting Less a Good Inductive Bias for Forward Transfer?
High-Precision Regressors for Particle Physics
Learning Structured Representations by Embedding Class Hierarchy
Promptagator: Few-shot Dense Retrieval From 8 Examples
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
Brain-like representational straightening of natural movies in robust feedforward neural networks
FunkNN: Neural Interpolation for Functional Generation
A Framework for Comprehensive Evaluations of Graph Neural Network based Community Detection using Node Clustering
TEXTCRAFT: ZERO-SHOT GENERATION OF HIGH FIDELITY AND DIVERSE SHAPES FROM TEXT
CrystalBox: Efficient Model-Agnostic Explanations for Deep RL Controllers
Label Propagation with Weak Supervision
TypeT5: Seq2seq Type Inference using Static Analysis
Approximating any Function via Coreset for Radial Basis Functions: Towards Provable Data Subset Selection For Efficient Neural Networks training
Axiomatic Explainer Locality With Optimal Transport
Fine-Tuning Offline Policies With Optimistic Action Selection
Improving the Strength of Human-Like Models in Chess
Test-Time Training on Video Streams
AGRO: Adversarial discovery of error-prone Groups for Robust Optimization
Learning Multiobjective Program Through Online Learning
Dichotomy of Control: Separating What You Can Control from What You Cannot
Progressive Knowledge Distillation: Constructing Ensembles for Efficient Inference
Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction
LogicDP: Creating Labels for Graph Data via Inductive Logic Programming
Simulating Environments for Evaluating Scarce Resource Allocation Policies
Domain Transfer with Large Dynamics Shift in Offline Reinforcement Learning
Learning to reason with relational abstractions
A Simple Approach for State-Action Abstraction using a Learned MDP Homomorphism
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Revisiting Curiosity for Exploration in Procedurally Generated Environments
Online Learning for Obstacle Avoidance
Transformer-based World Models Are Happy With 100k Interactions
Can Neural Networks Learn Implicit Logic from Physical Reasoning?
Blockwise self-supervised learning with Barlow Twins
DIGEST: FAST AND COMMUNICATION EFFICIENT DECENTRALIZED LEARNING WITH LOCAL UPDATES
Learning to Improve Code Efficiency
Real Data Distributions Prefer Simplicity and So Do Our Models: Why Machine Learning and Model Selection Are Possible
Backdoor Attacks in the Supply Chain of Masked Image Modeling
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret
On Achieving Optimal Adversarial Test Error
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Serving Graph Compression for Graph Neural Networks
Optimal Data Sampling for Training Neural Surrogates of Programs
Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation
Achieving Communication-Efficient Policy Evaluation for Multi-Agent Reinforcement Learning: Local TD-Steps or Batching?
Learning where and when to reason in neuro-symbolic inference
Aging with GRACE: Lifelong Model Editing with Key-Value Adaptors
A VAE for Transformers with Nonparametric Variational Information Bottleneck
Learning MLPs on Graphs: A Unified View of Effectiveness, Robustness, and Efficiency
On The Specialization of Neural Modules
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Information-Theoretic Underpinnings of Generalization and Translation in Emergent Communication
Optimal Transport-Based Supervised Graph Summarization
Contrastive Vision Transformer for Self-supervised Out-of-distribution Detection
Does the Half Adversarial Robustness Represent the Whole? It Depends... A Theoretical Perspective of Subnetwork Robustness
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Improving Accuracy and Explainability of Online Handwriting Recognition
On the duality between contrastive and non-contrastive self-supervised learning
Few-Shot Incremental Learning Using HyperTransformers
The Brainy Student: Scalable Unlearning by Selectively Disobeying the Teacher
FIGARO: Controllable Music Generation using Learned and Expert Features
A Neural PDE Solver with Temporal Stencil Modeling
The Right Losses for the Right Gains: Improving the Semantic Consistency of Deep Text-to-Image Generation with Distribution-Sensitive Losses
Selection Collider Bias in Large Language Models
CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data
Language models are multilingual chain-of-thought reasoners
DreamFusion: Text-to-3D using 2D Diffusion
Recitation-Augmented Language Models
Continual Active Learning
KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals
Credible, Sealed-bid, Optimal Repeated Auctions With Differentiable Economics
The Power of Feel-Good Thompson Sampling: A Unified Framework for Linear Bandits
Two-Tailed Averaging: Anytime Adaptive Once-in-a-while Optimal Iterate Averaging for Stochastic Optimization
Reward Design with Language Models
Calibrating the Rigged Lottery: Making All Tickets Reliable
Replay Buffer with Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning
Contrastive Audio-Visual Masked Autoencoder
Pessimistic Model-Based Actor-Critic for Offline Reinforcement Learning: Theory and Algorithms
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Soft Diffusion: Score Matching For General Corruptions
Open-Vocabulary Panoptic Segmentation MaskCLIP
Robust Federated Learning with Majority Adversaries via Projection-based Re-weighting
Double Wins: Boosting Accuracy and Efficiency of Graph Neural Networks by Reliable Knowledge Distillation
A Statistical Framework for Personalized Federated Learning and Estimation: Theory, Algorithms, and Privacy
Invariant Aggregator for Defending against Federated Backdoor Attacks
Improving Adversarial Robustness of Deep Neural Networks via Self-adaptive Margin Defense
Laser: Latent Set Representations for 3D Generative Modeling
Towards Efficient Gradient-Based Meta-Learning in Heterogenous Environments
Knowledge Cascade: Reverse Knowledge Distillation
Optimal Transport for Offline Imitation Learning
FedorAS: Federated Architecture Search under system heterogeneity
Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Towards A Unified View of Sparse Feed-Forward Network in Transformer
Learning multi-scale local conditional probability models of images
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
Online Continual Learning with Feedforward Adaptation
Understanding ReLU Network Robustness Through Test Set Certification Performance
Mind the Privacy Budget: How Generative Models Spend their Privacy Budgets
Resource Efficient Self-Supervised Learning for Speech Recognition
Subsampling in Large Graphs Using Ricci Curvature
Membership Leakage in Pre-trained Language Models
DSI++: Updating Transformer Memory with New Documents
Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
The Game of Hidden Rules: A New Challenge for Machine Learning
Motif-based Graph Representation Learning with Application to Chemical Molecules
Graph schemas as abstractions for transfer learning, inference, and planning
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
Beam Tree Recursive Cells
The Ultimate Combo: Boosting Adversarial Example Transferability by Composing Data Augmentations
In-Time Refining Optimization Trajectories Toward Improved Robust Generalization
Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation
Improving Aspect Ratio Distribution Fairness in Detector Pretraining via Cooperating RPN��s
Learning parsimonious dynamics for generalization in reinforcement learning
DECODING LAYER SALIENCY IN TRANSFORMERS
UNDERSTANDING THE ROLE OF POSITIONAL ENCODINGS IN SENTENCE REPRESENTATIONS
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Score-based Continuous-time Discrete Diffusion Models
Decision Transformer under Random Frame Dropping
Semi-supervised consistency regularization for accurate cell type fraction and gene expression estimation
Adversarial Imitation Learning with Preferences
How to Do a Vocab Swap? A Study of Embedding Replacement for Pre-trained Transformers
Attribution Scores are Redundant: Explaining Feature Contribution By Trajectories
SuperFed: Weight Shared Federated Learning
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
Recurrent Back-Projection Generative Adversarial Network for Video Super Resolution
Neural Networks as Paths through the Space of Representations
From Points to Functions: Infinite-dimensional Representations in Diffusion Models
Disentangling with Biological Constraints: A Theory of Functional Cell Types
ESEAD: An Enhanced Simple Ensemble and Distillation Framework for Natural Language Processing
Efficient One-Shot Neural Architecture Search With Progressive Choice Freezing Evolutionary Search
Synthetic Data Generation of Many-to-Many Datasets via Random Graph Generation
Learning rigid dynamics with face interaction graph networks
On the Importance of Contrastive Loss in Multimodal Learning
MAD for Robust Reinforcement Learning in Machine Translation
An Exploration of Conditioning Methods in Graph Neural Networks
Speed Up Iterative Non-Autoregressive Transformers by Distilling Multiple Steps
Global View For GCN: Why Go Deep When You Can Be Shallow?
Cross-Silo Training of Differentially Private Models with Secure Multiparty Computation
HyperTime: Implicit Neural Representations for Time Series Generation
Generative Adversarial Federated Model
Unsupervised Pretraining for Neural Value Approximation
Homotopy Learning of Parametric Solutions to Constrained Optimization Problems
When Rigid Coherency Hurts: Distributional Coherency Regularization for Probabilistic Hierarchical Time Series Forecasting
EENet: Learning to Early Exit for Adaptive Inference
MALIBO: Meta-Learning for Likelihood-free Bayesian Optimization
Finding and only finding local Nash equilibria by both pretending to be a follower
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Networks
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
SurCo: Learning Linear Surrogates for Combinatorial Nonlinear Optimization Problems
DT+GNN: A Fully Explainable Graph Neural Network using Decision Trees
Why (and When) does Local SGD Generalize Better than SGD?
Function-space regularized R��nyi divergences
Constant-Factor Approximation Algorithms for Socially Fair $k$-Clustering
Re-calibrated Wasserstein GAN for large-scale imputation with informative missing
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Depth Separation with Multilayer Mean-Field Networks
Robust Policy Optimization in Deep Reinforcement Learning
Analogical Networks for Memory-Modulated 3D Parsing
Fake It Until You Make It : Towards Accurate Near-Distribution Novelty Detection
Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
Domain Invariant Q-Learning for model-free robust continuous control under visual distractions
Continual Learning with Soft-Masking of Parameter-Level Gradient Flow
Asynchronous Message Passing: A new Framework for Learning in Graphs
Integrating Symmetry into Differentiable Planning with Steerable Convolutions
MolJET: Multimodal Joint Embedding Transformer for Conditional de novo Molecular Design and Multi-Property Optimization
The Challenges of Exploration for Offline Reinforcement Learning
SGD with large step sizes learns sparse features
Synergies Between Disentanglement and Sparsity: a Multi-Task Learning Perspective
Discerning Hydroclimatic Behavior with a Deep Convolutional Residual Regressive Neural Network
Causal Reasoning in the Presence of Latent Confounders via Neural ADMG Learning
ESC: A Benchmark For Multi-Domain End-to-End Speech Recognition
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
Pareto Rank-Preserving Supernetwork for HW-NAS
ProSampler: Improving Contrastive Learning by Better Mini-batch Sampling
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Bispectral Neural Networks
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
Zero-Shot Retrieval with Search Agents and Hybrid Environments
Hyper-Decision Transformer for Efficient Online Policy Adaptation
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment
Solving Continuous Control via Q-learning
Make-A-Video: Text-to-Video Generation without Text-Video Data
EiX-GNN : Concept-level eigencentrality explainer for graph neural networks
Unsupervised Adaptation for Fairness under Covariate Shift
Pushing the limits of self-supervised learning: Can we outperform supervised learning without labels?
Towards Dynamic Sparsification by Iterative Prune-Grow LookAheads
Learning Useful Representations for Shifting Tasks and Distributions
Personalized Reward Learning with Interaction-Grounded Learning (IGL)
From Adaptive Query Release to Machine Unlearning
ReAct: Synergizing Reasoning and Acting in Language Models
Towards convergence to Nash equilibria in two-team zero-sum games