Train longer, generalize better: closing the generalization gap in large batch training of neural networks |
Elad Hoffer, Itay Hubara, Daniel Soudry |
NeurIPS (Oral) |
2017 |
Fix your classifier: the marginal value of training the last weight layer |
Elad Hoffer, Itay Hubara, Daniel Soudry |
ICLR |
2018 |
The Implicit Bias of Gradient Descent on Separable Data |
Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Nathan Srebro |
ICLR |
2018 |
Exponentially vanishing sub-optimal local minima in multilayer neural networks |
Daniel Soudry, Elad Hoffer |
ICLR Workshop |
2018 |
Scalable Methods for 8-bit Training of Neural Networks |
Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry |
NeurIPS |
2018 |
Norm matters: efficient and accurate normalization schemes in deep networks |
Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry |
NeurIPS (Spotlight) |
2018 |
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning |
Tom Zahavy , Matan Haroush , Nadav Merlis , Daniel J. Mankowitz, Shie Mannor |
NeurIPS |
2018 |
Task Agnostic Continual Learning Using Online Variational Bayes |
Chen Zeno, Itay Golan, Elad Hoffer, Daniel Soudry |
NeurIPS Workshop |
2018 |
Infer2Train: leveraging inference for better training of deep networks |
Elad Hoffer, Berry Weinstein, Itay Hubara , Sergei Gofman , Daniel Soudry |
NeurIPS Workshop |
2018 |
Increasing batch size through instance repetition improves generalization |
Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler and Daniel Soudry |
ICML workshop |
2019 |
How Learning Rate and Delay Affect Minima Selection in AsynchronousTraining of Neural Networks: Toward Closing the Generalization Gap |
Niv Giladi, Mor Shpigel Nacson, Elad Hoffer and Daniel Soudry |
ICML workshop (Oral) |
2019 |
Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency |
Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry |
SEDL NeurIPS Workshop |
2019 |
Post training 4-bit quantization of convolutional networks for rapid-deployment |
Ron Banner, Yury Nahshan and Daniel Soudry |
NeurIPS |
2019 |
Augment your batch: Improving generalization through instance repetition |
Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler and Daniel Soudry |
CVPR |
2020 |
The Knowledge Within: Methods for Data-Free Model Compression |
Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry |
CVPR |
2020 |
At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? |
Niv Giladi, Mor Shpigel Nacson, Elad Hoffer and Daniel Soudry |
ICLR (Spotlight) |
2020 |
Robust Quantization: One Model to Rule Them All |
Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser |
NeurIPS |
2020 |
Feature Map Transform Coding for Energy-Efficient CNN Inference |
Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson |
IJCNN |
2020 |
Thanks for nothing: Predicting zero-valued activations with lightweight convolutional neural networks |
Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser |
ECCV |
2020 |
Loss aware post‑training quantization |
Yury Nahshan, Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson |
Machine Learning |
2021 |
Neural gradients are lognormally distributed: understanding sparse and quantized training |
Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner and Daniel Soudry |
ICLR |
2021 |
GAN "Steerability" without optimization |
Nurit Spingarn, Ron Banner, Tomer Michaeli |
ICLR (Spotlight) |
2021 |
Logarithmic unbiased quantization: Practical 4-bit training in deep learning |
Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry |
Preprint |
2021 |
Accurate post training quantization with small calibration sets |
Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry |
ICML |
2021 |
Accelerated sparse neural training: A provable and efficient method to find n: m transposable masks |
Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Joseph Naor, Daniel Soudry |
NeurIPS |
2021 |
CAT: Compression-Aware Training for bandwidth reduction |
Chaim Baskin, Brian Chmiel, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson |
JMLR |
2021 |
Energy awareness in low precision neural networks |
Nurit Spingarn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli |
Preprint |
2022 |
Power Awareness in Low Precision Neural Networks |
Nurit Spingarn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli |
ECCV Workshop |
2022 |
ON RECOVERABILITY OF GRAPH NEURAL NETWORK REPRESENTATIONS |
Maxim Fishman, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Avi Mendelson |
ICLR GTRL Workshop |
2022 |
Minimum variance unbiased n: M sparsity for the neural gradients |
Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry |
ICLR (Spotlight) |
2023 |
Accurate neural training with 4-bit matrix multiplications at standard formats |
Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry |
ICLR |
2023 |
DropCompute: simple and more robust distributed synchronous training via compute variance reduction |
Niv Giladi, Shahar Gottlieb, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Y Levy, Daniel Soudry |
NeurIPS |
2023 |