Ravid Shwartz-Ziv

Ravid Shwartz-Ziv

Assistant Professor and Faculty Fellow

New York University

Biography

I’m an Assistant Professor and a Fellow at NYU Center for Data Science, New York, where I’m working with Andrew Gordon Wilson and Yann LeCun to understand and develop better models using Bayesian deep networks, information theory, and self-supervised learning.

I completed my Ph.D. under Prof Naftali Tishby’s supervision  at the Hebrew University of Jerusalem, where I also worked with Prof. Haim Sompolinsky. In my Ph.D., I focused on the connection between deep neural networks and information theory. I tried to develop a deeper understating of deep networks based on information theory and implement it over large-scale problems. I received the Google Ph.D. Fellowship.

Additionally, I am a researcher at Intel’s Artificial Intelligence Research Group. There, I am involved in the development of deep learning, computer vision, and sensory data solutions for healthcare, manufacturing, and marketing, both internally and externally.

In 2019-2020, I worked at Google Brain, CA, USA, exploring DNN generalization using information theory tools.

In the past, I have also been involved in several Wikipedia projects.  I volunteer at The Public Knowledge Workshop in my spare time.

And I also enjoy playing basketball. 

Interests

  • Artificial Intelligence
  • Computational Neuroscience
  • Information Theory
  • Bayesian deep networks

Education

  • PhD in Computer Science and Neuroscience, 2021

    The Hebrew University of Jerusalem

  • MSc in Computer Science and Neuroscience, 2016

    The Hebrew University of Jerusalem

  • BSc in Computer Science and Bioinformatics, 2014

    The Hebrew University of Jerusalem

Experience

 
 
 
 
 

Faculty Fellow

NYU, Center for Data Science

Sep 2021 – Present New York, NY, USA
Studying information theory, Bayesian deep networks, and representations of self-supervised learning models.
 
 
 
 
 

Research Student

Google AI, Host: Dr. Alex Alemi

Jun 2019 – May 2020 Mountain View, CA, USA
Exploration of generalization by information quantities for infinitely-wide neural networks.
 
 
 
 
 

Graduate Researcher

Advisor: Professor Naftali Tishby - The Hebrew University of Jerusalem

Jan 2016 – Aug 2021 Israel

Empirical and theoretical study of DNNs based on information-theoretical principles.

  • Development of a deeper understating of DNNs based on information theory.
  • Devise large scale implementation algorithms for the information bottleneck theory.
 
 
 
 
 

Graduate Researcher

Advisor: Professor Haim Sompolinsky - The Hebrew University of Jerusalem

Jan 2016 – Aug 2021 Israel
Development of models for perceptual and transfer learning in DNNs, which are biologically plausible.
 
 
 
 
 

Senior AI and Data Science Researcher

Intel

Feb 2013 – Present Israel

Developing novel deep learning, computer vision and sensory data solutions for healthcare, manufacturing, sales, and marketing for both internal and external usage. Selected Projects

  • Automated testing of graphics units with video anomaly detection.
  • Optimization of the validation process using images transfer learning.
  • Healthcare solutions using physical and virtual sensors.
  • Gait recognition for smartphones using sensors fusion.
 
 
 
 
 

Research Assistant

Advisor: Professor Leo Joskowicz - The Hebrew University of Jerusalem

Feb 2012 – Feb 2014 Israel
Development of image segmentation algorithms using DNNs used for extracting medical indicators to detect abnormality in embryos.

Recent Publications

Quickly discover relevant content by filtering publications.

What Do We Maximize in Self-Supervised Learning?

We examine self-supervised learning methods to provide an information-theoretical understanding of their construction. As a first step, we demonstrate how information-theoretic quantities can be obtained for a deterministic network. This enables us to demonstrate how SSL methods can be (re)discovered from first principles and thier assumptions about the data distribution. Furthermore, we empirically demonstrate the validity of our assumptions, confirming our novel understanding.

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

We show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. This approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

Tabular Data: Deep Learning is Not All You Need

We explored whether deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.
Tabular Data: Deep Learning is Not All You Need

The Dual Information Bottleneck

A new framework, which resolves some of the known drawbacks of the Information Bottleneck. We provide a theoretical analysis of the framework, finding the structure of its solutions and present a novel variational formulation for DNNs.
The Dual Information Bottleneck

Information in Infinite Ensembles of Infinitely-Wide Neural Networks

Study the generalization properties of infinite ensembles of infinitely-wide neural networks. We report analytical and empirical investigations in the search for signals that correlate with generalization.
Information in Infinite Ensembles of Infinitely-Wide Neural Networks

Talks

Compression in deep learning - an information theory perspective

While DNNs have achieved many breakthroughs, our understanding of their internal structure, optimization process, and generalization is poor, and we often treat them as black boxes. We attempt to resolve these issues by suggesting that DNNs learn to optimize the Information Bottleneck (IB) principle - the tradeoff between information compression and prediction quality. In the first part of the talk, I presented this approach, showing an analytical and numerical study of DNNs in the information plane. This analysis reveals how the training process compresses the input to an optimal, efficient representation. I discussed recent works inspired by this analysis and show how we can apply them to real-world problems. In the second part of the talk, I will discuss information in infinitely-wide neural networks using recent results in Neural Tangent Kernels (NTK) networks. The NTK allows us to derive many tractable information-theoretic quantities. By utilizing these derivations, we can do an empirical search to find the important information-theoretic quantities that affect generalization in DNNs. I aslo presented the Dual Information Bottleneck (dualIB) framework, to find an optimal representation that resolves some of the drawbacks of the original IB. A theoretical analysis of the dualIB shows the structure of its solution and its ability to preserve the original distribution’s statistics. Within this, we focused on the variational form of the dualIB, allowing its application to DNNs.
Compression in deep learning -  an information theory perspective

Information in Infinite Ensembles of Infinitely-Wide Neural Networks

Finding generalization signals using information for infinitely-wide neural networks.
Information in Infinite Ensembles of Infinitely-Wide Neural Networks

Representation Compression in Deep Neural Network

An information theoretic viewpoint on the behavior of deep networks optimization processes and their generalization abilities by the information plane and how compression can help.
Representation Compression in Deep Neural Network

On the Information Theory of Deep Neural Networks

Understanding Deep Neural Networks with the information bottleneck principle.
On the Information Theory of Deep Neural Networks

Open the Black Box of Deep Neural Networks

Where is the information in deep neural networks? trying to find it by looking on the information plane.
Open the Black Box of Deep Neural Networks

Contact