EE Seminar: Problems in Information Theory and High Dimensional Statistics | Shahar Stein Ioushua

04 באוגוסט 2024, 15:00 
Room 011, Kitot Building, Faculty of Engineering 
EE Seminar: Problems in Information Theory and High Dimensional Statistics | Shahar Stein Ioushua

Electrical Engineering Systems Seminar

 

Speaker: Shahar Stein Ioushua

Ph.D. student under the supervision of Prof. Ofer Shayevitz

 

Sunday, 4th August 2024, at 15:00

Room 011, Kitot Building, Faculty of Engineering

 

Problems in Information Theory and High Dimensional Statistics

 

Abstract

We study problems in information theory and statistics where high-dimension plays a key role. Often in such problems, the near-asymptotic setting enables the use of mathematical tools that are not applicable in low-dimension.

First, we study the problem of counting the number of graphs with a given subgraph distribution.

Le G be a large (simple, unlabeled) dense graph on n vertices. Suppose that we only know, or can estimate, the empirical distribution of the number of subgraphs F that each vertex in G participates in, for some fixed small graph F. How many other graphs would look essentially the same to us, i.e., would have a similar local structure? We derive upper and lower bounds on the number of graphs whose empirical distribution lies close (in the Kolmogorov-Smirnov distance) to that of G. Our bounds are given as solutions to a maximum entropy problem on random graphs of a fixed size k that does not depend on n, under global density constraints.

Next, we study the benefits of batch learning in the overparameterized linear regression setting. Learning algorithms that divide the data into batches are prevalent in many machine-learning applications, typically offering useful trade-offs between computational efficiency and performance. Here, we examine the benefits of batch-partitioning through the lens of a minimum-norm overparameterized linear regression model with isotropic Gaussian features. We suggest a natural small-batch version of the minimum-norm estimator, and derive an upper bound on its quadratic risk, showing it is inversely proportional to the noise level as well as to the overparameterization ratio, for the optimal choice of batch size. In contrast to minimum-norm, our estimator admits a stable risk behavior that is monotonically increasing in the overparameterization ratio, eliminating both the blowup at the interpolation point and the double-descent phenomenon. Interestingly, we observe that this implicit regularization offered by the batch partition is partially explained by feature overlap between the batches. Our bound is derived via a novel combination of techniques, in particular normal approximation in the Wasserstein metric of noisy projections over random subspaces.

 

השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום שם מלא + מספר ת.ז. בדף הנוכחות שיועבר באולם במהלך הסמינר

 

 

 

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>