EE Seminar: Efficient representations for dense reasoning with long videos

31 בדצמבר 2025, 13:00 
אולם 011, בניין כיתות חשמל 
EE Seminar: Efficient representations for dense reasoning with long videos

הרישום לסמינר יבוצע בתחילת הסמינר באמצעות סריקת הברקוד למודל (יש להיכנס לפני כן למודל,  לא באמצעות האפליקציה)

Registration to the seminar is done at the beginning of the seminar by scanning the barcode for the Moodle (Please enter ahead to the Moodle, NOT by application)

 

(The talk will be given in English)

 

Speaker:     Prof. Greg Shakhnarovich

                       Toyota Technological Institute at Chicago

 

011 hall, Electrical Engineering-Kitot Building‏

Wednesday, December 31st, 2025

13:00 - 14:00

 

Efficient representations for dense reasoning with long videos

 

Abstract

In some video understanding scenarios, it is important to capture details that exist at fine temporal resolution, over a significant length of context (hundreds, thousands, and even tens of thousands of frames). This poses a computational challenge for many existing video encoders. I will discuss our recent efforts on developing models for video representation that address this challenge in two ways, each with a different kind of video task in mind. In our work on sign language understanding, we extract information from each video frame in a highly selective way and train the long context encoder from a large video corpus without any labels. The resulting video model, SHuBERT, is a "foundation model" for American Sign Language achieving state of the art performance on multiple sign language understanding tasks. In another ongoing effort, we focus on the task of nonlinear movie editing and develop an autoregressive model that relies on highly compressed representation of video frames. This model, trained on an unlabeled corpus of movies, yields state of the art results on complex movie editing tasks and on editing-related video understanding benchmarks.

Short Bio

Greg Shakhnarovich received a BSc degree in Mathematics and Computer Science from Hebrew University, Jerusalem, in 1994, a MSc degree in Computer Science from the Technion, Haifa, in 2001, and a PhD degree in Electrical Engineering and Computer Science from MIT in 2005. His dissertation focused on novel methods for learning concepts of similarity defined for a particular task and represented by a set of examples. Main applications of this work have been in computer vision, where it helped build systems for efficient analysis of human bodies in images and videos. He remains interested in statistical methods for learning similarity, and the closely related topic of example-based inference.

In 2005-2007, prior to joining TTIC, Greg was a Postdoctoral Research Associate in the Department of Computer Science and the Brain Sciences Program at Brown University. There he worked on computational methods for brain-machine interfaces, with applications in neuro-motor prostheses.

Shakhnarovich is interested in computational vision and machine learning. His current research is focused on automatic understanding of visual scenes, including recovery of three-dimensional structure and detection and categorization of objects. He is also generally interested in similarity-based, supervised and semi-supervised statistical learning methods.

 

  -סמינר זה ייחשב כסמינר שמיעה לתלמידי תואר שני ושלישי-

This Seminar Is Considered A Hearing Seminar For Msc/Phd Students-

 

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>