EE Seminar: Efficient LLM Systems: From Algorithm Design to Deployment

07 בינואר 2026, 15:00 
אולם 011, בניין כיתות חשמל 
EE Seminar: Efficient LLM Systems: From Algorithm Design to Deployment

הרישום לסמינר יבוצע בתחילת הסמינר באמצעות סריקת הברקוד למודל (יש להיכנס לפני כן למודל,  לא באמצעות האפליקציה)

Registration to the seminar is done at the beginning of the seminar by scanning the barcode for the Moodle (Please enter ahead to the Moodle, NOT by application)

 

(The talk will be given in English)

 

Speaker:     Dr. Rana Shahout

                        Harvard University

 

011 hall, Electrical Engineering-Kitot Building‏

Wednesday, January 7th, 2026

15:00 - 16:00

 

Efficient LLM Systems: From Algorithm Design to Deployment

 

Abstract

Large Language Models (LLMs) have transformed what machines can do and how systems are designed to serve them. These models are both computationally and memory demanding, revealing the limits of traditional optimization methods that once sufficed for conventional systems. A central challenge in building LLM systems is improving system metrics while ensuring response quality.

This talk presents approaches for reducing latency in LLM systems to support interactive applications, from scheduling algorithm design to deployment. It introduces scheduling frameworks that use lightweight predictions of request behavior to make informed decisions about prioritization and memory management across two core settings: standalone LLM inference and API-augmented LLMs that interact with external tools. Across both settings, prediction-guided scheduling delivers substantial latency reductions while remaining practical for deployment.

Short Bio

Rana Shahout is a Postdoctoral Fellow at Harvard University, working with Michael Mitzenmacher and Minlan Yu. She received her Ph.D. in Computer Science from the Technion and previously worked as a Senior Software Engineer at Mellanox (now NVIDIA). Her research combines machine learning, systems, and algorithmic theory to design efficient and scalable AI systems. Rana is a recipient of the Eric and Wendy Schmidt Postdoctoral Award, the Zuckerman Postdoctoral Fellowship, the Weizmann Institute Women’s Postdoctoral Career Development Award, the VATAT Postdoctoral Fellowship, and first place in the ACC Feder Family Award for Best Student Work in Communications.

 

  -סמינר זה ייחשב כסמינר שמיעה לתלמידי תואר שני ושלישי-

This Seminar Is Considered A Hearing Seminar For Msc/Phd Students-

 

 

 

 

 

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>