Name: ⚡ Lightning Talk: Introducing LLM Instance Gateways for Efficient Inference Serving - Abdel Sghiouar, Google Cloud
Start: 2025-04-01T10:00:00+0100
End: 2025-04-01T10:10:00+0100

CNCF-hosted Co-located Events Europe 2025 taking place on 1 April. This event is happening in person at Excel London in London, England.

The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025, and have an All-Access pass in order to participate in the sessions.

To view the full event schedule for a specific CNCF-hosted Co-located event, you can use the right-hand navigation bar to sort and filter.

The schedule is subject to change.

Tuesday April 1, 2025 10:00 - 10:10 BST

Level 1 | Hall Entrance N10 | Room G

Large Language Models (LLMs) are revolutionizing applications, but efficiently serving them in production is a challenge. Existing API endpoints, LoadBalancers and Gateways focus on HTTP/gRPC traffic which is a well defined space already. LLM traffic is completely different as an input to an LLM is usually characterized by the size of the prompt, the size and efficiency of the model...etc

Why are LLM Instance Gateways important? They solve the problem of efficiently managing and serving multiple LLM use cases with varying demands on shared infrastructure.

What will you learn? The core challenges of LLM inference serving: Understand the complexities of deploying and managing LLMs in production, including resource allocation, traffic management, and performance optimization.

We will dive into how LLM Instance Gateways work, how they route requests, manage resources, and ensure fairness among different LLM use cases.

Speakers

Abdel Sghiouar

Cloud Developer Advocate, Google Cloud

Abdel Sghiouar is a senior Cloud Developer Advocate @Google Cloud. A co-host of the Kubernetes Podcast by Google and a CNCF Ambassador. His focused areas are GKE/Kubernetes, Service Mesh and Serverless.

Tuesday April 1, 2025 10:00 - 10:10 BST
Level 1 | Hall Entrance N10 | Room G

Cloud Native + Kubernetes AI Day, LLMs

Content Experience Level Beginner
Event + Breaks Cloud Native + Kubernetes AI Day

CNCF-hosted Co-located Events Europe 2025

Abdel Sghiouar

Attendees (4)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!