Loading…
CNCF-hosted Co-located Events Europe 2025 taking place on 1 April. This event is happening in person at Excel London in London, England.

The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Europe 2025, and have an All-Access pass in order to participate in the sessions.

To view the full event schedule for a specific CNCF-hosted Co-located event, you can use the right-hand navigation bar to sort and filter.

The schedule is subject to change.
or to bookmark your favorites and sync them to your phone or calendar.
Venue: Level 1 | Hall Entrance N10 | Room G clear filter
Tuesday, April 1
 

09:00 BST

09:10 BST

The State of GenAI & ML in the Cloud Native Ecosystem - Alejandro Saucedo, Zalando SE
Tuesday April 1, 2025 09:10 - 09:35 BST
The growth of GenAI & ML has brought new emerging challenges; in this talk we dive into the state of production of GenAI & ML in the cloud native ecosystem, where we provide an overview of trends, challenges, opportunities and tooling that the ecosystem is standardizing towards.

As part of this session, we will provide a snapshot of the current state of the ecosystem, as uncovered by recent surveys, which highlights the gaps in tooling and skills [1]. We will then cover the best practices and tooling that are arising from production use-cases of LLMOps/MLOps at scale to tackle domain-specific challenges such as agentic-workflows, AI guardrails, efficiency requirements - between others. These include the OSS frameworks [2] that are supporting the end-to-end LLMOps / MLOps lifecycle across pipelining, optimization, productionisation, monitoring/observability and ML safety.

[1] https://ethical.institute/state-of-ml-2024
[2] https://bit.ly/ml-list
Speakers
avatar for Alejandro Saucedo

Alejandro Saucedo

Director of Engineering, Science & Product, Zalando SE
Alejandro is Director of Engineering, Science & Product at Zalando SE, where he is responsible for central systems that power Supply and Demand across the group, including Zalando's central data and State-of-the-Art ML systems. He is also Chief Scientist at the Institute for Ethical... Read More →
Tuesday April 1, 2025 09:10 - 09:35 BST
Level 1 | Hall Entrance N10 | Room G

09:45 BST

⚡ Lightning Talk: Scalable and Observable RAG for Question-Answering in a Box - Selvi Kadirvel, Elotl
Tuesday April 1, 2025 09:45 - 09:55 BST
Enterprises have critical operational and product data distributed in a variety of sources that includes public documentation, internal wikis and various ticketing systems. With the rising capabilities of Large Language Models, enterprises & SMBs are looking to LLMs to help achieve valuable insights from these diverse data sources accessible via natural language querying.

In this talk, we illustrate how we can build a self-hosted RAG system to answer questions across these data sources powered by LLMs hosted on a scalable and observable Kubernetes cluster. Auto-scaling and observability are incorporated into our Question-Answering-in-a-Box stack both at the application and infrastructure level to ensure that GenAI app developers can start with pilot deployments and then systematically move through an Iterate-and-Improve cycle to eventually deploy their stack in production in a cost-effective fashion.
Speakers
avatar for Selvi Kadirvel

Selvi Kadirvel

Engineering Lead, Elotl
Selvi is the Engineering Lead at Elotl where she works on building multi-cluster Kubernetes solutions. As an engineer in the infrastructure space for 15 years, she has worked on Kubernetes & container platforms at Cisco and ContainerX. In a prior avatar, she implemented machine-learning... Read More →
Tuesday April 1, 2025 09:45 - 09:55 BST
Level 1 | Hall Entrance N10 | Room G

10:00 BST

⚡ Lightning Talk: Introducing LLM Instance Gateways for Efficient Inference Serving - Abdel Sghiouar, Google Cloud
Tuesday April 1, 2025 10:00 - 10:10 BST
Large Language Models (LLMs) are revolutionizing applications, but efficiently serving them in production is a challenge. Existing API endpoints, LoadBalancers and Gateways focus on HTTP/gRPC traffic which is a well defined space already. LLM traffic is completely different as an input to an LLM is usually characterized by the size of the prompt, the size and efficiency of the model...etc

Why are LLM Instance Gateways important? They solve the problem of efficiently managing and serving multiple LLM use cases with varying demands on shared infrastructure.

What will you learn? The core challenges of LLM inference serving: Understand the complexities of deploying and managing LLMs in production, including resource allocation, traffic management, and performance optimization.

We will dive into how LLM Instance Gateways work, how they route requests, manage resources, and ensure fairness among different LLM use cases.
Speakers
avatar for Abdel Sghiouar

Abdel Sghiouar

Cloud Developer Advocate, Google Cloud
Abdel Sghiouar is a senior Cloud Developer Advocate @Google Cloud. A co-host of the Kubernetes Podcast by Google and a CNCF Ambassador. His focused areas are GKE/Kubernetes, Service Mesh and Serverless.
Tuesday April 1, 2025 10:00 - 10:10 BST
Level 1 | Hall Entrance N10 | Room G

10:40 BST

Serving the Future: KServe’s Next Chapter Hosting LLMs & GenAI Models (with Fun Drawings!) - Alexa Griffith & Tessa Pham, Bloomberg
Tuesday April 1, 2025 10:40 - 11:05 BST
In the rapidly evolving generative AI landscape, KServe has emerged as a pivotal platform for deploying and managing LLMs at scale. KServe simplifies deploying ML models on Kubernetes, but there’s so much more to the story than predictor pods and YAML files. With its newly expanded capabilities, KServe is ready to host the next generation of AI workloads, including LLMs and other generative AI applications.
As both maintainers of KServe and daily practitioners running it in Bloomberg’s clusters, we bring firsthand insights into how users utilize KServe to deploy advanced LLM features in production across hybrid environments. This session will delve into KServe's latest features tailored for generative AI. We will offer insights into its enhanced serving runtimes, scalability improvements, and integration strategies. Attendees will gain practical knowledge about deploying and scaling generative models using KServe, informed by real-world experiences and the lessons we’ve learned.
Speakers
avatar for Alexa Nicole Griffith

Alexa Nicole Griffith

Senior Software Engineer, Bloomberg LP
Alexa Griffith is a Senior Software Engineer on Bloomberg’s Cloud Native Compute Services organization. She works on building an inference platform for ML workflows and the open source project KServe. She enjoys solving engineering challenges at scale and writing code in Go. She... Read More →
avatar for Tessa Pham

Tessa Pham

Senior Software Engineer, Bloomberg
Tessa Pham is a Senior Software Engineer on Bloomberg's Cloud Native Compute Services organization. She works on building an inference platform for Bloomberg’s Data Science Platform, used by engineers and data scientists for training, deploying and serving ML models. Tessa is a... Read More →
Tuesday April 1, 2025 10:40 - 11:05 BST
Level 1 | Hall Entrance N10 | Room G

11:15 BST

Manage Cloud Native LLM Workloads Across Edge and Cloud Seamlessly Using KubeEdge and WasmEdge - Vivian Hu, Second State & Fei Xu, Huawei Cloud
Tuesday April 1, 2025 11:15 - 11:40 BST
LLMs moving beyond data centers to edge devices. While this migration promises reduced latency and enhanced privacy, challenges come: maintaining accuracy within limited resources, and cross-device deployment problems.

The integration of KubeEdge and WasmEdge addresses the challenge. WasmeEdge is a lightweight, portable runtime (less than 50MB) without external dependencies. The KubeEdge Sedna orchestrates the edge-cloud collaboration. It monitors inference accuracy and automatically routes requests to cloud-based models when edge processing doesn't meet accuracy thresholds.

This session will demo that small LLMs provide quick, local inference at the edge. When higher accuracy is needed, Sedna seamlessly transitions to larger models in the cloud. The inference workload is built in Rust and compiled to Wasm, enabling deployment across edge and cloud without any changes.

The solution has been implemented in production across multiple industries like aerospace and bank branches.
Speakers
avatar for Fei Xu

Fei Xu

Senior Software Engineer, Huawei Cloud
Huawei Cloud, Senior Software Engineer KubeEdge TSC Member, Senior Software Engineer at Huawei Cloud. Focusing on Cloud Native,Kubernetes, EdgeComputing, EdgeAI and other fields. Currently maintaining the KubeEdge project which is a CNCF graduated project. And has rich experience... Read More →
avatar for Vivian Hu

Vivian Hu

Product Manager, Second State
Vivian Hu is a Product Manager at Second State and a columnist at InfoQ. She is a founding member of the WasmEdge project. She organizes Rust and WebAssembly community events in Asia.
Tuesday April 1, 2025 11:15 - 11:40 BST
Level 1 | Hall Entrance N10 | Room G

11:50 BST

Panel: Engaging the Kubeflow Community: Building an Enterprise-Ready AI/ML Platform - Yuan Tang, Red Hat; Andrey Velichkevich, Apple; Andreea Munteanu, Canonical; Johnu George, Nutanix
Tuesday April 1, 2025 11:50 - 12:25 BST
Organizations often ask themselves when building a new solution whether to develop everything from scratch or integrate existing tools into an end-to-end solution. Kubeflow’s journey was exactly at this crossroads when it started. Part of CNCF as an incubating project, Kubeflow integrates a series of leading open source tools such as Knative, Istio, KServe amongst other AI/ML tools for both predictive and GenAI/LLM applications.

In this panel we will discuss the trade-offs between building a product based on existing tools vs. a DIY approach. We will delve into the key considerations of adding new enhancements and components, based on the developments in the industry and user adoption. The panel will highlight the challenges of being an official distribution of such a product and customer use cases and the influence they had over the project’s roadmap. We will talk through the trials and tribulations that paid off in a win-win outcome for the Kubeflow community and our users.
Speakers
avatar for Yuan Tang

Yuan Tang

Principal Software Engineer, Red Hat
Yuan is a principal software engineer at Red Hat, working on OpenShift AI. He has led AI infrastructure and platform teams at various companies. He holds leadership positions in open source projects, including Argo, Kubeflow, and Kubernetes. He's a maintainer and author of many popular... Read More →
avatar for Andrey Velichkevich

Andrey Velichkevich

Senior Software Engineer, Apple
Andrey Velichkevich is a Senior Software Engineer at Apple and is a key contributor to the Kubeflow open-source project. He is a member of Kubeflow Steering Committee and a co-chair of Kubeflow AutoML and Training WG. Additionally, Andrey is an active member of the CNCF WG AI. He... Read More →
avatar for Andreea Munteanu

Andreea Munteanu

AI Product Manager, Canonical
I help organizations drive scalable transformation projects with open source AI. I lead AI at Canonical, the publisher of Ubuntu. With a background in data science across industries like retail and telecommunications, I help enterprises make data-driven decisions with AI. I am passionate... Read More →
avatar for Johnu George

Johnu George

Technical Director, Nutanix
Johnu George is a Technical Director at Nutanix with a background in distributed systems and large-scale hybrid data pipelines. He is an active in open-source and has steered several industry collaborations on projects like Kubeflow, Apache Mnemonic and Knative. His research interests... Read More →
Tuesday April 1, 2025 11:50 - 12:25 BST
Level 1 | Hall Entrance N10 | Room G

13:30 BST

Accelerate Your AI/ML Workloads With Topology-Aware Scheduling in Kueue - Michał Woźniak, Google & Yuki Iwai, CyberAgent, inc
Tuesday April 1, 2025 13:30 - 13:55 BST
Optimizing execution time of AI training and inference is crucial in the era of LLMs. The workloads often exchange huge amounts of data between pods, making the network throughput a bottleneck.

Data centers have hierarchical organization with multiple layers, such as racks or blocks, however, leveraging this fact in vanilla Kubernetes is challenging as the scheduler needs to be aware of both workloads and the cluster topology. Kueue, as a Job-level scheduler, is already workload-aware. To tackle the second challenge, we propose a convention for labeling nodes by cloud-providers or cluster administrators. Leveraging this information, Kueue optimizes Pod placement within a cluster, ordering Pods by indices to enhance the performance of AI frameworks using NCCL.

In this session, we introduce the key concepts and machinery behind Topology-Aware Scheduling (TAS) in Kueue. We also compare TAS with alternatives and present results on how using it improves execution time of AI workloads.
Speakers
avatar for Michał Woźniak

Michał Woźniak

Software Engineer, Google
Michał is a software engineer with background in computer science, a PhD in computational biology, and 5+ years of professional experience. In his current role he is focusing on enhancing the support for batch workloads in the Kubernetes ecosystem. Outside of work he enjoys playing... Read More →
avatar for Yuki Iwai

Yuki Iwai

Software Engineer, CyberAgent, inc
Yuki is a Software Engineer at CyberAgent, Inc. He works on the internal platform for machine-learning applications and high-performance computing. He is currently a Technical Lead for Kubeflow WG AutoML / Training. He is also a Kubernetes WG Batch active member, Job API reviewer... Read More →
Tuesday April 1, 2025 13:30 - 13:55 BST
Level 1 | Hall Entrance N10 | Room G

14:05 BST

Cluster Management for Large Scale AI and GPUs: Challenges and Opportunities - Claudia Misale & David Grove, IBM
Tuesday April 1, 2025 14:05 - 14:30 BST
There are new challenges in managing large GPU clusters dedicated to cloud native AI workloads. The workload mix is diverse, and GPUs must be effectively utilized and dynamically shared across multiple teams. Furthermore, GPUs are subject to a variety of performance degradations and faults that can severely impact multi-GPU jobs, thus requiring continuous monitoring and enhanced diagnostics. Cloud native tools such Kubeflow, Kueue and others, are the building blocks for large scale GPU clusters used by teams across IBM Research for training, tuning, and inference jobs. In this talk, IBM Research will share and demonstrate lessons learnt on how they configure large scale GPU clusters and the development of Kubernetes native automation to run health checks on GPUs and report health. Finally, will show the use of diagnostics to enable both the dynamic adjustment of quotas to account for faulty GPUs, and the automatic steering of new and existing workloads away from nodes with faulty GPUs.
Speakers
avatar for Claudia Misale

Claudia Misale

Staff Research Scientist, IBM Research
Claudia Misale is a Staff Research Scientist in the Hybrid Cloud Infrastructure Software group at IBM T.J. Watson Research Center (NY). Her research is focused on Kubernetes and targets monitoring, observability and scheduling for HPC and AI training workloads. She is mainly interested... Read More →
avatar for David Grove

David Grove

Distinguished Research Scientist, IBM Research
David Grove is a Distinguished Research Scientist at IBM T.J. Watson, NY, USA. He has been a software systems researcher at IBM since 1998, specializing in programming language implementation and scalable runtime systems. His current research focuses on cloud-related technologies... Read More →
Tuesday April 1, 2025 14:05 - 14:30 BST
Level 1 | Hall Entrance N10 | Room G

14:40 BST

Objection! AI Security Mistakes on Trial With Kubeflow and Confidential Computing - Annie Talvasto, Waovo & Karl Ots, EPAM Systems
Tuesday April 1, 2025 14:40 - 15:05 BST
Enter the courtroom of cloud-native justice, where the most pressing AI security mistakes are put on trial. From exposed sensitive data to flawed model training and insecure pipelines, the prosecution will lay bare the vulnerabilities threatening AI deployments. But don’t worry—Kubeflow, confidential computing, and other powerful open source projects will take the stand to defend your AI infrastructure. Learn how these technologies work together to enforce robust security guardrails, protect sensitive data, ensure compliance, and mitigate the risks that come with AI operations. This session blends technical depth with courtroom drama to help you identify, understand, and address common AI security mistakes, so you can build secure, scalable AI pipelines with confidence. Join us for a verdict that ensures the protection of your AI workloads!
Speakers
avatar for Annie Talvasto

Annie Talvasto

CNCF Ambassador & CTO, Waovo
Annie Talvasto is an award-winning international technology speaker and leader. She has spoken at over 60 tech conferences worldwide, including KubeCon + CloudNativeCon. She has been recognized with the CNCF Ambassador, Azure & AI Platform MVP awards. She has co-organized the Kubernetes... Read More →
avatar for Karl Ots

Karl Ots

Head of Cloud Security, EPAM Systems
Karl Ots is a cloud security leader and author with over 15 years of experience in the technology industry. He has been advocating for open source technologies for over 15 years, and OSS technologies in his Linkedin Learning courses as an instructor. He is also a prolific author... Read More →
Tuesday April 1, 2025 14:40 - 15:05 BST
Level 1 | Hall Entrance N10 | Room G

15:20 BST

From Toil To Triumph: Harnessing Agentic AI To Streamline Infrastructure as Code - Jodee Varney, Outshift by Cisco
Tuesday April 1, 2025 15:20 - 15:45 BST
This talk will explore the transformative potential of GenAI agentic frameworks, using Infrastructure as Code (IaC) as a key example relevant to the CNCF community. While IaC offers benefits in modularity and control, it also presents challenges like maintaining code consistency, managing multiple environments, and troubleshooting IAM policies, creating toil for development teams.

We'll demonstrate how open-source agentic GenAI frameworks can be applied to OpenTofu repositories to streamline pull requests - enhancing consistency and reducing toil. We'll focus on how to construct GenAI agentic teams for each function to achieve useful quality high-context results, emphasizing cost management by allocating resources based on function complexity and impact. By sharing insights, we aim to highlight its broader applicability and seek collaborators for a CNCF project aimed at developing new agentic tools that aid in managing cloud-native environments.
Speakers
avatar for Jodee Varney

Jodee Varney

Principal Product Manager, Outshift by Cisco
Jodee Varney is a veteran product manager focused on developing tools to enhance DevOps processes. As a passionate advocate for open collaboration, she looks forward to every opportunity to work with other members of the CNCF community. She has a knack for transforming complex problems... Read More →
Tuesday April 1, 2025 15:20 - 15:45 BST
Level 1 | Hall Entrance N10 | Room G

15:55 BST

AI, CERN, and the Quest for GPU Custody: How CERN Leverages DRA for Efficient GPU Sharing - Diana Gaponcic, CERN & Jan-Philip Gehrcke, NVIDIA
Tuesday April 1, 2025 15:55 - 16:20 BST
Dynamic Resource Allocation (DRA) is quickly gathering momentum to become the go-to way of advertising GPUs on Kubernetes clusters. In this talk, we will present the current state of the project, the latest implementation updates, and feature additions. We will walk through how to get started with DRA, and why this is relevant for any engineer trying to improve the GPU offering on their clusters. We continue with configuring time-slicing, MPS, and MIG, and explain how to build more custom layouts on top.

Next, we will show how DRA is used at CERN to colocate machine learning workloads on the same GPU. We start by explaining how to choose the best-fitted sharing mechanism depending on the performance requirements. We present extensive training and inference benchmarking results, and how DRA comes into play to make the system flexible and easy to use. Lastly, we go through GPU sharing tradeoffs, and how in the end this approach can help save resources.
Speakers
avatar for Diana Gaponcic

Diana Gaponcic

Computing Engineer, CERN
Diana is a Computing Engineer in the CERN IT department. After an internship at CERN focusing on containerization of ETL applications she later joined the Kubernetes team, working on the GitOps and monitoring infrastructure. Her current focus is on optimizing the usage of GPUs and... Read More →
avatar for Jan-Philip Gehrcke

Jan-Philip Gehrcke

tbd, NVIDIA
tbd
Tuesday April 1, 2025 15:55 - 16:20 BST
Level 1 | Hall Entrance N10 | Room G

16:30 BST

Open Source Tools To Empower Ethical and Robust AI Systems - Vicente Herrera & Alberto Rodríguez Fernandez, Control Plane
Tuesday April 1, 2025 16:30 - 16:55 BST
In this talk, Vicente Herrera will show us some open source tools for evaluating and securing AI models that are essential to building responsible AI systems. He will present an ontology explaining where each tool can assist in these tasks.

He will show tools like Garak, that helps identifying undesirable behaviors. LLM Guard and LLM Canary, providing detection and prevention of adversarial attacks and unintended data disclosures. Promptfoo, that optimizes prompt engineering and testing, leading to more reliable and consistent AI outputs.
For adversarial robustness, Counterfit, the Adversarial Robustness Toolkit, and BrokenHill provide solutions to assess AI models against malicious threats. Regarding fairness and compliance, AI Fairness 360 and Audit AI are important to understand how models can be just and accountable.

The final goal is being able to choose a model not only because how big ir is or good a knowledge evaluation score it has, but also about how robust and fair it is.
Speakers
avatar for Vicente Herrera

Vicente Herrera

Principal Consultant, Control Plane
Principal Consultant at Control Plane, focusing on Kubernetes and AI cybersecurity for fintech organizations. Core member of AI Readiness Group in FINOS, collaborating in defining security risks, controls and mitigations. Lecturer at Loyola University in Seville for the Master's program... Read More →
Tuesday April 1, 2025 16:30 - 16:55 BST
Level 1 | Hall Entrance N10 | Room G

17:00 BST

⚡ Lightning Talk: Benchmarking Your Distributed ML Training on the K8s Platform - Liang Yan, CoreWeave
Tuesday April 1, 2025 17:00 - 17:10 BST
Kubernetes is widely adopted for inference workloads, but distributed ML training still presents challenges, such as dynamic resource scaling, GPU scheduling, and efficient inter-node communication. Recent advancements, including KubeRay, Kubeflow, and Slurm integration, have expanded Kubernetes' capabilities for training workloads, making it a more viable option for complex, large-scale ML tasks.

This session focuses on the next step: benchmarking these tools to evaluate and optimize their performance for distributed ML training. We’ll review existing solutions, discuss the design and implementation of our benchmarking platform, and demonstrate how it provides actionable insights to improve throughput, scalability, and efficiency.
Speakers
avatar for Liang Yan

Liang Yan

Sr. Software Engineer, Coreweave
Liang Yan is a senior software engineer at Coreweave, specializing in AI Infra, heterogeneous architecture acceleration and distributed machine learning systems from the cloud base. He collaborates closely with upstream communities and leading vendors like NVIDIA, AMD and ARM, delivering... Read More →
Tuesday April 1, 2025 17:00 - 17:10 BST
Level 1 | Hall Entrance N10 | Room G

17:15 BST

⚡ Lightning Talk: {”spec”: “nodeSelector”: "EuroHPC"} - Diego Ciangottini, INFN
Tuesday April 1, 2025 17:15 - 17:25 BST
All you need to run your application on a supercomputer is to target a specific node on your local kubernetes cluster. If you are really an expert and want to play with MPI capabilities of a SLURM batch system at a remote HPC center, you just have to pass your job parameters in the pod annotations.

This is possible today, and at INFN we are proposing a community ecosystem to streamline the adoption of the Virtual Kubelet technology. From beefy machines in your basement to batch systems and PaaS/CaaS services, interLink provides a common and cloud-native interface to make kubernetes pods running where kubernetes is not an option.

We'll be presenting how this is possible and how the first real scientific use cases are already running their payloads at EuroHPC centers. We'll be demoing ML training and GenAI frameworks running seamlessly on Leonardo and Vega supercomputers, everything on a single kubernetes cluster.
Speakers
avatar for Diego Ciangottini

Diego Ciangottini

Technologist, INFN
Diego Ciangottini is a physicist and received his PhD from the University of Perugia, Italy in 2012. Now he's working as technologist at INFN (Italian National Institute for Nuclear Physics) researching cloud-native solutions for the scientific use cases of the institute. In that... Read More →
Tuesday April 1, 2025 17:15 - 17:25 BST
Level 1 | Hall Entrance N10 | Room G

17:25 BST

 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
  • ArgoCon
  • BackstageCon
  • Breaks
  • CiliumCon
  • Cloud Native + Kubernetes AI Day
  • Cloud Native Telco Day
  • Cloud Native University
  • Data on Kubernetes Day
  • EnvoyCon
  • Istio Day
  • Kubeflow Summit
  • Kubernetes on Edge Day
  • Linkerd Day
  • Observability Day
  • OpenFeature Summit
  • OpenTofu Day
  • Platform Engineering Day