Inference
33 sessions
08:31 Project Lightning Talk: How To Run Kubernetes Pods On My Slurm-Based HPC Center Elicium 2 · Diego Ciangottini → 11:30 Hands-On Workshop to Build and Scale GenAI Inference on Kubernetes Hosted by Amazon Web Services Leonardo Royal Hotel Amsterdam, Paul van Vlissingenstraat 24, Amsterdam NH 1096 BK Room: Amstel 1-2 → 12:33 Project Lightning Talk: Next-Gen AI Orchestration With Volcano On Kubernetes Elicium 2 · Zhonghu Xu → 13:36 Project Lightning Talk: Evolving KServe: The Unified Model Inference Platform For Both Predictive And Generative AI Elicium 2 · Yuan Tang → 08:37 Keynote: Rules of the Road for Shared GPUs: AI Inference Scheduling at Wayve Hall 12 · Mukund Muralikrishnan → 08:42 Sponsored Keynote: From Complexity to Clarity: Engineering an Invisible Kubernetes ▶ Hall 12 · Jesse Butler → 08:56 Keynote: From Inference to Agents: Where Open Source AI Is Headed ▶ Hall 12 · Jonathan Bryce, Brian Stevens, Mark Collier, Lin Sun → 11:00 Intelligent Routing for Optimized Inference Hall 7 | Room B · Antonio Berben, Felipe Vicens → 11:00 Your Models Are Vulnerable: How KitOps Turns KServe Into a Zero-Trust Inference Platform Amtrium 1+2 · Brad Micklea, Gavrish Prabhu → 13:30 Cloud Native Theater | Istio Day: Running State of the Art Inference with Istio and LLM-D Hall 1-5 | Tram Zone | Cloud Native Theater · Jackie Maertens → 13:30 🚨 Contribfest: Testing the Waters: Getting Started With Kgateway G106 · Nina Polshakova, Mayowa Fajobi, David Jumani, Steven Thwaites → 14:15 Schema Inference and Automation: A New Era for Telemetry Management Hall 12 · Nicolas Takashi, Arthur Silva Sens → 14:15 To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kubernetes 1.34+ Amtrium 1+2 · Nic Vermande → 15:15 LLM Inference at Scale: Orchestrating Prefill-Decode Disaggregation Forum · Zhonghu Xu → 15:15 Slinky Expanded: Slurm, Kubernetes, and DRA Amtrium 1+2 · Praveen Krishna, Marlow Warnicke → 16:00 SIG Network: The State of Networking for AI on Kubernetes E103-105 · David Martin, Haiyan Meng, Bowei Du, Kellen Swain, Nadia Pinaeva → 16:00 Volcano: Orchestrating the Full AI Lifecycle – From Training To Inference and Agents E106-108 · Chen Zicong, Hajnal Máté → 17:15 Sponsored Demo: Beyond Training: Volcano for Inference & Agents Hall 1-5 | Tram Zone | Demo Theater → 08:24 Sponsored Keynote: Inference and Sovereign AI: Scaling Cloud-Native AI with Control and Compliance ▶ Hall 12 · Karena Angell, Vincent Caldeira → 10:45 Route, Serve, Adapt, Repeat: Adaptive Routing for AI Inference Workloads in Kubernetes Auditorium · Nir Rozenbaum, Kellen Swain → 12:15 Sponsored Demo: Building cross-cloud AI inference on Kubernetes with OSS Hall 1-5 | Tram Zone | Demo Theater → 12:15 🪧 Poster Session: Efficient Inference for Training Hurricane Data and Predicting Future Movement Hall 1-5 | Gouda Zone | Poster Pavilion · Avery Yang → 14:30 Sponsored Demo: 7 Things You Need to Run Production AI Workloads on K8s Hall 1-5 | Tram Zone | Demo Theater → 14:30 Project Demo: Transforming KServe Into a Zero Trust Inference Platform with Modelkits Hall 1-5 | Gouda Zone | Project Pavilion → 15:00 Optimizing LLM Inference for the Rest of Us F002-005 · Abdel Sghiouar → 10:00 Achieving Resilient Multi-Cluster AI Inference on Kubernetes With Karmada and KubeRay Auditorium · Wei-Cheng Lai, Han-Ju Chen → 12:45 Cloud Native at the Far(m) Edge: Running Kubernetes and AI on Tractors Auditorium · Mauro Morales, Jordan Karapanagiotis → 12:45 Evolving KServe: The Unified Model Inference Platform for Both Predictive and Generative AI G102-103 · Filippe Spolti, Jooho Lee → 13:30 BoF: Infrastructure Optimization for GPUs / Inference / Training / Networking G106 → 13:30 Envoy in the Era of Agentic Workloads G102-103 · Yan Avlasov, Erica Hughberg → 13:30 Making Topology-Aware Scheduling Practical for AI Workloads: From Discovery to Simulation at Scale Hall 8 | Room D · Weizhou Lan → 13:30 Redefining SLIs for LLM Inference: Managing Hybrid Cloud with vLLM & LLM-D Hall 7 | Room A · Christopher Nuland, Hilliary Lipsig → 14:15 BoF: AI Observability G106 →