Session 1: LLMs for Databases
KathDB: Explainable Multimodal Database Management System with Human-AI Collaboration
Waiting to Decompress: The Economics of LLM-Based Compression
Andreas Kipf, Tobias Schmidt, Ping-Lin Kuo, Skander Krid, Moritz Rengert, Luca Heller, Andreas Zimmerer, Mihail Stoian, Varun Pandey, Alexander van Renen
BridgeScope: A Universal Toolkit for Bridging Large Language Models and Databases
Lianggui Weng, Dandan Liu, Rong Zhu, Bolin Ding, Jingren Zhou
Making Prompts First-Class Citizens for Adaptive LLM Pipelines
Uğur Çetintemel, Shu Chen, Alexander W. Lee, Deepti Raghavan, Duo Lu, Andrew Crotty
Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics
Matthew Russo, Tim Kraska
Session 2: Data Platform Benchmarking and Optimization Techniques
End-to-End Declarative Data Analytics: Co-designing Engines, Interfaces, and Cloud Infrastructure
Pinghe Li, Tom Kuchler, Marko Kabić, Tobias Stocker, Gustavo Alonso, Ana Klimovic
Survivorship Bias in Industrial Database Workloads
Ryan Marcus, Jeffrey Tao, Peizhi Wu, Zijie Zhao
A Multi-tenant Relational OLTP Database at Salesforce
Vaibhav Arora, Subho Chatterjee, Terry Chong, Thomas Fanghaenel, Pat Helland, Jamie Martin, Kaushal Mittal, Nat Wyatt
I Can't Believe It's Not Yannakakis: Pragmatic Bitmap Filters in Microsoft SQL Server
Hangdong Zhao, Yuanyuan Tian, Rana Alotaibi, Bailu Ding, Nicolas Bruno, Jesús Camacho-Rodríguez, Vassilis Papadimos, Ernesto Cervantes Juárez, Cesar Galindo-Legaria, Carlo Curino
Fast Vector Search in PostgreSQL: A Decoupled Approach
Jiayi Liu, Yunan Zhang, Chenzhe Jin, Aditya Gupta, Shige Liu, Jianguo Wang
Session 3: Text-to-SQL, Agents, LLMs, Oh My!
Text-to-SQL Benchmarks are Broken: An In-Depth Analysis of Annotation Errors
Tengjun Jin, Yoojin Choi, Yuxuan Zhu, Daniel Kang
Leveraging Query Optimizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More
Vivek Narasayya, Surajit Chaudhuri
BenchPress: A Human-in-the-Loop Annotation System for Rapid Text-to-SQL Benchmark Curation
Fabian Wenz, Omar Bouattour, Devin Yang, Justin Choi, Cecil Gregg, Nesime Tatbul, Çağatay Demiralp
Please Don't Kill My Vibe: Empowering Agents with Data Flow Control
Charlie Summers, Haneen Mohammed, Eugene Wu
Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First
Shu Liu, Soujanya Ponnapalli, Shreya Shankar, Sepanta Zeighami, Alan Zhu, Shubham Agarwal, Ruiqi Chen, Samion Suwito, Shuo Yuan, Ion Stoica, Matei Zaharia, Alvin Cheung, Natacha Crooks, Joseph E. Gonzalez, Aditya G. Parameswaran
Session 4: Distributed Coordination and Consistency
Consistency and Correctness in Data-Oriented Workflow Systems
Michael Stonebraker, Xinjing Zhou, Peter Kraft, Qian Li
Event Horizon: Asymmetric Dependencies for Fast Geo-Distributed Operations
Jonathan Arns, Harald Ng, Kyriakos Psarakis, Asterios Katsifodimos, Paris Carbone
Privacy Meets Regulations: Shaping the Future of Work
Mohammad Javad Amiri, Tristan Allard, Boon Thau Loo, Divyakant Agrawal, Amr El Abbadi
Rosé: Flexible Replication With Strong Semantics For Partitioned Databases
Ioannis Zarkadas, Kelly Kostopoulou, Thomas Graham, Junfeng Yang, Philip A. Bernstein, Asaf Cidon, Tamer Eldeeb
Session 5: SQL and Data Modeling
On the Vexing Difficulty of Evaluating IN Predicates
Altan Birler, Thomas Neumann
Raqlet: Cross-Paradigm Compilation for Recursive Queries
Amir Shaikhha, Youning Xia, Meisam Tarabkhah, Jazal Saleem, Anna Herlihy
Semantic Data Modeling, Graph Query, and SQL, Together at Last?
Jeff Shute, Colin Zheng, Romit Kudtarkar
Database Research needs an Abstract Relational Query Language
Wolfgang Gatterbauer, Diandre Miguel B. Sabale
Session 6: Data Integration and Wrangling
Towards Scalable Visual Data Wrangling via Direct Manipulation
El Kindi Rezig, Mir Mahathir Mohammad, Nicolas Baret, Ricardo Mayerhofer, Andrew McNutt, Paul Rosen
The Pneuma Project: Reifying Information Needs as Relational Schemas to Automate Discovery, Guide Preparation, and Align Data with Intent
Muhammad Imam Luthfi Balaka, Raul Castro Fernandez
A Vision for Autonomous Data Agent Collaboration: From Query-by-Integration to Query-by-Collaboration
Timo Eckmann, Carsten Binnig
Session 7: Memory, I/O, and Data Movement in Modern Data Systems
Flexible I/O for Database Management Systems with xNVMe
Emil Houlborg, Andreas Nicolaj Tietgen, Simon A. F. Lund, Marcel Weisgut, Tilmann Rabl, Javier González, Vivek Shah, Pınar Tözün
Declarative Memory Services
Jeronimo Castrillon, Jana Giceva, Yu Hua, Kimberly Keeton, Akhil Shekar, Kevin Skadron, Tianzheng Wang, Huanchen Zhang
Data Movement-Aware GPU Sharing for Data-Intensive Systems
Yi Jiang, Hamish Nicholson, Viktor Sanca, Anastasia Ailamaki
Cloudspecs: Cloud Hardware Evolution Through the Looking Glass
Till Steinert, Maximilian Kuschewski, Viktor Leis
Session 8: Hardware-Accelerated Query Processing
Rethinking Analytical Processing in the GPU Era
Bobbi Yogatama, Yifei Yang, Kevin Kristensen, Devesh Sarda, Abigale Kim, Adrian Cockcroft, Yu Teng, Joshua Patterson, Gregory Kimball, Wes McKinney, Weiwei Gong, Xiangyao Yu
Raster is Faster: Rethinking Ray Tracing in Database Indexing
Harish Doraiswamy, Jayant R. Haritsa
Does A Fish Need a Bicycle? The Case for On-Chip NPUs in DBMS
Alexander Baumstark, Kai-Uwe Sattler
Hash Joins Meet CXL: A Fresh Look
Wentao Huang, Mian Lu, Kian-Lee Tan