I am a researcher and engineer with expertise in data infrastructure. My work has spanned across query optimization, query processing, index tuning, transaction processing, and vector search. I co-authored Extensible Query Optimizers in Practice, a book on query optimizer design. I received my Ph.D. in Computer Science from Cornell University.
2025.5--present: Co-founder, stealth startup -- United States
2016.7--2025.5: Principal Researcher, Microsoft Research -- Redmond, WA
2013.10--2014.5: Research Intern, Microsoft Research -- Redmond, WA
2013.5--2013.8: Software Engineer Intern, LinkedIn -- Mountain View, CA
2012.5--2012.8: Software Engineer Intern, Amazon -- Seattle, WA
2009.2--2009.7: Research Intern, Microsoft Research Asia -- Beijing
2007.9--2007.11: Software Engineer Intern, Google -- Shanghai
Education
Cornell University (2010-2016)
Doctor of Philosophy in Computer Science
Research Area: Database
Advisor: Prof. Johannes Gehrke
Fudan University (2006-2010)
Bachelor of Science, Department of Computer Science
Book
Bailu Ding, Vivek Narasayya, Surajit Chaudhuri, Extensible Query Optimizers in Practice, Foundations and Trends in Databases 2024 (paper)
Publications
Hangdong Zhao, Yuanyuan Tian, Rana Alotaibi, Bailu Ding, Nicolas Bruno, Jesús Camacho-Rodríguez, Vassilis Papadimos, Ernesto Cervantes Juárez, Cesar Galindo-Legaria, Carlo Curino, I Can't Believe It's Not Yannakakis: Pragmatic Bitmap Filters in Microsoft SQL Server, Conference on Innovative Data Systems Research (CIDR) 2026 (paper)
Gwangoo Yeo, Zhiyang Shen, Wei Cui, Matteo Interlandi, Rathijit Sen, Bailu Ding, Qi Chen, Minsoo Rhu, ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs, arXiv 2026 (paper)
Yaoqi Chen, Jinkai Zhang, Baotong Lu, Qianxi Zhang, Chengruidong Zhang, Jingjia Luo, Di Liu, Huiqiang Jiang, Qi Chen, Jing Liu, Bailu Ding, Xiao Yan, Jiawei Jiang, Chen Chen, Mingxing Zhang, Yuqing Yang, Fan Yang, Mao Yang, RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference, arXiv 2025 (paper)
Jiongli Zhu, Yue Wang, Bailu Ding, Philip A. Bernstein, Vivek Narasayya, Surajit Chaudhuri, MINT: Multi-Vector Search Index Tuning, IEEE International Conference on Data Engineering (ICDE) 2026, arXiv 2025 (paper)
Yinan Li, Bailu Ding, Ziyun Wei, Lukas M. Maas, Momin Al-Ghosien, Spyros Blanas, Nicolas Bruno, Carlo Curino, Matteo Interlandi, Craig Peeper, Kaushik Rajan, Surajit Chaudhuri, Johannes Gehrke, Scaling GPU-Accelerated Databases Beyond GPU Memory Size, Proceedings of the VLDB Endowment 2025 (paper)
Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang, Chen Chen, Fan Yang, Yuqing Yang, Lili Qiu, RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval, ENLPS @ NeurIPS 2025 (Best Paper Award), arXiv 2024 (paper)
Bailu Ding, Jiaqi Zhai, Retrieval with Learned Similarities, The Web Conference 2025 (paper)
Bailu Ding, Surajit Chaudhuri, Johannes Gehrke, Vivek Narasayya, DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems, Proceedings of the VLDB Endowment 2021 (paper)
Bailu Ding, Surajit Chaudhuri, Vivek Narasayya, Bitvector-aware Query Optimization for Decision Support Queries, ACM SIGMOD/PODS International Conference on Management of Data 2020 (paper)
Lin Ma, Bailu Ding, Sudipto Das, Adith Swaminathan, Active Learning for ML Enhanced Database Systems, ACM SIGMOD/PODS International Conference on Management of Data 2020 (paper)
Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, Vivek Narasayya, AI Meets AI: Leveraging Query Executions to Improve Index Recommendations, ACM SIGMOD/PODS International Conference on Management of Data 2019 (paper)
Bailu Ding, Lucja Kot, Johannes Gehrke, Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering, Proceedings of the VLDB Endowment 2019 (paper)
Chi Wang, Bailu Ding, Fast Approximation of Empirical Entropy via Subsampling, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2019 (paper)
Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, Vivek Narasayya, Plan Stitch: Harnessing the Best of Many Plans, Proceedings of the VLDB Endowment 2018 (paper)
Bailu Ding, Lucja Kot, Alan Demers, Johannes Gehrke, Centiman: Elastic, High Performance Optimistic Concurrency Control by Watermarking, ACM Symposium on Cloud Computing (SoCC) 2015 (paper)
Philip A. Bernstein, Sudipto Das, Bailu Ding, Markus Pilman, Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases, ACM SIGMOD/PODS International Conference on Management of Data 2015 (paper)
Sudip Roy, Lucja Kot, Gabriel Bender, Bailu Ding, Hossein Hojjat, Christoph Koch, Nate Foster, Johannes Gehrke, The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis, ACM SIGMOD/PODS International Conference on Management of Data 2015 (paper)
Bailu Ding, Jiang-Ming Yang, Chong Wang, Rui Cai, Zhiwei Li, Lei Zhang, Who Talks to Whom: Modeling Latent Structures in Dialogue Documents, NIPS 2009 Workshop on Applications for Topic Models: Text and Beyond (paper)
Zhongzhi Zhang, Jihong Guan, Bailu Ding, Liang Chen, Shuigeng Zhou, Contact Graphs of Disk Packings as a Model of Spatial Planar Networks, New Journal of Physics 11 (8), 083007 (paper)
Yi Qi, Zhongzhi Zhang, Bailu Ding, Shuigeng Zhou, Jihong Guan, Structural and Spectral Properties of a Family of Deterministic Recursive Trees: Rigorous Solutions, Journal of Physics A: Mathematical and Theoretical 42 (16), 165103 (paper)
Patents
Bailu Ding, Surajit Chaudhuri, Vivek Narasayya, Transforming queries using bitvector aware optimization, US Patent App. 16/917,489
Bailu Ding, Sudipto Das, Ryan Marcus, Lin Ma, Adith Swaminathan, Surajit Chaudhuri, Vivek Narasayya, Leveraging Query Executions to Improve Index Recommendations, US Patent 11,138,266
Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, Vivek Narasayya, Execution Plan Stitching, US Patent 10,810,202
Service
Program Chair
SIGMOD 2027 Industry Track, Co-chair
SMDB@ICDE, Co-chair (2022-2023)
Review Board
VLDB Journal (2025-present)
Program Committee
VLDB: 2020-2025; Distinguished Reviewer (2024)
SIGMOD: 2020-2026; Distinguished Reviewer (2020); Industry Track (2023); Reproducibility (2023)