Yuanyuan Tian

Yuanyuan Tian (田媛媛)

Partner Scientist Manager & Graph Architect at Microsoft.
ACM Distinguished Member.

Previously Principal Research Staff Member at IBM Almaden Research Center. PhD and MS from University of Michigan, BS from Peking University.

CV · Google Scholar · [fullname without space]@microsoft[dot]com

Current Research

Yuanyuan leads graph queries & analytics, query/workload optimization, and ML-for-Systems research in GSL. Her broader interests include Systems-for-ML, HTAP, big data, and cloud computing.

Graph Queries and Analytics

Yuanyuan has worked on graphs for most of her career — two books and 20+ papers. She is currently building scale-out graph query and analytics platforms, working with Azure data product teams and the Liquid team at LinkedIn. She founded and led IBM Db2 Graph and is an architect for Graph in Microsoft Fabric. Her PhD thesis was on querying graph databases.

Selected: Graph Databases Survey (SIGMOD Record'24, '22), Graph BI Benchmarking (DaMoN'23), Db2 Graph (SIGMOD'20, VLDB'19), Large-scale Graph Processing (ICDE'15, PVLDB'13), Graph Summarization (CIKM'14, ICDE'10, SIGMOD'08), Graph Matching (ICDE'08, Bioinformatics'07)

Workload Optimization

Yuanyuan and her team approach classical query optimization from a fresh angle — they advocate for Query Optimizer as a Service (QOaaS). The idea is to decouple the QO from the engine, enabling independent deployment, workload-level optimizations (index/view selection, ML-driven enhancements), shared development costs across engines, and multi-engine federation where sub-plans route to the best engine.

Selected: Industry Perspective for QO (SIGMOD Record'26), Bitmap Filter in SQL Server (CIDR'26), QOaaS (CIDR'25), Workload Forecasting (SIGMOD'24)

ML-for-Systems

Yuanyuan is co-leading the ML-for-Systems research in GSL. She and her team work with Azure data product teams on applying data-driven approaches to automate data services — spanning cloud infrastructure, query engines, and service layers.

Selected: Autonomous Data Service Vision (SIGMOD'23), MLOS (VLDB'24), Semantic Equivalence Detection (SIGMOD'24)

Past Research

SQL for Big Data

SQL-on-Hadoop, Hybrid Warehouses, and HTAP. Her SQL-on-Hadoop work tied to IBM Db2 Big SQL; the Wildfire HTAP system she co-developed became IBM Db2 Event Store.

Selected: HTAP for Big Data (VLDB'20, BigData'19, SIGMOD'19, EDBT'19, CIDR'17, SIGMOD'16 Demo), Hybrid Warehouses (TODS'16, EDBT'15), CoHadoop (PVLDB'11), Hadoop Joins (SIGMOD'10)

Systems for Machine Learning

Yuanyuan is the co-inventor and a lead developer of SystemML, now Apache SystemDS. She also worked on integrating SQL and ML systems, and temporally-biased sampling for online model management.

Selected: Sampling for Online Model Management (Information Processing Letters'23, TODS'19, SIGMOD Record'19, EDBT'18), Integration of SQL and ML (EDBT'15), SystemML (SIGMOD'15, IEEE DE Bulletin'14, PVLDB'14, ICDE'12, ICDE'11)

Books

Systems for Big Graph Analytics

Systems for Big Graph Analytics

D. Yan, Y. Tian, J. Cheng

SpringerBriefs in Computer Science, Springer, 2017

Big Graph Analytics Platforms

Big Graph Analytics Platforms

D. Yan, Y. Bu, Y. Tian, A. Deshpande

Foundations and Trends in Databases, Vol. 7: No. 1-2, pp 1-195, 2017

Selected Awards

Invited Talks

Panels

Professional Service

Editor: SIGMOD PC Advisory Board Member · Associate Editor for VLDB 2025 · Associate Editor for SIGMOD 2024 · Associate Editor for Frontiers in Big Data (since 2021) · Associate Editor for VLDB Journal (2019–2025) · Associate Editor for PVLDB Vol. 11 (VLDB 2018) · Section Editor for Encyclopedia on Big Data Technologies.

Chair: SIGMOD 2027 PC Chair · VLDB 2025 DEI Chair · EDBT 2025 Industrial & Application Chair · SoCC 2023 PC Chair · VLDB 2023 Industry Chair · EDBT 2023 Demo Chair · CIDR 2023 Diversity & Inclusion Chair · CIDR 2022 Diversity & Inclusion Chair · VLDB 2021 Demo Chair · IEEE BigData 2019 Industry and Government Chair · VLDB 2019 Workshop Chair · ICDE 2017 Demo Chair · CIKM 2013 Poster Chair

Workshop Chair: 3rd Workshop on Large Scale Network Analysis (LSNA 2014) · 5th Workshop on Graph Data Management (GDM 2014) · 2nd LSNA (2013) · 4th GDM (2013) · 1st LSNA (2012)

PC Member: CIDR 2023 · SIGMOD 2023 Industry · SIGMOD 2022 Industry · CIDR 2022 · CIDR 2021 · VLDB 2020 Demo · SIGMOD 2020 · VLDB 2019 · SIGMOD 2018 · VLDB 2017 · VLDB 2016 Industrial · TKDE 2016 Poster · VLDB 2015 · ICDE 2014 · WISE 2013 · SIGMOD 2012 · GDM 2012 · VLDB 2011 Industrial · DBSocial 2011 · GDM 2011 · ICDE 2011 · GDM 2010 · VLDB 2009

Reviewer for Journals: VLDB Journal (2014, 2017) · TODS (2013, 2015) · Statistical Analysis and Data Mining (2009) · Information Systems (2010, 2011, 2013) · ACM TIST (2010) · Distributed and Parallel Databases (2012)

Reviewer for Books: Data Processing Techniques in The Era of Big Data

Reviewer for Grants: NSF Advisory Panel (2013, 2016) · Research Grants Council of Hong Kong (2010, 2011)

Reviewer for Awards: The NCWIT Award for Aspirations in Computing