portrit

Welcome to Yuanyuan Tian’s Home Page!

Yuanyuan Tian is currently a Principal Scientist Manager and a Graph Architect at Microsoft Gray Systems Lab and an ACM Distinguished Member. Before Microsoft, she was a Principal Research Staff Member at IBM Almaden Research Center. She received her PhD degree in Computer Science & Engineering in 2008 and MS degree in Computer Science & Engineering in 2005 both from University of Michigan, and BS degree in Computer Science & Technology with honor in 2003 from Peking University.

[Download CV]

Contact: [fullname without space]@microsoft[dot]com

Reseach Areas

Yuanyuan is currently leading the graph queries & analytics, query/workload optimization, and ML-for-Systems research efforts in GSL. Her other research interests include Systems-for-ML, HTAP, big data, and cloud computing.

Graph Queries and Analytics: Yuanyuan has long standing interests in graph queries and analytics. She has written two books and 20+ papers on graphs. Her current research includes building scale-out graph query and analytics platforms and designing distributed graph algorithms. Yuanyuan is a currently a Graph Architect working with Azure data product teams and the LIquid team at Linkedin on various graph projects. She was the founder and technical lead of IBM Db2 Graph, which is IBM’s graph database product. Her PhD thesis was on querying graph databases.

Selected Papers: Graph Databases Survey (SIGMOD Record’22), Graph BI Queries Benchmarking (DaMoN’23), IBM Db2 Graph (SIGMOD’20, VLDB’19), Dynamic Graph Analysis (ICDE’15), Giraph++ (PVLDB’13), Graph Summarization (CIKM’14, ICDE’10, SIGMOD’08), Graph Matching (ICDE’08, Bioinformatics’07)

Query/Workload Optimization: Yuanyuan and her team approach the classical query optimization (QO) problem from a fresh perspective, proposing a bold shift in direction. Instead of relying on bespoke QOs or library-sharing frameworks like Calcite, they advocate for reimagining the QO stack by fully embracing Query Optimizer as a Service (QOaaS). This approach decouples the QO from the engine’s query processing, enabling independent deployment and experimentation. QOaaS centralizes workload-level optimizations such as index/view selection and ML-driven QO enhancements, accelerates development by sharing costs across engines, and lays the foundation for multi-engine federation, where each query sub-plan is executed by the most optimal engine.

Selected Papers: QOaaS Vision (CIDR’25), Workload Forecasting (SIGMOD’24).

ML-for-Systems: Yuanyuan is co-leading the ML-for-Systems research area in GSL. She and her team work with various Azure data product teams on applying data-driven approaches to automate many aspects of data services on Azure. Their work spans cloud infrastructure, query engines, and service layers of the cloud stack, driving innovation and efficiency across the platform.

Selected Papers: Autonomous Data Service Vision (SIGMOD’23), MLOS (VLDB’24), Semantic Equivalence Detection (SIGMOD’24).

Past Research

SQL for Big Data: Yuanyuan’s work in this area includes SQL-on-Hadoop, Hybrid Warehouse (integration between Hadoop and Enterprise Data Warehouses), and HTAP (Hybrid Transactional and Analytical Processing) for Big Data. She collaborated closely with the IBM software group. Her work on SQL-on-Hadoop was tied to the IBM Db2 Big SQL product, and the Wildfire HTAP system she has co-developed has been released as the IBM Db2 Event Store product.

Selected Papers: HTAP for Big Data (VLDB’20, BigData’19, SIGMOD’19, EDBT’19, CIDR’17, SIGMOD’16 Demo), Hybrid Warehouses (TODS’16, EDBT’15), CoHadoop (PVLDB’11), Hadoop Joins (SIGMOD’10)

System Support for Machine Learning: Yuanyuan is the co-inventor and a lead developer for a large-scale machine learning system, called SystemML. It is now a top-level Apache open source project. Recently, she has worked on integrating big SQL and big ML systems, and designing novel distributed time-biased sampling algorithms for online ML model management.

Selected Papers: Time-biased Sampling for Online Model Mangement (Information Processing Letters’23, TODS’19, SIGMOD Record’19, EDBT’18), Integration of SQL and ML (EDBT’15), SystemML on YARN (SIGMOD’15), SystemML Optimizer (IEEE DE Bulletin’14), ParFor in SystemML (PVLDB’14), Numerical Stability in SystemML (ICDE’12), SystemML Archtecture (ICDE’11).

Selected Awards

Invited Talks

Professional Service

Editor: Associate Editor for VLDB 2026, Associate Editor for VLDB 2025, Associate Editor for SIGMOD 2024, Associate Editor for Frontiers in Big Data (since 2021), Associate Editor for VLDB Journal (since 2019), Associate Editor for PVLDB Vol. 11 (VLDB 2018), Section Editor for Encyclopedia on Big Data Technologies.

Chair: SIGMOD 2027 PC Chair, VLDB 2025 DEI Chair, EDBT 2025 Industrial & Application Chair, SoCC 2023 PC Chair, VLDB 2023 Industry Chair, EDBT 2023 Demo Chair, CIDR 2023 Diversity & Inclusion Chair, CIDR 2022 Diversity & Inclusion Chair, VLDB 2021 Demo Chair, IEEE BigData 2019 Industry and Government Chair, VLDB 2019 Workshop Chair, ICDE 2017 Demo Chair, CIKM 2013 Poster Chair

Workshop Chair: 3rd Workshop on Large Scale Network Analysis (LSNA 2014), 5th Workshop on Graph Data Management (GDM 2014), 2nd Workshop on Large Scale Network Analysis (LSNA 2013), 4th Workshop on Graph Data Management (GDM 2013), 1st Workshop on Large Scale Network Analysis (LSNA 2012)

Panelist:

PC Member: CIDR 2023, SIGMOD 2023 Industry Track, SIGMOD 2022 Industry Track, CIDR 2022, CIDR 2021, VLDB 2020 Demo Track, SIGMOD 2020, VLDB 2019, SIGMOD 2018, VLDB 2017, VLDB 2016 Industrial Track, TKDE 2016 Poster Track, VLDB 2015, ICDE 2014, WISE 2013, SIGMOD 2012, GDM 2012, VLDB 2011 Industrial Track, DBSocial 2011, GDM 2011, ICDE 2011, GDM 2010, VLDB 2009.

Reviewer for Journals: VLDB Journal (2014, 2017), TODS (2013, 2015), Statistical Analysis and Data Mining (2009), Information System (2010, 2011, 2013), ACM Transactions on Intelligent Systems and Technology (2010), Distributed and Parallel Databases (2012).

Reviewer for Books: Data Processing Techniques in The Era of Big Data.

Reviewer for Research Grants: Research Grants Council (RGC) of Hong Kong (2010, 2011).

Reviewer for Awards: The NCWIT Award for Aspirations in Computing.