portrit

Welcome to Yuanyuan Tian’s Home Page!

Yuanyuan Tian is currently a Principal Scientist at Microsoft Gray Systems Lab and an ACM Distinguished Member. Before Microsoft, she was a Principal Research Staff Member at IBM Almaden Research Center. She received her PhD degree in Computer Science & Engineering in 2008 and MS degree in Computer Science & Engineering in 2005 both from University of Michigan, and BS degree in Computer Science & Technology with honor in 2003 from Peking University.

[Download CV]

Contact: [fullname without space]@microsoft[dot]com

Reseach Areas

Yuanyuan is currently co-leading the ML-for-Systems research efforts in GSL. Her other research interests include Systems for ML, HTAP, big data, and graph databases.

Past Research

SQL for Big Data: Yuanyuan’s work in this area includes SQL-on-Hadoop, Hybrid Warehouse (integration between Hadoop and Enterprise Data Warehouses), and HTAP (Hybrid Transactional and Analytical Processing) for Big Data. She collaborated closely with the IBM software group. Her work on SQL-on-Hadoop was tied to the IBM Db2 Big SQL product, and the Wildfire HTAP system she has co-developed has been released as the IBM Db2 Event Store product.

Selected Papers: HTAP for Big Data (VLDB’20, BigData’19, SIGMOD’19, EDBT’19, CIDR’17, SIGMOD’16 Demo), Hybrid Warehouses (TODS’16, EDBT’15), CoHadoop (PVLDB’11), Hadoop Joins (SIGMOD’10)

Graph Analytics: Yuanyuan has long standing interests in graph analytics. She has written two books on large scale graph processing. Her current research includes building distributed graph-processing systems, designing distributed graph algorithms, and social network analysis. Her PhD thesis was on querying graph databases. She was the founder and technical lead of a project on supporting graph analytics inside relational databases, which is released as IBM Db2 Graph.

Graph Processing/Databases Papers: IBM Db2 Graph (SIGMOD’20, VLDB’19), Dynamic Graph Analysis (ICDE’15), Giraph++ (PVLDB’13), Graph Summarization (CIKM’14, ICDE’10, SIGMOD’08), Graph Matching (ICDE’08, Bioinformatics’07)

Social Network Analysis Papers: Topic-Specific Influence Analysis (WSDM’14), Event-Based Social Network (SIGKDD’12).

System Support for Machine Learning: Yuanyuan is the co-inventor and a lead developer for a large-scale machine learning system, called SystemML. It is now a top-level Apache open source project. Recently, she has worked on integrating big SQL and big ML systems, and designing novel distributed time-biased sampling algorithms for online ML model management.

Selected Papers: Time-biased Sampling for Online Model Mangement (TODS’19, SIGMOD Record’19, EDBT’18), Integration of SQL and ML (EDBT’15), SystemML on YARN (SIGMOD’15), SystemML Optimizer (IEEE DE Bulletin’14), ParFor in SystemML (PVLDB’14), Numerical Stability in SystemML (ICDE’12), SystemML Archtecture (ICDE’11).

Selected Awards

Invited Talks & Panels

Professional Service

Editor: Associate Editor for SIGMOD 2024, Associate Editor for Frontiers in Big Data (since 2021), Associate Editor for VLDB Journal (since 2019), Associate Editor for PVLDB Vol. 11 (VLDB 2018), Section Editor for Encyclopedia on Big Data Technologies.

Chair: EDBT 2025 Industrial & Application Chair, SoCC 2023 PC Chair, VLDB 2023 Industry Chair, EDBT 2023 Demo Chair, CIDR 2023 Diversity & Inclusion Chair, CIDR 2022 Diversity & Inclusion Chair, VLDB 2021 Demo Chair, IEEE BigData 2019 Industry and Government Chair, VLDB 2019 Workshop Chair, ICDE 2017 Demo Chair, CIKM 2013 Poster Chair

Workshop Chair: 3rd Workshop on Large Scale Network Analysis (LSNA 2014), 5th Workshop on Graph Data Management (GDM 2014), 2nd Workshop on Large Scale Network Analysis (LSNA 2013), 4th Workshop on Graph Data Management (GDM 2013), 1st Workshop on Large Scale Network Analysis (LSNA 2012)

Panelist:

PC Member: CIDR 2023, SIGMOD 2023 Industry Track, SIGMOD 2022 Industry Track, CIDR 2022, CIDR 2021, VLDB 2020 Demo Track, SIGMOD 2020, VLDB 2019, SIGMOD 2018, VLDB 2017, VLDB 2016 Industrial Track, TKDE 2016 Poster Track, VLDB 2015, ICDE 2014, WISE 2013, SIGMOD 2012, GDM 2012, VLDB 2011 Industrial Track, DBSocial 2011, GDM 2011, ICDE 2011, GDM 2010, VLDB 2009.

Reviewer for Journals: VLDB Journal (2014, 2017), TODS (2013, 2015), Statistical Analysis and Data Mining (2009), Information System (2010, 2011, 2013), ACM Transactions on Intelligent Systems and Technology (2010), Distributed and Parallel Databases (2012).

Reviewer for Books: Data Processing Techniques in The Era of Big Data.

Reviewer for Research Grants: Research Grants Council (RGC) of Hong Kong (2010, 2011).

Reviewer for Awards: The NCWIT Award for Aspirations in Computing.