Welcome to Yuanyuan Tian’s Home Page!
Yuanyuan Tian is currently a Principal Scientist at Microsoft Gray Systems Lab and an ACM Distinguished Member. Before Microsoft, she was a Principal Research Staff Member at IBM Almaden Research Center. She received her PhD degree in Computer Science & Engineering in 2008 and MS degree in Computer Science & Engineering in 2005 both from University of Michigan, and BS degree in Computer Science & Technology with honor in 2003 from Peking University.
Contact: [fullname without space]@microsoft[dot]com
Reseach Areas
Yuanyuan is currently co-leading the ML-for-Systems and graph analytics research efforts in GSL. Her other research interests include Systems for ML, HTAP, big data, and query optimization.
Past Research
SQL for Big Data: Yuanyuan’s work in this area includes SQL-on-Hadoop, Hybrid Warehouse (integration between Hadoop and Enterprise Data Warehouses), and HTAP (Hybrid Transactional and Analytical Processing) for Big Data. She collaborated closely with the IBM software group. Her work on SQL-on-Hadoop was tied to the IBM Db2 Big SQL product, and the Wildfire HTAP system she has co-developed has been released as the IBM Db2 Event Store product.
Selected Papers: HTAP for Big Data (VLDB’20, BigData’19, SIGMOD’19, EDBT’19, CIDR’17, SIGMOD’16 Demo), Hybrid Warehouses (TODS’16, EDBT’15), CoHadoop (PVLDB’11), Hadoop Joins (SIGMOD’10)
Graph Analytics: Yuanyuan has long standing interests in graph analytics. She has written two books on large scale graph processing. Her current research includes building distributed graph-processing systems, designing distributed graph algorithms, and social network analysis. Her PhD thesis was on querying graph databases. She was the founder and technical lead of a project on supporting graph analytics inside relational databases, which is released as IBM Db2 Graph.
Graph Processing/Databases Papers: IBM Db2 Graph (SIGMOD’20, VLDB’19), Dynamic Graph Analysis (ICDE’15), Giraph++ (PVLDB’13), Graph Summarization (CIKM’14, ICDE’10, SIGMOD’08), Graph Matching (ICDE’08, Bioinformatics’07)
Social Network Analysis Papers: Topic-Specific Influence Analysis (WSDM’14), Event-Based Social Network (SIGKDD’12).
System Support for Machine Learning: Yuanyuan is the co-inventor and a lead developer for a large-scale machine learning system, called SystemML. It is now a top-level Apache open source project. Recently, she has worked on integrating big SQL and big ML systems, and designing novel distributed time-biased sampling algorithms for online ML model management.
Selected Papers: Time-biased Sampling for Online Model Mangement (TODS’19, SIGMOD Record’19, EDBT’18), Integration of SQL and ML (EDBT’15), SystemML on YARN (SIGMOD’15), SystemML Optimizer (IEEE DE Bulletin’14), ParFor in SystemML (PVLDB’14), Numerical Stability in SystemML (ICDE’12), SystemML Archtecture (ICDE’11).
Selected Awards
-
2023 DaMoN 2023 Best Short Paper Award, “Microarchitectural Analysis of Graph BI Queries on RDBMS”, DaMoN 2023
-
2020 ACM Distinguished Member
-
2020 Outstanding Technical Achievement Award for contribution to IBM Db2 Event Store, IBM
-
2019 IBM A-Level Accomplishment for contribution to IBM Db2 Event Store, IBM Research
-
2019 Invention Achievement Award, IBM
-
2019 VLDB 2019 Distinguished Reviewer Award, VLDB 2019
-
2019 SIGMOD 2019 Research Highlight Award, “Online Model Management via Temporally Biased Sampling”, SIGMOD 2019
-
2019 Research Division Award for the work in declarative machine/deep learning, IBM
-
2019 Outstanding Technical Achievement Award for the work in large-scale graph analytics and infrastructure, IBM
-
2018 EDBT Best Paper Award, “Temporally-Biased Sampling for Online Model Management”, EDBT 2018
-
2018 IBM A-Level Accomplishment for the work in large scale graph analytics and infrastructure, IBM Research
-
2018 IBM A-Level Accomplishment for the work in declarative machine/deep learning (SystemML), IBM Research
-
2016 Outstanding Technical Achievement Award for the work in join algorithms for big data, IBM
-
2016 Eminence & Excellence Award, IBM Research
-
2015 IBM A-Level Accomplishment for the work in join algorithms for big data, IBM Research
-
2015 IBM A-Level Accomplishment for the contributions to the SystemML project, IBM Research
-
2013 High Value Patent Application Award, IBM Research
-
2012 Eminence & Excellence Award, IBM Research
-
2011 Eminence & Excellence Award, IBM Research
-
2008 Distinguished Achievement Award, University of Michigan
-
2007 2nd Place, CSE Honor Competition, University of Michigan
-
2007 Rackham Predoctoral Fellowship, University of Michigan
-
2003 Rackham Graduate Fellowship, University of Michigan
Invited Talks
-
Graph Databases and AI, interview by Kyle Polich on Data Skeptic Podcast, October 21, 2025. Listen here or wherever you listen to podcasts.
-
Towards Autonomous Data Services on Azure, Microsoft Sponor Talk, VLDB 2024, Aug 2024.
-
The World of Graph Databases from An Industry Perspective, [Youtube Video], 16th LDBC TUC meeting, Jun 2023
-
Towards Autonomous Data Services on Azure: A GSL Journey, Sky Seminar, UC Berkeley, Sep 2022
-
Leading Women in Tech Q&A, Ontra, Jan 2022.
-
Db2 Graph Query Drill Down, Db2 Technical Advisory Board Meeting, Apr 2020
-
Big Data Analytics: From SQL to Machine Learning and Graph Analysis (Keynote), [Slides] [Youtube Video], BigDas Workshop, SIGKDD’2017, Aug 2017.
-
Hybrid Transactional/Analytical Processing (Tutorial), [Youtube Video], SIGMOD’2017, May 2017.
-
Big Graph Analytics Platforms (Tutorial), [Slides], SIGMOD’2016, June 2016.
-
Giraph++: From “Think Like a Vertex” to “Think Like a Graph”, Facebook, Nov 2013.
-
Large Scale Topic-specific Influence Analysis on Microblogs, UC Santa Barbara, May 2013.
-
Large Scale Topic-specific Influence Analysis on Microblogs, UC Santa Cruz, May 2013.
-
SystemML: Large Scale Machine Learning on MapReduce, Peking University, Beijing, China, Aug 2012.
-
SystemML: Large Scale Machine Learning on MapReduce, IBM China Research Lab, Beijing, China, Aug 2012.
-
SystemML: Large Scale Machine Learning on MapReduce, University of Maryland, College Park, Maryland, Apr 2012.
Professional Service
Editor: Associate Editor for VLDB 2025, Associate Editor for SIGMOD 2024, Associate Editor for Frontiers in Big Data (since 2021), Associate Editor for VLDB Journal (since 2019), Associate Editor for PVLDB Vol. 11 (VLDB 2018), Section Editor for Encyclopedia on Big Data Technologies.
Chair: EDBT 2025 Industrial & Application Chair, SoCC 2023 PC Chair, VLDB 2023 Industry Chair, EDBT 2023 Demo Chair, CIDR 2023 Diversity & Inclusion Chair, CIDR 2022 Diversity & Inclusion Chair, VLDB 2021 Demo Chair, IEEE BigData 2019 Industry and Government Chair, VLDB 2019 Workshop Chair, ICDE 2017 Demo Chair, CIKM 2013 Poster Chair
Workshop Chair: 3rd Workshop on Large Scale Network Analysis (LSNA 2014), 5th Workshop on Graph Data Management (GDM 2014), 2nd Workshop on Large Scale Network Analysis (LSNA 2013), 4th Workshop on Graph Data Management (GDM 2013), 1st Workshop on Large Scale Network Analysis (LSNA 2012)
Panelist:
- PhD Mentoring Panel (Moderator), VLDB 2024, August 2024.
- The Future of Graph Analytics, SIGMOD 2024, June 2024.
- AI for Systems, SIGMOD 2024, June 2024.
- FinBench Panel, 16th LDBC TUC meeting, Jun 2023.
- Women in DB: Discussion and Socialization, Organizer, CIDR 2022.
- Women in DB round table, VLDB 2021, Aug 2021.
- ICDE PhD Symposium Panel, ICDE 2021, Apr 2021.
- Round Table on Graph Databases, VLDB 2020.
- Deep Dive: In-Database Graph analytics with Db2, IBM DB2 Nebula (11.5.4) Webinar Series, Jun 2020.
- “Women in DB: Experiences and Perspectives” event, Organizer, SIGMOD 2020.
- NSF Advisory Panel, 2013 & 2016.
- NSF Career Mentoring Panel, ICDE 2012. My 2 cents on How to Be Competitive for Industrial Research Jobs presented in this career panel.
PC Member: CIDR 2023, SIGMOD 2023 Industry Track, SIGMOD 2022 Industry Track, CIDR 2022, CIDR 2021, VLDB 2020 Demo Track, SIGMOD 2020, VLDB 2019, SIGMOD 2018, VLDB 2017, VLDB 2016 Industrial Track, TKDE 2016 Poster Track, VLDB 2015, ICDE 2014, WISE 2013, SIGMOD 2012, GDM 2012, VLDB 2011 Industrial Track, DBSocial 2011, GDM 2011, ICDE 2011, GDM 2010, VLDB 2009.
Reviewer for Journals: VLDB Journal (2014, 2017), TODS (2013, 2015), Statistical Analysis and Data Mining (2009), Information System (2010, 2011, 2013), ACM Transactions on Intelligent Systems and Technology (2010), Distributed and Parallel Databases (2012).
Reviewer for Books: Data Processing Techniques in The Era of Big Data.
Reviewer for Research Grants: Research Grants Council (RGC) of Hong Kong (2010, 2011).
Reviewer for Awards: The NCWIT Award for Aspirations in Computing.