Data Science in Action.
How are data science workflows implemented in real-life?
Past literature on understanding data science has primarily focused on data scientists and less on data science workflows themselves. My research focuses on empirically understanding real-world data science workflows implemented in notebooks. Particularly, I aim to develop methods to support large-scale analysis and understanding of data science workflows.
Related Publications
[1] Ramasamy et. al, Workflow analysis of data science code in public GitHub repositories, Empirical Software Engineering Journal 28, Article number: 7 (2023). Read more in the blog.
[6] Collaboration with JetBrains, “Observing Fine-Grained Changes in Jupyter Notebooks During Development Time” (2025).