We hope to gather the growing community of researchers and practitioners in HCI and allied fields who are building new insights, methods, and collaborative practices around data science. Topics of interest include (but are not limited to):
We encourage you to contact us if you have any questions about the workshop or themes!
With the rise of big data, there has been an increasing need to understand who is working in data science, and how they are doing their work. HCI and CSCW researchers have begun to examine these questions. In this workshop, we invite researchers to share their observations, experiences, hypotheses, and insights, in the hopes of developing a taxonomy of work practices and open issues in the behavioral and social study of data science and data science workers.
The outcomes of data science work are increasingly influential in much of the world. How people work in data science is equally important, if we are to understand, support, and critique as (appropriate) these important societal forces. We invite our colleagues to work with us for a deeper understanding of how human perform the work of data science.
Extraordinary claims are made about the promises and current successes of data science. While some of these claims are stated for the future, Agarwal and Dhar editorialize that “This is powerful… we are in principle already there”. At a different extreme, scholars have criticized the “mythology” of such claims.
Further complicating these discussions, there is considerable diversity in the tools and methods, challenges, and job-roles involved in data science. Detailed study will be necessarily partial and contextualized, adding depth of description and understanding, but potentially lacking a broader view.
Several studies in HCI and CSCW have begun to look at facets of these diverse topics. Passi and Jackson described an on-going tension over the use of algorithmic rules. They propose that data science students can learn to practice a kind of data vision that treats rules more as guidance (“rules-based”) than as formal constraint (“rules-bound”).
Dealing with data, or “data wrangling,” has been estimated to take up 80-90% of the effort in a typical data science project. Understanding how people approach their data is therefore important.
Bilis contrasted two views of the analyst’s relationship with data. In one view, the analyst takes a relatively passive stance, and receives data as “given” by the environment (“donné”). In a second view, the analyst takes a more active role as s/he captures data (“capta”). Pine and Leboiron made a similar point, claiming that in some cases “human-computer interactions start before the data reaches the computer because various measurement interfaces are the invisible premise of data and databases” (emphasis in their original text).
Mentis et al. described curatorial practices with data among transplant surgeons as they engaged in the necessary work of “crafting the image” for one another, and Taylor et al. offered similar observations how a local community “enacted a multiplicity of ‘small worlds” in their data. Feinberg describes the “design” of data, and Patel et al. similarly describe the creation of features for analysis. Muller et al. documented the sometimes necessary processes of the creation of data, including the creation of grounded truth data.
Recording the outcomes and managing the diverse experimental analytic histories of data science work are also challenging. Despite the promise of literate programming [KNUTH84], people engaged in data science tend to scant their documentation, apparently because of a tension between dynamic exploration and time-consuming explanation.
Your submission abstract should be a single PDF file between 2-4 pages in total, and include the following information:
As you will submit your abstract via email, please include a brief paragraph or so in the email about the following. This will help us organize the workshop around interdiciplinary interests a bit better!
We encouage authors to use the ACM SIGCHI Extended Abstract Format for their submissions.
Paper submission deadline: February 12, 2019 (11:59 pm EST)
Notification of acceptance: March 1, 2019
Workshop at CHI in Glascow: May 4 or 5, 2019 (TBD)
Applications are open! Please email your submissions directly to email@example.com
Michael Muller works as a researcher at IBM Research AI, where he studies data science work, and collaborates with data science workers to design future tools for data science.
Bonnie John is a Senior Interaction Designer at Bloomberg, where she uses user-centered methods to design and evaluate tools for financial data scientists and collaborates with Project Jupyter.
Melanie Feinberg is an associate professor at the School of Information and Library Science (SILS) at the University of North Carolina at Chapel Hill. She studies the practices by which data is made, and the characteristics of data as both design artifact and design material.
Mary Beth Kery is a PhD student at the Human-Computer Interaction Institute at Carnegie Mellon University. Her research focuses on studying programmer behavior and designing new kinds of programming tools to support exploratory data science work.
Timothy George works as a UI/UX designer for Project Jupyter, where he designs next generation data science tools. He also works to develop open standards, protocols and practices for practitioning data scientists.
Samir Passi is a PhD candidate in the Department of Information Science at Cornell University. His research focuses on the forms of human work in data science learning, research, and practice. He studies such forms of work ethnographically in the context of academic as well as corporate data science.
Steven Jackson is an Associate Professor and Chair of Information Science at Cornell University. His work addresses questions of ethics, policy and practice in emerging computing fields.