Introduction
This exercise asks you to consider creating a “datasheet” for one dataset used in your project, based on the Datasheets pre-reading.
This framework includes 52 questions, which is a lot! The purpose of this exercise is not to actually answer the questions; instead, you are asked to decide which questions you would answer if you were to create a datasheet for your dataset. Think of this exercise as picking a subset of the 52 questions that are most relevant to your project (you might pick 5, or 20, or 45, or all 52!).
Your task
This activity has 2 parts:
Part 1
- [ ] Individually, pick your questions. Go to the worksheet below and complete the "Relevant to my dataset?" column. Take a few minutes to identify which questions could be relevant to your project, as follows:
- For each question, select whether this question is relevant to your dataset. Click in the empty cell to select an option:
- Does your dataset documentation already address this question? Select Already documented
- Could you answer this question? Do you think that the answer should be documented? Select Yes
- Is this question irrelevant to your project? Select No
- Not sure if you would be able to answer this question? Select Maybe
<aside>
💡 52 questions is a lot, so in this session we're going to start with just 7 questions (the first question from each section).
</aside>
Datasheets Framework Worksheet
<aside>
💡 If you want to see all 52 questions, simply remove the filter! To do so, click the Filter button above the question grid, then click "..." and choose Remove.
Need an easy way to share this framework with your team? For a full spreadsheet version of all 52 questions in the Datasheets framework, click here.
</aside>
Part 2
- [ ] Small group discussion. After everyone in your group has picked their subset of questions, discuss the following questions with your group for about 20 minutes. We encourage you to screenshare during these discussions, so that you can see how your peers selected questions!
- What was it like to select these questions? How many fell into each category for you?
- What norms (if any) exist in your team/company/industry for documenting datasets and sharing that documentation with all stakeholders? How are the datasets and contexts in your project documented?
- What power dynamics are reflected in the collection, use, and distribution of data in your current project?
- How have impacted individuals/communities been consulted in the design of the data collection, metrics, and/or overall project? How might you engage them more deeply?
My Notes
<aside>
💡 Use this space to type your notes and takeaways from this exercise!
</aside>