Final Report Prompts
First Draft
Please note that the first draft of the report should be submitted in the MDPI Rticles format with all citation metadata included in a separate .bib file and referenced in-line.
Introduction
Your introduction should be 300-500 words and should introduce your project to an audience unfamiliar with our capstone course. This section will be some combination of your problem definition, sponsor description, the scope of your solution, and an outline of the remaining sections of the report.
As you prepare your introduction, you should think about how you can synthesize some of what you wrote in these earlier sections into a piece writing that will frame your project to an unfamiliar audience.
Note that this will require more than just copying and pasting previously written sections of the report. You should think carefully about how to integrate this writing in a meaningful way to introduce your project.
Data Description
In 300-500 words, introduce the data resources that you will be working with to a lay audience. Some teams might not yet know all of these data sources, so you need only detail the ones that you know you will be working with.
I have found in the past that detailing datasets to an audience unfamiliar with that data is really challenging. We have a tendency to want to describe the data’s technical details rather than summarizing more generally what the data represents. I’ve included an example below to help you think through how to approach this. At the very least this section should include:
- What the data represents
- Who produced it
- Its observational unit
- Some variables that describe that observational unit (but be careful about getting too technical here)
- A bit of its history/context
- A few data limitations
To address this problem, we analyzed a dataset put out by the US EPA [<– who produced it] that documents the environmental compliance and enforcement history of every EPA-regulated facility in the US, including prisons [<– what it represents]. For every EPA-regulated facility in the US in a given year [<– observational unit], the dataset reports information such as the permits the facility has been awarded, enforcement actions taken against the facility, and penalties it has assessed [<– variables]. The EPA has been integrating this data from a number of different compliance databases for major federal regulations (such as the Clean Air Act and the Safe Drinking Water Act) for over ten years [<– history/context]. The data tends to be more comprehensive for larger facilities than for smaller facilities and does not include information about compliance with all environmental laws [<–limitations].
Detailed Methodology
This section should be about 800-1000 words. This is going to look quite different for each of the projects. This section might include things like:
- How did you acquire the data?
- How did you clean and format the data?
- How did you integrate the data?
- How did you determine which metrics to use or how to calculate certain metrics?
- How did you analyze the data?
- What assumptions did you have to make when drawing conclusions from the data?
- How did you automate certain processes?
- What was the purpose of certain key functions you wrote?
- Why did you select certain plots to visualize the data?
As you are drafting this section, I encourage you to think about the key information a non-technical audience would need in order to understand how you plugged along towards the development of your deliverables. Remember that this is a draft, and aspects may change or be extended as you continue the project.
Findings/Outputs
This section should be about 600-800 words. This is going to look quite different for each of the projects, depending on the scope of your deliverables. This section might include things like:
- What were the primary findings of your statistical analyses (in relation to original research questions)? (e.g. “We found that…[give us key numbers, stats, and summaries of figures].”)
- What were some of the secondary findings of your analyses (perhaps beyond the scope of the original research questions)?
- How might we contextually interpret the project findings? (e.g. “This indicates that…”)
- What products did you produce (e.g.
R
packages,shiny
dashboards), and what were some of their key features? - In what ways did your deliverables match the expectations of the project sponsor?
Conclusion
Your conclusion should be about 300-500 words and do four things:
- Briefly summarize what you completed in your project
- Discuss how your project addressed (or didn’t address) the problems outlined in your problem statement
- Outline some of the limitations of your approach
- Present some suggestions for further work (i.e. How could this project be extended with more time and resources?)
Final Draft
Be sure that the final draft includes a brief abstract summarizing the contents of the report, along with keywords.
Ethics Statement
In 400-500 words, identify ethical issues that emerged in the course of your project, along with how your team grappled with/addressed them. Issues may relate to:
- Data availability
- Reductionist data categorization or semantics
- Privacy and/or confidentiality concerns
- Potentially incorrect assumptions your team needed to make (or use of proxies)
- Possibility of algorithmic biases
- Reliance on certain data infrastructures
- Your team’s cultural/social identities (and what identities are not at the table)?
- Your team’s knowledge of/relationship with the topic or organization
- Challenges in public data communication
I imagine that most teams will focus on 1-3 of these topics - not all of them.
For each topic you cover in this section, you should be sure to discuss four things - 1) what was the issue? 2) why did the issue come up? 3) what social harm might emerge from the issue? 4) how did your team respond to this issue? However, I encourage you not to think of this like a checklist. The text of this section should be well-integrated into the flow of your report.