Final Principles of Computing Project
Project ID: 23
Course: CSE 10001 - Principles of Computing
Instructor: Professor Shreya Kumar
Authors
Sara Lessie
Business Analytics and Economics
Class of 2025
Tucson, Arizona
Sophia Alexa Ochoa
Visual Communications Design
and Digital Marketing
Class of 2025
Manila, Philippines
Data Topic and Questions
Many Americans are aware of the fact that education is determined by the state rather than a federal level. Because of this, the quality and content of a child's education can vary drastically. Our topic
investigates the potential danger of educational redlining, and whether the external environment of a child's neighborhood is carried onto their performance in the classroom. To do so, we are looking at the
greater New York City area, delving into the five boroughs within NYC we aim to find out if they have done their duty in providing equal care to students, or if there is some lag within the city.
-
Are there any correlations between the average household income of a district and the high school’s average SAT score?
-
Which district has the highest SAT score average? Which district has the lowest?
-
Does the size of student enrollment have any correlation with its respective high school's SAT scores?
About the Data
We used two main sources of data: SAT data and borough demographic data. To clean the SAT scores data, we needed to reformat the demographic values by removing symbols such as ‘%’ and spaces, then converting them to the proper data type, ready for analysis.
We also dropped nulls so that they would not interfere with our analysis. As for the borough data, each table needed to be sorted for key variables (Name, Indicator, Borough, and Year) and joined with the other boroughs. Finally, borough and scores data were
joined so that SAT scores could be related to borough demographic data. For analysis, we used techniques such as grouping by borough and pivoting the time series data. One caveat is that although the test scores come from 2014, our demographic data does not
explicitly come from 2014. However, the time horizon crosses over 2014, providing context regarding the demographic compositions. Additionally, some boroughs are bigger than others leading to uneven samples of schools. The data is still representative of its
respective borough.
The SAT scores data comes from the College Board. It refers to the 2014 exam scores. The data contains variables such as School ID, name, state, longitude and latitude, race demographics, and average score broken down by part reading, math, and writing.
The data regarding the boroughs comes from the NYU Furman Center. There is a separate table for each of the five NYC boroughs. The boroughs are Queens, the Bronx, Brooklyn, Staten Island, and Manhattan. There are very many indicators, broken down by
subject (like demographics or Housing Market Conditions). Each statistic is given for 2000, 2006, 2010, 2019, and 2021.
Significance of this Project
We chose this project because American education is extremely far-reaching. Taxpayer dollars are a significant portion of a school's overall quality and no student should suffer because of where they grew up. As we have taken courses at ND regarding American education
through both a sociology and economics viewpoint, we felt compelled to merge our background knowledge in researching and creating this website. Our collection of data visualizations can be viewed by anyone. However, we want to enlighten American audiences, particularly
New York residents to understand how the economics of a city can impact a student’s education. We intend to encourage financial equity in all districts in New York City. While government reform is a large step in ensuring a better future for students, awareness is the
key to creating discussions about our country's welfare.
Code Walkthrough Video