Collaborative Startup Helping People in Disadvantaged Communities Learn Entry-level Data Science Skills
Across businesses and organizations of all sizes, there are rapidly growing opportunities for data science workers.
Many people, however, particularly those from economically disadvantaged communities, are often excluded from the training opportunities necessary to be competitive for these jobs.
To address this issue and increase the diversity of the data science field, Georgia Tech has launched DataWorks.
The new hybrid program – which recently earned a $1.5 million National Science Foundation (NSF) grant – works closely with community partners to hire and train people from under-resourced communities in Atlanta to do data science work.
“There are so many tasks peripheral to computer science that, while not requiring a degree to perform, are critically important to the CS community,” said DataWorks founder and School of Interactive Computing (IC) Associate Professor Betsy DiSalvo.
Part startup company, part outreach effort, and part research platform, DataWorks provides its employees with on-the-job training to learn entry-level data wrangling skills.
To learn data skills like cleaning, linking, and reformatting, employees use real-world “messy” data – provided mostly by Atlanta non-profit organizations. Once cleaned the data are returned to an organization to help it fulfill its mission and business objectives.
“Rather than just teaching, we think it’s important to have people situated in a real work environment. In doing so, they feel like what they are doing is more valued. It also allows them to see themselves as being a part of this industry,” said DiSalvo.
[RELATED: Betsy DiSalvo Joins the Interaction Hour Podcast to Discuss DataWorks and Equity in Computing]
For one of its first pro bono projects DataWorks employees worked with Enterprise Community Partners, Inc. on an affordable housing database project.
“The DataWorks team completed the detail-oriented work of pulling data from public reports, aligning it with other public data, and producing one complete dataset for our affordable housing database and website project,” said Sara Haas, southeast market director for Enterprise Community Partners.
“With DataWorks’ help, we’re helping to level the playing field by providing residents, community advocates, public partners, and nonprofit developers access to the same type of data that others have,” said Haas.
DataWorks started in January with four part-time workers, which were recruited through west Atlanta organization Raising Expectations, a non-profit mentoring and tutoring program. DiSalvo had hoped to be up to 10 employees this summer but then the pandemic hit. At the time, she wasn’t sure the program would survive.
But working remotely, DataWorks continued building its reputation and training its employees through pro bono work. It has also recently picked up a few paying clients from the private sector, as well as a new contract with Atlanta’s Center for Civic Innovation.
Along with new clients, the program also has new funding. Price Waterhouse Cooper (PWC) recently donated $25,000 to the program. Along with the funding, this collaboration with DataWorks includes a donation of 300 volunteer hours from PWC.
DiSalvo and her co-primary investigators – School of IC Associate Professor Carl DiSalvo and Georgia State University Assistant Professor Ben Shapiro –also recently earned a $1.5 million NSF grant for the program in August.
“We’re actually down to three employees now, but one left to take a full-time job with Georgia Tech so we consider that a win,” DiSalvo said happily. She added that with the new NSF funding – and the additional work – she expects to move forward with hiring additional DataWorks employees in the near term.
The NSF grant is part of the Connected Communities program. Along with continuing to hire and train people from Atlanta’s under-resourced communities, the grant will fund research into the program to build more training tools and programs for employees that have the potential to scale to other cities.
One aspect of the research will address bias. DiSalvo says she wants to better understand how having different groups of people doing peripheral data work ultimately impacts outputs. Another research question will look at what kind of structures can be developed to do this kind of community engagement work within an institution like Georgia Tech.
“We know there are a lot of grassroots communities that could take advantage of data and they don’t because there just aren’t structures in place for them to do it,” said DiSalvo.
DataWorks is currently part of Georgia Tech’s Constellations Center for Equity in Computing, which is housed in the College of Computing.
“Constellations focuses on increasing equity in computing. DataWorks extends our reach by providing a pathway to computing opportunities for those who wish to acquire the necessary skills while applying them in an entry-level position,” said Cedric Stallworth, assistant dean for Outreach, Enrollment and Community in the College of Computing, which houses Constellations.
Aside from DataWorks, DiSalvo has been helping people to acquire entry-level skills since she was a Georgia Tech Ph.D. student. For her dissertation project as she created Glitch, a program in which young Black men were hired from the community to test video games. Like DataWorks, testing video games is entry-level work that doesn’t require a degree or special skills other than knowing how to play video games.
“These young men were testing real games for real companies. During the three years that Glitch was active, 33 young men, mostly from lower-income neighborhoods, participated in the project. More than 50 percent went on to major in computer science or related field,” said DiSalvo.
According to Stallworth, programs like Glitch and DataWorks are key to developing a social climate of inclusivity and opportunity for people in underrepresented communities.
“We must be creative and diligent in our efforts to widen the doorway that leads to success in computing,” said Stallworth.