Can you change the world for the better in 24-hours? That was the challenge 39 teams tackled at the Bayes Hack data-science challenge in November.
Bayes Impact is a Y Combinator-backed nonprofit which runs programs to bring data-science solutions to high impact social problems. In addition to a 12-month full-time fellowship supporting leading data scientists to work with civic and nonprofit organizations such as the Gates Foundation, Johns Hopkins and the White House, the organization runs an annual 24-hour hackathon to bring together data scientists and engineers to tackle social problems.
Starting from a set of 20 challenge problems proposed by government and non-profit organizations, teams drawn from the Silicon Valley’s top data-science talent applied their skills to finding impactful ways to use already available data to solve pressing social problems.
Google Cloud Platform sponsored the event with $500 Google Cloud Starter pack credit for each team, and a prize of $100K of Google Cloud Platform Credits to the winning team.
With only only 24 hours and large quantities of data to process, teams were able to leverage the power of tools such as Google Compute Engine and BigQuery to quickly chew through terabytes of information looking for ways to make meaningful impacts on people’s lives.
The winning team, comprised of five local Bay Area data scientists, used their data savvy and their Cloud Platform credits to identify prostitution rings by analyzing patterns of phone numbers and text in postings to adult escort websites. Using a cluster of Compute Engine nodes, the team processed a dataset provided by the non-profit group Thorn. They indexed 38,600 phone numbers and combined that with a heuristic phrase matching strategy to detect 143 separate networks or cells operating in the US.
“Realizing that it was going to take 76 days to process the data on a local laptop, we saw this as a place to use our Cloud Platform credits,” notes Peter Reinhardt, the lead for the winning team. “We found it really straightforward to get SSH access to our first compute instance right from the console. Once that was running, we were able to use that image to quickly bring up 10 machines, and went from nothing to a high powered compute cluster in just over half an hour.”
Paul Duan, President of Bayes Impact, observed that Cloud Platform “enabled the participants to get going quickly and focus on their application without having to spend too much time setting up infrastructure.”
It is estimated that 100,000 to 300,000 children are at risk of commercial sexual exploitation in the United States and one million children are exploited by the global commercial sex trade each year.* As the winning entry, the team’s work will be adopted and expanded as a resident Bayes Impact project.
Companies use data-science and Google’s Big Data tools to quickly answer tough data-intensive questions. Bayes Impact and Google worked together to show what is possible when human and technology resources are brought to bear against social problems.
Posted by Preston Holmes, Google Cloud Platform Solutions Architect
*U.S. Department of State, The Facts About Child Sex Tourism: 2005.