Today’s guest blog comes from Graham Polley, Senior Consultant for Shine Technologies, a digital consultancy in Melbourne, Australia. Shine builds custom enterprise software for companies in many industries, including online retailers, telecom providers, and energy businesses.
Wrestling with large data sets reminds me of that memorable line from Jaws when police chief Brody sees the enormous great white shark for the first time: “You’re gonna need a bigger boat”. That line pops into my head whenever we have a new project at Shine Technologies that involves processing and reporting on massive amounts of client data. Where do we get that ‘bigger boat’ we need to help businesses make sense of the billions of ad clicks, ad impressions, and other data that can guide business decisions?
Four or five years ago, without any kind of ‘bigger boat’ available, we simply couldn’t grind through terabytes of data without plenty of expensive hardware, and a lot of time. We’d have to provision new servers, which could take weeks or even months, not to mention costs for licensing and system administration. We could rarely analyze all the data at hand because it would overwhelm network resources and we’d end up usually trying to analyze just 10% or 20%, which didn’t give us complete answers to client questions or provide any discernible insights.
When one of our biggest clients, a national telecommunications provider in Australia, needed to analyze a large amount of their business data in real time, we chose Google’s DoubleClick for Publishers product. We realized we could configure DoubleClick to store the data in Google Cloud Storage, and then point Google BigQuery to those files for analysis, with just a couple of clicks.
Finally, we thought, we’ve found something that can scale effortlessly, keep costs down, and (most importantly) allow us to analyze all of our client’s data as opposed to only small chunks of it. BigQuery boasts impressive speeds, is easy to use, and comes with a very short learning curve. We don’t need to provision any hardware, or spin up complex Hadoop clusters, and it comes with a really nice SQL-like interface that even makes it possible for non-techy people, such as Business Analysts, to easily interrogate and draw insights from the data.
When the same client came to us with a particularly complex problem, we immediately knew that BigQuery had our backs. They wanted us to stream millions of ad impressions from their large portfolio of websites into a database, and generate analytics about that data using some visually compelling charts – in real-time. Using its streaming functionality, we started to pump the data into BigQuery, which went off without a hitch, and we sat back and watched as millions of rows started flowing into BigQuery. When it came to interrogating and analysing the data, we experienced consistent results in the 20-25 second range for grinding through our massive data set of 2 billion rows using relatively complex queries to aggregate the data.
By leveraging the streaming capability of BigQuery, it allows us to analyze our client’s data instantly, and empowers them with ‘real-time insights’, rather than waiting for slower batch jobs to complete. The client can now instantly see how ad campaigns are performing, and change the ad creative or target audience on the fly in order to achieve better results.
Simply put, without BigQuery it just would not have been possible to pull this off. This is bleeding edge technology that we are using and the idea of doing something similar in the past with a relational database management system (RDBMS) was simply inconceivable.
The success of this project opened up a lot of doors for us. After we blogged about it, we received several requests from prospective clients wanting to know if we could apply the same technology to their own big data projects, and Google invited us to become a Google for Work Services partner. Our clients are continuously coming up with more ideas for driving insights from their data, and by using BigQuery we can easily keep up with them.
Big data can seem like that great white shark in Jaws – unmanageable and wild unless you have the right tools at your disposal to tame it. BigQuery has become our go-to solution for reeling in data, processing it, and discovering the value within.
– Contributed by Graham Polley, Senior Consultant, Shine Technologies
Learn more about Shine Technologies and the business impact of BigQuery. Watch as BigQuery takes on Shine Technologies’ 30 Billion Row, 30 Terabyte Challenge.
Feed Source: Google Cloud Platform Blog
Article Source: Shine Technologies Reels in Big Data Using Google’s BigQuery