
Overview
What we did
- Developed a cloud-based analytics solution
- Designed and implemented a data pipeline that used Apache NiFi to integrate data from approximately 60 sources, including:
- MySQL database
- Oracle database
- A number of APIs
- Omniture DoubleClick for Publishers marketing and web analytics
- Pardot B2B Marketing Analytics
- Tableau reports
- Neilsen ratings and viewing data
- Exported the data from the Teradata data warehouse to Apache Hadoop Distributed File System orchestrated by Google Cloud Dataproc, then transferred to Google Cloud Storage and loaded into Google BigQuery
- Created a data dictionary using Apache Avro to define the company’s various data types
- Designed and implemented a data pipeline that used Apache NiFi to integrate data from approximately 60 sources, including:
Technologies used
- Google Cloud
- Google BigQuery
- Google Cloud Storage
- Google Cloud Dataproc
- Apache NiFi
Key Outcomes
Pythian’s experience with migrations and analytics solutions, plus technical expertise in the selected technologies, resulted in a flexible and cost-efficient data analytics solution. In Google BigQuery the cost of storage and compute are kept separate, and on-demand pricing enables the customer to pay for only the storage and compute they use. Query response time is also better with Google BigQuery than Teradata, and BigQuery can adapt to any data type or format, plus convert formats, without additional charges.

Explore our Analytics Services
No matter your business, no matter the challenge: Pythian’s solutions drive results.