Pythian tears down data silos with Hadoop for financial client

Business Needs

To cost-effectively scale their data platform to keep up with increasing demands on their systems, including their mobile payment system, while integrating data from disparate sources so users across the organization could gain new insights and introduce new services. In order to drive innovative use of data across their organization and to keep up with ever increasing demands on their systems, one of Europe’s leading financial services companies had to achieve these main objectives:
  • To increase their ability to cost-effectively scale and maintain performance levels of their existing mobile payment application as their transaction velocity grew.
  • To integrate a large number of disparate data sources into a single database for use by application developers and analytics professionals across the organization, while keeping it separate from their core banking transaction systems.
  • To implement a multi-tenant platform that would host the applications for several different teams in the bank, while ensuring data security, fair resource sharing and enforcement of usage restrictions remain top priorities.
  • To setup and configure a data governance strategy that could annotate, classify and control the vast amounts of data imported into the data lake.
The company recognized early that business users would want access to data to create data models for specialized functionality and to create new services. With many database servers, and more than 400 always-growing databases, the company needed to easily scale their capacity as demand for services increased. The company wanted an enterprise data hub that would offer “big data as a service” to the various business units. Protecting sensitive customer information is of paramount importance in the banking industry, so advanced data protection and security was a key requirement
Read MoreLess


The company asked Pythian to undertake the complex task of developing a single point of integration for multiple data sources in a secure, multi-tenant architecture. The company had selected Hadoop because of its scalability and because it would allow them to combine many data sources, and would enable departments across the business to access a much richer data store. Pythian recommended Cloudera Enterprise based on its ability to provide the best overall enterprise-level support, along with its security and audit features. Pythian was tasked with building the enterprise data lake in less than 90 days. The platform had to be flexible enough to satisfy the requirements of many different departments and stakeholders, incorporating the high levels of security required in the banking industry. Pythian designed a secure, multi-tenant architecture to efficiently ingest and process data from more than 400 databases across the enterprise and from external data stores. They also created a sandbox area to enable administrators to apply constraints to manage usage and access without keeping business teams from getting access to the data they needed. Rigorous security policies were applied to provide users better access to data sources in a controlled environment. Permissions could be set for individual users, who would also be assigned a secure space to work, along with usage and consumption limits. The new solution enabled the bank to better understand customer behavior and build enhanced personalized services, such as a new fraud detection system that accessed data from many different systems. Also, the new platform offered the processing power to enable the launch of a new mobile payments application that could scale to handle a massive number of daily transactions. Hadoop made it possible to meet the company’s requirement to keep this data separate from, but synchronized with, the core banking transaction systems.
Lastly, by aggregating its data with Hadoop, the company could reduce the number of servers, along with management and overhead costs. The ability to scale dynamically meant they could also maintain a high level of performance on their transaction-heavy applications
Read MoreLess


Many different departments are already finding new ways to use the data through a completely synchronized, parallel data set. This means they can develop additional functionality to enrich the customer experience without touching the core system. To date, one year into the project, the company has expanded the two production clusters from 11 nodes to 27 nodes to handle the additional data produced by more than ten teams that have onboarded. HDFS usage is at 300 TB and growing. Where departments once had to manually access data from different groups, they now have access to integrated data from different sources. This access to better information will allow the company to create data models and continually improve both decision making and business processes. The impact to the business has included:
  • Bottom line savings: lower cost in system provisioning, system maintenance; increased productivity and faster time to market
  • Increased competitive advantage in terms of new services delivering additional functionality to clients
  • Mitigated risk for the bank and its clients through a new fraud detection system
  • New customer experiences in emerging payment channels
Read MoreLess

Explore Pythian’s popular services:

"Pythian helped us implement a scalable big data platform that integrates data from across our organization. The new solution has enabled our teams to build a single integrated data repository and tear down existing data silos. A new cross-data analysis capability has enabled us to improve the customer experience and enhance our security and fraud detection models, while letting us deliver innovative new data-driven services"


Looking to learn more about Data Platforms?