Big Data App on AWS Optimizes Analytics for One of Big 4

Big Data Application on AWS Helps a Global Audit Firm Automate Data Processing

One of the Big 4




100 000+

About the Customer:

One of the leading audit and professional services firms that supports clients globally by providing a full range of consulting, risk management, legal, tax, and financial advisory services.

Business Challenge

Our client, One of the Big 4, serves the needs of various companies around the world and through the years has developed numerous databases of essential contacts (customers, their employees, third-party providers, their personal and corporate data, relationships between each party, etc.) and related financial and corporate data.

Since all data sources were heterogeneous, finding necessary contacts and mapping out their relationships to simply initiate communication was laborious and time-consuming.

Besides, financial data analysis and business analytics were greatly complicated due to poor data quality, missing values, data inconsistencies, and different data types and formats.

Thus, the client decided to set up an advanced big data solution that would unify their network of contacts and provide the automation of data-intensive processes. The solution would also enable the flexibility and scalability of their modern data storage and analytics solution and, most importantly, ensure the credibility of mission-critical data.

Infopulse has been a reliable partner of the global audit and consulting company in a whole array of projects for many years, including data engineering and digital transformation. Thus, when the client approached Infopulse for this project, we already had a good background of the client’s domain and policies, while our experts were familiar with the scale of data and peculiar analytics needs.



Infopulse developed a corporate analytics application using AWS services, operating as a single big data platform that consolidates all the client’s contacts and maps their interactions. The platform allowed for aggregating and analyzing massive volumes of financial and contact data from diverse sources that is generated and utilized by multiple users of the audit company.

The source data lived in different databases, thus creating a range of complexities. Our data engineers had to manually process data inconsistencies, including data conflicts, duplicates, and errors that could hamper data analysis and reporting. It was also essential to provide data changes daily, though not all databases could do that. Infopulse data architects developed custom solutions and modifications like creating pseudo delta of data to ensure the timely changes of large volumes of data.

Before the solution found its place in the client’s business ecosystem, Infopulse delivered the following services:

  • Enabled data unification to exclude duplicates and enhance data accuracy
  • Provided the ability to rapidly analyze financial and contact data regarding customers’ personnel and their interactions with other departments and executives
  • Enabled retrieving data from varied systems using APIs, file replication, direct access to databases, etc.
  • Ensured the storing of large volumes of intermediate data in an object repository Amazon S3, functioning as a data lake with its high scalability and resilience
  • For storing contact and financial data of companies that worked with our client, we used the following databases:
    • A cost-efficient graph-oriented database Amazon Neptune to build a network of customers (employees, managers, etc.) and their relationships regarding the different parties involved
    • RDS databases such as Amazon Athena and Aurora to store and analyze interactions
  • Applied AWS Glue jobs for efficient data transformations and movements, allowing to prepare clean and well-formatted data faster.
  • Implemented change data capture (CDC) to track changes in source data.
  • Enabled quick data search based on different attributes (email, alias, or contact data) using Amazon OpenSearch.

The architecture of a Big Data Application on AWS


Business Value

Infopulse helped One of the Big 4 consolidate all contact and financial data, creating a golden record of all the people involved and their connections. The quality of big data greatly increased and could be fully trusted due to its completeness, better accuracy, and constant data update.

Among other benefits the client noticed are:

  • Simplified access to data for 100 000+ business users (even if records are requested simultaneously)
  • Automation of data processing and data management activities
  • Improved big data analytics and time to insights, thus accelerating decision-making
  • Better interactions discovery to identify contacts who used or still use services of the audit firm and their relations to other potential customers
  • With the ability to find a person responsible for specific division or functions, be they business users or customers, it became easier to connect with them
  • Improved communication and collaboration between the client’s personnel and customers


Amazon S3 logo
Amazon S3
Amazon Athena
Amazon Aurora
AWS Glue
Amazon OpenSearch
Amazon Neptune
AWS Lambda
Amazon AppSync
AWS StepFunctions

Related Services 

Transform your data processing and analytics capabilities with refined Big Data applications on AWS.

Please specify your request

Thank you!

We have received your request and will contact you back soon.