Online Advertising Technology Firm

CHALLENGE

An online advertising technology firm sought Relus Cloud's help for a massive data and analytics migration project. Previously, the Company’s team was running in their data platform on a mixed environment consisting of on-premise data centers and Softlayer cloud services. This environment was extremely expensive and cost the company close to 7 million dollars a year to maintain. Not only was it costly, but it also did not allow for elasticity, flexibility or resiliency. The Company engaged Relus Cloud's help, not only in building the platform, but to teach their technical staff how to support their new AWS environment. The company, in order to stay competitive and maintain their status as an industry leader, needed to alter their existing testing environment to support their plans for moving all of their massive workloads into AWS.  

OPPORTUNITY

The online advertising technology firm chose Relus Cloud as the partner that could best satisfy their need for relevant skills and experience needed to provide insight and resources for a successful AWS and data and analytics implementation. The Relus Cloud team supported the engagement with a comprehensive team of data architects, data engineers, and project management that not only kept the project on track, but helped deliver a successful engagement on time and under budget.

SOLUTION

The Relus Cloud team provided an assessment of the Company’s existing AWS environment. Part of this assessment was to look at the environment’s overall architecture and provide remediation plans based off of the 5 pillars (operations, security, reliability, performance, and cost) of Amazon’s Well Architected Framework.

It was determined that Amazon EC2, AWS ELB, and Autoscale groups were needed across availability zones to add resiliency and help balance the load for all application services. Relus Cloud assisted in testing and providing input for instance sizing and scaling policies. The Relus Cloud team used an “Automation First” strategy and implemented Terraform to automate the client’s infrastructure in AWS. This allowed the Company to quickly spin up environments in different domains and spin down to save on costs.

In addition to bringing the infrastructure up to AWS best practices and standards, the Relus Cloud team implemented a data lake strategy for the Company’s data platform. This allowed the client to ingest billions of data sets from all of their different on-prem data centers into AWS S3. Relus Cloud implemented a data pipeline to process ingested data to create rollups and summarized data sets for revenue information. As part of the data pipeline, Relus Cloud’s team implemented a 67 node EMR cluster to process more than nine billion records per day ingested into an S3 bucket. The team also implemented an AWS Glue Catalog to use as a Universal Metadata store, applied AWS Lambda functionality for files transfer, and enacted Apache Airflow for job scheduling and monitoring. As part of setting up the data pipeline, the Relus Cloud team assisted in migrating over 250 HIve jobs to the Company’s new AWS environment.  

AWS Glue was a central piece to the overall solution of this engagement, being leveraged as a Universal Metadata storage for EMR. Relus Cloud implemented scheduled jobs to crawl S3 buckets for metadata discovery and automatic schema inferences. This allowed the Company to not worry about downstream services impacted by slow changing dimensions (SCD) type of schemas. Relus Cloud implemented a bi-directional approach where an external table created within the EMR cluster would be automatically discovered and cataloged within AWS Glue. Any schema changes identified by scheduled AWS Glue crawlers would also be queried against using Hive.

Finally, Relus Cloud configured AWS CloudTrail and Amazon CloudWatch to enable logging and resource monitoring on the environment. Through this, the Company is able to be notified of any unwanted access attempts and/or impacts to environmental performance.

Services leveraged:

  • Amazon VPC & VPC Peering
  • VPN

  • Amazon EC2

  • Amazon S3

  • Amazon SQS

  • AWS CloudTrail

  • Amazon CloudWatch

  • Amazon Route 53

  • EMR

    • Hive

    • Presto

    • Zeppelin

    • Hue

  • AWS Glue Catalog

  • AWS Lambda

  • Amazon RDS 

Benefits

The online advertising technology firm recognized several benefits following the successful implementation of Relus Cloud’s AWS and data and analytics strategies. Using AWS, the Relus Cloud team was able to create a modern data platform for the Company that can grow as their business does with an infinite number of scalable capabilities. As a result of the Relus Cloud engagement, the Company has a fully-scalable data platform on AWS that can handle over 20 billion transactions per day. With Relus's help, the Company has the ability to continuously innovate without any limits. The Company estimates that they are able to save $5,000,000 per year in infrastructure costs because of their move to AWS and the guidance of the Relus Cloud team.