What is a Customer Data Platform (CDP)?

Ashish Agarwal
3 min readDec 8, 2022


What is CDP — Copyrights Protected

What is a Customer Data Platform (CDP)

CDP is a piece of enterprise software that combines data from multiple sources into a single, centralised customer profile. It contains real-time data of all the touch points and interactions with your product. This data can then be segmented in multiple ways to create customised marketing campaigns.

The centralised customer profile is sometimes referred to as 360º customer profile.

Customer data could be data like contact details (names, address, phone number, email), demography (age, gender, location), order history, buying preferences, etc.

Customer data could be collected from sources like CRM systems, transactional systems, web forms, chat bots, emails and social media, ecommerce sites, loyalty systems, etc.

What are the some of the essential features of a CDP

In order to be usable for marketing campaigns, any CDP should exhibit following features -

a) Data Ingestion

To create a single, centralised customer profile, any CDP requires to connect to different data sources via SFTP, S3/GCS, APIs. Either off-the-shelf connectors products like Fivetran, StitchData or Workato or custom-made connectors can be used to ingest customer data into CDP.

b)Identity Resolution

Identity data forms the foundation of any CDP. A marketer needs to associate customer behaviours to a single customer view for each user so that right campaigns can be run. It allows businesses to uniquely identify each customer and avoid costly duplication. Sophisticated algorithms are used to do data cleanup and de-duping to create 360 customer profile.

How is a 360º customer profile built?

Typically following 2 rules are used to built a master customer record

  • Exact match on the email id
  • Fuzzy match using a combination of First Name, Last Name, Address and Zip

Further, during fuzzy matching, to identify how 2 values are ‘similar’ string distance algorithms like — Lavenshtein and Jaro-Winkler are used

Libraries (or APIs) from Melissa are used for address verification.

c) Audience Building or Cohort Analysis or Segmentation

A CDP synthesises behaviours from customers to create groups of look-alike customers. This is known as cohorts or segments in marketing parlance. Segmentation allows marketers to deliver campaigns to cohorts which otherwise would have been missed in absence of segmentation.

How are segments created?

Segments are created on the basis of customer attributes like city, address, gender, purchase amounts, product (or type) purchased, etc.

d) Analytics

Marketers and Sales team need rich dashboards and reports to gain more insights which eventually drive decision making. They get access to data (via queries) that directly impacts the sales. These queries run on the curated data stored in a database/datawarehouse system like SnowFlake. There are off-the-shelf Business Intelligence tools like Looker, Tableau, Metabase and Apache Superset that can be used to view reports or create custom dashboards.

Further, Machine Learning models can be used to uncover insights from curated data. TensorFlow, Orange and Qubole are some of the off-the-shelf tools for ML

Following is a representative diagram of a scalable data pipeline for a CDP depicting how several technologies work with each other

Model scalable Data Pipeline Copyrights Protected

Liked this article ? Show your love by giving claps and sharing this article.

You may consider following me on LinkedIn