Subtypes of First-party Data
An in-depth breakdown and the purpose of each subtype: event data, entity data, and identity data.
In the previous post, I described the characteristics of first-party data and zero-party data and touched upon what causes confusion between the two. If you haven’t had the chance to go through it, I highly recommend doing so before proceeding.
In this post, I dig deeper into the subtypes of first-party data.
And if you’re ready, let’s dive in.
This is an excerpt from a guide originally posted on the astorik learning hub →
As a reminder, data collected by an organization or brand from a user implicitly is first-party data. This data can be further split into three types: event data, entity data, and identity data.
Understanding the differences between these subtypes helps answer the following questions:
Which types of data do we really need?
What tools do we use to collect and store each type of data?
How do we make each type of data actionable?
Event data
Also referred to as behavioral data or product-usage data, event data helps understand how a user navigates through a product, where they get stuck, which features they use or don’t use, and at what point they become activated.
Every user action — a click, a tap, or a swipe — can be recorded as an event, along with additional data points called properties that provide more context about that event.
If a logged-in user visits a page inside an app, the event Page_Viewed along with its properties such as user_id, timestamp, and page_title helps understand the following:
Which users viewed a certain page or screen
How many times a certain page was viewed
The exact times when a certain page was viewed
Event data is the most common subtype of first-party data and has become table stakes for both B2B and B2C brands to understand user behavior. If you’d like to get a deeper understanding of the components of event data, check out this guide.
Confusion alert: Sometimes, codeless event tracking is referred to as implicit tracking and instrumenting events via code is called explicit tracking. Either way, users don’t share this data intentionally or explicitly, and therefore, irrespective of the technology used, event data falls under first-party data.
Entity data
Any piece of data containing some information or trait about an entity — an individual or group — falls under entity data. And when this data is collected implicitly, it falls under first-party entity data.
User is the most common entity that brands interact with and collect data about. The User table comprises columns to store user properties such as name, email, gender, occupation, is_active, is_customer, and so on.
B2B Context
In B2B, Account is a group entity that represents the customer (an organization) and stores data points associated with an entire account and not just a user who belongs to that account. Common account properties include account_name, account_status, is_customer, is_partner, number_of_users, subscription_type, and so on.
On the other hand, since an individual can be a lead, user, partner, or even just a billing admin or a combination of these, it’s useful to create an entity table such as Person that stores every piece of data pertaining to an individual.
Based on their actions, an individual can belong to one or more audiences as follows:
A person who subscribes to the blog is a subscriber or lead
A person who creates a free account is a user who belongs to an account where subscription_type is free
A person who upgrades a free account can be a user or a billing_contact of a paid account
A person who signs up for the partner program is an evangelist and a user of an account where is_partner is true
And so on
A combination of a person property such as number_of_docs (collaboration tool) and an account property like number_of_users helps B2B brands determine who their most active users are and which accounts to upsell to. If you’d like to dig deeper into entity data in the context of B2B SaaS, this guide is what you need.
P.S. It’s useful to note that entities such as Account and Person are referred to as objects inside a CRM.
B2C Context
An e-commerce store needs to maintain a record of the products viewed by a user in order to offer recommendations, forecast demand, or run remarketing campaigns. This piece of entity data (the user being the entity) can be stored in the products_viewed column of the user table.
Marketplaces deal with an additional entity — Merchant (or Seller) whose details and attributes such as merchant_category, product_categories, and serviceable_pincodes are stored in the Merchant table.
Additionally, every unique product (SKU) that a store sells is also an entity, and the feedback or reviews for that product are stored in the table dedicated to that SKU. This is what enables marketplaces like Amazon to display consolidated reviews of a product even though that product is sold by multiple merchants (who may or may not serve the same pin codes).
Identity data
Derived from the user's browser or device or fetched using data enrichment tools, first-party identity data helps identify who the user is.
When I visit a brand's website, my IP address is collected automatically which I cannot choose not to share (all I can do is use a VPN or a static IP to supply a different IP address). The brand is able to derive my location from my IP address to offer me a personalized experience or for that matter, block me from accessing their product altogether.
Other identity attributes that can be derived implicitly include a user’s system preferences (such as light or dark mode), OS (Windows, macOS, iOS, or Android), as well as a host of demographic and firmographic data points that are made available by enrichment tools.
Additionally, enrichment tools are also able to fetch one’s name and email address but this data is never 100% accurate or up-to-date.
First-party data, therefore, has a lot of applications and is a vital component for brands to build a strong audience data strategy.
Your thoughts matter more than mine
The results from my poll are in and two-thirds of the participants agree that the data referred to as “customer data” isn’t just about customers.
If you have additional thoughts or ideas about any of this stuff, you’d be doing me a favor by sharing them on this LinkedIn post.
And if you’d like to dig deeper and learn about the subtypes of zero-party data, here you go.