Once again, what exactly is a warehouse-native app?
This time, hear it from George Xing who helped build the analytics function at Lyft and deeply understands the priorities and constraints of data teams.
In this episode of Data Beats, George explains how the warehouse-native architecture is superior to its predecessor and what benefits organizations can reap by adopting tools that are built on top of the data warehouse.
George also has some advice for organizations looking to get started with a warehouse-native app.
Let’s dive in:
Q. George, why don't you tell us what is a warehouse-native app?
Warehouse-native apps are business applications that run on top of the customer's cloud data warehouse. So something like a Snowflake or Redshift or BigQuery and relies on that piece of infrastructure as the source of truth for customer data. This is different than managed applications or the traditional way of building SaaS where the vendor will store a copy of the customer data on their own system and manage it that way. And some of the other advantages of this is you don't have to be tied to a fixed schema. So the business relationships that the customer has already defined in their cloud data warehouse, you can leverage. Warehouse-native apps are cloud... Warehouse-native apps are schema-agnostic and they also remove the need for pipelines. So they move data in and out of the cloud data warehouse without additional ETL tools.
Q. What is leading to such a shift in the way SaaS tools are being built? You mentioned some of the benefits which sort of answers the question, but what else?
One of the things that we see is that more and more so SaaS products are relying on data as a core differentiator. So in our space, in marketing, marketing ROI is really driven by the use of customer data. And the other big trend is that customer data is getting centralized in cloud data warehouses because that's where you can see all your touch points, that's where your source of truth is. And so in order to connect those to the obvious kind of architecture is to move the software into the data that is housed in the source of truth.
With warehouse-native apps, you don't have to replicate your data which obviously brings cost savings.
Q. What are the other core benefits of warehouse-native apps over a traditional-managed app?
There are so many but I think some of the ones that stand out, one is just speed to set up. One of the biggest challenges of working with a data-intensive application today, is that you have to first send all your customer data to the vendor in order to just get started or have any value, whereas in a warehouse-native app, your data's already there and you just connect the application directly to your data via a standard database connection.
And so it just simplifies and speeds up the cost of, or the time to getting set up from sometimes months to days. The other piece of this is just having more access for personalization, as I mentioned. If your warehouse is the source of truth for all your data, then you also have more data, richer data there. You can leverage a lot more of those data points for personalization. In the case of Supergrain, that would be targeting for emails.
That would be more personalized messages. And obviously, that drives more business ROI. And then I think the third piece of this gets at the pipelines. Normally you have to move data from your vendor back into your warehouse for analysis and reporting and BI tools. Warehouse-native apps simplify that because they write directly back to the warehouse, meaning the reporting is both more complete, it's faster, and it's completely managed. So you don't need to set up a separate pipeline, which also introduces costs.
Q. What are the prerequisites in terms of the data stack to adopt a warehouse-native app?
Obviously, having a warehouse is prerequisite number one, but the other piece of a data warehouse is data. So you need some way of getting data in there. And, you know, we talk to a lot of customers that have various ways, they use various ETL tools to load data in.
They also have other tools for doing transformation on top of the warehouse. And typically what that means is there's also somebody who's managing the data warehouse or either a data team or an engineering team of some sort. They're kind of responsible for the governance of the data that's in the warehouse.
And there's a nice kind of handoff between the work that they're doing and the application side, which is what Supergrain does. We're essentially making a lot of the work that they already do more accessible to downstream go-to-market teams.
🤔 Have questions?
Q. Since warehouse-native apps don't store any custom data, won't marketing campaigns break if the data changes at source or if there's an issue connecting to the customer's data warehouse?
Yeah, so I think the way that I would think about this is not, you know, warehouse-native apps versus traditional apps in terms of data storage. I would say that marketing campaigns today break a lot more often than people expect or would think because source data breaks and it breaks regardless of where the data is actually stored. You know, there's just a lot of dependencies between various different systems.
The data might not be formatted in the right way. And once it lands in a destination, then it might not render. And we've all seen those examples where emails go out with the unfilled merch tags with the curly braces. The question is, how do we solve that problem? And in our view, it's really through better data governance. So you need to have full visibility over the pipeline.
You need better lineage, you need better observability. And the problem is today with the model of moving and copying your data into a third-party system, you lose all visibility and the ability to trace the dependencies along that pipeline.
One of the benefits of warehouse-native apps is that we rely on the existing tooling in the data warehouse ecosystem today and just all the tools available to data teams to actually manage that pipeline. So it's definitely a big problem, but we feel that data warehouse apps actually are at a much better advantage when it comes to solving some of these problems.
Q. Can you briefly explain the segmentation capabilities of a warehouse-native engagement tool like Supergrain? Are there any limitations here?
So, what Supergrain offers in our application is a no-code visual builder for marketing and growth teams to create customer segments. And so you can filter for various criteria, people who have done certain actions or have certain properties. And essentially what we do is we compile that into SQL under the hood and we execute that against the data warehouse.
What that means is we can also support all kinds of joins. We can support complex account hierarchies. We can support multiple entities, funnel criteria, and just a lot more complex sophisticated segments and business logic that you normally wouldn't be able to do in a traditional segment builder type of product that doesn't run on a data warehouse.
Q. Do the visual segmentation capabilities replace the need to build data models in SQL?
It doesn't. As we talked about earlier, you know, part of the thing that you still need, you still need the data models and the data teams who are managing those data models to define things like key business metric definitions, key dimensions, key business entities, like, you know, what a user is. That's something that is very business specific that Supergrain isn't going to know out of the box. And so, essentially what we do is we expose a UI on top of the work that the data team's already doing to enable the marketing team and the growth teams here to self-serve with our visual builder.
Q. Do warehouse-native apps make reverse ETL workflows redundant?
You know, our view is no, you know, and the reason I say that is there will be plenty of applications that will be built on top of, you know, the data warehouse and, you know, as data warehouse native apps, but there will also be plenty of applications that don't, that need data from the warehouse.
I think, you know, we see this a lot where there's a number of systems and tools that don't really need the benefits that we talked about, but still need some data from the warehouse. Reverse ETL tools are a really, really good compliment in that scenario.
So we see kind of a future where a lot of the core business applications like customer engagement, which is the category that Supergrain is building in, will have warehouse-native apps, but there's a very, very long tail of applications that still serve important business use cases where reverse ETL tools fulfill like a very, very important need.
Q. Last question — what's the one piece of advice you have for companies that are looking to get started with a warehouse-native app?
I think the biggest piece of advice is actually less from a technology standpoint and more from a mindset perspective!
Obviously, you need a data warehouse. You wanna make sure that you check that box, but the other one is that as a company you should have made a concerted investment in the data warehouse as the single source of truth for your data. And you wanna turn that into a data platform. And as you've made those investments, whether it's in data or the teams, and kind of aligned the downstream customers of data to that mindset, then it becomes a lot easier to adopt a warehouse-native application like Supergrain.
Prefer watching the interview?
If you’d like to hear other perspectives, check out the other parts of the series on Warehouse-native apps: