May 15, 2023

How Google Uses Ibis for its Data Validation Tool (DVT)

Kae Suarez

Ibis is a fantastic tool for portable data analytics, and we are proud to support it. Today we’ll explore how the community is using Ibis and extracting value from it.

In 2021, Google launched the Data Validation Tool (DVT). At a company of Google’s scale, large quantities of data are moved between locations/technologies often so ensuring success became necessary to maintain agility. For Google in particular, this work enables customers to move to Google Cloud from elsewhere. This is called data ingress. In order to handle cross-backend validation at this scale, DVT employs Ibis, which can be connected to multiple databases simultaneously to compare data from multiple sources.

Here’s how Google explained it when they launched DVT:

Today, we are excited to announce the Data Validation Tool (DVT), an open-sourced Python CLI tool that provides an automated and repeatable solution for validation across different environments. The tool uses the Ibis framework to connect to a large number of data sources including BigQuery, Cloud Spanner, Cloud SQL, Teradata, and more.

Because Ibis offers this interface in open source, the DVT was easily implemented — it’s ongoing today, and you can find it on GitHub at:

It’s exciting to see how Ibis is used “in the real world,” and look forward to seeing and sharing more!

Photo by William Daigneault