The Data Thread Spotlight: Simplify Your Analytical Workflows with Ibis

Patrick Clarke · Jun 15, 2022

Next week at The Data Thread conference, we’re highlighting a number of open source software libraries connected with the Apache Arrow ecosystem. One library of particular significance is Ibis because it will soon have the ability to integrate with Substrait, a connector that works with compute engines to make data processing more efficient.

If you’re unfamiliar with Ibis, it’s a Python library that allows users the ability to query data using engine-agnostic Python expressions. Adding Substrait support will expand this agnosticism. In other words, if an engine supports Substrait then Ibis should theoretically support that engine.

As part of the programming for this inaugural conference, a series of short tutorials will be released to introduce Ibis, outline how the tool can be used to improve your data workflows, and spotlight some practical use cases.

What is Ibis? + Simple Demo with Patrick Clarke

Patrick will talk about Ibis and walk through a simple example of de-stringifying your queries to make parameterization simpler. Additionally, there will be a brief into fString SQL versus Ibis expressions.

Querying Sales Data with Ibis with Patrick Clarke

Patrick provides a quick overview of a small project he worked on to familiarize himself with Ibis. Using mock sales data, he constructs a sales data query from scratch using fStrings and Ibis expressions and then discusses how Ibis can be used to make parameterizing queries much easier.



How to Get Rid of Stringly-Typed Analytics with Phillip Cloud

Phillip introduces Ibis and walks through some basics to discuss how the tool can be used to avoid some of the pitfalls of using formatted strings to query our data.



Marlene will be using Ibis and data from DataSF’s Civic Art Collection to explore the San Francisco art scene. Ibis provides a more Pythonic way to interact with data. She will be looking at how to carry out joins, filters and other operations with Ibis’ familiar Pandas-like syntax. The database/backend used is MySQL and she will briefly share which other backends Ibis supports. Learn more about the most popular Civic Art in San Francisco while also discovering more about this helpful library.

You can read more about Marlene’s work with Ibis on her blog.



Ibis as an Interface for Analytics with Hussain Sultan

Hussain will demonstrate how Ibis can enhance the user experience between analytical Python developers and their data by walking us through a demo using Ibis interaction with DuckDB through Substrait.



How to Access this Content

These resources are coming soon, and we hope you will be just as eager to dive into Ibis as we were. For early access to the videos, register to attend The Data Thread.

Ibis will be updated to version 3.1.0 later this summer. To learn more about the Ibis project, visit ibis-project.org.