Mar 21, 2023

New Release: Ibis 5.0 Has Landed

Patrick Clarke

single firework in sky red and gold colors

For those of you that missed our previous post, Ibis 5.0 brings a host of new features and UX updates. This release strengthens Ibis’s position as the DataFrame API for Python helping you go from code to dev to prod with ease.

In this post, we’re highlighting five key features of the Ibis 5.0 release:

  • A new Apache Druid backend
  • Table-wide operations (selectors)
  • pivot_longer, the un-pivot function
  • New convenience modules for interactive analysis
  • Support for writing CSVs and Parquet files

These new updates make data exploration easier than ever before.

Ibis Connects to Apache Druid Backend

On February 24, 2023, a user requested Apache Druid support. Voltron Data’s Phillip Cloud, with the help of Gil Forsyth and Krisztián Szűcs, quickly created and merged the Druid backend within six hours. Going forward, Ibis users can now write Ibis expressions against their Druid databases after installing the Druid backend through pip (pip install ibis-framework[druid]) or conda (conda install ibis-druid).

Ibis Column Selectors Make Table-Wide Operations Convenient

Column selectors (“selectors”) are now available, making table-wide operations easier to write. Column selectors are convenience functions for selecting columns that share some property. Selectors can be used to pick columns by their attributes and apply a function to them without picking and choosing which columns to operate on. Read more about how selectors can simplify your workflows on the Ibis Project blog.

Ibis Makes Interactive Analysis Easier

The selectors blog also highlights two new UX features in 5.0: the ibis.interactive and ibis.examples (ex) modules. The Interactive module (ibis.interactive) is star-imported (from ibis.interactive import *) to implement some data exploration defaults as a replacement for boilerplate imports and options-setting.

The Interactive module imports the examples module (ibis.examples, ex), which is a repository of example data. Users can use the examples module to quickly download a variety of tabular data to get started on Ibis immediately:

>>> from ibis.interactive import *
>>> cabbages = ex.cabbages.fetch()
>>> cabbages
┏━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
 cult    date    head_wt  vit_c 
┡━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
 string  string  float64  int64 
├────────┼────────┼─────────┼───────┤
 c39     d16         2.5     51 
 c39     d16         2.2     55 
 c39     d16         3.1     45 
 c39     d16         4.3     42 
 c39     d16         2.5     53 
 c39     d16         4.3     50 
 c39     d16         3.8     50 
 c39     d16         4.3     52 
 c39     d16         1.7     56 
 c39     d16         3.1     49 
                            
└────────┴────────┴─────────┴───────┘

Ibis Feature Engineering by Melting

pivot_longer is a function that ships with Ibis 5.0. This function is the equivalent of pandas’ melt function, which un-pivots data by turning column names into values. Feature engineering with Ibis is now much easier out of the box with this added functionality.

Ibis Expressions for CSV and Parquet Data File Movement

Users can now also write expression results to Parquet or CSV files using the to_parquet or to_csv functions, making Ibis a more robust API for moving data around.

Along with these changes, more features, quality-of-life updates, and bug fixes have shipped with Ibis v5.0. Download or update Ibis and dive into your data today.

If you want to dive deeper and access more resources, check out our Ibis page. If you’re working with the project and want to accelerate your success, explore our Enterprise Support options.

Photo by Isaac Wolf