Voltron Data Logo
About

Learn more about our company

Contact Us

Get in touch with our team

Theseus

  • How it Works

    Core concepts and architecture overview

  • Control Plane

    Kubernetes deployment guide and best practices

  • Query Profiler

    Analyze and optimize query performance

Arrow
Loading...

In-memory columnar data processing

Ibis
Loading...

Python dataframe API for multiple backends

RAPIDS
Loading...

GPU-accelerated data science and analytics

Dev Blog

Latest updates and technical insights

Benchmarks Report

Read about our 2024 benchmarks for our data engine, Theseus.

The Composable Codex

A 5-part guide to understanding composable

Try Theseus

Product

  • How it Works
  • Control Plane
  • Query Profiler

Resources

  • Blog
  • Composable Codex
  • Benchmarks

Getting Started

  • Test Drive

Theseus

Built for AI workloads, Theseus is a high-performance SQL engine with GPU acceleration.

© 2025 Theseus. All rights reserved.

Terms of ServicePrivacy PolicyCookie Policy
Voltron Data Logo
About

Learn more about our company

Contact Us

Get in touch with our team

Theseus

  • How it Works

    Core concepts and architecture overview

  • Control Plane

    Kubernetes deployment guide and best practices

  • Query Profiler

    Analyze and optimize query performance

Arrow
Loading...

In-memory columnar data processing

Ibis
Loading...

Python dataframe API for multiple backends

RAPIDS
Loading...

GPU-accelerated data science and analytics

Dev Blog

Latest updates and technical insights

Benchmarks Report

Read about our 2024 benchmarks for our data engine, Theseus.

The Composable Codex

A 5-part guide to understanding composable

Try Theseus

When Scale Matters, Don't Wait on pandas...

K

Kae Suarez

April 19, 2023
When Scale Matters, Don't Wait on pandas...

pandas: The Python Dataframe Library

pandas has been a critical part of a Pythonic life with data for a long time. With a popular API that truly sets a Python data-processing application apart from, say, C++, it’s understandable that pandas has been, and will remain, a good friend to many.

However, at scale, data outgrows pandas. Certainly, there are solutions that claim to scale pandas, but these are closed-source and come with a vendor lock-in risk. Open source software is a key component of a modern stack, and this is why we see Ibis as the premiere dataframe API option for scale.

It’s tempting to stay with your current toolset and wait on adopting new technology — after all, they’ll get there, and they’ve done well to carry many programmers and data scientists far. But the world of Big Data moves quickly, and there may be a significant opportunity cost in waiting for tools to catch up to modern needs.

Don’t Wait on pandas…

…to Decouple API from Compute.

When you use most tools, you get their compute options. Many mimic the pandas dataframe API but implement their own specific compute options, which leaves you locked wherever you go. Ibis is the only true portable Python Dataframe interface that never ties you down and lets you move between several popular engines (ever heard of Trino?).

…to be the Fastest Engine.

The classics are improving with time, and becoming better at handling larger data, especially with efforts to rely more heavily on Apache Arrow. But, they aren’t there yet. Ibis already targets fast columnar engines, such as Snowflake. It even addresses pandas as a backend.

…to Scale to Many Nodes.

pandas is a one-node tool, and very good at that! However, it’s specialized for local use, not for scaling. The efforts that try to make it scale, even Dask, are limited by design decisions in the pandas API. There are already columnar engines that scale right now (such as, again, Snowflake), and Ibis provides access to them!

Grow Right Now with Ibis

Ibis is ready today — and this means you don’t need to wait take advantage of the modern features you need. Whether your data is gigabyte-big, terabyte-big, or petabyte-big, scaling across nodes and up data isn’t the future, it’s the present. Ibis is flexible, and the front door to entering this field of fast, scalable compute, without giving up on intuitive Dataframe APIs.

All of this is achieved with open source, and behind Ibis is an active team that is driven by innovation and pushing new features forward. The progress happening daily on the Ibis project is evident and shows how powerful open source is when it comes to transparency, innovation, and collaboration.

To be part of Ibis’s growth, check out the contribution guide. If you want to explore what Ibis can do for you, check out the Ibis page on our website or explore our Enterprise Support options.

Photo by Sam Poullain

Product

  • How it Works
  • Control Plane
  • Query Profiler

Resources

  • Blog
  • Composable Codex
  • Benchmarks

Getting Started

  • Test Drive

Theseus

Built for AI workloads, Theseus is a high-performance SQL engine with GPU acceleration.

© 2025 Theseus. All rights reserved.

Terms of ServicePrivacy PolicyCookie Policy