Latest Updates & Insights

The Theseus Blog

Discover the latest insights, tutorials, and updates about
Composable Data Systems, Theseus, SQL, and AI-powered data
processing.

Featured Posts

SQL RAG website icons

SQL and Semantic Search - Merging Structured and Unstructured Data

Voltron Data

Theseus Supports Concurrency on AWS

Theseus Supports Concurrency on AWS

Voltron Data

Latest Posts

Starcraft-it's-about-damn-time

Rethinking Scale: Theseus for Workloads Big and Small

Rodrigo Aramburu

Rodrigo Aramburu

The Theseus Query Profiler User Interface

The Theseus Query Profiler Part One: The Front-End

Johan Peltenburg

Johan Peltenburg

Joost Hoozemans

Joost Hoozemans

Alleviating S3 Bottlenecks and Boosting I/O Performance

Alleviating S3 Bottlenecks and Boosting I/O Performance

Voltron Data

Theseus runs on AMD GPUs

Theseus runs on AMD GPUs

Voltron Data

Technical Blog on How Theseus Executes a Join Operation on Multiple GPUs

Data Analytics are Faster on GPUs - Here’s Why

Voltron Data

When choosing a data engine, consider the right tool for the job.

The GPU ↔ CPU crossover: DuckDB and Theseus are better together

Voltron Data

Top 5 Challenges in Large-Scale Data Pipelines — And How GPU-Accelerated Analytics Unlocks Next-Level Innovation

Top 5 Challenges in Large-Scale Data Pipelines — And How GPU-Accelerated Analytics Unlocks Next-Level Innovation

Voltron Data

NEW Theseus Benchmarks 10X Spark-RAPIDS

NVIDIA’s Spark-RAPIDS Hits The Wall — Constrained by Spark’s CPU Architecture

Voltron Data

Modern Data Governance and Security with Voltron Data's Theseus: Meeting the Requirements of EO-14028 and OMB M-21-31

Modern Data Governance and Security with Voltron Data's Theseus: Meeting the Requirements of EO-14028 and OMB M-21-31

Steven Morrow

Relentlessly Improving the Performance of our GPU Query Engine, Theseus

Relentlessly Improving the Performance of our GPU Query Engine, Theseus

Voltron Data

Go Inside Arrow Database Connectivity: Roadmap, Background & Community

Go Inside Arrow Database Connectivity: Roadmap, Background & Community

Srikanth Nadukudy

Adding A New Ibis-cuDF Backend With Zero Code Changes

Adding A New Ibis-cuDF Backend With Zero Code Changes

Marlene Mhangami

Fast, Elegant, and Performant Geospatial Data Analysis with Arrow

Fast, Elegant, and Performant Geospatial Data Analysis with Arrow

Dewey Dunnington

ADBC Brings Composability to Industry Leading Data Tools, Stacks

ADBC Brings Composability to Industry Leading Data Tools, Stacks

GPUs for Analytics: An Experiment with Tuning, Chunking, Compression & Decompression

GPUs for Analytics: An Experiment with Tuning, Chunking, Compression & Decompression

Joost Hoozemans

Hugging Face Embeddings & Ibis: Create a Custom Search Engine for Your Data

Hugging Face Embeddings & Ibis: Create a Custom Search Engine for Your Data

Marlene Mhangami

The Standard Dataframe Language for Data Analysis and Data Engineering

The Standard Dataframe Language for Data Analysis and Data Engineering

Marlene Mhangami

Fernanda Foertter

nanoarrow: A Lightweight, Embeddable Arrow Implementation for Data Pipelines

nanoarrow: A Lightweight, Embeddable Arrow Implementation for Data Pipelines

Dewey Dunnington

Leverage Arrow and Ibis to Streamline Database Connectivity

Leverage Arrow and Ibis to Streamline Database Connectivity

Dataframe Interoperability in Python: How PyArrow Enables Modular Workflows

Dataframe Interoperability in Python: How PyArrow Enables Modular Workflows

François Michonneau

Pass Data Between Python and R using Parquet & Arrow for Scalable Reporting

Pass Data Between Python and R using Parquet & Arrow for Scalable Reporting

François Michonneau

Use LLMs with Python UDFs to Query & Generate Tabular Data in Natural Language

Use LLMs with Python UDFs to Query & Generate Tabular Data in Natural Language

Marlene Mhangami

Ibis and Snowflake at the Speed of Arrow

Ibis and Snowflake at the Speed of Arrow

Zero-Copy Sharing using Apache Arrow and Golang

Zero-Copy Sharing using Apache Arrow and Golang

Showing the Power of Parquet and Ibis Using UK Census Data

Showing the Power of Parquet and Ibis Using UK Census Data

How Polars Leverages Rust and Arrow For Faster Data Pipelines

How Polars Leverages Rust and Arrow For Faster Data Pipelines

Marlene Mhangami

Use LangChain & Ibis to Chat with Data Stored Anywhere

Use LangChain & Ibis to Chat with Data Stored Anywhere

Marlene Mhangami

Ibis 6.0 Preview: Supercharge Oracle Workflows with the Power of Python

Ibis 6.0 Preview: Supercharge Oracle Workflows with the Power of Python

Shopping for a Data Warehouse? Lower Costs Using Ibis to Benchmark Queries

Shopping for a Data Warehouse? Lower Costs Using Ibis to Benchmark Queries

François Michonneau

Data Transfer with Apache Arrow and Golang

Data Transfer with Apache Arrow and Golang

How to Connect & Analyze Data using Ibis in Snowflake

How to Connect & Analyze Data using Ibis in Snowflake

Fernanda Foertter

Give Your MySQL, MS SQL, or PostgreSQL Stack an Upgrade with Ibis

Give Your MySQL, MS SQL, or PostgreSQL Stack an Upgrade with Ibis

How Google Uses Ibis for its Data Validation Tool (DVT)

How Google Uses Ibis for its Data Validation Tool (DVT)

Make pandas Faster with DuckDB

Make pandas Faster with DuckDB

Comparing Performance of ADBC and JDBC in Python for Arrow Flight SQL

Comparing Performance of ADBC and JDBC in Python for Arrow Flight SQL

Scale From Local to Distributed Cloud Compute with Ibis, PySpark, and Amazon EMR

Scale From Local to Distributed Cloud Compute with Ibis, PySpark, and Amazon EMR

arrow-go-work-with-data-files

Make Data Files Easier to Work With Using Golang and Apache Arrow

Matthew Topol

Felipe Aramburu

Explore a New Way to Deploy Data Storage and Analytics with Arrow Flight SQL and Apache Superset

Explore a New Way to Deploy Data Storage and Analytics with Arrow Flight SQL and Apache Superset

Ibis 5.1: Faster file reading with DuckDB, Arrow-Native Workflows for Snowflake, and more

Ibis 5.1: Faster file reading with DuckDB, Arrow-Native Workflows for Snowflake, and more

Scaling Out to Apache Spark with Ibis

Scaling Out to Apache Spark with Ibis

When Scale Matters, Don't Wait on pandas...

When Scale Matters, Don't Wait on pandas...

Breaking Down the First Principles of Ibis

Breaking Down the First Principles of Ibis

New Ibis Backend Shipped in 4 hours… Hello, Druid!

New Ibis Backend Shipped in 4 hours… Hello, Druid!

What is Substrait? A High-Level Primer

What is Substrait? A High-Level Primer

Learn How Ibis Solves Your @Problems (Video)

Learn How Ibis Solves Your @Problems (Video)

Speeds and Feeds: Hardware and Software Matter

Speeds and Feeds: Hardware and Software Matter

Ibis: Upgraded Interface, Same Stack

Ibis: Upgraded Interface, Same Stack

Use Apache Arrow and Go for Your Data Workflows

Use Apache Arrow and Go for Your Data Workflows

How to Use Snowflake and Ibis for Better Analytics

How to Use Snowflake and Ibis for Better Analytics

Marlene Mhangami

New Release: Ibis 5.0 Has Landed

New Release: Ibis 5.0 Has Landed

SQL and Data Frames Unite with Ibis

SQL and Data Frames Unite with Ibis

Marlene Mhangami

b.telligent Makes Intelligent Use of Ibis

b.telligent Makes Intelligent Use of Ibis

Shopping for a Data Warehouse? Put Workloads to the Test with Ibis

Shopping for a Data Warehouse? Put Workloads to the Test with Ibis

Ibis 5.0 Preview: Three Features to Get Excited About

Ibis 5.0 Preview: Three Features to Get Excited About

From Laptop to Cloud: Ibis Connects With Your Data at Any Scale

From Laptop to Cloud: Ibis Connects With Your Data at Any Scale

The Top Python Tools to Analyze PUMS Census Data

The Top Python Tools to Analyze PUMS Census Data

Marlene Mhangami

Scaling Down: The Python Libraries You Need to Compress and Analyze the PUMS Dataset

Scaling Down: The Python Libraries You Need to Compress and Analyze the PUMS Dataset

Marlene Mhangami

Ibis: Easy, Performant, and Portable Python API for Data Analytics

Ibis: Easy, Performant, and Portable Python API for Data Analytics

Fernanda Foertter

Running an Arrow Flight SQL Server and Querying Data with JDBC and ADBC

Running an Arrow Flight SQL Server and Querying Data with JDBC and ADBC

383 Ibis Expressions and the Only Language You Need is One

383 Ibis Expressions and the Only Language You Need is One

Ibis and Substrait: Supercharging Portability

Ibis and Substrait: Supercharging Portability

Quick Wins: Shortening Time to Analysis with Ibis 4.1

Quick Wins: Shortening Time to Analysis with Ibis 4.1

Making Big Data Feel Small: Analysis of Hacker News Stories with BigQuery and Ibis (Part 1)

Making Big Data Feel Small: Analysis of Hacker News Stories with BigQuery and Ibis (Part 1)

Marlene Mhangami

Arrow Database Connectivity: Apache Arrow for Every Database User

Arrow Database Connectivity: Apache Arrow for Every Database User

Apache Arrow Flight SQL: Arrow for Every Database Developer

Apache Arrow Flight SQL: Arrow for Every Database Developer

Inside Ibis: Contributors Weigh In Ahead of the 4.0 Release

Inside Ibis: Contributors Weigh In Ahead of the 4.0 Release

Ibis Explained: Increasing Code Portability and Performance Gains

Ibis Explained: Increasing Code Portability and Performance Gains

The Takeaway: Go and Apache Arrow at ApacheCon ‘22

The Takeaway: Go and Apache Arrow at ApacheCon ‘22

Ibis Explained: Making DataFrames, Big and Small, More Delightful

Ibis Explained: Making DataFrames, Big and Small, More Delightful

Quick Wins: Accelerating pandas CSV Reading with Apache Arrow

Quick Wins: Accelerating pandas CSV Reading with Apache Arrow

Data Transfer Between Python and R with rpy2 and Apache Arrow

Data Transfer Between Python and R with rpy2 and Apache Arrow

Danielle Navarro

Ibis v3.2.0 Brings More Ways to Tackle Tabular Data

Ibis v3.2.0 Brings More Ways to Tackle Tabular Data

Marlene Mhangami

Passing Arrow Data Between R and Python with Reticulate

Passing Arrow Data Between R and Python with Reticulate

Danielle Navarro

Creating an Arrow Dataset

Creating an Arrow Dataset

Françios Michonneau

One Function. No Rewrites. Explore Ibis for Filling Null Values.

One Function. No Rewrites. Explore Ibis for Filling Null Values.

Improving Apache Arrow One Change at a Time

Improving Apache Arrow One Change at a Time

Stephanie Hazlitt

Quick Wins: Reading CSVs in R with Apache Arrow

Quick Wins: Reading CSVs in R with Apache Arrow

Getting Started with Apache Arrow in R

Getting Started with Apache Arrow in R

Danielle Navarro

Simplifying database connectivity with Arrow Flight SQL and ADBC

Simplifying database connectivity with Arrow Flight SQL and ADBC

Apache Arrow Version 9.0.0 Released

Apache Arrow Version 9.0.0 Released

Alessandro Molina

Ibis v3.1.0 Release Brings New Features and Updates

Ibis v3.1.0 Release Brings New Features and Updates

Introducing Arrow Flight SQL: The All-Star Database Connector

Introducing Arrow Flight SQL: The All-Star Database Connector

Serving Dataframes Over the Wire with Arrow Flight SQL and DuckDB

Serving Dataframes Over the Wire with Arrow Flight SQL and DuckDB

Data Transfer at the Speed of Flight

Data Transfer at the Speed of Flight

Fernanda Foertter

Engine Agnostic Analytics with Ibis

Engine Agnostic Analytics with Ibis

Arrow 8.0.0 Release Brings New Functionality for PyArrow, Arrow Flight, C++ Engine, and More

Arrow 8.0.0 Release Brings New Functionality for PyArrow, Arrow Flight, C++ Engine, and More

Alessandro Molina

Apache Arrow New Contributor’s Guide

Apache Arrow New Contributor’s Guide

Introducing Substrait: An Interoperable Data to Engine Connector

Introducing Substrait: An Interoperable Data to Engine Connector

Apache Arrow Version 7.0.0 Released

Apache Arrow Version 7.0.0 Released

Alessandro Molina

Apache Arrow: Driving Columnar Analytics Performance and Connectivity

Apache Arrow: Driving Columnar Analytics Performance and Connectivity

Apache Arrow 7.0.0 – What to Expect

Apache Arrow 7.0.0 – What to Expect

Alessandro Molina