Theseus runs on AMD GPUs

Theseus targets hipDF to run on AMD hardware

For more than a decade, “accelerated analytics” was shorthand for one vendor—NVIDIA—and one software stack, CUDA. The CHIPS Act and similar diversification efforts promise alternate silicon, but those goals are unrealistic without an equivalent software ecosystem to CUDA. Today, we deliver a concrete step toward developing that ecosystem: Theseus now runs on both CUDA and ROCm.

CUDA powers roughly ninety-eight percent of datacenter GPUs.. NVIDIA’s CUDA DataFrame Library (cuDF) has become an ever‑growing catalogue of data analytics libraries and the default standard for running large‑scale data pipelines on GPUs. Built on Apache Arrow and using RAPIDS cuDF to target NVIDIA GPUs, Theseus delivered benchmark results that exceed NVIDIA’s Spark RAPIDS performance by 10X, further widening the gap for software developed CPU-first. As part of AMD’s ROCm DS Early Access Program, our engineers integrated Theseus’s execution layer with AMD’s emerging hipDF DataFrame library. With Theseus on ROCm, customers now have a second, production‑ready road to petabyte‑scale SQL.

Why AMD, why now?

While the hardware benefit is clear, AMD’s software strategy attempts to fast-follow CUDA development, quite literally. AMD’s response to CUDA is the Heterogeneous-compute Interface for Portability, or HIP. HIP is a source‑level compatibility layer that translates CUDA runtime calls to AMD’s ROCm driver stack, allowing the same codebase to be compiled for AMD infrastructure. AMD’s latest HIP project, hipDF (docs, github), targets enterprise data analytics use cases (think jobs typically run on Spark, BigQuery, Databricks, or Snowflake) and accelerates them on the AMD stack. hipDF is a cuDF-compatible ROCm dataframe library. It remains in an alpha release, with a general availability (GA) release planned for later in 2025.

This is no easy undertaking; in fact, we’ve done it before. Back in late 2017, the RAPIDS project was conceived (and architected) at the Berkeley RISELab, with contributing members in attendance from BlazingSQL, the Apache Arrow community, Anaconda, and NVIDIA, among others. These pioneers went on to form the RAPIDS ecosystem and ultimately founded Voltron Data. At the time, BlazingSQL was essentially a RAPIDS company that not only was contracted by NVIDIA to contribute low-level code to the project, but also created a distributed SQL framework on NVIDIA GPUs that battle-tested and hardened the RAPIDS stack for commercial use cases. That same BlazingSQL team is driving Theseus engineering today. Theseus is an evolution of BlazingSQL, and that expertise is what Voltron Data can bring to the hipDF project collaboration.

libcuDF was a catalyst for NVIDIA software development

NVIDIA leverages the RAPIDS cuDF technology stack broadly throughout their product suites (e.g., NVIDIA Morpheus for cybersecurity, NVIDIA Clara for healthcare, and NVIDIA Nemo for Generative AI). But this doesn’t stop at the product level. NVIDIA offers a suite of software libraries that depend on cuDF for data processing for machine learning (cuML), graph analytics (cuGraph), vector search (cuVS), and image processing (cuCIM). These libraries use libcuDF primitives to minimize overhead and optimize jobs to run faster on GPU hardware. Lastly, because hipDF, like cuDF, is built on Arrow, integration with popular CPU-based data processing engines like pandas, polars, and DuckDB is also within grasp. This broadens the user base and creates an easier onboarding to AMD GPUs.

Theseus Compute Mesh

Voltron Data used this same approach when developing Theseus. We define it as a “compute mesh” that unifies hardware, languages, and applications. Our design decisions were intentional to support this compute mesh vision.

Accelerator-Native: Distributed query engine built from the ground up to take advantage of full system hardware acceleration.
Composable: Built on open source standards that enable full-stack interoperability
Evolutionary: A composable engine that seamlessly evolves with new hardware and languages

Who wins?

AMD wins. hipDF is a catalyst opening new market opportunities, user communities, and ecosystems.
Federal agencies and national labs shed single-supplier risk and curb procurement costs while complying with open-standards mandates.
Cloud providers can roll out MI300 instances with turnkey SQL acceleration, expanding beyond AI training and inference.
Enterprises in finance, retail, telecom, and healthcare run petabyte-scale BI and risk analysis on AMD hardware tomorrow, using the same SQL they wrote yesterday.
Hyperscalers gain a heterogeneous execution model ready to run on any future accelerator.
The open-source community receives a fully permissive, ROCm-native alternative to RAPIDS cuDF, igniting fresh innovation.

What’s Next?

We used AMD’s HIP framework to integrate Theseus with ROCm and connect it to the early-access version of hipDF. Now Theseus just runs on AMD GPUs, demonstrating how mature HIP has become in supporting projects on both CUDA and ROCm. After GA, we will update to the latest Theseus release and run TPC-H and TPC-DS benchmarks at SF10000 on AMD MI300 hardware.

Wrap up

Theseus is built from the ground up for accelerated systems. That’s why Theseus’ secret sauce is its ability to spill out of GPU memory, shuffle over accelerated networks, and keep GPU pipelines fully saturated throughout query execution. Theseus brings these capabilities to ROCm-DS, giving AMD an immediate, production-grade answer to RAPIDS cuDF and unlocking AMD GPUs for analytics.

Voltron Data is excited to participate in AMD’s hipDF early access program and share more Theseus benchmarks on more AMD hardware. If you’re interested in learning more about AMD’s hipDF project, check out their blog, Introducing ROCm-DS: GPU-Accelerated Data Science for AMD Instinct™ GPUs.