Voltron Data
For more than a decade, “accelerated analytics” was shorthand for one vendor—NVIDIA—and one software stack, CUDA. The CHIPS Act and similar diversification efforts promise alternate silicon, but those goals are unrealistic without an equivalent software ecosystem to CUDA. Today, we deliver a concrete step toward developing that ecosystem: Theseus now runs on both CUDA and ROCm.
CUDA powers roughly ninety-eight percent of datacenter GPUs.. NVIDIA’s CUDA DataFrame Library (cuDF) has become an ever‑growing catalogue of data analytics libraries and the default standard for running large‑scale data pipelines on GPUs. Built on Apache Arrow and using RAPIDS cuDF to target NVIDIA GPUs, Theseus delivered benchmark results that exceed NVIDIA’s Spark RAPIDS performance by 10X, further widening the gap for software developed CPU-first. As part of AMD’s ROCm DS Early Access Program, our engineers integrated Theseus’s execution layer with AMD’s emerging hipDF DataFrame library. With Theseus on ROCm, customers now have a second, production‑ready road to petabyte‑scale SQL.
While the hardware benefit is clear, AMD’s software strategy attempts to fast-follow CUDA development, quite literally. AMD’s response to CUDA is the Heterogeneous-compute Interface for Portability, or HIP. HIP is a source‑level compatibility layer that translates CUDA runtime calls to AMD’s ROCm driver stack, allowing the same codebase to be compiled for AMD infrastructure. AMD’s latest HIP project, hipDF (docs, github), targets enterprise data analytics use cases (think jobs typically run on Spark, BigQuery, Databricks, or Snowflake) and accelerates them on the AMD stack. hipDF is a cuDF-compatible ROCm dataframe library. It remains in an alpha release, with a general availability (GA) release planned for later in 2025.
This is no easy undertaking; in fact, we’ve done it before. Back in late 2017, the RAPIDS project was conceived (and architected) at the Berkeley RISELab, with contributing members in attendance from BlazingSQL, the Apache Arrow community, Anaconda, and NVIDIA, among others. These pioneers went on to form the RAPIDS ecosystem and ultimately founded Voltron Data. At the time, BlazingSQL was essentially a RAPIDS company that not only was contracted by NVIDIA to contribute low-level code to the project, but also created a distributed SQL framework on NVIDIA GPUs that battle-tested and hardened the RAPIDS stack for commercial use cases. That same BlazingSQL team is driving Theseus engineering today. Theseus is an evolution of BlazingSQL, and that expertise is what Voltron Data can bring to the hipDF project collaboration.
NVIDIA leverages the RAPIDS cuDF technology stack broadly throughout their product suites (e.g., NVIDIA Morpheus for cybersecurity, NVIDIA Clara for healthcare, and NVIDIA Nemo for Generative AI). But this doesn’t stop at the product level. NVIDIA offers a suite of software libraries that depend on cuDF for data processing for machine learning (cuML), graph analytics (cuGraph), vector search (cuVS), and image processing (cuCIM). These libraries use libcuDF primitives to minimize overhead and optimize jobs to run faster on GPU hardware. Lastly, because hipDF, like cuDF, is built on Arrow, integration with popular CPU-based data processing engines like pandas, polars, and DuckDB is also within grasp. This broadens the user base and creates an easier onboarding to AMD GPUs.
Voltron Data used this same approach when developing Theseus. We define it as a “compute mesh” that unifies hardware, languages, and applications. Our design decisions were intentional to support this compute mesh vision.
We used AMD’s HIP framework to integrate Theseus with ROCm and connect it to the early-access version of hipDF. Now Theseus just runs on AMD GPUs, demonstrating how mature HIP has become in supporting projects on both CUDA and ROCm. After GA, we will update to the latest Theseus release and run TPC-H and TPC-DS benchmarks at SF10000 on AMD MI300 hardware.
Theseus is built from the ground up for accelerated systems. That’s why Theseus’ secret sauce is its ability to spill out of GPU memory, shuffle over accelerated networks, and keep GPU pipelines fully saturated throughout query execution. Theseus brings these capabilities to ROCm-DS, giving AMD an immediate, production-grade answer to RAPIDS cuDF and unlocking AMD GPUs for analytics.
Voltron Data is excited to participate in AMD’s hipDF early access program and share more Theseus benchmarks on more AMD hardware. If you’re interested in learning more about AMD’s hipDF project, check out their blog, Introducing ROCm-DS: GPU-Accelerated Data Science for AMD Instinct™ GPUs.