Save Time. Save Energy. Save Space. Save Money.
90x
Performance Increase.
Jobs done in 2 minutesrather than 3 hours.
100x
Decrease in Server Count.
Go from 200 servers down to 2.
100x
Cost Savings.
Spend $2 rather than$200 to run the same job.
How did we achieve these results?
Explore the Theseus Benchmarking Report.
Scale to 100 TB+ queries and beyond
Theseus uses hardware acceleration to brute force its way quickly through massive datasets.
Put your GPUs to work.
Minimize GPU downtime by running both analytics and AI workloads on a unified accelerated architecture.
Solve time-sensitive problems too big for Spark and Presto.
CPU-based engines have reached peak performance. Leverage Theseus to keep up with AI demand and fundamentally change how data systems are accelerated.
Work on full-sized problems.
No downsampling. Theseus delivers efficient spilling strategies outside of GPU memory so you can leverage less servers to process hundreds of terabytes of data.
100 TBs with no sorting, indexing, caching or warm-up runs.
See HowOur Engine. Your Product.
Build new or enhance data analytics products on top of Theseus to go to market faster.
Modular & secure installation
Tune and configure Theseus to meet your needs – not the other way around. Deploy on-premise or in the cloud where Kubernetes is supported. Your Data. Your Kubernetes. Your Authentication.
Embeddable in your product
Save your customers time and money by adding Theseus to your product offering. Retain control over your brand and vision while Theseus does the heavy lifting.
Raw data-ready
Indexing not required. Suitable for tabular machine or log data. No sorting, indexing, caching, or warm-up runs (cold queries).
Data Preprocessing for Machine Learning
Accelerate preprocessing tasks to free up resources and kickstart training. Scale beyond big data bottlenecks by transforming, sampling, and labeling ML workloads with the massive parallelism of multiple distributed GPUs.
Become an accelerator-native design partner today.
Interoperability through Open Standards
Composes seamlessly with established and emergent open source standards and tools. Built by the leading company behind Apache Arrow, ADBC, Ibis, and Substrait.
Use your tools and language of choice
Write code in Python or SQL. Read from popular formats (Avro, CSV, Parquet…), and table systems (Hive, Iceberg…).
Don’t move your data
Built with open standards like Arrow and Arrow Database Connectivity, Theseus allows you to operate on your data wherever it is.
Theseus’ open standards, like Ibis, allow developers to write code once and deploy it across an array of execution engines without changing a single line of code.
Built for Composability
Voltron Data is solely focused on delivering a scalable data engine that unifies hardware, languages, and frameworks to solve the eventual scale problem that organizations definitively will hit with existing data platforms.
Mike Leone
Principal Analyst, TechTarget
What’s needed is a key to unlock this data at scale through an analytics acceleration engine that brings the benefits of hardware acceleration to data processing. That’s where vendors like Voltron Data come into play.
Brad Shimmin
Chief Analyst, AI Platforms, Analytics, and Data Management at Omdia
As the average enterprise now accesses over a thousand data sources, businesses must invest their data processing capabilities to support the next order of magnitude for analytics and AI demands. Voltron Data has taken an important step forward with this maiden voyage of Theseus to solve all of these data issues for the Era of AI.
Hyoun Park
CEO and Principal Analyst at Amalgam Insights
FAQ
Is Theseus right for your organization?
Anywhere that supports Kubernetes. Theseus can be deployed on-premise or in the cloud where Kubernetes is supported. We regularly deploy on GKE, EKS, and AKS, but as an accelerator-native engine, our customers who deploy their technology on-premise have the greatest TCO and performance gains.
Theseus is not a database. Theseus doesn’t store data. Theseus works with data in the Arrow memory format – so long as your data speaks Arrow, Theseus can operate on it wherever it is. Theseus can read CSV files from network-attached storage, parquet files from s3, ORC, AVRO, and data warehouses like Snowflake and beyond.
Theseus delivers exceptional value for queries exceeding 30TB. There are numerous products and open-source projects for data under 30TB. We understand you have workloads below and above 30TB, so we've designed Theseus to work alongside other solutions through our supported open-source projects and standards.
If your team likes Python, they can leverage Ibis, a popular data frame library that supports over 20 different engines (including Theseus). If your team likes SQL, Theseus can support a myriad of different dialects, from PostgreSQL to Standard SQL (function support notwithstanding) via Ibis.
If you wrote your Spark jobs in Ibis (which supports Spark), then yes; otherwise, not yet.
On the largest queries, it’s expensive not to run on GPUs. As our benchmark report demonstrates, the costs for incredibly large queries quickly balloon out of control. Queries on GPUs can dramatically reduce costs. Moreover, the initial investment in GPUs for AI/ML use cases, where GPUs are often underutilized, can be repurposed for analytics. Move your most demanding analytics workloads to GPUs and get more value out of your investment. Our benchmarks show analytics workloads are dramatically less expensive on Theseus compared to CPU-native engines — over 50x cheaper than Spark.
We support all NVIDIA data center class GPUs from Volta onwards. The more HBM on the GPU, the better.