Theseus for RAG Workflows

Maximize Data Freshness and Tokenize on the Fly

Instantly tokenize and search live data at query runtime to always retrieve the freshest insights and power enterprise-scale RAG with zero overhead.

Embed vector search directly in SQL via UDFs and eliminate external pipelines and orchestration.

Scale to large-sized datasets with GPU acceleration, delivering results in seconds, not hours.

Feed live production data as structured context to LLMs for up-to-the-minute, domain-specific insights.

Download the Theseus for RAG Workflows 1PGR

Build RAG Pipelines
with SQL Statements

Python

1k = 100
2user_question = "Where are earthquakes causing damage?"
3
4result = con.sql(f"""
5SELECT 
6    source_url, source_text,  rag.find_nearest_neighbor_distances(
7        embedding, '{user_question}'
8    ) AS distance_result
9FROM gdelt_text_embeddings
10ORDER BY
11distance_result ASC
12LIMIT {k}
13""").to_pyarrow()
14agent_response = chat.ask(user_question, result['source_text'])
15
16pprint(agent_response[0].as_py())

Read the DevBlog

Drawbacks of Traditional RAG Approaches

Traditional RAG setups require heavy Python orchestration and external vector stores (e.g., Pinecone, FAISS, Chroma), which complicates SQL integration and drives up retrieval costs at scale.

Introducing SQL operations like joins, sorts, aggregations, or filters across multiple sources deteriorates RAG pipeline performance.

Contact us to get started

We'll help you start using Theseus and accelerate your data processing workflows.

Book a Call

Advantages of Using Theseus

Theseus leverages SQL-native vector search and GPU-accelerated query performance for production-scale, structured, and performance-critical applications.

	Theseus	Others
Retrieval Method	SQL dialect, structured & vector	Similarity search with embeddings
Infrastructure	GPU-accelerated, SQL-native engine	Python libraries, vector DBs
Scale	Petabyte scale, structured and semi-structured	Document-level, small-to-medium scale
Target User	Data engineers, SQL analysts	AI developers, data scientists
Use Cases	Enterprise analytics, SQL pipelines	Document retrieval, chatbots, QA systems

Example RAG Pipeline with JIT Tokenization and Embedding

Pull in RAW data into GPU memory (CSV, Parquet, JSON)

for example, news articles with metadata and URL to news source

Generate embeddings in situ

scrape text from news articles and generate embeddings with a GPU tokenizer using tools like Hugging Face and Triton Inference

Search embeddings in situ

use a vector search tool or library like Pinecone, Quadrant, AstraDB or NVIDIA cuVS to search embeddings for articles relevant to the question asked.

Inference/LLM in situ

feed relevant articles alongside the user question and generate a response using Langchain, RaySegve or AWS Bedrock.