Versatility of Exasol with Agentic Engineering | Exasol
Skip to content
Watch our demo
Contact us
Watch our demo
Home » Blogs » Versatility of Exasol with Agentic Engineering
Exasol is an analytical database. It’s built for joins, aggregations, window functions, and the kind of queries that chew through billions of rows before your coffee gets cold.
But Exasol is capable of doing much more than just that. It is incredibly extensible and malleable. I pushed the limits of it and solved a problem that one would not expect to be solved by their analytical database.
The problem
Semantic search is essential for data-driven systems. It returns meaningful results from a search query.
Semantic search is a data searching technique that focuses on understanding the contextual meaning and intent behind a user’s search query, rather than only matching keywords. For example, a search query of “How to change the password” would give results that contain documents like “account recovery steps” or “forgotten credentials”. Even though the words in the results do not exactly match the words used in the search query. The users don’t need to search for the exact document. Semantic search handles this for them.
This requires the data to be represented in vectors – embeddings. Vectors are datatypes that contain lists of numbers and floats. Vectors are used to create embeddings – a numerical, representation of low dimensional multi-modal data in high dimensions that capture semantic relationships. These embeddings are calculated by machine learning models specifically designed to create these embeddings.
With all these requirements at hand. This is the type of problem that requires a whole new architecture that lives separately from the database. But, with Exasol along with its extensible architecture and features – all this is possible from within the database.
Discovery
Exasol has three seams. A seam is a place where external code can get inside the database. These seams let the users/agents extend the functionality of the database – and this explains what a user/agent can do when you hand in the platform and some docs.
Most databases give you one extensibility hatch. If you’re lucky, it’s a stored-procedure language. If you’re less lucky, it’s a narrow plugin API with a handful of approved hooks. You can do things, but the shape of what you can do is tightly constrained by whatever the original designers pictured. With Exasol you get these three seams; they are all wide.
Virtual Schemas are the most important part of this whole story. The idea: an adapter sits between Exasol and an external data platform, and that external data platform now looks like a SQL table. When a query hits the virtual table, Exasol hands the parsed query – columns, filters, limits, and the adapter does whatever it wants. Query a database, hit an HTTP API, run a model. It hands row back. Exasol doesn’t care how the rows were made. It treats them like native data.
UDFs (User defined functions) are functions that users can write and run inside Exasol. Python, Lua, Java, R – pick your poison, write code, run it inside a SELECT statement. I used Python UDFs to ingest data: read rows, call an embedding model, push the vectors into Qdrant. Python UDFs have the full python runtime behind them, so you don’t have to install anything. If you need extra libraries like how we needed to run the embedding model in this. You can customize the SLC that sits behind every UDF. Find out more about more on customizing the SLCs here.
Script pre-processors are the third. They rewrite SQL before it runs, so you can intercept and transform queries right before they get executed. I didn’t use them here. But they’re sitting there, fully documented, for whoever wants to build something weirder.
Our Solution
Use Exasol’s Virtual Schemas and point them to an external vector database (Qdrant in this case) to import its vector functionality into Exasol. Use Exasol’s UDFs (User defined functions) to create the local embeddings which also gets injected into the vector database via UDFs – embeddings are data that get converted into vectors. We planned to create embeddings on table columns from an external machine learning model running on an Ollama instance. Ollama is an open-source server on which one can run open-source machine learning models. You can also store these embedding models in Exasol via BucketFS and invoke them with a UDF.
Now, I am not a Lua developer (programming language required to write this adapter). I’ve never written a line of Lua outside of this project. And the specific extensibility mechanism I needed to use – Virtual Schemas – expects adapters written in Lua. Historically that would have meant one of two things: file a request with engineering and waiting a quarter or spend a month learning Lua well enough to not embarrass myself.
I did neither. What I had instead was this:
Exasol’s Virtual Schema documentation. Thorough. Annoying to...