AI Powered Database Queries

by Markus Egger | First published: July 16, 2024 - Last Updated: August 16, 2024

We have added a feature to quite a few applications now that I am very excited about: AI database queries.

This is different from AI Copilot type of features. Copilots typically provide a chat-style interface (or sometimes a single-button triggered interface) that answers questions, ideally including your own data. For instance, Copilot features within our own applications I have previously demonstrated (including on this blog) can answer questions suchas as "who attended our recent AI training class, and what do you know about the attendees?". This then usually creates a conversational text response with interesting information and - usually - the ability to click into further details. Such answers can usually include a handful of data points, similar to the amount of information you could cover when having a chat with a human.

But what if you want to query millions of records?

That is the domain of database queries, such as SELECT * FROM Customers WHERE.... It is the strength of database technologies to handle very large amounts of data, whether that is SQL Server, Oracle, a no-SQL database, and many others. But what if you wanted to use AI to query such a database and run a query using an AI prompt like this: Customers from the greater Houston area who attended one of our events in-person since the start of the pandemic.

And that is exactly the example I am showing here. We have developed technology that can take this kind of AI prompt (in English, or any other language) and turn it into an SQL query behind the scenes, and then run the query with raw SQL power (or whatever database technology you prefer). This is fundamentally different from typical copilots in that it can handle large amounts of data (or even streaming data), because once the AI generates an SQL query that satisfies the criteria, it is just that: An SQL query.

Here is the video that shows my demo:

Note that your specific requirements may be very different, but the same technique could be used to craft a solution that satisfies any scenario or database technology.

How does it work?

We have a custom implementation of this (and yes, I know I previously was particular about this not being a good idea… see security issues below). We take the AI prompt and hand it to an LLM model (currently GPT-4o, but we change this regularly), which is grounded in a special data dictionary specifically made to explain relevant data structures to the AI in a way that makes sense to an AI, and in a way that is also specific to the scenario. (In other words: In this example, we look at the data structures speficically for looking at people and some 70+ related tables that connect to people. In other scenarios, the dictionary may be designed to be completely different). We then have a rather sophisticated promot that allows the LLM to create an SQL query that matches the prompt.

Or at least we hope that it does 😃. Our next step is to use another AI to verify that the resulting SQL query is actually satisfying the original criteria. If it doesn't, we go back to square one and start over.

Security

Once we have a query, we use yet another AI to inspect the query for potential security problems. This helps us identify standard issues, and we abandon the query if it appears to be problematic. (Note that our custom data-dictionary approach also reduces our attack surface, since the AI is only aware of some of our data structures that are relevant for the scenario in question). We then take another step, where we AI-analyze the query to see if it intends do do something that is not desired. For instance, since we are creating a query that only returns records, we can inspect the query to make sure it isn't attempting to modify data. (Also, the security context we run this under cannot modify data anyway… but never hurts to double-check, right? 😃 ).

Also, our query has to satisfy some very specific criteria. In this example, it can only return a list of primary key values in a single column. If the resulting structure was anything other than that, it would simply fail, due to how we run the query (in this example, as a sub-query). Finally, if we are satisfied that everything looks good, we run the query in a “blast chamber”. That is a small fake version of the database, with the same structure as our production database. This allows us to safely run the query in an envirnment where even really bad outcomes would not matter.

When we are finally satisfied that all that works, we keep the resulting query in a safe place on the server and return a token to the client that represents that query. (This entire process usually takes about 5 seconds, or so). This token can then be used to run the query later (either immediately as a next step, or at any later point, depending on how long we want to keep the query available). It is important that the client never gets to see the actual query. The client can only say “run query XYZ”, but it will not see the actual SQL for security reasons.

Note that all these steps are very important. Otherwise, running such a query can be very problematic security-wise. (This is why I previously said to never do this 😃 ). Your admin may still require you to run this on a duplicate of your database depending on your exact setup.

Out-of-the-box Support for AI Queries

Currently, a number of database vendors are working on similar features to be added to their core products. (I would even assume that most of them are). These implementations may make these scenarios much easier.

I am not entirely sure how I feel about that. As you can see, we are taking numerous steps to reduce the attack surface, remove the ability to do this too generically (so hacking attacks are more likely to fail… again, we are reducing the attack surface), and only expose the parts of our database required for a specific scenario. Also, the purpose-crafted AI data dictionary provides a great way to “explain” data structures to the AI in a way it not just understands, but that is also specific to the scenario at hand.

I think that approach will be superior to more generic features supported by a database directly. It is also more work though (although - like many AI tasks - it is NOT a huge effort either). I am curious to see how all that will play out in the long run.