SQL 2025 and AI – chatting with your data part 1

With SQL Server 2025 embracing AI as a core focus, I was inspired by Bob Ward’s demo and Davide Mauri’s post to try out the new vector features myself. In this two-part series, I explore how SQL now supports semantic search and hybrid RAG setups — all running locally with Ollama.

Part 1: vector basics in SQL Server

Part 2 (coming soon): hybrid RAG with local LLMs

Lets begin 🙂

First, what’s all the fuss about ?

SQL 2025 ships with some new features enabling RAG applications to be built on top of the familiar sql server ecosystem. It’s Microsoft trying to keep sql relevant when everything under the sun seems to be eating into its market share.

RAG ( Retrieval-Augmented Generation ) is an architecture that combines retrieval of external data with generation using a large language model (LLM), like GPT or LLaMA.

What RAG Does:

Instead of asking an LLM to “guess” or hallucinate facts, it retrieves relevant context from an external data source (like a database, PDF, or website), and then asks the LLM to generate an answer based on that real-world context. So in simple terms and in SQL Server’s case you can search based on AI enhanced semantic reasoning. It’s actually pretty cool.

RAG Architecture

  1. User Question → “What are the symptoms of Lyme disease?”
  2. Embed the Question → Convert the question into a vector.
  3. Vector Search → Look up the most semantically similar documents/snippets from a vector store (SQL 2025, Azure SQL, Managed Instance)
  4. Context Retrieval → Get top N relevant documents/snippets.
  5. Prompt Construction → Inject retrieved context into an LLM prompt.
  6. LLM Generation → “Based on these docs, answer the user question.”

Why is RAG useful ?

  • Reduces hallucinations (answers grounded in facts)
  • Keeps LLMs up-to-date (by connecting to live data)
  • Enables domain-specific Q&A (using internal documents)
  • Lightweight alternative to fine-tuning

SQL 2025’s AI Features to enable RAG

Vector Data Type

Store vector data optimized for operations such as similarity search and machine learning applications. Vectors are stored in an optimized binary format but are exposed as JSON arrays for convenience. Each element of the vector is stored as a single-precision (4-byte) floating-point value.

Vector Functions

New scalar functions perform operations on vectors in binary format, allowing applications to store and manipulate vectors in the SQL Database Engine.

Vector Index

Create and manage approximate vector index to quickly and efficiently find similar vectors to a given reference vector.

External AI Models

Manage external AI model objects for embedding tasks (creating vector arrays) accessing REST AI inference endpoints.

So in SQL Server’s context there’s nothing fancy going on in the engine, we have a a new data type and the ability to call external REST API’s within SQL (that is pretty cool)

Vector Search Demo

The intention here is to get the vector search running on my own PC ,using the demo from Bob Ward, utilising Ollama to generate the embeddings. Once that’s working I tried to wrap it in some natural language so a user could ‘chat’ with the embedded data.

The RAG flow looks like this:

Setup

Below are some hugely simplified steps to get the basics of embedding generation going locally, leaning heavily on bob’s work.

  • Install SQL 2025 public preview.
  • Install Ollama.
  • pull nomic-embed-text.
  • Create proxy API with using HTTPS (SQL can only use external api’s running over https).
  • Create external ai model.
  • create table with vector data type.

Adventureworks – the dummy data

Production.ProductDescription contains a bunch of text data relating to different products, and forms the source of the embeddings.

If we take a look at it we can see it seems to be bike related, so lets query it using traditional methods.

Are there any actual bike products ?

Yes, so a very simple wildcard search can work. But what if we want to expand on that ?

Whats the fastest bike ?

Natural language prompts don’t work well at all. Existing data types and indexes struggle with semantic lookups.

Now lets use the new index :

You can see here the proc brings back results that are ranked in closest approximate similarity the natural language expression. We are able to convert any sentence into something sql can actually filter on.

Whats going on under the hood here

https://learn.microsoft.com/en-us/sql/relational-databases/vectors/vectors-sql-server?view=sql-server-ver17

Embeddings are vectors that represent important features of data. Embeddings are often learned by using a deep learning model, and machine learning and AI models utilize them as features. Embeddings can also capture semantic similarity between similar concepts. For example, in generating an embedding for the words person and human, we would expect their embeddings (vector representation) to be similar in value since the words are also semantically similar.

In our example i have used nomic-embed-text to generate embeddings based on product descriptions.

Vector search refers to the process of finding all vectors in a dataset that are similar to a specific query vector. Therefore, a query vector for the word human searches the entire dataset for similar vectors, and thus similar words: in this example it should find the word person as a close match. This closeness, or distance, is measured using a distance metric such as cosine distance. The closer vectors are, the more similar they are.

SQL Server provides built-in support for vectors via the vector data type. Vectors are stored in an optimized binary format but exposed as JSON arrays for convenience. Each element of the vector is stored using single-precision (4 bytes) floating-point value

Exact Nearest Neighbor (ENN)

Exact Nearest Neighbour search compares a query vector to every vector in the database to find the most similar ones. It uses a brute-force method that guarantees perfect accuracy by computing exact distances (e.g., cosine similarity or Euclidean distance). This approach is reliable for small datasets or when absolute precision is required, but it becomes inefficient and slow as the dataset grows, making it unsuitable for real-time or large-scale use cases.

Approximate Nearest Neighbour (ANN)

Approximate Nearest Neighbor search is designed to return very similar — but not guaranteed exact — results by using efficient indexing algorithms like HNSW, IVF, or product quantization. It significantly speeds up similarity searches by reducing the number of comparisons, making it ideal for large datasets and real-time applications such as semantic search or recommendation engines. While slightly less accurate, ANN offers a powerful trade-off between performance and precision.

FeatureENN (Exact Nearest Neighbor)ANN (Approximate Nearest Neighbor)
AccuracyPerfectClose, not guaranteed
SpeedSlowerFast
ScalabilityLimited (small datasets)Excellent (large datasets)
Use caseHigh-precision queriesReal-time, large-scale search
MethodFull scanIndexed/partitioned search
Tools/ExamplesPlain SQL, pgvector (no index)FAISS, Milvus, pgvector with ivfflat

The index we used uses the diskann type, which is approximate nearest neighbour, as well as the proc.

Here’s a concise version of the python behind the scenes which creates an API that a blazor webapp can use, ChatGPT is amazing 😉

And the embedding api used by SQL:

With just a few lines of SQL and FastAPI glue, SQL Server 2025 can now power smart, semantically aware applications — no fine-tuning required. That’s all well and good, but what I found most interesting was the ability to use open source models through Ollama. If anything this was an excuse to start playing around with local llm’s, which itself is where the brains for all of this lies, and in truth the most interesting part. In Part 2, I’ll take this a step further by combining traditional precise sql queries with semantic ones for hybrid precision. Stay tuned!


Categories:

Tags:


Comments

One response to “SQL 2025 and AI – chatting with your data part 1”

  1. […] my pet human’s last thrilling installment, he casually mentioned how SQL Server 2025 and AI were becoming best friends. Well, plot twist: […]