There are use cases that are best served by multiple types of data access, including SQL, vector search, geospatial queries, and key-value access. One approach is to combine/chain together multiple data systems for each access method. However, the Couchbase approach makes it possible to combine these different types of queries to solve real-world problems.

This article walks through aspects of the demo application “What is This Thing?” (aka “WITT”). For more context and background, check out:

This blog post is part of the 2024 C# Advent. However, you don’t need to understand C# to read this post: the concepts are applicable to any of the many SDKs available for Couchbase.

Vector Search: The Basics

Vector search is useful for applications that need to find similar items. For example, embeddings created by AI models can be indexed and searched. Each item in WITT is modeled like this:

Note: The image of the item is stored as a base64-encoded string. In a production project, I’d recommend using file storage, S3, etc, rather than storing it in the database.

imageVector is retrieved by uploading the image to an AI image model, like Azure Computer Vision.

Note: One of the features of the just-announced Capella AI services is model hosting, which will reduce the latency of this step, and also increase privacy and flexibility, and potentially reduce costs.

Image Embeddings and Nearest Neighbor Search

With a vector search index on the imageVector field, Couchbase can perform nearest neighbor searches. In this case, that search would find items that are visually similar (according to the AI model). So, if a user has an image, and they want to find an item in Couchbase that is most similar to that image, a vector search index can do this:

Here’s the code in WITT that, for a given image, requests a vector embedding from Azure Computer Vision:

There are probably frameworks that can handle this call too, but for this simple demo, that only requires a single REST call, I found this to be sufficient. If you want to use something other than Azure with this demo, you need to implement IEmbeddingService.

Multi-Purpose Queries with SQL++

Many databases with vector search can perform a very similar operation. What Couchbase enables you to do is to perform multiple types of data operations with a single platform, a single pool of data. For instance, given a geospatial location (which can be retrieved through a web browser), you can not only query to find a similar item by image, but also combine that with a geospatial search, all through a single SQL++ query:

Note: this query was edited for brevity’s sake. Check out DataLayer.cs for a more complete view of the queries.

The result of this query is a “most likely match” for a given image. For example, here is the top result when uploading a picture of a pen:

The quality of matches will depend on:

    • The quality of the AI model
    • The quality/quantity of the images for a given item

In my limited testing, I’ve found the Azure Computer Vision model to be very good for matching relevant images.

The result also will contain nearby stores, where the item is available for purchase.

Beyond Vector Search and Geospatial

This query showed Couchbase’s ability to combine vector search AND geospatial search into a single operation. It also contained a CTE, JOINs, and a subquery.

Within a single query, you can also perform:

Here’s the marketing section: Some databases may only be able to perform a subset of these operations, and require you to bring in other tools when you need additional functionality. This increases your costs, latency, and complexity. With Couchbase, you can keep your application simpler, faster, and cheaper. Marketing section over.

Technical Highlights

The WITT demo application referenced is built with:

    • React UI frontend
    • ASP.NET Core backend
    • Azure Computer Vision
    • Couchbase .NET SDK

You can also check out What is This Thing? as a public demo. (Keep in mind that it is all built with free-tier hosting (Azure and Capella Free Tier), and that it is still actively being developed. If you notice some slowness or downtime, that could be because of too much traffic, sorry!)

Author

Posted by Matthew Groves

Matthew D. Groves is a guy who loves to code. It doesn't matter if it's C#, jQuery, or PHP: he'll submit pull requests for anything. He has been coding professionally ever since he wrote a QuickBASIC point-of-sale app for his parent's pizza shop back in the 90s. He currently works as a Senior Product Marketing Manager for Couchbase. His free time is spent with his family, watching the Reds, and getting involved in the developer community. He is the author of AOP in .NET, Pro Microservices in .NET, a Pluralsight author, and a Microsoft MVP.

Leave a reply