We are thrilled to announce the General Availability (GA) of Couchbase Shell (cbsh), a powerful command-line tool built on top of nushell designed to make your interactions with Couchbase easier and more efficient. Couchbase shell supports powerful vector search capabilities that can be used to power GenAI applications by allowing applications to retrieve semantically similar items based on vector embedding representations of the items in a multi-dimensional space. With this release, cbsh introduces vector search support enabling users to create vector indexes and conduct vector searches (such as similarity search) right from the command line. This enables many use cases for users such as testing and modifying model parameters, ad hoc vector querying, and scripting from a simple command line interface.

What is Couchbase Shell?

Couchbase Shell is an open source CLI tool tailored for developers and administrators working with Couchbase Capella and Couchbase Server. It allows users to quickly monitor, query, load data, export data, and conduct full vector searches via a simple command line tool that is easily extendable and has modern features such as syntax highlighting, intelligent autocomplete, contextual help, and error messages.

The tool is available as a Couchbase Community supported project.

Compatible with Linux, Mac, Windows (visit installation documentation for the complete list).

Key features

    • Command pipelines, syntax highlighting and auto-completer
    • Connection Management
    • Loading Data
    • Exporting Data
    • Vector Search
    • Key-Value (KV) Operation Support
    • Query Data using SQL++

Here are some of the standout features of the Couchbase Shell. 

Command pipelines, syntax highlighting and auto-completer

In nushell, pipelines allow the combination of many commands, similar to Linux pipes (|). Cbsh builds upon this by adding custom Couchbase commands to interact with your cluster. Additionally depending upon your shell it will enable syntax highlight and suggest you auto-complete. For example, the following uses basic nushell commands to open a local JSON file, then format it into a table:

Note that the syntax highlighting may be different for you based on shell of your choice

Once formatted in this way you can pipe the result into the custom cbsh doc upsert command to insert  the JSON into your Couchbase cluster. The full pipeline to open the doc, format and then upsert is: 

Connection management

Couchbase Shell simplifies connection management, allowing you to establish and manage connections to Couchbase with minimal effort. Users have two options to connect to Couchbase:

Command inline connection management (CLI arguments)

Connecting to Couchbase Shell via CLI arguments is straightforward and allows you to quickly start working with your Couchbase clusters from the command line. The primary arguments you’ll need to provide are the cluster connection string, your username, and your password. Additionally, you can specify other parameters like the bucket you wish to interact with, the specific scope and collection, and even the authentication mechanism if needed.

Basic connection example

To connect to a Couchbase cluster, you can use the following command:

Connecting to a specific bucket

To connect to a bucket on a Couchbase cluster, you can use the following command:

Connection via config file

The first time that you run ./cbsh you will receive a prompt asking if you’d like to create a config file. If you choose “yes”, then the shell will provide you with a series of prompts to provide information about your default cluster. If you choose “no” then it will try to connect to a local cluster running on localhost using the “Administrator” username and the password of “password”. The config file must be called config and be placed in a .cbsh dot file either in your home directory or in the directory from which the shell is being run. More details can be found on our documentation website.

Below configuration example shows you how to define two different clusters – one in Capella and another in a local Couchbase cluster. Optionally, users can also add a large language model (LLM) configuration which we will describe later in the Vector Search section.

Loading data

Loading data into Couchbase using Couchbase Shell is a straightforward process that allows you to quickly populate your database with options to load data:

    • Single document JSON files
    • Multiple documents JSON files
    • CSV files
    • Any call to CLI that will output structured text supported by nushell

Users can use open or from command to first load data into Couchbase Shell which can then be sent to Couchbase Server using doc import or doc upsert commands.

Visit our documentation for code samples and load data recipes.

Exporting data

Exporting data from Couchbase using the Couchbase Shell is a powerful way to back up your data, move it between environments, or simply extract it for analysis. The cbsh tool provides a straightforward command to export data directly from your Couchbase cluster into JSON files, making it easy to handle data outside of the database environment. The export counterparts to open and from, are save and to. You can use both commands to take tabular data from the shell and store it in files of the needed target format.

Visit our documentation for code samples and export data recipes.

Vector search

The headline feature of this release is the support for vector search. This capability allows you to perform similarity searches on a given corpus of document, all from a command line (CLI) interface. This is super useful to try and test your models and conduct ad-hoc vector searches easily via command line.

Vector search in Couchbase is powered by the integration of AI/ML models that convert text, images, or other data types into vectors. These vectors can then be compared to find similar items, providing a more relevant search experience compared to traditional keyword-based approaches.

Before users can start vector search, users need to define which large language model (LLM) they want to use. Currently, cbsh supports the following LLMs:

    1. OpenAI
    2. Gemini (Google)
    3. Bedrock (AWS)

Visit the LLM documentation for configuration samples.

Typically, vector search is a three step process:

1 – Generate embeddings for fields in a collection

This is typically done when documents are created or updated, or if already have an existing dataset, do it as a bulk operation. With cbsh, this can be accomplished using the vector enrich-doc command. For example, the following example has 3 parts piped together:

    1. Query for documents in landmark collection.
    2. Send the content field from the previous part’s resultset to LLM to generate vector embeddings. In this step the specified field is sent to LLM’s endpoint and the response is captured. Note that you must have LLM definition configured in the config file. If you have multiple models you want to experiment with, you can define multiple LLM in the config file and switch the LLM using cb-env LLM <identifier> command. All the LLM’s return back a default number of dimensions but you can override it using the –dimensions options. This command, by default, stores the vector in a field called fieldVector, which is contentVector in our example since the name of our field is content. You can override the default name of the vector field by using the –vectorField option.
    3. Save the vector embeddings by upserting it back to the database. 

This command sends the specified field (content) to the LLM defined in the config file which generates and returns vector embeddings. The returned vector embeddings are saved in the document, assuming the piped document has an id and content field. If not, user can specify a custom ID and content field.

Note that this command assumed that default bucket is set to travel-sample.

2 – Create a vector index for saved embeddings

This is always run after step 1. With cbsh, this can be accomplished using the create-index command The following command will create a new vector index named landmark-contentVector-index over vectors with dimension 1024. Note that the dimensions specified here must match the dimensions value in step 1. If you use the default value of dimension from your LLM, you can check this number from LLM’s documentation or just count the number of vector elements that were generated in step 1:

3 – Generate vector embedding for search keyword

Step 3: Generate vector embedding for a search keyword using vector enrich-text command and conduct a vector search against the vector index using the vector search command:

Finally, the result of a vector search can be piped into a doc or subdoc get to retrieve the contents of the found documents. Additionally, we can make the output prettier by only printing relevant fields:

Key-Value (KV) operation support

Couchbase shell natively supports performing key value operations. Key-value operations are unique to Couchbase and provide very fast CRUD operations for documents stored in Couchbase. 

cbsh command to read documents via KV service

You can retrieve a document with doc get:

To distinguish the actual content from the metadata, the content is nested in the content field. If you want to have everything at the toplevel, you can pipe to the flatten command:

And you can get multiple documents by using a command like below:

cbsh command to write documents via KV service

Documents can be mutated with doc insert, doc upsert and doc replace.

All those three commands take similar arguments. If you only want to mutate a single document, passing in the ID and the content as arguments is the simplest way:

Documents can be removed with doc remove.

There are many more KV operations you can do such as subdoc get. Please visit our documentation to learn more about KV operations.

Query data using SQL++

Running SQL++ queries using the Couchbase Shell is one of the most powerful features of the tool, enabling you to interact with your Couchbase data in a flexible and efficient manner. SQL++, an extension of SQL designed for JSON data, allows you to perform complex queries, including joins, aggregates, and subqueries, directly from the command line with cbsh.

For example, we can see how many airlines are operating in ‘France’ in travel-sample data:

SQL++ is even more powerful as users can use named parameters and also use piping (|) to redirect the query command resultset to other commands. Visit our documentation for more details.

Get started

To help you get started with cbsh, we’ve prepared a detailed getting started guide on our documentation website. Here’s a quick overview of how to begin:

    • Getting cbsh: Download cbsh for your operating system from the Couchbase shell website.
    • Connect to Your Cluster: Recommended way is to create a config file as specified in config file documentation.
    • Perform CRUD operations, run queries, or Vector Search, and leverage Couchbase’s powerful features. To get you started very quickly, our documentation provides existing recipes for some common use cases.

Community and open source support

We believe in the power of community and open-source development. Couchbase cbsh is open source, and we encourage you to contribute, provide feedback, and join the conversation. Join the Couchbase Forums or Couchbase Discord.

Further reading

To learn more, check out our documentation website. It goes into more detail on various supported commands and configurations, especially around connection credentials, and examples of piping commands together to achieve desired results.

Happy command shell!

The Couchbase Team



Author

Posted by Vishal Dhiman, Sr. Product Manager

Leave a reply