Build Your First Open Source AI Agent with Couchbase

If 2024 was the year of AI chatbots, then 2025 is the year of AI agents. At first glance, they may seem similar, but nothing could be farther from the truth. While you may interact with an AI agent in the same manner you interact with an AI chatbot, perhaps through a web interface, the differences between them are stark.

AI agents can autonomously act to fulfill your request. That’s right, I said it: autonomously. AI acting on your behalf figuring out what to do and what its next steps should be without your intervention.

Want to find a good restaurant for a date night and book a table? Let your AI agent do it for you, from the research to the booking.

Need to track new customer engagements in your CRM and send customized and personalized email outreach? Let your AI agent do it for you, from the tracking to composing the email all the way to sending it.

Essentially, agents can do just about anything for you online. You can even have multiple agents working in tandem. All they need is a task that you set them off with and they will go about completing that task.

You may think that building or using agentic AI will cost you a fortune, and you wouldn’t be wrong for thinking that. Companies, some of the largest in the space, are offering these services for very high monthly fees. Yet, with some knowledge of Python and JavaScript, you can build a full stack AI agent web user interface for free. Best of all, you can easily incorporate Couchbase for performant data storage, retrieval and full text search capabilities. Couchbase Capella, the fully managed database as a service, even offers a fully free forever tier, that you can use in your application.

Did I pique your interest? Well, great news! You don’t even have to build it from scratch. You can find the full application in my open source browser agent GitHub repository – with instructions on how to start using it.

Let’s take a look at the architecture of the application, and how it is built. A better understanding of how it works will empower you to adapt it for your own personal use case.

The application is made up of a React frontend and a Python backend. We’ll focus on the backend as that is where the work of the agent and the interactions with Couchbase happen. You can explore the frontend freely on GitHub, which provides a mobile-first design.

Exploring how it works

The foundation of this application is a harmonious blend of a React frontend and a Python backend. While the frontend ensures a smooth and responsive user experience, the backend is where the AI agent operates, handling user requests and interacting with Couchbase for data management.

The AI agent with browser use

At the core of the AI agent is the integration of the open-source Python library from Browser Use. This library empowers the agent to perform a variety of tasks by simulating browser actions. Whether it’s navigating to a restaurant’s website to book a table or accessing the CRM to track customer engagements, Browser Use provides the necessary tools for the agent to interact with web resources effectively.

Initializing a Browser Use AI agent only takes a few lines of code in your Python backend:

async def run_browser_agent(task: str):
    """
    Runs the browser agent and returns its results in a JSON-serializable format.
    """
    agent = Agent(task=task, llm=llm, use_vision=True, max_failures=3, retry_delay=5)
    history = await agent.run(max_steps=50)

    serialized_results = []
    if history and history.history:
        for step in history.history:
            if step.result:
                if isinstance(step.result, list):
                    for res in step.result:
                        serialized_results.append(vars(res))
                else:
                    serialized_results.append(vars)

    return serialized_results

async def run_browser_agent(task: str):

"""

Runs the browser agent and returns its results in a JSON-serializable format.

"""

agent = Agent(task=task, llm=llm, use_vision=True, max_failures=3, retry_delay=5)

history = await agent.run(max_steps=50)

serialized_results = []

if history and history.history:

for step in history.history:

if step.result:

if isinstance(step.result, list):

for res in step.result:

serialized_results.append(vars(res))

else:

serialized_results.append(vars)

return serialized_results

The Agent is instantiated with several parameters, including:

- The specific task you want it to do
- The large language model it will use to analyze the results of its independent research (the project currently supports Anthropic, OpenAI, DeepSeek, Qwen, Azure, Gemini and others)
- A boolean value for vision, which gives the agent the ability to also analyze images from its research
- The number of failures you will tolerate before exiting the task unsuccessfully
- How long to wait in seconds in between retries
- How many steps you give permission for the agent to execute in order to attempt to complete the task

The rest of the code for instantiating the agent is just concerned with returning the results.

That’s all it takes to create an agent and put it out in the world. A few lines of Python, really.

When the AI agent receives a task, such as booking a restaurant or sending personalized emails, it utilizes the Browser Use library to perform the necessary actions online. This involves navigating websites, filling out forms, and even parsing information, all executed programmatically to fulfill user requests efficiently.

Data persistence and search are fundamental features of modern applications. Not only can you persist the data for user purposes, like chat history, but you could also provide further context to your agent with the persisted data with a methodology called Retrieval Augmented Generation (RAG). I recommend reading this article if you’re interested in exploring that topic further.

For now, let’s introduce data persistence and search functionality into your agent user interface with Couchbase.

Interacting with Couchbase for Full Text Search

Couchbase plays a pivotal role in managing and retrieving chat data efficiently. Leveraging Couchbase Full Text Search (FTS) capabilities, the backend can swiftly search through extensive chat histories to find relevant conversations based on user queries. Here’s a streamlined example of how the backend interacts with Couchbase for FTS using the Couchbase Python SDK:

from typing import List
from couchbase.search import QueryStringQuery, SearchOptions
import logging

def search_chats(user_id: str, search_text: str) -> List[dict]:
    """
    Searches chats for a given user based on the search_text using Couchbase FTS.

    Args:
        user_id (str): The ID of the user.
        search_text (str): The text to search within chat names.

    Returns:
        List[dict]: A list of chat documents that match the search criteria.
    """

    try:
        # Execute the search query against the specified FTS index
        search_result = cluster.search_query(
            "bucket-name.scope-name.search-index-name",
            QueryStringQuery(search_text),
            SearchOptions(fields=["chat_id", "user_id", "name", "messages.content", "messages.timestamp", "messages.sender"])
        )

        results = []

        for row in search_result.rows():
            chat = row.fields.copy()

            # Extract message details
            messages_content = chat.get("messages.content", [])
            messages_timestamp = chat.get("messages.timestamp", [])
            messages_sender = chat.get("messages.sender", [])

            # Ensure all message lists are aligned
            if len(messages_content) == len(messages_timestamp) == len(messages_sender):
                chat["messages"] = [
                    {
                        "content": content,
                        "timestamp": timestamp,
                        "sender": sender
                    }
                    for content, timestamp, sender in zip(messages_content, messages_timestamp, messages_sender)
                ]
            else:
                logging.warning("Mismatch in message fields. Including only content.")
                chat["messages"] = [{"content": content} for content in messages_content]

            # Clean up redundant fields
            chat.pop("messages.content", None)
            chat.pop("messages.timestamp", None)
            chat.pop("messages.sender", None)

            results.append(chat)

        # Filter chats belonging to the specified user
        user_chats = [chat for chat in results if chat.get("user_id") == user_id]

        logging.info(f"Search returned {len(user_chats)} chats for user '{user_id}'.")

        return user_chats

    except Exception as e:
        logging.error(f"Error during search: {e}")
        return []

from typing import List

from couchbase.search import QueryStringQuery, SearchOptions

import logging

def search_chats(user_id: str, search_text: str) -> List[dict]:

"""

Searches chats for a given user based on the search_text using Couchbase FTS.

Args:

user_id (str): The ID of the user.

search_text (str): The text to search within chat names.

Returns:

List[dict]: A list of chat documents that match the search criteria.

"""

try:

# Execute the search query against the specified FTS index

search_result = cluster.search_query(

"bucket-name.scope-name.search-index-name",

QueryStringQuery(search_text),

SearchOptions(fields=["chat_id", "user_id", "name", "messages.content", "messages.timestamp", "messages.sender"])

)

results = []

for row in search_result.rows():

chat = row.fields.copy()

# Extract message details

messages_content = chat.get("messages.content", [])

messages_timestamp = chat.get("messages.timestamp", [])

messages_sender = chat.get("messages.sender", [])

# Ensure all message lists are aligned

if len(messages_content) == len(messages_timestamp) == len(messages_sender):

chat["messages"] = [

{

"content": content,

"timestamp": timestamp,

"sender": sender

}

for content, timestamp, sender in zip(messages_content, messages_timestamp, messages_sender)

]

else:

logging.warning("Mismatch in message fields. Including only content.")

chat["messages"] = [{"content": content} for content in messages_content]

# Clean up redundant fields

chat.pop("messages.content", None)

chat.pop("messages.timestamp", None)

chat.pop("messages.sender", None)

results.append(chat)

# Filter chats belonging to the specified user

user_chats = [chat for chat in results if chat.get("user_id") == user_id]

logging.info(f"Search returned {len(user_chats)} chats for user '{user_id}'.")

return user_chats

except Exception as e:

logging.error(f"Error during search: {e}")

return []

In this snippet, the search_chats function is pivotal in retrieving relevant chat conversations based on user input. When a user enters a search query, the function constructs a search request targeting the FTS index within Couchbase. It specifies the fields to retrieve, ensuring that all necessary information, such as chat_id, user_id, name, and message details, are included in the search results.

As the function processes each search result, it meticulously organizes the message content, timestamps, and senders, ensuring that the data structure remains consistent and reliable. This careful handling guarantees that the frontend receives well-structured data, ready to be displayed to the user without any hiccups.

The interaction between the AI agent and Couchbase ensures that all user interactions are stored securely and can be retrieved swiftly when needed. Whether it’s fetching chat histories or updating user preferences, Couchbase provides the scalability and performance required to handle a growing user base without compromising on speed or reliability.

When a user initiates a search from the frontend, the React application sends the query to the backend. The AI agent processes this request, interacts with Couchbase to fetch the relevant chats, and sends the results back to the frontend. This round-trip ensures that users receive accurate and timely information, enhancing their overall experience with the application.

By combining the capabilities of the Browser Use library with the data management features of Couchbase, the application delivers a comprehensive AI agent experience. Users can manage their online tasks, engage in meaningful conversations with their agent, and rely on the system to handle complex operations behind the scenes.

This architecture not only provides immediate functionality but also lays a solid foundation for future enhancements. As the application evolves, the integration between the frontend, AI agent, and Couchbase ensures that it remains adaptable, scalable, and responsive to user needs.

Wrapping up

The next time you see AI agent services out in the wild and you feel the FOMO because your credit card just can’t quite take the monthly bill, remember with just a bit of Python and a bit of JavaScript, you can have your own agent at your disposal.

Want to get started? Head on over to the project on GitHub, clone it to your computer, and follow the instructions in the README to give it a spin.

What time saving tasks will you have your agent do for you? Join us on our active and growing Discord community to share what you’ve built!

- Start using the AI capabilities in Couchbase Capella today, for free

Ben Greenberg, Senior Developer Evangelist

Platform

Self-Managed

Services

Capabilities

Why Couchbase?

Migrate to Capella

By Use Case

By Industry

By Application Need

Popular Docs

By Developer Role

COMMUNITY

Join the Developer Community

Resource Center

Education

Compare

About

Partnerships

Our Services

Partners: Register a Deal

Ready to register a deal with Couchbase?

Marriott

All Posts

Build Your First Open Source AI Agent with Couchbase

Exploring how it works

The AI agent with browser use

Interacting with Couchbase for Full Text Search

Wrapping up

Author

Posted by Ben Greenberg, Senior Developer Evangelist

Leave a reply Cancel reply