On April 5, 2022, the US Patent and Trademark Office granted a second patent to Couchbase for its novel approach to optimizing document-oriented database queries on arrays! This feature has been available since Couchbase Server 7.1 and Couchbase Capella 7.0 but this patent recognizes our innovation in cost-based optimization for document-oriented databases.
Optimizing queries is a science that has been ongoing in relational data systems since the 1970s. And consistent with our leadership in bringing innovations to market, Couchbase has been recognized for deeply technical work in bringing query optimization to unstructured data in a JSON format. The Couchbase engineering team has been at the forefront of evolving the performance of document databases for the past decade. Our engineers’ commitment to excellence is the reason why some of the largest enterprises in the world now trust Couchbase for their mission-critical applications. We have recently patented a novel approach to cost-based optimization (CBO) for document-oriented database queries on arrays as part of this commitment. Couchbase engineering continues to bring the power of cost-based optimization to NoSQL, and this patent grant recognizes our continued innovation.
We congratulate Bingjie Miao, Keshav Murthy, Marco Greco, and Prathibha Bisarahalli for their continued impressive work in the field of Cost-based Optimization!
This post will dive into cost-based optimization (CBO), why it matters, and why CBO for queries to document databases is unique to Couchbase.
What is Cost-based Optimization?
Cost-based optimization (or CBO) is a process for selecting the most efficient way to execute a database query by considering the cost of memory, CPU, network transport, and disk usage. CBO compares the cost of alternative query routes and then selects the query-execution plan with the least cost.
Keshav Murthy, our VP of Engineering and one of the patent authors, uses the following map analogy to explain what CBO is:
One way to grasp CBO is to consider an airplane flight plan: a plane can take any number of paths to go from San Francisco to São Paulo, but there are only a few optimal paths when you consider fuel costs, wind resistance, air traffic, etc. Similarly, a database query needs a query plan. There are many ways to run the query, but only a few optimal plans.
One way to select a query path is to use a Rule-based Optimization (RBO), which makes query path decisions based on rules (e.g., always prefer indexes with the most keys). However, RBO can get very messy and inefficient very quickly. And it rarely yields the most optimal query path. In the NoSQL database world, most databases still leverage rule-based optimization.
Cost-based optimization takes a user-submitted query, selects from millions of query plans, and chooses the most performant and resource-efficient plan for query execution based on statistics.
Why does Cost-based Optimization matter?
The implications of CBO are that queries leverage less memory, less disk, less IO, fewer partitions, and less overflow, which leads to lower latency and lower cost for users. This is particularly meaningful for databases that handle a large number of transactions––even minor performance improvements can have a significant impact.
Keshav Murthy went on to explain why CBO matters, again using a map analogy:
When it matters — like getting to your kid’s recital or a ballgame on time — would you use a static direction map that doesn’t account for the traffic? Google Maps’ route optimizer will optimize for time. The optimizers develop a plan to execute the query with the least resources: CPU and memory. Knowing this, why would you accept a static rule (or query shape!) based optimization of your business-critical database workload?
The database query optimizer makes decisions. These decisions have major implications on query performance, system throughput, and your ability to meet the SLAs. Databases with a better optimizer will make it easier to develop, manage and meet the SLAs.
How CBO for document-oriented database queries on arrays is unique to Couchbase
Cost-based optimization (CBO) for SQL has existed for more than 40 years and has been critical to the success of RDBMS and developer productivity. However, CBO was not generally available for document database queries until Couchbase implemented CBO for SQL++ (formerly known as N1QL) with the Couchbase Server 6.5 release in 2019. Since then, our customers have enjoyed the performance benefits of CBO for their queries––which is particularly important for many of our customers that rely on the high performance of Couchbase to power their most mission-critical applications.
The patent grant represents a technical commitment from Couchbase to deliver the best elements of SQL for our NoSQL Database platform. And with the recent patent grant, Couchbase is the only document database provider that intelligently executes cost-based optimization for NoSQL database queries––which has enormous implications on performance and cost. Before deciding on a NoSQL database, ask your vendor: Do you have a cost-based optimizer?
Congratulations to our engineering team for their continued hard work to evolve the standard of excellence for document databases.
Learn More about Cost-Based Optimization for Couchbase!
Watch a short video or read the documentation for an overview of Cost-Based Optimization in N1QL.
For a deep dive into CBO for SQL++, I recommend reading the following blog posts by Keshav Murthy, our VP of Engineering:
Thank you for reading!