Could Someone Give me Advice for Optimizing Couchbase Queries in a High-Load Environment?

Hello there, :wave:

I am new to Couchbase and currently working on a project where we need to optimize query performance in a high load production environment. We are using Couchbase Server 7.x and have a cluster of five nodes with both KV and query services enabled.

Our application processes real-time data from multiple sources; and we have observed that some of our N1QL queries are slower than expected during peak load times.

What are the best practices for designing secondary indexes to minimize latency? How do you decide between covering indexes and non-covering indexes for frequently used queries? :thinking:

Are there any recommended settings for memory or resource allocation specifically for the query service? How can we identify and resolve bottlenecks caused by query execution plans? :thinking:

Also, I have gone through this post; https://www.couchbase.com/forums/t/how-to-optimize-data-replication-performance-in-couchbase-server-salesforce-commerce-cloud which definitely helped me out a lot.

Does anyone have experience using the Couchbase Query Workbench or Analytics Service for this kind of workload? :thinking:

Thanks in advance for your help and assistance. :innocent:

You haven’t stated your version precisely (there are differences between 7.0 and 7.6, for example) nor have you stated your edition, but since you note XDCR I assume you’re using EE. CE is more constrained than EE, naturally.

See:

https://developer.couchbase.com/learn/n1ql-query-performance-guide

https://developer.couchbase.com/tutorial-tuning-tips-and-advice?learningPath=learn/n1ql-query-performance-guide

Generally try to ensure your queries can be covered - and if possible any ordering is able to be pushed down to the index. Of course the number of indices has to be reasonable or you need to provide index hints as if there are a great many indices to choose from, the time spent planning grows. You may also experiment with join type hints, though if you have CBO enabled and have fully up-to-date statistics the optimiser should be able to pick the best plan.

Standard indexing practice is recommended: typically order keys from least to most selective (i.e. continent before country before province/state before city…). Where queries will always make use of the results of a function on a field, index that function if possible (e.g. LOWER(name)).

Avoid LIKE predicates with a leading wild-card as these cannot offer index filtering.

Make use of prepared statements to avoid having to parse & plan queries repeatedly, but be mindful that prepared queries with parameters can’t take advantage of partial indices. You can mix-n-match so if you have a common filter that’ll always be a particular value and that value is used for partial indexing, supply that as a literal in your prepared statement alongside parameters for the variable conditions. This may mean preparing multiple statements with different literal filters. e.g. ... WHERE eu = false AND id = $id ... plus ... WHERE eu = true AND id = $id ....

Avoid queries with a large number of full document fetches, particularly if the documents are large (e.g. several MiB). Retrieving full documents through the Query service means two “hops” for the data. In many cases where a large number of or large documents are to be retrieved, use of a covered query to retrieve documents keys followed by direct KV access to retrieve those documents is more efficient simply because that extra hop for the bulk of the data is eliminated.

Memory depends on what you have available and what your queries need; the more you’re able to run covered queries the less memory the Query service will need. If you have to have an ORDER BY processed in Query and you’re on 7.6+, you may consider if you have enough memory available for the Query service to avoid spilling sorts to disk (and then disable that feature). (Pre 7.6 if the Query service performs ordering it is only done in memory.) In-memory sorting can required a lot of memory if the result-set is large. (For optimum performance ensure ordering can always be pushed down to an index - i.e. be pre-computed.)

(You can constrain the amount of memory a request is allowed to use by using a memory quota - but this is intended to prevent “bad” queries affecting the rest of the workload and not to adjust relative performance of queries.)

If you’re investigating the performance of a Query, be sure to gather a profile with timings (rather than just a plain EXPLAIN plan). That way you can see not only the plan but timing and counts of documents flowing through each operator. Make use of completed requests to gather instances where the runtime is abnormal. (Manage and Monitor Queries | Couchbase Docs)

A few points on reading a profile:

  • Typically ensure the right and/or best index is being used and as few items are produced by the index scan as possible.

  • Be primarily concerned with the “execTime” (time spent in Query) and “servTime” (time spend in dependent services, e.g. in indexing performing the actual index scan), and with item counts.

    • A large number of items flowing out of a FETCH and into a FILTER for only a few to flow out of the FILTER suggests there could be better indexing reducing the need to fetch documents that will simply be excluded by the filtering.
    • It could also suggest a revision to the query - e.g. not using leading wild-card LIKE patterns.
  • A high “execTime” in the STREAM operator indicates time sending results to the client and may indicate a stalled client (i.e. look at the application and not the server).

etc.

HTH.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.