In PART I, we have covered the differences in indexing with local and global indexes. We have also concluded with the 2 storage options Couchbase Server provides for global secondary indexes: standard GSI and memory optimized GSI. Lets dive into memory optimized GSI now;
What is a Memory Optimized Global Secondary Index?
In previous post covering global vs local indexes, we talked extensively about how global indexes reduce query latency vs local indexes. We also saw how challenging it can be to maintain these global indexes: Global indexes require a subset of nodes to keep up with a large cluster of nodes. In the case of Couchbase Server, 10-100K ops/sec is the norm for a cluster.
Challenges do not end there! Global indexing can be even more challenging in cases like array-indexes (read more about array indexes here). Array indexes index the elements of an embedded array in a JSON document and the mutations to documents with arrays amplify a document to many index writes. Memory optimized global secondary index (MOI) is purpose built to solve the challenges for these most demanding applications – travel-itinerary, score-board and fraud detection – no problem!
MOI can provide over 10x better latency and throughput under faster mutations to data compared to standard GSIÂ and here is how it does that:
- Lock-free processing for indexing simply allows massive concurrency when maintaining the incoming mutation to the index.
- Skiplist structure optimize in-memory storage, as opposed to B+Tree indexes.
- Forcing the index into memory storage, means MOI does not run at disk speeds for storing the index – instead it take regular snapshots to disk for recovery only.
Memory Optimized vs. Standard Global Secondary Indexes
Memory optimized indexes are added in 4.5 as an additional storage option for GSIs. Standard global secondary indexes have been there since version 4.0. Administrators can configure GSI with either the standard GSI storage, which uses ForestDB underneath, for indexes that cannot fit in memory or can pick the memory optimized GSI for faster in-memory indexing and queries.
Typically, indexes are created to lower query latencies and keeping indexes in memory reduces latencies a great deal more! MOI is designed for lower latency and highest throughput needs and MOI require machines with large memory to keep the index in RAM. Standard GSI can spill to disk when memory runs out. IO Subsystem performance becomes extremely important for standard GSI to be able to perform well. However unlike standard GSI, high performance IO subsystem is not required for MOI. As MOI runs at in-memory speeds, initial and ongoing indexing times are faster with MOI compared to Standard GSI.Â
The following table summarizes the major differences between standard vs memory optimized GSIs;
Creating and Managing Memory Optimized GSIs
Regardless of the store type, CREATE INDEX is the way to create global secondary indexes in Couchbase Server. In fact there are no MOI specific options in CREATE INDEX statement. In general, high-availability and partitioning mechanics stay the same with standards vs memory optimized GSI. However it is important to note that MOI comes with additional stats and alerts to help with placement of indexes and with managing indexes.
Placing Memory Optimized Indexes in Couchbase Server Cluster
Memory optimized indexes provide 2 important stats that can guide placement of MOI with the NODES clause on CREATE INDEX.
- MAX Index RAM Used %: Reports the max ram quota used in percent (%) through the cluster and on each node both real-time and with a history over minutes, hours, days, weeks and more.
- Remaining Index RAM: Reports the free index RAM quota for the cluster as a total and on each node both real-time and with a history over minutes, hours, days weeks and more.
When placing the next memory optimized index, you can look at the availability of memory on the node and place your index bases on the size in memory.
Alerts with Memory Optimized Indexes
Running out of memory pauses the indexing with MOI so It is important for administrators to be able to visually see if a node is approaching its RAM quota. The MAX Index RAM Used % stat (discussed above) is built exactly for that. There is also an alert that will alert the interactive user or will notify admins through email. The alert “approaching full indexer RAM quota”, fires if over 75% of Indexer RAM Quota is exhausted on any node in the cluster. You can configure the alerts in the Web Console under settings.
Technical documentation provides more detailed information on MOI and GSI in general. You can read more about how to select the storage mode for GSI and how to administer Index service and GSIs in the admins guide here and find the architecture guide for Indexing service and indexers here.
In Part III of the series, we will be talking about the new Circular Write Mode with Standard Global Indexes and how Circular Write Mode improves the IO Performance when Indexing data in Couchbase Server 4.5.
Happy hacking
-Cihan