Hi,
[1]
I have around 300 million documents in my test system, similar to below
{
“ownerId” : 1,
“playerId” : 500,
“gems” : 750
}
There are about 50 owners, with the top 4 owners having around 30 million documents each. The rest are around the 1-5 million document mark per ownerId.
For a specific ownerId I am looking for playerId that either has an exact amount of gems or a range of gems.
SELECT playerId FROM gemshop WHERE ownerId = 1 AND gems = 750
SELECT playerId FROM gemshop WHERE ownerId = 1 AND gems > 700
To speed this up I created covering indexes, a generic to cover all and another for the large owners
CREATE INDEX idx_owner ON gemshop (ownerId, gems, playerId)
CREATE INDEX idx_owner_1 ON gemshop (ownerId, gems, playerId) WHERE ownerId = 1
These are standard GSI index, not memory. The machines where the indexes ran have 128GB of memory and the indexes are around 10-15GB each (indexes are created on all index nodes).
When I run my queries, the results are a little inconsistent. When I run it the first time for a range value (>750) it takes a few seconds to a minute to complete. If I run the query again it completes in less than a second. If I change the value to search on (for example to < 900), it takes long on the first search and then fast on subsequent re-searches for that value.
At the time of the searching there are no changes being done to the documents.
Is this expected that the times are so volatile? Is there anything else I can do to make the search time a little more deterministic (without changing to MOI)? Would prepared statements help?
[2]
Any other ideas on what I could do improve the doc to aid the searching? I was thinking of adding a bucketing value to the document, that will save a “rolled down to the nearest 500” value.
{
“ownerId” : 1,
“playerId” : 500,
“gems” : 750,
“atLeast” : 500
}
That way my query might be more selective
CREATE INDEX idx_owner ON gemshop (ownerId, atLeast, gems, playerId)
SELECT playerId FROM gemshop WHERE ownerId = 1 AND gems = 750 AND atLeast = 500
SELECT playerId FROM gemshop WHERE ownerId = 1 AND gems > 700 AND atLeast = 500
Apologies if the JSON or queries have syntax errors, this was typed in the browser.