CBL 2.8 Queries based on Indices works very slow

Hello!
I am using following code for creating an index:

database.createIndex(
“PledgeTaskWithStatusIndex”,
IndexBuilder.valueIndex(
ValueIndexItem.property(“type”),
ValueIndexItem.property(PledgeObjectTask.IsApprovedByUserField),
ValueIndexItem.property(PledgeObjectTask.TaskGroupInfoField),
ValueIndexItem.property(PledgeObjectTask.PledgeTaskTypeField)
));

When I get new data, queries works very slow, but query.explain() shows that it uses the index.
After some time queries works OK.

What is the reason for it? Do indices need to be manually rebuilt as in CBL 1.4?
I cannot find any documentation regarding this matter

UPDATE: I found out that queries works fine after app restart.
I issue database.createIndex command at every application start.
Does it mean that when I got new data i have to issue database.createIndex again?
Is it so?

UPDATE 2: Confirmed!
I did pull replication and if I issue createIndex again then everything works fine, otherwise indices are not rebuilt automatically.
Is this OK?
I could not find any useful documentation regarding this matter even in sample apps?
How to create an application, based on CBL if its like black box, no infomation available.

CBL 1.4 is a very old version of Couchbase Lite. It is unsupported. Please look into updating to version 2.7.

In short my question is: Do we need to rebuild indices manually in CBL 2.8?
(I mentioned CBL 1.4 as an example, where we do it manually)

Ok… so you are using 2.8.3? On what platform?

I will get one of the Core developers to chime in on when indices are rebuilt.

You do not need to rebuild indexes unless of course you need to change the properties being indexed.

createIndex does nothing if an index with that name exists already. I have no idea why it would make a difference in your app to call it redundantly, except that if you’re running your tests with a fresh database every time, removing the createIndex call would mean you don’t have any index in that test run.

In CBL 2 (unlike 1.x) indexes are updated immediately as documents are added/modified/deleted. If queries are slow, it’s possible that new pages need to be read from storage, and afterwards the query speeds up because the pages are already cached.

Could you show use the complete output of the explain method?

The troubleshooting guide should be a helpful reference to understanding the query plan

2 Likes

Explain:

SELECT DISTINCT fl_result(fl_value(_doc.body, ‘taskGroupInfo.groupedRecordId’)) FROM kv_default AS _doc WHERE (NOT (fl_value(_doc.body, ‘type’) = fl_null() OR fl_value(_doc.body, ‘type’) IS NULL) AND ((fl_value(_doc.body, ‘type’) = ‘PledgeObjectTask’ AND fl_value(_doc.body, ‘IsApprovedByUser’) = fl_bool(0)) AND NOT (fl_value(_doc.body, ‘taskGroupInfo’) = fl_null() OR fl_value(_doc.body, ‘taskGroupInfo’) IS NULL))) AND (_doc.flags & 1 = 0) ORDER BY fl_value(_doc.body, ‘taskGroupInfo.groupedRecordId’) DESC

5|0|0| SEARCH TABLE kv_default AS _doc USING INDEX PledgeTaskWithStatusIndex (=? AND =?)
41|0|0| USE TEMP B-TREE FOR DISTINCT
42|0|0| USE TEMP B-TREE FOR ORDER BY

{“WHAT”:[[“.taskGroupInfo.groupedRecordId”]],“ORDER_BY”:[[“DESC”,[“.taskGroupInfo.groupedRecordId”]]],“WHERE”:[“AND”,[“NOT”,[“OR”,[“IS”,[“.type”],null],[“IS”,[“.type”],[“MISSING”]]]],[“AND”,[“AND”,[“=”,[“.type”],“PledgeObjectTask”],[“=”,[“.IsApprovedByUser”],false]],[“NOT”,[“OR”,[“IS”,[“.taskGroupInfo”],null],[“IS”,[“.taskGroupInfo”],[“MISSING”]]]]]],“DISTINCT”:true}

Query itself:

Query query = QueryBuilder
.selectDistinct(SelectResult.property(“taskGroupInfo.groupedRecordId”).as(“recordId”))
.from(DataSource.database(DatabaseManager.getDatabase()))
.where(
Expression.property(“type”).notNullOrMissing()
.and(Expression.property(“type”).equalTo(Expression.string(PledgeObjectTask.Type))
.and(Expression.property(PledgeObjectTask.IsApprovedByUserField)
.equalTo(Expression.booleanValue(!inProgress)))
.and(Expression.property(PledgeObjectTask.TaskGroupInfoField).notNullOrMissing()))
).orderBy(Ordering.property(“taskGroupInfo.groupedRecordId”).descending());

Index creation:

database.createIndex(
“PledgeTaskWithStatusIndex”,
IndexBuilder.valueIndex(
ValueIndexItem.property(“type”),
ValueIndexItem.property(PledgeObjectTask.IsApprovedByUserField),
ValueIndexItem.property(PledgeObjectTask.TaskGroupInfoField),
ValueIndexItem.property(PledgeObjectTask.PledgeTaskTypeField)
));

P.S.: It is not a test run, i am not creating new database every time

Yes. Latest CBL on Android

Yesterday I was looking at another developer’s somewhat similar query performance issue — where it’s slow the first time but fast after that. They noticed that the query plan shown by explain changes after the first time they run the query. This indicates that SQLite’s query optimizer has decided on a better query plan after looking at the statistics it keeps about the database.

Have you tried running Query.explain again, after the query starts being fast, to see if the query plan is different?

I did pull replication and if I issue createIndex again then everything works fine,

I found out that queries works fine after app restart.

That fits this theory. After creating an index, and also just before closing the database, CBL calls SQLite’s pragma optimize, which updates the internal statistics that drive the query optimizer.

So, I have three suggestions:

(1) Call Database.compact after the sync. This also runs pragma optimize, so it should have the same effect of optimizing the query.

(2) Clean up your query. The part where you check that .type is not null or missing is unnecessary, because you also require that .type is PledgeObjectTask. I don’t know if SQLite figures that out, or if it slows down the query.

(3) The index would work better, for this query, if the third term were .taskGroupInfo.groupedRecordId; that would let SQLite get the docs from the index in sorted order without having to use a separate sort (“USE TEMPB-TREE FOR ORDER BY”). It doesn’t need a fourth term. Again, that’s for this query. If you have other queries that you want to use this index, making that change could be bad for them. You’ll have to weigh the trade-offs, and of course you can create another index but that will slow down document updates…

1 Like