I’m using couchbase to store blacklists of disallowed passwords of sizes up to ~15mb.
Each list contains a json array that is I’ve indexed to allow fast “any-in-satisfies” lookups.
If I check whether some password is contained in some blacklist, the query finishes after ~5ms when executing it via the web ui or cbq. When I execute the same query via the sdk, it takes more than 1000ms!
Reproducing this issue:
I’m running a new Couchbase EE 6.5 instance.
The dataset can be downloaded here
You should be able to import it like this:
/opt/couchbase/bin/cbimport json -u <user> -p <password> -b <bucket-name> -f lines -c http://localhost:8091 --generate-key %_id_% -d file://blacklists
I’ve created following index to allow quick search for passwords. No primary index exists.
CREATE INDEX blacklisted_passwords_idx ON `blacklist`(DISTINCT ARRAY LOWER(pw) FOR pw IN passwords END) WHERE type="password-blacklist"
The following query will complete within a few milliseconds when run from the web-ui or cbq:
SELECT RAW COUNT(*) FROM `blacklist` USE INDEX (blacklisted_passwords_idx) WHERE ANY pw IN passwords SATISFIES LOWER(pw) = LOWER("p4ssword") END AND type="password-blacklist"
When executing the query via the sdk (scala in my case), it takes ~1200ms to complete!
import com.couchbase.client.scala.Cluster
import com.couchbase.client.scala.query.{QueryOptions, QueryParameters}object TestApp extends App{
val cluster = Cluster.connect(
“127.0.0.1”,
“”,
“”
).getval bucket = this.cluster.bucket(“blacklist”)
val collection = this.bucket.defaultCollectionval n1ql = s"““SELECT RAW COUNT(*) FROM
blacklist
|USE INDEX (blacklisted_passwords_idx)
|WHERE ANY pw IN passwords SATISFIES LOWER(pw) = LOWER($$password) END
| AND type=“password-blacklist”
|””".stripMarginvar start = 0L
for (i ← 0 until 10) {
println(“Starting…”)
start = System.currentTimeMillis()
val result = cluster.query(n1ql, QueryOptions().parameters(QueryParameters.Named(“password” → “p4ssword”)).metrics(true)).get
println(s"Took ${System.currentTimeMillis() - start} ms")
println(“executionTime: " + result.metaData.metrics.get.executionTime.toMillis + " ms”)
println(“elapsedTime: " + result.metaData.metrics.get.elapsedTime.toMillis + " ms”)
}}
This yields following results:
SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See SLF4J Error Codes for further details.
Starting…
Took 3015 ms
executionTime: 1273 ms
elapsedTime: 1273 ms
Starting…
Took 1353 ms
executionTime: 1350 ms
elapsedTime: 1350 ms
Starting…
Took 1194 ms
executionTime: 1189 ms
elapsedTime: 1189 ms
Starting…
Took 1267 ms
executionTime: 1263 ms
elapsedTime: 1263 ms
Starting…
Took 1248 ms
executionTime: 1245 ms
elapsedTime: 1245 ms
…