Back in Couchbase 4.0, N1QL was introduced as a way to give developers more flexibility when querying their NoSQL data stored within Couchbase Server. You could run SQL queries against JSON data and everything would be amazing. However, with N1QL as good as it is, there were features left to be desired. Fast forward to Couchbase 5.0, many of these features are now a reality. Take for example the ability to use subqueries within N1QL queries. This feature has already existed, but Couchbase Server 5.0 brings subquery expressions, where subqueries can be used in generic expressions.
At the time of writing this, January 2017, Couchbase 5.0 is available through a developer build. While the features are coming, they are not yet in a production ready build. More information about developer builds can be found in this blog announcement on the subject.
Let’s take a look what sub-querying over a nested document means. For the basis of this example we’re going to work with the following two documents:
The document below will have a document key of order::1 and acts as a receipt for a person:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "type": "order", "customerId": "nraboy", "order_number": "12345", "products": [ { "productId": "pokemon-red", "quantity": 1, "price": 39.99 }, { "productId": "pokemon-yellow", "quantity": 1, "price": 39.99 } ] } |
The document below will have a document key of order::2 and will act as a receipt for a different person:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "type": "order", "customerId": "mgroves", "order_number": "34532", "products": [ { "productId": "pokemon-blue", "quantity": "1", "price": 39.99 }, { "productId": "zelda-ocarina-time", "quantity": 2, "price": 59.99 } ] } |
The important thing to note and the basis behind this example is the fact that we have nested objects within an array making the document complex. The complexity is what we are after.
So now let’s say we wanted to get the total that each person spent on video games. Typically this would be done using a subquery with some aggregate function. The following is possible in Couchbase Server 5.0:
1 2 3 4 |
SELECT customerId, (SELECT RAW SUM(products.price * products.quantity) FROM default.products AS products)[0] AS total FROM default; |
Of course subqueries are not limited to just aggregation. You can use on nested data as part of an EXISTS
condition, MERGE
, or anything else. Let’s say we wanted to figure out if pokemon-blue exists on anyone’s receipt. We could do something like this to query for that data:
1 2 3 4 |
SELECT customerId, order_number FROM default WHERE "pokemon-blue" IN (SELECT RAW products.productId FROM default.products AS products); |
However, just because you can use subqueries, doesn’t mean you should in all scenarios as there are other ways that might yield better performance. Instead of using a subquery in the above example, maybe try the following instead:
1 2 3 4 |
SELECT customerId, order_number FROM default WHERE SOME product IN default.products SATISFIES product.productId = "pokemon-blue" END; |
Both queries accomplish the same task, but one will give you better performance than the other. It really comes down to what your needs are as a result and then you can define a query that matches.
Let’s see another example of where subquery expressions might work. What if we wanted to use them in a LET
clause like the following:
1 2 3 4 5 |
SELECT customerId, SUM(total) FROM default LET total = (SELECT RAW SUM(products.price * products.quantity) FROM default.products)[0] GROUP BY customerId; |
Assuming we had more than one document for a customer, the totals would be aggregated and we would see the total purchase history for that particular customer. We use LET
versus LETTING
because we want the values to be determined before trying to group and aggregate them.
Other scenarios where subquery expressions might make sense are in the MERGE
and LETTING
clauses.
For more information on using N1QL with Couchbase, visit the Couchbase Developer Portal.