How you design and query the most selling products?

Hello,

I have learned to query the most selling products in N1QL Tutorial

https://query-tutorial.couchbase.com/tutorial/#40

but my question is, is the design good ? He group by all products on the query and it takes a lot of times, I know there is no KV or index in the tutorial examples but is this good ?

Or would you recommend when I have in my products document a sold field and then when every one buys anything I make it +1. So I dont have to group by products and join anything. Or is this bad what you think ?

The tutorial example is a “good” design following typical database design principles. IMO it would almost always be better to calculate aggregates than maintain them as fields as you suggest. But I concede that whether or not any design is the “right” design for a given scenario is always going to be up for debate.

If you maintain counts independently of the actual items then you have the potential for these to be out of sync, you create a concurrency-limiting bottleneck, add potentially unused processing overhead, may have application version issues, etc. Perhaps none of these are a problem for the simplest of scenarios but would likely be issues for most.

Furthermore, it is just an example. What if you want to know the total sales by product on Tuesdays? Are you going to maintain another count especially for that? With the statement given in the example you can simply add a filter if and when you need it.

What if you only need the total once a year? Is it a good design for every transaction to have the overhead of maintaining it throughout the year instead of a single calculation once? And if you need 27 (random example) varieties of aggregates for reporting purposes? How much in the way of aggregate calculation per transaction is appropriate? How long will it take to process an order that contains hundreds or thousands of different products?

What if your back-end reporting needs change after the application is deployed? Such changes in reporting may not be optional (they could be needed for things like regulatory compliance, for instance) but can be easily accommodated when calculating rather than maintaining aggregates.

As you acknowledge, the issue with the tutorial examples is the lack of a proper server back-end. The demo back-end is deliberately an emulation as the tutorial is only for introducing SQL++, not the database itself. It lacks any of the optimisations the server has to more efficiently process such statements. I’d urge you to try similar in the playground before trying to judge the efficiency/practicality of the statements. Reserve the tutorial for introducing the language and its concepts.

HTH.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.