Hi,
I have the below query, which gives me the results I need. The problem is that the query runs for 6+ secs. My intention is to try and re-write the query to make it more efficient and use indexes where possible.
WITH available_cb_docs AS ( WITH available_daily_docs AS (
SELECT RAW td.DocNum
FROM tranData td
WHERE td.`$MdfdTmstmp` BETWEEN "{report_date}" and "{report_end_date}"
AND td.RteId = "{report_route}"
AND td.`$Type`="DailyReport"
UNION ALL
SELECT RAW mch.DocNum
FROM tranData td
UNNEST td.ActvCustOrdr AS mch
WHERE td.`$MdfdTmstmp` BETWEEN "{report_date}" and "{report_end_date}"
AND td.RteId = "{report_route}"
AND td.`$Type`="DailyReport"
)
SELECT DISTINCT raw DocNum
FROM tranData
WHERE DocNum IN (available_daily_docs)and `$MdfdTmstmp` between "{previous_date}" and "{report_end_date}") ,
available_daily_report_docs AS (
SELECT RAW td.DocNum
FROM tranData td
WHERE td.`$MdfdTmstmp` BETWEEN "{report_date}" AND "{report_end_date}"
AND td.RteId = "{report_route}"
AND td.`$Type`="DailyReport"
UNION ALL
SELECT RAW mch.DocNum
FROM tranData td
UNNEST td.ActvCustOrdr AS mch
WHERE td.`$MdfdTmstmp` BETWEEN "{report_date}" AND "{report_end_date}"
AND td.RteId = "{report_route}"
AND td.`$Type`="DailyReport"
)
SELECT array_symdiff(available_daily_report_docs, available_cb_docs) AS cust_order_missing_docs;
One of the things I want to do is replace the Union query with a Variable so it can be executed only one. The problem I run into is with the DocNum IN (available_daily_docs)
part. For whatever reason, whenever there is an IN clause, Analytics does not want to pick the index defined on DocNum. If I do DocNum = "123"
, the index is used and response is in milliseconds. But if I do DocNum in ["123", "456"]
, it takes 6 secs and does not use the index.
Any suggestion on optimizing this?