Creating a Domain using derived tables is a huge problem for those of use using BigQuery for a back end. Because BigQuery charges users based on the amount of processing that must be done, is it getting very expensive to use JasperSoft with BigQuery, in the realm of 2 or 3 orders of magnitude increase in BigQuery cost.
Big Query allows us to partition our data (it is practically the only performance tool that we have available... there are no indexes!), but that means that any query on that partitioned table requires the partition field to be included as a filter.
Example:
I have a table partitioned by event_timestamp
, and is therefore required. Any query on this table must be filtered by event_timestamp
, even if the filter is WHERE event_timestamp IS NOT NULL
. Because any date range is possible, I cannot further restrict the derived table SQL.
Now, assume that an Ad Hoc report is created, and I filter by today's date. The way that the report sql is genereated, Big Query will first need to pull all rows from every partition, and then it will filter for the event_timestamp
that I'm actually looking for. If, instead, I were able to parameterize the derived table, then the parameter (event_timestamp
) could be required (rather an using the IS NOT NULL
filter shown earlier), the end result being that only the intended partition will be searched.
Alternative example with SQL:
Given this derived table definition, where event_timestamp
is the partition field:
SELECT id, event_timestamp FROM example WHERE event_timestamp IS NOT NULL
Currently, an Ad Hoc report's query would look something like this:
SELECT derived_table.
id, derived_table.
event_timestamp
FROM (
SELECT id, event_timestamp FROM example WHERE event_timestamp IS NOT NULL) derived_table
WHERE event_timestamp = "2021-09-28"
It would be much better to be in this form:
SELECT id, event_timestamp
FROM example
WHERE event_timestamp = "2021-09-28"
Components | Domains |
Thanks for bringing this to our notice. We will get back to you shortly.