BigQuery user-defined aggregation function?
I know I can define a custom function to do some custom calculations. I also know that I can use the aggregation functions out of the box to reduce a set of values to a single value when using a sentence GROUP BY
.
Is it possible to define a custom UDF aggregation function for use with a sentence GROUP BY
?
+8
source to share
1 answer
It turns out that this is possible (as long as the groups we are trying to merge have a reasonable size in memory) with a little "glue" - namely, the function ARRAY_AGG
The steps are as follows:
- Create a UDF with an input type parameter
ARRAY<T>
whereT
is the type of the value you want to aggregate. - Use the function
ARRAY_AGG
in the clause requestGROUP BY
to generate an arrayT
and pass it to the UDF.
As a specific example:
CREATE TEMP FUNCTION aggregate_fruits(fruits ARRAY<STRING>)
RETURNS STRING
LANGUAGE js AS """
return "my fruit bag contains these items: " + fruits.join(",");
""";
WITH fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)
SELECT aggregate_fruits(ARRAY_AGG(fruit))
FROM fruits
+8
source to share