BigQuery user-defined aggregation function?

I know I can define a custom function to do some custom calculations. I also know that I can use the aggregation functions out of the box to reduce a set of values ​​to a single value when using a sentence GROUP BY

.

Is it possible to define a custom UDF aggregation function for use with a sentence GROUP BY

?

+8


source to share


1 answer


It turns out that this is possible (as long as the groups we are trying to merge have a reasonable size in memory) with a little "glue" - namely, the function ARRAY_AGG

The steps are as follows:

  1. Create a UDF with an input type parameter ARRAY<T>

    where T

    is the type of the value you want to aggregate.
  2. Use the function ARRAY_AGG

    in the clause request GROUP BY

    to generate an array T

    and pass it to the UDF.


As a specific example:

CREATE TEMP FUNCTION aggregate_fruits(fruits ARRAY<STRING>)
RETURNS STRING
LANGUAGE js AS """
return "my fruit bag contains these items: " + fruits.join(",");
""";

WITH fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)

SELECT aggregate_fruits(ARRAY_AGG(fruit))
FROM fruits

      

+8


source







All Articles