Is it possible that in clickhouse it is possible to save the state of HyperLogLog / uniqState () directly via an insert request?

We can use the AggregatedMergeTree table mechanism, which can be used to aggregate rows.

Typically, in aggregated data, we are not interested in storing all unique IDs and still want to be countable. However, we want to be able to do another aggregation to subsequently get a unique score on these rows (by grouping the rows in the selected query). It is convenient to use HyperLogLog here, which is implemented as uniqState function in clickhouse.

I would like to store the hyperlog log directly through the insert request and offer it in the click table from my client application. Is it possible?

+3


source to share


1 answer


So, I achieved this feat using only request for clicks. It works really well!



CREATE TABLE demo_db.aggregates
(
    name String,
    date Date,
    ids AggregateFunction(uniq, UInt8)
) ENGINE = MergeTree(date, date, 8192)

//So here the declaration of a set of ids in the insert query will lead to a binary hash tree being stored    
INSERT INTO aggregates SELECT
    'Demo',
    toDate('2016-12-03'),
    uniqState(arrayJoin([1, 5, 6, 7])) 

SELECT
    name,
    date,
    uniqMerge(ids) //our hashtree can be grouped and give us unique count over the grouped rows
FROM aggregates
GROUP BY name, date

      

+2


source







All Articles