codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"single or null" aggregation function for elastic


Hello,

I’m looking for a way to simulate “single or null” (on distinct values)
aggregation function in elastic.
Example of a query

-- for multiple distinct values return nullselect date,
single_value(value) from table group by date

Some Options

   1. ANY_VALUE. For multiple values returns one of them (I need null).
   2. SINGLE_VALUE. For multiple (or empty) values throws exception (I need
   null).
   3. COLLECT / JSON_ARRAYAGG (see below).
   4. Manually add column COUNT(distinct ...) and post-process the result.

COLLECT / JSON_ARRAYAGG

select date, collect(distinct value) from table group by date

This query might potentially work but the problem with elastic is that it
doesn’t return all values by default (similar issue to scrolling. see size
<https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size>).
Currently, scrolling is not possible with aggregations.
Therefore I have to impose usage of DISTINCT / LIMIT with COLLECT :

-- enforce DISTINCT and LIMITselect date, collect(distinct value limit
2) from table group by date

Do you think it is reasonable to enforce such restriction to make it work
with elastic ?
Adding COUNT function

One can append count(distinct value) aggregation and check if it is equal
to 1.

-- use combination of ANY_VALUE and COUNTselect date,
any_value(value), count(distinct value) from table group by date

What do you think ?

Regards,
Andrei.