Bigquery - filter only unique results
My database looks like this:
Entry-Key Name Surname Age
10a Smith Alex 35
11b Finn John 41
10a Smith Al 35
10c Finn Berta 28
11b Fin John 41
I need to get unique strings from it. The group does not work as expected because sometimes there are inaccuracies in the First / Last Name columns.
It felt like I was only grouping Entry-Keys and then I find the first occurrence of the key in the table and take only that row. I know how to do this in Excel, but since the database contains about 100,000 rows, Excel is not a real option.
the idea is to finally get this table:
10a Smith Alex 35
11b Finn John 41
12c Finn Berta 28
Please, help!
+3
source to share
1 answer
For your logic, you can run the following query:
select key, first(name), first(surname), first(age) from
(select '10a' as key, 'Smith' as name, 'Alex' as surname, 35 as age),
(select '11b' as key, 'Finn' as name, 'John' as surname, 41 as age),
(select '10a' as key, 'Smith' as name, 'Al' as surname, 35 as age),
(select '10c' as key, 'Finn' as name, 'Berta' as surname, 28 as age),
(select '11b' as key, 'Fin' as name, 'John' as surname, 41 as age),
group by key
This returns:
+-----+-----+-------+-------+-----+---+
| Row | key | f0_ | f1_ | f2_ | |
+-----+-----+-------+-------+-----+---+
| 1 | 10a | Smith | Alex | 35 | |
| 2 | 11b | Finn | John | 41 | |
| 3 | 10c | Finn | Berta | 28 | |
+-----+-----+-------+-------+-----+---+
+2
source to share