Bigquery - filter only unique results

My database looks like this:

Entry-Key     Name     Surname     Age
10a           Smith    Alex        35
11b           Finn     John        41
10a           Smith    Al          35
10c           Finn     Berta       28
11b           Fin      John        41

      

I need to get unique strings from it. The group does not work as expected because sometimes there are inaccuracies in the First / Last Name columns.

It felt like I was only grouping Entry-Keys and then I find the first occurrence of the key in the table and take only that row. I know how to do this in Excel, but since the database contains about 100,000 rows, Excel is not a real option.

the idea is to finally get this table:

10a           Smith    Alex        35
11b           Finn     John        41
12c           Finn     Berta       28

      

Please, help!

+3


source to share


1 answer


For your logic, you can run the following query:

select key, first(name), first(surname), first(age) from 
(select '10a' as key,           'Smith' as name,    'Alex' as surname,        35 as age),
(select '11b' as key,           'Finn' as name,     'John' as surname,        41 as age),
(select '10a' as key,           'Smith' as name,    'Al' as surname,          35 as age),
(select '10c' as key,           'Finn' as name,     'Berta' as surname,       28 as age),
(select '11b' as key,           'Fin' as name,      'John' as surname,        41 as age),
group by key

      



This returns:

+-----+-----+-------+-------+-----+---+
| Row | key |  f0_  |  f1_  | f2_ |   |
+-----+-----+-------+-------+-----+---+
|   1 | 10a | Smith | Alex  |  35 |   |
|   2 | 11b | Finn  | John  |  41 |   |
|   3 | 10c | Finn  | Berta |  28 |   |
+-----+-----+-------+-------+-----+---+

      

+2


source







All Articles