Bigquery - filter only unique results

Question

Bigquery - filter only unique results

My database looks like this:

Entry-Key     Name     Surname     Age
10a           Smith    Alex        35
11b           Finn     John        41
10a           Smith    Al          35
10c           Finn     Berta       28
11b           Fin      John        41

I need to get unique strings from it. The group does not work as expected because sometimes there are inaccuracies in the First / Last Name columns.

It felt like I was only grouping Entry-Keys and then I find the first occurrence of the key in the table and take only that row. I know how to do this in Excel, but since the database contains about 100,000 rows, Excel is not a real option.

the idea is to finally get this table:

10a           Smith    Alex        35
11b           Finn     John        41
12c           Finn     Berta       28

Please, help!

+3

unique google-bigquery

Ilja June 24. 15 at 12:35

source to share

1 answer

Pentium10 · Accepted Answer · 2015-06-24T12:59:34+0000

For your logic, you can run the following query:

select key, first(name), first(surname), first(age) from 
(select '10a' as key,           'Smith' as name,    'Alex' as surname,        35 as age),
(select '11b' as key,           'Finn' as name,     'John' as surname,        41 as age),
(select '10a' as key,           'Smith' as name,    'Al' as surname,          35 as age),
(select '10c' as key,           'Finn' as name,     'Berta' as surname,       28 as age),
(select '11b' as key,           'Fin' as name,      'John' as surname,        41 as age),
group by key

This returns:

+-----+-----+-------+-------+-----+---+
| Row | key |  f0_  |  f1_  | f2_ |   |
+-----+-----+-------+-------+-----+---+
|   1 | 10a | Smith | Alex  |  35 |   |
|   2 | 11b | Finn  | John  |  41 |   |
|   3 | 10c | Finn  | Berta |  28 |   |
+-----+-----+-------+-------+-----+---+

Bigquery - filter only unique results

More articles: