Multiple calls to array_agg () in one request

I am trying to accomplish something with my request but it is not working. My application uses mongo db, so the application is used to get arrays in a field, now we needed to switch to Postgres and I don't want to change my application code to keep v1.

To get arrays in 1 field in Postgres I used the function array_agg()

. And it has worked fine so far. However, I am at the point where I need a different array in a field from a different table.

For example:

I have my employees. employees have multiple addresses and several business days.

SELECT name, age, array_agg(ad.street) FROM employees e 
JOIN address ad ON e.id = ad.employeeid
GROUP BY name, age

      

Now it worked for me, it will lead for example for example:

| name  | age| array_agg(ad.street)
| peter | 25 | {1st street, 2nd street}|

      

Now I want to join another table for weekdays, so I:

SELECT name, age, array_agg(ad.street), arrag_agg(wd.day) FROM employees e 
JOIN address ad ON e.id = ad.employeeid 
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age

      

This leads to:

| peter | 25 | {1st street, 1st street, 1st street, 1st street, 1st street, 2nd street, 2nd street, 2nd street, 2nd street, 2nd street}| "{Monday,Tuesday,Wednesday,Thursday,Friday,Monday,Tuesday,Wednesday,Thursday,Friday}

      

But I need this:

| peter | 25 | {1st street, 2nd street}| {Monday,Tuesday,Wednesday,Thursday,Friday}

      

I realize this has something to do with my concatenations as plural concatenates multiple strings, but I don't know how to do this, can anyone give me the correct advice?

+3


source to share


2 answers


DISTINCT

often used to fix requests that are rotten on the inside, and that are often slow and / or wrong. Don't start multiplying lines, so you don't have to sort unnecessary duplicates at the end.

Joining multiple n-tables ("has many") multiplies the rows in the result set at once. It's like a proxy CROSS JOIN

or Cartesian product :

There are various ways to avoid this error.

Aggregate first, join later

Technically, the query works as long as you join the table one with multiple rows at a time before aggregating:

SELECT e.id, e.name, e.age, e.streets, arrag_agg(wd.day) AS days
FROM  (
   SELECT e.id, e.name, e.age, array_agg(ad.street) AS streets
   FROM   employees e 
   JOIN   address  ad ON ad.employeeid = e.id
   GROUP  BY e.id    -- id enough if it is defined PK
   ) e
JOIN   workingdays wd ON wd.employeeid = e.id
GROUP  BY e.id, e.name, e.age;

      

It is also best to include the primary key id

and GROUP BY

, because name

and are age

not necessarily unique. You can merge two employees by mistake.

But you can aggregate in a subquery before you join that boss if you don't have selective conditions WHERE

on employees

:

SELECT e.id, e.name, e.age, ad.streets, arrag_agg(wd.day) AS days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN   workingdays wd ON e.id = wd.employeeid
GROUP  BY e.id, e.name, e.age, ad.streets;

      

Or combine both:



SELECT name, age, ad.streets, wd.days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN  (
   SELECT employeeid, arrag_agg(wd.day) AS days
   FROM   workingdays
   GROUP  BY 1
   ) wd ON wd.employeeid = e.id;

      

The latter is generally faster if you are retrieving all or most of the rows in the underlying tables.

Please note that using JOIN

, rather than LEFT JOIN

removing employees from a result that has no address or no business days. It may or may not be intended. Go to LEFT JOIN

to save all results .

Correlated subqueries / LATERAL join

For a small selection , consider correlated subqueries instead:

SELECT name, age
    , (SELECT array_agg(street) FROM address WHERE employeeid = e.id) AS streets
    , (SELECT arrag_agg(day) FROM workingdays WHERE employeeid = e.id) AS days
FROM   employees e
WHERE  e.namer = 'peter';  -- very selective

      

Or, with Postgres 9.3 or newer, you can use LATERAL

:

SELECT e.name, e.age, a.streets, w.days
FROM   employees e
LEFT   JOIN LATERAL (
   SELECT array_agg(street) AS streets
   FROM   address
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) a ON true
LEFT   JOIN LATERAL (
   SELECT array_agg(day) AS days
   FROM   workingdays
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) w ON true
WHERE  e.name = 'peter';  -- very selective

      

As a result, any request saves all employees.

+5


source


If you want values ​​that don't repeat, use DISTINCT, for example:



SELECT name, age, array_agg(DISTINCT ad.street), array_agg(DISTINCT wd.day) FROM employees e 
JOIN address ad ON e.id = ad.employeeid 
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age

      

0


source







All Articles