Multiple calls to array_agg () in one request
I am trying to accomplish something with my request but it is not working. My application uses mongo db, so the application is used to get arrays in a field, now we needed to switch to Postgres and I don't want to change my application code to keep v1.
To get arrays in 1 field in Postgres I used the function array_agg()
. And it has worked fine so far. However, I am at the point where I need a different array in a field from a different table.
For example:
I have my employees. employees have multiple addresses and several business days.
SELECT name, age, array_agg(ad.street) FROM employees e
JOIN address ad ON e.id = ad.employeeid
GROUP BY name, age
Now it worked for me, it will lead for example for example:
| name | age| array_agg(ad.street)
| peter | 25 | {1st street, 2nd street}|
Now I want to join another table for weekdays, so I:
SELECT name, age, array_agg(ad.street), arrag_agg(wd.day) FROM employees e
JOIN address ad ON e.id = ad.employeeid
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age
This leads to:
| peter | 25 | {1st street, 1st street, 1st street, 1st street, 1st street, 2nd street, 2nd street, 2nd street, 2nd street, 2nd street}| "{Monday,Tuesday,Wednesday,Thursday,Friday,Monday,Tuesday,Wednesday,Thursday,Friday}
But I need this:
| peter | 25 | {1st street, 2nd street}| {Monday,Tuesday,Wednesday,Thursday,Friday}
I realize this has something to do with my concatenations as plural concatenates multiple strings, but I don't know how to do this, can anyone give me the correct advice?
source to share
DISTINCT
often used to fix requests that are rotten on the inside, and that are often slow and / or wrong. Don't start multiplying lines, so you don't have to sort unnecessary duplicates at the end.
Joining multiple n-tables ("has many") multiplies the rows in the result set at once. It's like a proxy CROSS JOIN
or Cartesian product :
There are various ways to avoid this error.
Aggregate first, join later
Technically, the query works as long as you join the table one with multiple rows at a time before aggregating:
SELECT e.id, e.name, e.age, e.streets, arrag_agg(wd.day) AS days
FROM (
SELECT e.id, e.name, e.age, array_agg(ad.street) AS streets
FROM employees e
JOIN address ad ON ad.employeeid = e.id
GROUP BY e.id -- id enough if it is defined PK
) e
JOIN workingdays wd ON wd.employeeid = e.id
GROUP BY e.id, e.name, e.age;
It is also best to include the primary key id
and GROUP BY
, because name
and are age
not necessarily unique. You can merge two employees by mistake.
But you can aggregate in a subquery before you join that boss if you don't have selective conditions WHERE
on employees
:
SELECT e.id, e.name, e.age, ad.streets, arrag_agg(wd.day) AS days
FROM employees e
JOIN (
SELECT employeeid, array_agg(ad.street) AS streets
FROM address
GROUP BY 1
) ad ON ad.employeeid = e.id
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY e.id, e.name, e.age, ad.streets;
Or combine both:
SELECT name, age, ad.streets, wd.days
FROM employees e
JOIN (
SELECT employeeid, array_agg(ad.street) AS streets
FROM address
GROUP BY 1
) ad ON ad.employeeid = e.id
JOIN (
SELECT employeeid, arrag_agg(wd.day) AS days
FROM workingdays
GROUP BY 1
) wd ON wd.employeeid = e.id;
The latter is generally faster if you are retrieving all or most of the rows in the underlying tables.
Please note that using JOIN
, rather than LEFT JOIN
removing employees from a result that has no address or no business days. It may or may not be intended. Go to LEFT JOIN
to save all results .
Correlated subqueries / LATERAL join
For a small selection , consider correlated subqueries instead:
SELECT name, age
, (SELECT array_agg(street) FROM address WHERE employeeid = e.id) AS streets
, (SELECT arrag_agg(day) FROM workingdays WHERE employeeid = e.id) AS days
FROM employees e
WHERE e.namer = 'peter'; -- very selective
Or, with Postgres 9.3 or newer, you can use LATERAL
:
SELECT e.name, e.age, a.streets, w.days
FROM employees e
LEFT JOIN LATERAL (
SELECT array_agg(street) AS streets
FROM address
WHERE employeeid = e.id
GROUP BY 1
) a ON true
LEFT JOIN LATERAL (
SELECT array_agg(day) AS days
FROM workingdays
WHERE employeeid = e.id
GROUP BY 1
) w ON true
WHERE e.name = 'peter'; -- very selective
As a result, any request saves all employees.
source to share