Reasons not to use GROUP_CONCAT?

I just discovered this amazingly useful MySQL feature GROUP_CONCAT

. It seems so useful and overly simplistic to me that I am really afraid to use it. Mainly because it's been quite a while since I started doing web programming and I have never seen it anywhere. A sample of amazing usage would be as follows

The table clients

contains customers (you don't say ...) one row per customer with unique IDs.
The table currencies

has 3 columns client_id

, currency

and amount

.

Now if I wanted to get user 15 name

from the table clients

and their balances, with the "old" method of overwriting the array, I would have to use the following SQL

SELECT id, name, currency, amount 
FROM clients LEFT JOIN currencies ON clients.id = client_id 
WHERE clients.id = 15

      

Then in php I would have to iterate over the result set and overwrite the array (which I really am not a big fan of, especially on massive result sets), for example

$result = array();
foreach($stmt->fetchAll() as $row){
    $result[$row['id']]['name'] = $row['name'];
    $result[$row['id']]['currencies'][$row['currency']] = $row['amount'];
}

      

However with the newly discovered function I can use this

SELECT id, name, GROUP_CONCAT(currency) as currencies GROUP_CONCAT(amount) as amounts 
FROM clients LEFT JOIN currencies ON clients.id = client_id 
WHERE clients.id = 15
GROUP BY clients.id

      

Then, at the application level, everything is so cool and beautiful.

$results = $stmt->fetchAll();
foreach($results as $k => $v){
    $results[$k]['currencies'] = array_combine(explode(',', $v['currencies']), explode(',', $v['amounts']));
}

      

The question I would like to ask is if there are any performance disadvantages to using this feature or anything at all, because to me it just feels like sheer awesomeness, which makes me think that for people there should be no reason to use it quite often.

EDIT:

In the end, I want to ask what are the other options besides array rewriting to get a multidimensional array from the MySQL result set, because if I select 15 columns it is really a big pain in the neck to write that animal ..

+3


source to share


2 answers


  • Using GROUP_CONCAT () usually calls group-by logic and creates temporary tables, which are usually very negative for performance. Sometimes you can add the index you want to avoid the temporary table in the group-by query, but not in every case.

  • As @MarcB points out, the default length limit for a group-concatenated string is quite short, and many people were confused by truncated lists. You can increase the limit with group_concat_max_len .

  • Exploding a string into an array does not come for free in PHP. Just because you can do it in a single function call in PHP doesn't mean it's best for performance. I have not compared this difference, but I doubt you will either.

  • GROUP_CONCAT () is MySQLism. It is not widely supported by other SQL products. In some cases (e.g. SQLite) they have a GROUP_CONCAT () function, but it doesn't work exactly the same as it does in MySQL, so this can lead to confusing errors if you need to support multiple RDBMS-back-ends. Of course, if you don't have to worry about porting, that's not a problem.

  • If you want to get multiple columns from a table currencies

    , you need multiple GROUP_CONCAT () statements. Are the lists guaranteed in the same order? That is, the third field in one list matches the third field in the next list? The answer is no, unless you specify the ordering with a clause ORDER BY

    inside GROUP_CONCAT ().

I usually prefer your first code format, use the normal result set and iterate over the results, keeping a new array indexed by the customer id, adding currencies to the array. This is a simple solution, simplifies and simplifies SQL, and works best if you have multiple columns to retrieve.



I'm not trying to say that GROUP_CONCAT () is bad! This is very useful in many cases. But trying to make any rule of the same size fit (or avoid) any feature or feature of the language is simplified.

+6


source


The biggest problem I see with using GROUP_CONCAT

is that it is very specific to MySql: if you want to port your code to work with any other platform, you will have to rewrite all the queries that use GROUP_CONCAT

. For example, your first query is much more portable - you can probably run it against any underlying DBMS engine without changing one character in it.



If you are ok with working only with MySql (say, because you are writing a tool that needs to be specific to MySql), queries with GROUP_CONCAT

are likely to be faster because the RDBMS will do more work for you while keeping the data transfer size small.

+2


source







All Articles