Difference between concatMap f xs and concat $ map f xs?

Presumably they do the same, concatMap f xs

and concat $ map f xs

. Why should I choose one by one?

I guess it might be an optimization. If so, is this still the case with GHC 7.8?

+3


source to share


1 answer


In this case concatMap f xs = concat (map f xs)

, as you suspect. Thus, for the purpose of correctness, you should consider them interchangeable. However, we can study their definitions to find out a little more.

concatMap               :: (a -> [b]) -> [a] -> [b]
concatMap f             =  foldr ((++) . f) []

concat :: [[a]] -> [a]
concat = foldr (++) []

      

In particular, it means it concat . map f

expands to foldr (++) [] . map f

. Now, using the thing known as the "universal property fold

"
, we can see that foldr g z . map f = foldr (g . f) z

for any ( g

, z

, f

), for example, the choice of ( (++)

, f

, []

), which we use above. This demonstrates that concatMap f = concat . map f

how we want it. [0]

So why are they defined differently? Because foldr ((++) . f) []

it will always be faster than foldr (++) [] . map f

because in a truly pathological case, the latter offers two separate recursions. Due to laziness, it is unlikely that two recursions will ever be executed, what gives?



The real reason is that more complex fusion laws are available to the compiler, such as those that combine two sequential ones foldr

or that define interactions between foldr

and unfoldr

. They are quite subtle to use, as they depend on being able to look at the surface syntax of a piece of code and spot possible simplifications. A lot of work is in getting the merger laws consistent.

But one thing we can do is encourage people to use higher order combinators with pre-defined optimization laws. Since it foldr (++) [] . map f

will never be faster than foldr ((++) . f) []

, we can use a shortcut and first apply a simplification of the universal law. This will improve the likelihood that merger laws will be laid off elsewhere to best optimize the pipeline for listing.

[0] Why does this law work? Roughly speaking, the universal law foldr

says that if you have any function q

such that q [] = z

and q (a:as) = f a (q as)

, then there q

must be and foldr f z

. Since it q = foldr g z . map f

can be shown as q [] = z

and q (a:as) = g (f a) (q as)

, then it should be the same foldr (g . f) z

as we want.

+9


source







All Articles