List processing in Haskell

Question

List processing in Haskell

I am learning Haskell myself and am facing a problem and need help.

Background:

type AInfo  =  (Char, Int)
type AList  =  [AInfo]       (let’s say [(‘a’, 2), (‘b’,5), (‘a’, 1), (‘w’, 21)]

type BInfo  =  Char
type BList  =  [BInfo]      (let’s say [‘a’, ‘a’, ‘c’, ‘g’, ‘a’, ‘w’, ‘b’]

One quick editing: . The above information is for illustration purposes only. The actual list items are a little more complicated. Also, lists are not static; they are dynamic (hence the use of the IO monad) and I need to maintain / pass / "return" / access and modify the lists while the program is running.

I want to do the following:

For all AList checkers for all BList elements and where the character of the ALIST element (pair) is equal to the character in the Blist, add it to the Int value of the AList element (pair) and remove the character from BList.

So this means that after the first AList item is checked against all BList items, the list values should be:

AList [('a, 5), (' b, 5), ('a, 1), (' w, 21)]

BList ['c,' g, 'w,' b]

And in the end, the values of the lists should be:

AList [('a, 5), (' b, 6), ('a, 1), (' w, 22)]

BList ['c,' g]

Of course, all of this happens in the IO monad.

What I have tried:

Using mapM and a recursive helper function. I looked at both:

Each AList item is checked for every bList-mapM (myHelpF1 alist) blist and Each BList item is checked for every AList-mapM (myHelpF2 alist) blist
Passing both lists to a function and using complex if / then / else and helper function calls (feels like I'm forcing Haskell - iterative; messy folded code, don't feel right.)
I was thinking about using a filter, character value AList element and Blist to create a third Bool list and count the number of True values. Update the Int value. Then use a BList filter to remove the BList elements that ...... (doesn't feel right again, not very Haskell-like.)

Things I think I know about the problem:

The solution can be trivial. So much so that more experienced Haskellers will mumble "what a noob" to themselves as they type in an answer.

Any pointers would be greatly appreciated. (mumble ....)

+3

haskell

user1872391 Jan 25. 13 at 5:25 am

source to share

4 answers

luqui · Answer 1 · 2013-01-25T06:07:32+0000

Several pointers:

Do not use [(Char, Int)]

for "AList". The data structure that you are looking for, is the final card: Map Char Int

. Especially look at member

and insertWith

. toList

and fromList

convert from the view you are currently for AList

, so even if you are stuck with this view, you can convert to Map

for this algorithm and convert back to the end. (This will be more efficient than staying in the list because you do so many searches and the final map API is easier to work with lists)

I would approach the problem as two steps: (1) partition

from elements blist

from whether they are in the display (2) insertWith

elements that are already on the map. Then you can return the received card and another section.

I would also get rid of nonsensical assumptions like keys Char

- you can just say that it is any type k

(for "key") that satisfies the constraints you want (which you can put in Map

, which requires Ord

erable). You do it with lowercase variables:

import qualified Data.Map as Map

sieveList :: (Ord k) => Map.Map k Int -> [k] -> (Map.Map k Int, [k])

Writing algorithms more generally help you catch errors because they don't allow any assumptions that you don't need.

Oh, also this program has no business in the monad IO

. This is clean code.

The internet · Answer 2 · 2013-01-25T06:36:23+0000

While I'm by no means a Haskell expert, I have a partial try that returns this operation result once. Maybe you can find out how to match it to the rest to get your solution. Also, this is smart as you want to update the first occurrence of an item in lista, if it exists twice, it will just add 0 to it. Criticism of the code is more than welcome.

import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]

type BInfo = Char
type BList = [BInfo]

lista = ([('a', 2), ('b',5), ('a', 1), ('w', 21)] :: AList)
listb = ['a','a','c','g','a','w','b']

--step one, get the head, and its occurrences
items list = (eleA, eleB) where
        eleA = length $ filter (\x -> x == (head list)) list
        eleB = head list

getRidOfIt list ele = (dropWhile (\x -> x == ele) list) --drop like its hot

--add to lista
addWhile :: [(Char, Int)] -> Char -> Int -> [(Char,Int)]    
addWhile [] _ _ = []
addWhile ((x,y):xs) letter times = if x == letter then (x,y+times) : addWhile xs letter times 
                                   else (x,y) : addWhile xs letter 0

--first answer
firstAnswer = addWhile lista (snd $ items listb) (fst $ items listb)
--[('a',5),('b',5),('a',1),('w',21)]

Father deus · Answer 3 · 2013-01-25T06:37:48+0000

import Data.List

type AInfo  =  (Char, Int)
type AList  =  [AInfo]

type BInfo  =  Char
type BList  =  [BInfo]

process :: AList -> BList -> AList
process [] _ = []
process (a:as) b = if is_in a b then (fst a,snd a + 1):(process as (delete (fst a) b)) else a:process as b where
        is_in f [] = False
        is_in f (s:ss) = if fst f == s then True else is_in f ss

*Main> process [('a',5),('b',5),('a',1),('b',21)] ['c','b','g','w','b']
[('a',5),('b',6),('a',1),('b',22)]
*Main> process [('a',5),('b',5),('a',1),('w',21)] ['c','g','w','b']
[('a',5),('b',6),('a',1),('w',22)]

Probably an important disclaimer: I'm rusty in Haskell to the point of ineptitude, but as a relaxing midnight exercise, I wrote this thing. It should do what you want, although it doesn't return BList. With a little change, you can get it to return a (AList, BList) tuple, but in my opinion you would be better off using an imperative language if this kind of manipulation is required.

Alternatively, there is a neat solution, and I am too ignorant of Haskell to know.

Will ness · Answer 4 · 2013-01-25T10:11:05+0000

The operation you are describing is pure as @luqui points out, so we just define it as a pure Haskell function. It can be used inside a monad (including IO

) with fmap

(or do

).

import Data.List

combine alist blist = (reverse a, b4) where

First, we sort and count list B:

  b = map (\g->(head g,length g)) . group . sort $ blist

We need an import for group

and sort

. Then we roll over alist

and do the following:

  (a,b2) = foldl g ([],b) alist
  g (acc,b) e@(x,c) = case pick x b of 
                        Nothing -> (e:acc,b)
                        Just (n,b2) -> ((x,c+n):acc,b2)
  b3 = map fst b2
  b4 = [ c | c <- blist, elem c b3 ]

Now pick

as used should be

  pick x [] = Nothing
  pick x ((y,n):t) 
     | x==y = Just (n,t)
     | otherwise = case pick x t of Nothing -> Nothing
                                    Just (k,r) -> Just (k, (y,n):r)

Does pick

linear search of course, so if performance (speed) becomes an issue b

it should be modified to allow binary search (tree, etc., for example Map

). Calculation b4

, which is filter (`elem` b3) blist

, is another potential performance issue when re-checking availability in b3

. Again, checking for presence in trees is faster than in lists.

Testing:

> combine [('a', 2), ('b',5), ('a', 1), ('w', 21)] "aacgawb"

([('a',5),('b',6),('a',1),('w',22)],"cg")

edit: you probably want it the other way around, rolling around blist

, updating alist

and creating (or not) elements blist

as a result ( b4

in my code). This way the algorithm will work more locally on long input streams (assuming yours is blist

long, although you didn't say so). As written above, this will have a spatial problem consuming the input stream multiple times blist

. I'll keep this as an illustration, food for thought.

So, if you decide to go to the second route, convert alist

to a map first ( beware of duplicates! ). Then scan (s scanl

) over blist

, use updateLookupWithKey

to update the readout map and at the same time decide for each member blist

, one by one, whether to output it or not. So the accumulator type should be (Map a Int, Maybe a)

, with a

your element type ( blist :: [a]

):

scanl :: (acc -> a -> acc) -> acc -> [a] -> [acc]

scanning = tail $ scanl g (Nothing, fromList $ reverse alist) blist
g (_,cmap) a = case updateLookupWithKey (\_ c->Just(c+1)) a cmap of
                 (Just _, m2) -> (Nothing, m2)   -- seen before
                 _            -> (Just a, cmap)  -- not present in counts 
new_b_list = [ a | (Just a,_) <- scanning ]
last_counts = snd $ last scanning

You will need to merge toList last_counts

with the original alist

one if you need to keep old duplicates (why would you?).

List processing in Haskell

More articles: