List processing in Haskell
I am learning Haskell myself and am facing a problem and need help.
Background:
type AInfo = (Char, Int)
type AList = [AInfo] (letโs say [(โaโ, 2), (โbโ,5), (โaโ, 1), (โwโ, 21)]
type BInfo = Char
type BList = [BInfo] (letโs say [โaโ, โaโ, โcโ, โgโ, โaโ, โwโ, โbโ]
One quick editing: . The above information is for illustration purposes only. The actual list items are a little more complicated. Also, lists are not static; they are dynamic (hence the use of the IO monad) and I need to maintain / pass / "return" / access and modify the lists while the program is running.
I want to do the following:
For all AList checkers for all BList elements and where the character of the ALIST element (pair) is equal to the character in the Blist, add it to the Int value of the AList element (pair) and remove the character from BList.
So this means that after the first AList item is checked against all BList items, the list values โโshould be:
AList [('a, 5), (' b, 5), ('a, 1), (' w, 21)]
BList ['c,' g, 'w,' b]
And in the end, the values โโof the lists should be:
AList [('a, 5), (' b, 6), ('a, 1), (' w, 22)]
BList ['c,' g]
Of course, all of this happens in the IO monad.
What I have tried:
-
Using mapM and a recursive helper function. I looked at both:
Each AList item is checked for every bList-mapM (myHelpF1 alist) blist and Each BList item is checked for every AList-mapM (myHelpF2 alist) blist
-
Passing both lists to a function and using complex if / then / else and helper function calls (feels like I'm forcing Haskell - iterative; messy folded code, don't feel right.)
-
I was thinking about using a filter, character value AList element and Blist to create a third Bool list and count the number of True values. Update the Int value. Then use a BList filter to remove the BList elements that ...... (doesn't feel right again, not very Haskell-like.)
Things I think I know about the problem:
The solution can be trivial. So much so that more experienced Haskellers will mumble "what a noob" to themselves as they type in an answer.
Any pointers would be greatly appreciated. (mumble ....)
source to share
Several pointers:
Do not use [(Char, Int)]
for "AList". The data structure that you are looking for, is the final card: Map Char Int
. Especially look at member
and insertWith
. toList
and fromList
convert from the view you are currently for AList
, so even if you are stuck with this view, you can convert to Map
for this algorithm and convert back to the end. (This will be more efficient than staying in the list because you do so many searches and the final map API is easier to work with lists)
I would approach the problem as two steps: (1) partition
from elements blist
from whether they are in the display (2) insertWith
elements that are already on the map. Then you can return the received card and another section.
I would also get rid of nonsensical assumptions like keys Char
- you can just say that it is any type k
(for "key") that satisfies the constraints you want (which you can put in Map
, which requires Ord
erable). You do it with lowercase variables:
import qualified Data.Map as Map
sieveList :: (Ord k) => Map.Map k Int -> [k] -> (Map.Map k Int, [k])
Writing algorithms more generally help you catch errors because they don't allow any assumptions that you don't need.
Oh, also this program has no business in the monad IO
. This is clean code.
source to share
While I'm by no means a Haskell expert, I have a partial try that returns this operation result once. Maybe you can find out how to match it to the rest to get your solution. Also, this is smart as you want to update the first occurrence of an item in lista, if it exists twice, it will just add 0 to it. Criticism of the code is more than welcome.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
lista = ([('a', 2), ('b',5), ('a', 1), ('w', 21)] :: AList)
listb = ['a','a','c','g','a','w','b']
--step one, get the head, and its occurrences
items list = (eleA, eleB) where
eleA = length $ filter (\x -> x == (head list)) list
eleB = head list
getRidOfIt list ele = (dropWhile (\x -> x == ele) list) --drop like its hot
--add to lista
addWhile :: [(Char, Int)] -> Char -> Int -> [(Char,Int)]
addWhile [] _ _ = []
addWhile ((x,y):xs) letter times = if x == letter then (x,y+times) : addWhile xs letter times
else (x,y) : addWhile xs letter 0
--first answer
firstAnswer = addWhile lista (snd $ items listb) (fst $ items listb)
--[('a',5),('b',5),('a',1),('w',21)]
source to share
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
process :: AList -> BList -> AList
process [] _ = []
process (a:as) b = if is_in a b then (fst a,snd a + 1):(process as (delete (fst a) b)) else a:process as b where
is_in f [] = False
is_in f (s:ss) = if fst f == s then True else is_in f ss
*Main> process [('a',5),('b',5),('a',1),('b',21)] ['c','b','g','w','b']
[('a',5),('b',6),('a',1),('b',22)]
*Main> process [('a',5),('b',5),('a',1),('w',21)] ['c','g','w','b']
[('a',5),('b',6),('a',1),('w',22)]
Probably an important disclaimer: I'm rusty in Haskell to the point of ineptitude, but as a relaxing midnight exercise, I wrote this thing. It should do what you want, although it doesn't return BList. With a little change, you can get it to return a (AList, BList) tuple, but in my opinion you would be better off using an imperative language if this kind of manipulation is required.
Alternatively, there is a neat solution, and I am too ignorant of Haskell to know.
source to share
The operation you are describing is pure as @luqui points out, so we just define it as a pure Haskell function. It can be used inside a monad (including IO
) with fmap
(or do
).
import Data.List
combine alist blist = (reverse a, b4) where
First, we sort and count list B:
b = map (\g->(head g,length g)) . group . sort $ blist
We need an import for group
and sort
. Then we roll over alist
and do the following:
(a,b2) = foldl g ([],b) alist
g (acc,b) e@(x,c) = case pick x b of
Nothing -> (e:acc,b)
Just (n,b2) -> ((x,c+n):acc,b2)
b3 = map fst b2
b4 = [ c | c <- blist, elem c b3 ]
Now pick
as used should be
pick x [] = Nothing
pick x ((y,n):t)
| x==y = Just (n,t)
| otherwise = case pick x t of Nothing -> Nothing
Just (k,r) -> Just (k, (y,n):r)
Does pick
linear search of course, so if performance (speed) becomes an issue b
it should be modified to allow binary search (tree, etc., for example Map
). Calculation b4
, which is filter (`elem` b3) blist
, is another potential performance issue when re-checking availability in b3
. Again, checking for presence in trees is faster than in lists.
Testing:
> combine [('a', 2), ('b',5), ('a', 1), ('w', 21)] "aacgawb"
([('a',5),('b',6),('a',1),('w',22)],"cg")
edit: you probably want it the other way around, rolling around blist
, updating alist
and creating (or not) elements blist
as a result ( b4
in my code). This way the algorithm will work more locally on long input streams (assuming yours is blist
long, although you didn't say so). As written above, this will have a spatial problem consuming the input stream multiple times blist
. I'll keep this as an illustration, food for thought.
So, if you decide to go to the second route, convert alist
to a map first ( beware of duplicates! ). Then scan (s scanl
) over blist
, use updateLookupWithKey
to update the readout map and at the same time decide for each member blist
, one by one, whether to output it or not. So the accumulator type should be (Map a Int, Maybe a)
, with a
your element type ( blist :: [a]
):
scanl :: (acc -> a -> acc) -> acc -> [a] -> [acc]
scanning = tail $ scanl g (Nothing, fromList $ reverse alist) blist
g (_,cmap) a = case updateLookupWithKey (\_ c->Just(c+1)) a cmap of
(Just _, m2) -> (Nothing, m2) -- seen before
_ -> (Just a, cmap) -- not present in counts
new_b_list = [ a | (Just a,_) <- scanning ]
last_counts = snd $ last scanning
You will need to merge toList last_counts
with the original alist
one if you need to keep old duplicates (why would you?).
source to share