Creating lists of anagrams from a list of words

I want to find a list of lists of anagrams from a list of words. Should I use a different loop in my code or recursion?

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

new_list = [some_list[0]]
i = 0
while i+1 < len(some_list):
    if (''.join(sorted(some_list[0]))) == (''.join(sorted(some_list[i+1]))):
        new_list.append(some_list[i+1])
        i = i+1
    else:
        i = i+1

print(new_list)

      


  • My conclusion ['bad', 'dab', 'bda', 'dba']

    . But I also want more lists of other anagrams from some_list

    .

I want the result to be: - ['app', 'ppa']

 - ['bad', 'dab', 'bda', 'dba']

 -['sad', 'das']

+3


source to share


6 answers


I recommend that you write Python, not Java or any other language that you imitate. Here's your main Python code, with a regular loop and no extra stuff:

new_list = [some_list[0]]
for word in some_list[1:]:
    if sorted(some_list[0]) == sorted(word):
        new_list.append(word)

      

I don't see any use for recursion, but yes, you could wrap an outer loop around this to find other groups of anagrams.


Although I would do it using the helpful itertools.groupby :

for _, group in groupby(sorted(some_list, key=sorted), sorted):
    group = list(group)
    if len(group) > 1:
        print(group)

      

What prints:



['bad', 'dab', 'bda', 'dba']
['sad', 'das']
['app', 'ppa']

      


Alternative solution for modified question when sorting groups:

groups = (list(group) for _, group in groupby(sorted(some_list, key=sorted), sorted))
print([group for group in sorted(groups) if len(group) > 1])

      

Output:

[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

      

+5


source


Your problem is that you are looping once over your list as you need to loop over all the words.

But I suggest another way for this task, you can use the itertools.groupby

sorted function as well with operator.itemgetter

:

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

from operator import itemgetter
from itertools import groupby 
s=sorted([(i,''.join(sorted(j))) for i,j in enumerate(some_list)],key=itemgetter(1))
inds= [zip(*g)[0] for _,g in groupby(s,itemgetter(1))]
print [itemgetter(*i)(some_list) for i in inds]

      

Result:

[('bad', 'dab', 'bda', 'dba'), 'mad', ('sad', 'das'), ('app', 'ppa'), 'pge']

      

All I have done here is create a list of sorted words with these indices using sorted

and enumerate

:



sorted([(i,''.join(sorted(j))) for i,j in enumerate(some_list)],key=itemgetter(1))
[(0, 'abd'), (4, 'abd'), (6, 'abd'), (9, 'abd'), (3, 'adm'), (2, 'ads'), (8, 'ads'), (1, 'app'), (7, 'app'), (5, 'egp')]

      

then we need to group these pairs based on the second element and get the first element (indices), so we will have the following list of tuples:

[(0, 4, 6, 9), (3,), (2, 8), (1, 7), (5,)]

      

that each tuple contains word indices, that these sorted views are the same.

and finally, all you need is to collect the elements of the main list based on the previous indices:

[itemgetter(*i)(some_list) for i in inds]
[('bad', 'dab', 'bda', 'dba'), 'mad', ('sad', 'das'), ('app', 'ppa'), 'pge']

      

+3


source


1) Create a function anagrams(word)

that returns a list of anagrams for one word like your code.
2) a map

function above your wordlist.

0


source


Here's the solution:

from itertools import groupby
some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']
some_list_ordered = map( lambda x : "".join( sorted( x) ), some_list )
some_lists = sorted(zip( some_list_ordered, some_list ) )
anagrams = filter( lambda x : len( x ) > 1, [ zip(*v)[1]  for k,v in groupby( some_lists, lambda x : x[0] ) ] )    

for a in anagrams:
    print a

#('bad', 'bda', 'dab', 'dba')
#('das', 'sad')
#('app', 'ppa')

      

0


source


A natural way to do it, if you can afford the extra Slovak memory overhead, it seems to me:

words = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

anagrams = {}
for word in words:
    sword = ''.join(sorted(word))
    try:
        anagrams[sword].append(word)
    except KeyError:
        anagrams[sword] = [word]

anagrams_list = [v for v in anagrams.values() if len(v) > 1]
print anagrams_list

      

Output:

[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

      

EDIT: As mentioned in the comment below, you can replace the block try...except

with a method dict

setdefault

if the syntax doesn't bother you:

words = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

anagrams = {}
for word in words:
    sword = ''.join(sorted(word))
    anagrams.setdefault(sword, []).append(word)

anagrams_list = [v for v in anagrams.values() if len(v) > 1]
print anagrams_list

      

0


source


You can group words in a dict using the sorted word as a key, filtering out words whose values ​​do not have at least two elements, using OrderedDict to preserve order:

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']


from collections import OrderedDict

od = OrderedDict()
for ele in some_list:
    srt = "".join(sorted(ele))
    od.setdefault(srt,[]).append(ele)

print(filter(lambda x: len(x) > 1, od.values()))


[['bad', 'dab', 'bda', 'dba'], ['app', 'ppa'], ['sad', 'das']]

      

Or using a loop and adding to the list, using the temp list to collect common words:

new_list = []
from collections import OrderedDict
for ele in OrderedDict.fromkeys("".join(sorted(ele)) for ele in some_list):
    temp = []
    for s in some_list:
        if ele == ''.join(sorted(s)):
            temp.append(s)
    if len(temp) > 1:
        new_list.append(temp)

      

If the order doesn't matter the defaultdict will be more efficient:

from collections import defaultdict

d = defaultdict(list)
for ele in some_list:
    d[''.join(sorted(ele))].append(ele)

print(filter(lambda x: len(x) > 1, d.values()))
[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

      

0


source







All Articles