Remove double values in an array using Python

Question

Remove double values in an array using Python

I have a problem with a small Python script I wrote. Shortinfo:

What I have: an array consisting of arrays consisting of integers:

finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]

What I want from this program: iterate over all subarrays, sort them and remove double values (e.g. 52 in sub-range 0, 154 in sub-array 1 and 43 in sub-array 2)

My script:

finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
print "\n"

print "list:",finalSubLines,"\n========================"

for i in range (0, len(finalSubLines)):
    finalSubLines[i].sort()
    print "iterate until index:",len(finalSubLines[i])-2
    print "before del-process:",finalSubLines[i]
    for j in range (0, len(finalSubLines[i])-2):
        print "j=",j
        if finalSubLines[i][j] == finalSubLines[i][j+1]:
            del finalSubLines[i][j]     
    print "after del-process:",finalSubLines[i]
    print "-------------"
print finalSubLines

This is what I get:

Problem:

I don't understand why the range of the first for-loop is len (finalSubLines) and not len (finalSubLines) -1. I tried the last one, but then the last sub-array was not reached.
The main problem is that the second for-loop will not reach all the elements (blue rectangles in the image). Therefore the value 154 will not be removed (see the red rectangle in the picture). Why is this happening?

There might be an easier way to get what I need, but since I'm completely new to scripting I don't know better ^^

+3

python arrays

Elias Dec 21 14 at 13:08

source to share

3 answers

Martijn pieters · Answer 1 · 2014-12-21T13:11:21+0000

An easy way would be to use sets:

[sorted(set(sub)) for sub in finalSubLines]

Demo:

>>> finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
>>> [sorted(set(sub)) for sub in finalSubLines]
[[0, 44, 52, 57], [12, 25, 154], [41, 42, 43, 74]]

Your loop doesn't account for the fact that by removing elements, your lists get shorter; what was once in the index i + 1

moved to the index i

, but your loop happily jumps to the index i + 1

, and the value that was once in i + 2

. This way you are missing items.

See Loop "Forgets" to remove some of the items for a more detailed description of what's going on.

Padraic cunningham · Answer 2 · 2014-12-21T13:20:54+0000

If you really want to remove items that appear exactly twice:

finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]

from collections import Counter
counts = [Counter(sub) for sub in finalSubLines]

print([sorted(k for k,v in c.iteritems() if v != 2) for c in counts])

Ash · Answer 3 · 2014-12-21T14:17:22+0000

Try the following:

tempList=[]
finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
for item in finalSubLines:
    tempList.append(sorted(set(sorted(item))))
print tempList,

Remove double values ​​in an array using Python

More articles:

Remove double values in an array using Python