Remove double values ​​in an array using Python

I have a problem with a small Python script I wrote. Shortinfo:

What I have: an array consisting of arrays consisting of integers:

finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]

      

What I want from this program: iterate over all subarrays, sort them and remove double values ​​(e.g. 52 in sub-range 0, 154 in sub-array 1 and 43 in sub-array 2)

My script:

finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
print "\n"

print "list:",finalSubLines,"\n========================"

for i in range (0, len(finalSubLines)):
    finalSubLines[i].sort()
    print "iterate until index:",len(finalSubLines[i])-2
    print "before del-process:",finalSubLines[i]
    for j in range (0, len(finalSubLines[i])-2):
        print "j=",j
        if finalSubLines[i][j] == finalSubLines[i][j+1]:
            del finalSubLines[i][j]     
    print "after del-process:",finalSubLines[i]
    print "-------------"
print finalSubLines

      

This is what I get:

Problem:

  • I don't understand why the range of the first for-loop is len (finalSubLines) and not len ​​(finalSubLines) -1. I tried the last one, but then the last sub-array was not reached.
  • The main problem is that the second for-loop will not reach all the elements (blue rectangles in the image). Therefore the value 154 will not be removed (see the red rectangle in the picture). Why is this happening?

There might be an easier way to get what I need, but since I'm completely new to scripting I don't know better ^^

+3


source to share


3 answers


An easy way would be to use sets:

[sorted(set(sub)) for sub in finalSubLines]

      

Demo:



>>> finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
>>> [sorted(set(sub)) for sub in finalSubLines]
[[0, 44, 52, 57], [12, 25, 154], [41, 42, 43, 74]]

      

Your loop doesn't account for the fact that by removing elements, your lists get shorter; what was once in the index i + 1

moved to the index i

, but your loop happily jumps to the index i + 1

, and the value that was once in i + 2

. This way you are missing items.

See Loop "Forgets" to remove some of the items for a more detailed description of what's going on.

+3


source


If you really want to remove items that appear exactly twice:



finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]

from collections import Counter
counts = [Counter(sub) for sub in finalSubLines]

print([sorted(k for k,v in c.iteritems() if v != 2) for c in counts])

      

+1


source


Try the following:

tempList=[]
finalSubLines = [[0,44,52,52,57],[12,154,25,154],[41,42,43,43,74]]
for item in finalSubLines:
    tempList.append(sorted(set(sorted(item))))
print tempList,

      

0


source







All Articles