Correcting values ββin a dictionary
I created a dictionary using this code:
import collections
exons = collections.defaultdict(list)
with open('test_coding.txt') as f:
for line in f:
chrom, start, end, isoform = line.split()
exons[isoform].append((int(start), int(end)))
This code creates a dictionary that looks like this:
{'NM_100': [(75, 90), (100, 120)], 'NM_200': [(25, 50), (55, 75), (100, 125), (155, 200)]})
from this file:
chr1 75 90 NM_100
chr1 100 120 NM_100
chr2 25 50 NM_200
chr2 55 75 NM_200
chr2 100 125 NM_200
chr2 155 200 NM_200
What I want to do is subtract the first value in the list (in the first case 75 and 25 for the second case) from any other value in that particular list for the desired output:
{'NM_100': [(0, 15), (25, 45)], 'NM_200': [(0, 25), (30, 50), (75, 100), (130, 175)]})
I thought I needed to create my dictionary in an alternative way. It looks like below, but I can't get this function to work correctly.
def read_exons(line):
parts = iter(line.split()) #I think the problem is here
chrom = next(parts)
start = next(parts)
end = next(parts)
base = start[0] #and here
return name, [(s-base, e-base) for s, e in zip(start, end)]
with open('testing_coding.txt') as f:
exons = dict(read_exons(line) for line in f
if not line.strip().startswith('#'))
Any suggestions?
source to share
If you really want to do this conversion, when you read the file, you can create another dictionary containing the key as isoform
well as the value as the first value in the list, and then continue deleting it.
The problem with trying to do this without a separate dictionary or list is that if you subtract for the first line, then for all other values ββthat are read, you get a subtraction 0
, which is the new value of the first element. Or you have to create a dict first and then iterate over to do the subtraction.
Example -
import collections
exons = collections.defaultdict(list)
firstvalues = {}
with open('test_coding.txt') as f:
for line in f:
chrom, start, end, isoform = line.split()
if isoform not in firstvalues:
firstvalues[isoform] = int(start)
exons[isoform].append((int(start) - firstvalues[isoform], int(end) - firstvalues[isoform]))
source to share
My approach is to store the element that you want to subtract in each iteration and then apply it with a function map
, very simple and store the result in the same dictionary:
exons = {'NM_100': [(75, 90), (100, 120)], 'NM_200': [(25, 50), (55, 75), (100, 125), (155, 200)]}
for k,v in exons.items():
x = d1[k][0][0] #Saving the first element of first tuple of each list
for i,t in enumerate(v):
exons[k][i] = tuple(map(lambda s: s-x, t)) #just to conserve the original format of your exons dictionany
Output:
>>> exons
{'NM_100': [(0, 15), (25, 45)], 'NM_200': [(0, 25), (30, 50), (75, 100), (130, 175)]}
source to share