Understanding Python Function
I need help understanding the function I want to use, but I'm not really sure what some parts of it are. I understand that the function creates dictionaries from reading from Fasta file. From what I understand, it is assumed that it will generate vocabulary and suffix dictionaries eventually expanding the ends (overlapping dna sequences). Code:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
dict = {}
multipleKeys = []
i = 1
for read in reads:
if read[0:lenKeys] in dict:
multipleKeys.append(read[0:lenKeys])
else:
dict[read[0:lenKeys]] = read[lenKeys:]
if verbose:
print("\rChecking suffix", i, "of", len(reads), end = "", flush = True)
i += 1
for key in set(multipleKeys):
del(dict[key])
if verbose:
print("\nCreated", len(dict), "suffixes with length", lenSuffix, \
"from", len(reads), "Reads. (", len(reads) - len(dict), \
"unambigous)")
return(dict)
Additional Information: reads = readFasta("smallReads.fna", verbose = True)
This is how the function is called:
if __name__ == "__main__":
reads = readFasta("smallReads.fna", verbose = True)
suffixDicts = makeSuffixDicts(reads, 10)
The smallReads.fna file contains radix strings (Dna):
"> read 1
TTATGAATATTACGCAATGGACGTCCAAGGTACAGCGTATTTGTACGCTA
"> read 2
AACTGCTATCTTTCTTGTCCACTCGAAAATCCATAACGTAGCCCATAACG
"> read 3
TCAGTTATCCTATATACTGGATCCCGACTTTAATCGGCGTCGGAATTACT
Here are the parts I don't understand:
lenKeys = len(reads[0]) - lenSuffix
What does the value [0] mean? From what I understand, "len" returns the number of elements in the list. Why "automatically reads" the list? edit: It seems Fasta file can be declared as List. Can anyone confirm this?
if read[0:lenKeys] in dict:
Does this mean "from 0 to" lenKeys "? Still confusing the meaning. There is a similar line in another function: if read[-lenKeys:] in dict:
What does" - "do?
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
Here I don't understand parameters: how to reads
be a parameter? What is lenSuffix = 20
in the context of this function other than the value subtracted from len(reads[0])
? What is verbosity? I've read about "verbose-mode" ignoring spaces, but I've never seen it used as a parameter and then as a variable.
source to share
The tone of your question makes me feel that you are confusing things such as software functions ( len
functions and so on) with the things that have been identified source programmer (type reads
, verbose
etc.).
def some_function(these, are, arbitrary, parameters):
pass
This function defines a set of parameters. They mean nothing, except for the meaning that I implicitly give them. For example, if I:
def reverse_string(s):
pass
s
probably a string, right? In your example, we have:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
...
From these two lines, we can conclude a few things:
- the function will probably return a dictionary (on behalf of it)
-
lenSuffix
isint
, butverbose
isbool
(by their default parameters) -
reads
can be indexed (string? list? tuple?) - elements inside
reads
have length (string? list? tuple?)
Since Python is dynamically typed, this is ALL WE CAN KNOW about a function so far. The rest is explained by its documentation or the way it was called.
That said, let me put all your questions in order:
- What does the value [0] mean?
some_object[0]
captures the first item in the container.[1,2,3][0] == 1
,"Hello, World!"[0] == "H"
. This is called indexing and is controlled by a magic method.__getitem__
- From what I understand, "len" returns the number of elements in the list.
len
is a built-in function that returns the length of an object. It is controlled by a magical method__len__
.len('abc') == 3
, alsolen([1, 2, 3]) == 3
. Note thatlen(['abc']) == 1
since it measures the length of the list, not the string inside it.
- Why "automatically reads" the list?
reads
is a parameter. This is about the call area. He seems to be expecting a list, but that's not a hard and fast rule!
- (various questions about the cut)
Slicing does
some_container[start_idx : end_idx [ : step_size]]
. This is pretty much what you would expect:"0123456"[0:3] == "012"
. Slicing indices are considered zero indices and lie between elements, therefore[0:1]
identical[0]
, except that the slice return lists, not individual objects (therefore'abc'[0] == 'a'
, but'abc'[0:1] == ['a']
). If you omit the starting or ending index, it is treated as the start or end of the string, respectively. I will not go into step size.Negative indices are counted from the back, so
'0123456'[-3:] == '456'. Note that
[- 0]is not the last value,
[- 1]is. This is contrasted with
[0] `is the first value.
- How to read a parameter?
Because the function is defined as
makeSuffixDict(reads, ...)
. This parameter...
- What is lenSuffix = 20 in the context of this function.
Similar to the length of the expected suffix!
- What is it
verbose
?
verbose
doesn't matter by itself. This is another parameter. It looks like the author has included a flagverbose
so that you can get the output while the function is running. Note that all blocksif verbose
don't seem to do anything, just provide feedback to the user.
source to share