Understanding Python Function
I need help understanding the function I want to use, but I'm not really sure what some parts of it are. I understand that the function creates dictionaries from reading from Fasta file. From what I understand, it is assumed that it will generate vocabulary and suffix dictionaries eventually expanding the ends (overlapping dna sequences). Code:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
dict = {}
multipleKeys = []
i = 1
for read in reads:
if read[0:lenKeys] in dict:
multipleKeys.append(read[0:lenKeys])
else:
dict[read[0:lenKeys]] = read[lenKeys:]
if verbose:
print("\rChecking suffix", i, "of", len(reads), end = "", flush = True)
i += 1
for key in set(multipleKeys):
del(dict[key])
if verbose:
print("\nCreated", len(dict), "suffixes with length", lenSuffix, \
"from", len(reads), "Reads. (", len(reads) - len(dict), \
"unambigous)")
return(dict)
Additional Information: reads = readFasta("smallReads.fna", verbose = True)
This is how the function is called:
if __name__ == "__main__":
reads = readFasta("smallReads.fna", verbose = True)
suffixDicts = makeSuffixDicts(reads, 10)
The smallReads.fna file contains radix strings (Dna):
"> read 1
TTATGAATATTACGCAATGGACGTCCAAGGTACAGCGTATTTGTACGCTA
"> read 2
AACTGCTATCTTTCTTGTCCACTCGAAAATCCATAACGTAGCCCATAACG
"> read 3
TCAGTTATCCTATATACTGGATCCCGACTTTAATCGGCGTCGGAATTACT
Here are the parts I don't understand:
lenKeys = len(reads[0]) - lenSuffix
What does the value [0] mean? From what I understand, "len" returns the number of elements in the list. Why "automatically reads" the list? edit: It seems Fasta file can be declared as List. Can anyone confirm this?
if read[0:lenKeys] in dict:
Does this mean "from 0 to" lenKeys "? Still confusing the meaning. There is a similar line in another function: if read[-lenKeys:] in dict:
What does" - "do?
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
Here I don't understand parameters: how to reads
be a parameter? What is lenSuffix = 20
in the context of this function other than the value subtracted from len(reads[0])
? What is verbosity? I've read about "verbose-mode" ignoring spaces, but I've never seen it used as a parameter and then as a variable.
The tone of your question makes me feel that you are confusing things such as software functions ( len
functions and so on) with the things that have been identified source programmer (type reads
, verbose
etc.).
def some_function(these, are, arbitrary, parameters):
pass
This function defines a set of parameters. They mean nothing, except for the meaning that I implicitly give them. For example, if I:
def reverse_string(s):
pass
s
probably a string, right? In your example, we have:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
...
From these two lines, we can conclude a few things:
- the function will probably return a dictionary (on behalf of it)
-
lenSuffix
isint
, butverbose
isbool
(by their default parameters) -
reads
can be indexed (string? list? tuple?) - elements inside
reads
have length (string? list? tuple?)
Since Python is dynamically typed, this is ALL WE CAN KNOW about a function so far. The rest is explained by its documentation or the way it was called.
That said, let me put all your questions in order:
- What does the value [0] mean?
some_object[0]
captures the first item in the container.[1,2,3][0] == 1
,"Hello, World!"[0] == "H"
. This is called indexing and is controlled by a magic method.__getitem__
- From what I understand, "len" returns the number of elements in the list.
len
is a built-in function that returns the length of an object. It is controlled by a magical method__len__
.len('abc') == 3
, alsolen([1, 2, 3]) == 3
. Note thatlen(['abc']) == 1
since it measures the length of the list, not the string inside it.
- Why "automatically reads" the list?
reads
is a parameter. This is about the call area. He seems to be expecting a list, but that's not a hard and fast rule!
- (various questions about the cut)
Slicing does
some_container[start_idx : end_idx [ : step_size]]
. This is pretty much what you would expect:"0123456"[0:3] == "012"
. Slicing indices are considered zero indices and lie between elements, therefore[0:1]
identical[0]
, except that the slice return lists, not individual objects (therefore'abc'[0] == 'a'
, but'abc'[0:1] == ['a']
). If you omit the starting or ending index, it is treated as the start or end of the string, respectively. I will not go into step size.Negative indices are counted from the back, so
'0123456'[-3:] == '456'. Note that
[- 0]is not the last value,
[- 1]is. This is contrasted with
[0] `is the first value.
- How to read a parameter?
Because the function is defined as
makeSuffixDict(reads, ...)
. This parameter...
- What is lenSuffix = 20 in the context of this function.
Similar to the length of the expected suffix!
- What is it
verbose
?
verbose
doesn't matter by itself. This is another parameter. It looks like the author has included a flagverbose
so that you can get the output while the function is running. Note that all blocksif verbose
don't seem to do anything, just provide feedback to the user.