Difficulty calculating the similarity between two sequences

Question

Difficulty calculating the similarity between two sequences

What is the computational complexity of the most well-known algorithm for computing the similarity between two sequences (as in DNA or protein alignment / string approximation)?

The similarity is based on:

by adjusting the alignment using substitution substitution matrices (for global or positional substitutions of 20 characters in the protein alphabet or 4 characters in the DNA alphabet)
Click Penalty

Is the Burrows-Wheeler linear time transform used in Bowtie and BWA short-term readers equalizing the actual state of affairs, or are there sublinear algorithms the same problem?

[Edit]: thinking about applying LSH for an approximate match that will be sublinear, assuming preprocessing / indexing of the referenced dataset

+3

algorithm complexity-theory bioinformatics dna-sequence

alex 09 Feb At 3:01 am

source to share

1 answer

zad · Answer 1 · 2013-02-09T03:14:12+0000

I assume that at some point you will finish reading the entire sequence so that there is no sublinear time algorithm.

Difficulty calculating the similarity between two sequences

More articles: