

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
java.lang.Object org.jscience.biology.alignment.PairwiseAlignmentAlgorithm org.jscience.biology.alignment.NeedlemanWunsch
public class NeedlemanWunsch
This class implements the classic global alignment algorithm (with linear gap penalty function) due to S.B.Needleman and C.D.Wunsch (1970).
It is based on a dynamic programming approach. The idea consists of, given two sequences A and B of sizes n and m, respectively, building an (n+1 x m+1) matrix M that contains the similarity of prefixes of A and B. Every position M[i,j] in the matrix holds the score between the subsequences A[1..i] and B[1..j]. The first row and column represent alignments with spaces.
Starting from row 0, column 0, the algorithm computes each position M[i,j] with the following recurrence:
M[0,0] = 0 M[i,j] = max { M[i,j1] + scoreInsertion (B[j]), M[i1,j1] + scoreSubstitution (A[i], B[j]), M[i1,j] + scoreDeletion(A[i]) }
In the end, the value at the last position (last row, last column) will contain
the similarity between the two sequences. This part of the algorithm is accomplished
by the computeMatrix
method. It has quadratic space complexity
since it needs to keep an (n+1 x m+1) matrix in memory. And since the work of computing
each cell is constant, it also has quadratic time complexity.
After the matrix has been computed, the alignment can be retrieved by tracing a path
back in the matrix from the last position to the first. This step is performed by
the buildOptimalAlignment
method, and since the path can
be roughly as long as (m + n), this method has O(n) time complexity.
If the similarity value only is needed (and not the alignment itself), it is easy to
reduce the space requirement to O(n) by keeping just the last row or column in memory.
This is precisely what is done by the computeScore
method. Note
that it still requires O(n^{2}) time.
For a more efficient approach to the global alignment problem, see the CrochemoreLandauZivUkelson algorithm. For local alignment, see the SmithWaterman algorithm.
SmithWaterman
,
CrochemoreLandauZivUkelson
,
CrochemoreLandauZivUkelsonLocalAlignment
,
CrochemoreLandauZivUkelsonGlobalAlignment
Field Summary  

protected int[][] 
matrix
The dynamic programming matrix. 
protected CharSequence 
seq1
The first sequence of an alignment. 
protected CharSequence 
seq2
The second sequence of an alignment. 
Fields inherited from class org.jscience.biology.alignment.PairwiseAlignmentAlgorithm 

alignment, APPROXIMATE_MATCH_TAG, GAP_CHARACTER, GAP_TAG, MATCH_TAG, MISMATCH_TAG, score, score_computed, scoring, sequences_loaded, use_match_tag 
Constructor Summary  

NeedlemanWunsch()

Method Summary  

protected PairwiseAlignment 
buildOptimalAlignment()
Builds an optimal global alignment between the loaded sequences. 
protected void 
computeMatrix()
Computes the dynamic programming matrix. 
protected PairwiseAlignment 
computePairwiseAlignment()
Builds an optimal global alignment between the loaded sequences after computing the dynamic programming matrix. 
protected int 
computeScore()
Computes the score of the best global alignment between the two sequences using the scoring scheme previously set. 
protected void 
loadSequencesInternal(java.io.Reader input1,
java.io.Reader input2)
Loads sequences into CharSequence instances. 
protected void 
unloadSequencesInternal()
Frees pointers to loaded sequences and the dynamic programming matrix so that their data can be garbage collected. 
Methods inherited from class org.jscience.biology.alignment.PairwiseAlignmentAlgorithm 

getPairwiseAlignment, getScore, loadSequences, max, max, max, scoreDeletion, scoreInsertion, scoreSubstitution, setScoringScheme, unloadSequences, useMatchTag 
Methods inherited from class java.lang.Object 

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 
Field Detail 

protected CharSequence seq1
protected CharSequence seq2
protected int[][] matrix
seq1
and j characters of
seq2
.
Constructor Detail 

public NeedlemanWunsch()
Method Detail 

protected void loadSequencesInternal(java.io.Reader input1, java.io.Reader input2) throws java.io.IOException, InvalidSequenceException
CharSequence
(please
check the specification of that class for specific requirements).
loadSequencesInternal
in class PairwiseAlignmentAlgorithm
input1
 Input for first sequenceinput2
 Input for second sequence
java.io.IOException
 If an I/O error occurs when reading the sequences
InvalidSequenceException
 If the sequences are not validCharSequence
protected void unloadSequencesInternal()
unloadSequencesInternal
in class PairwiseAlignmentAlgorithm
PairwiseAlignmentAlgorithm.unloadSequences()
protected PairwiseAlignment computePairwiseAlignment() throws IncompatibleScoringSchemeException
buildOptimalAlignment
method
after the computeMatrix
method computes the dynamic programming
matrix.
computePairwiseAlignment
in class PairwiseAlignmentAlgorithm
IncompatibleScoringSchemeException
 If the scoring scheme is not compatible
with the loaded sequences.computeMatrix()
,
buildOptimalAlignment()
protected void computeMatrix() throws IncompatibleScoringSchemeException
IncompatibleScoringSchemeException
 If the scoring scheme is not compatible
with the loaded sequences.protected PairwiseAlignment buildOptimalAlignment() throws IncompatibleScoringSchemeException
computeMatrix
method.
IncompatibleScoringSchemeException
 If the scoring scheme
is not compatible with the loaded sequences.computeMatrix()
protected int computeScore() throws IncompatibleScoringSchemeException
computeScore
in class PairwiseAlignmentAlgorithm
IncompatibleScoringSchemeException
 If the scoring scheme is not compatible
with the loaded sequences.PairwiseAlignmentAlgorithm.getScore()


PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 