Installing the software
The TD software is written in C and allows to compute the transformation distance between DNA sequences. In order to obtain the soft, you may download C sources and compile them or directly an executable if it is available for your platform.

Compiling the sources

  • Untar and gunzip the archive : tar -xzvf td2.0.tgz (the directory td2.0 is created)
  • Go to the directory td2.0
  • Compile the soft with : make
TD is then available.
How does it work ?
The soft allows you to compute the transformation distance either from a set of DNA sequences (FASTA format) or from a set of already computed links (by the TD!).

Several file formats:

The method is based on the computation of a set of common factors, namely links. This set is automatically computed from a couple of sequences. It is possible to choose the minimal length of these links and to modify their structure with two parameters : a link is composed by a series of identity blocks, separated by errors blocs where substitutions are allowed. You can choose the minimal length of identity blocks and the maximum length or error blocks. The below example shows a link making of three identity blocks :

ATTCGtgGGCTCCGatgGGTGA

ATTCGctGGCTCCGttaGGTGA

The soft computes by default the upper triangular matrix of comparisons :

td seq1.seq seq2.seq seq3.seq
computes only : 1 vs. 2, 1 vs. 3 et 2 vs. 3. You may ask for the program to compute the entire matrix.

Parameters :

Option Required arguments Description Remarks
-f - Says that the data is a set of links -
-w [ 1 | 2 | 3 ] Chooses the weight function -
-l integer Minimal link length 8 by default
-b integer Minimal identity block length Must be less to l, 4 by default
-e integer Maximal error block length 0 by default
-m [ u | l | c | N ] Chooses which matrix to compute (u : upper triangular matrix, l : lower triangular matrix, c : all, N doesn't compute the matrix but only the set of links) u by default
-o filename ('auto' produces automatically a filename with parameters) Basename for results files The data filename by default
-V - verbose -
- - - -
Output
TD produces three results files :
  • XXXX.links : the set of links
  • XXXX.res : the set of comparisons with selected links
  • XXXX.matrix : the distance matrix (PHYLIP format)

td -l 6 -b 2 -e 1 -V multiseq.seq

produces those commentaries

GRAPH : SCRIPT GRAPH
WEIGHT: 1
AUTODELIM: 1
TYPE : fasta
MFL : 6
B : 2
E : 1
DATA : multiseq.seq
NBSEQ : 3
seq1 and seq2 , 188 / 11 / 11, (estimated time: 0)
MFL : 6, UPPER: 13 / 69 (0.000000) = 117,
seq1 and seq3 , 86 / 9 / 9, (estimated time: 0)
MFL : 6, UPPER: 11 / 30 (0.000000) = 50,
seq2 and seq3 , 143 / 10 / 10, (estimated time: 0)
MFL : 6, UPPER: 12 / 45 (0.000000) = 51,
Results printed in multiseq.seq.[links|res|matrix]

and the three files multiseq.seq.links, multiseq.seq.res et multiseq.seq.matrix

Please, send any comments to Jean-Stephane.Varré
2/12/2006 Jean-Stéphane Varré