Home arrow Site Navigation arrow FAQs and HowTo's arrow Sequence Analysis arrow Using phred/polyphred/phrap/consed
Using phred/polyphred/phrap/consed | Print |
Become familiar with basic UNIX commands for listing files, moving around the directory structure and viewing output files. There are numerous free Tutorials available on the web – go to Google and search for ‘Unix Tutorials’ to find some of these.

Using UNIX:

There are many sources of UNIX tutorials.
A few of these are:
http://www.ee.surrey.ac.uk/Teaching/Unix
http://www.math.utah.edu/lab/unix/unixtutorial.html http://www.unixtools.com/tutorials.html

There are some short Quicktime movies accessible from the UA
Computer-Based Training Site (type UNIX in the ‘Find a Course’ box):
http://uacbt.arizona.edu/default.htm  (for U of A, you do not need to purchase these! You only need your UA NetID/password).

Trimming file names:

There is a script named trimlabel that will trim the first letter and two digits from a filename. It will rename all files in the current directory.

Setting up to run phred/polyphred/phrap/consed:

A set of directories must be created, and the script mkpolydirs will do this for you. Then move your chromatogram files into the chromat_dir.

Using phred to improve base-calling:

phred -id chromat_dir -trim_alt "" -trim_scf -cd scf_dir

The scf_dir will contain SCF (Staden compressed format) chromatograms that also contain the phred quality info and base calls. These files are about half the size of the original chromatograms. To save disk space you can gzip the original chromatograms or move them to another system. The SCF files can be imported into Sequencher. To use them with polyphred, phrap, and consed, run the scf2chromat script - this renames the original chromat directory (if it is still present) and renames the scf_dir as chromat_dir as required by phrap, etc.

Running phred/polyphred/phrap/consed:

Move down into the edit_dir (using the command cd edit_dir) and type polyphredPhrap. Lots of output will scroll by and when it's finished run consed. In consed open the most recent assembly, then double click on a contig. From the contig view, Navigate to Tags and select polyPhredRank1. This will let you view each polymorphic site. Repeat for Rank2, 3, etc. NOTE: if your sequences are purely homozygous (e.g. cloned DNA or mitochondrial sequences) then refcomp is more appropriate than polyphred.

consed Output:

This appears to be quite limited, in that you can only output consensus sequences or an ace format assembly, which doesn't nicely show aligned sequences. It might be possible to use BioPerl to write a script that would produce a nicer output.

polyphred output

In the edit_dir you will see a file with its name ending in ‘.polyphred.out’ and that file contains detailed information on predicted polymorphisms. The BEGIN_POLY tag marks a region that lists the position, flanking sequence, and rank of each polymorphism. (Higher ranks are more likely to be polymorphisms). The BEGIN_GENOTYPE region lists position in the consensus, position in the read, name of the read, two most significant bases and rank. To change the sensitivity of polyphred, read the complete documentation for available options.