Using substitution probabilities to improve position-specific scoring matricesComput. Appl. Biosci., Vol. 12, No. 2. (1 April 1996), pp. 135-143.
|
Reviews
[Write a review of this article]
There are no reviews of this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
AbstractEach column of amino acids in a multiple alignment of protein sequences can be represented as a vector of 20 amino acid counts. For alignment and searching applications, the count vector is an imperfect representation of a position, because the observed sequences are an incomplete sample of the full set of related sequences. One general solution to this problem is to model unobserved sequences by adding artificial pseudo-counts' to the observed counts. We introduce a simple method for computing pseudo-counts that combines the diversity observed in each alignment position with amino acid substitution probabilities. In extensive empirical tests, this position-based method out-performed other pseudo-count methods and was a substantial improvement over the traditional average score method used for constructing profiles. 10.1093/bioinformatics/12.2.135
BibTeX record
RIS record