新規登録 | ログイン | FAQ      [?] 
CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Recent | Unread | Search | Authors | Tags | Export

Probabilistic and statistical properties of words: An overview

by: G Reinert, S Schbath, MS Waterman
Journal of Computational Biology, Vol. 7, No. 1-2. (2000), pp. 1-46.


View FullText article


X Reviews [Write a review of this article]

There are no reviews of this article

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Abstract

In the following, an overview is given on statistical and probabilistic properties of words, as occurring in the analysis of biological sequences. Counts of occurrence, counts of clumps, and renewal counts are distinguished, and exact distributions as well as normal approximations, Poisson process approximations, and compound Poisson approximations are derived. Here, a sequence is modelled as a stationary ergodic Markov chain; a test for determining the appropriate order of the Markov chain is described. The convergence results take the error made by estimating the Markovian transition probabilities into account. The main tools involved are moment generating functions, martingales, Stein's method, and the Chen-Stein method. Similar results are given for occurrences of multiple patterns, and, as an example, the problem of unique recoverability of a sequence from SBH chip data is discussed. Special emphasis lies on disentangling the complicated dependence structure between word occurrences, due to self-overlap as well as due to overlap between words. The results can be used to derive approximate, and conservative, confidence intervals for tests.


X BibTeX record

X RIS record



RIS BibTeX
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.