Microbiology Publications and Other Works

Dynamics of domain coverage of the protein sequence universe

Bhanu Rekapalli, University of Tennessee - Knoxville
Kristin Wuichet, University of Tennessee - KnoxvilleFollow
Gregory D. Peterson, University of Tennessee - KnoxvilleFollow
Igor B. Zhulin, University of Tennessee - KnoxvilleFollow

Document Type

Article

Publication Date

11-16-2012

Abstract

Background

The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”.

Results

Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain.

Conclusions

Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data.

Recommended Citation

BMC Genomics 2012, 13:634 doi:10.1186/1471-2164-13-634

Download

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Included in

Microbiology Commons

COinS

Microbiology Publications and Other Works

Dynamics of domain coverage of the protein sequence universe

Document Type

Publication Date

Abstract

Background

Results

Conclusions

Recommended Citation

Included in

Search

Browse

Contributors

Links

About Trace

Microbiology Publications and Other Works

Dynamics of domain coverage of the protein sequence universe

Authors

Document Type

Publication Date

Abstract

Background

Results

Conclusions

Recommended Citation

Included in

Share

Search

Browse

Contributors

Links

About Trace