therapies. However, analysis of proteomes is much more complicated and challenging than sequencing genomes for three reasons. First, in higher eukaryotes, a single gene often produces many different forms of the protein including alternative spliceoforms and diverse post-trans-lational modifications of the protein such as N-terminal processing, glycosylation, phosphorylation, ubiquitination, and sulfation, and these different forms of the proteins usually have important functional differences. Second, genomes are largely static during the lifetime of a cell or organism, but proteomes are highly dynamic and can change substantially, often over very short time scales, due to external stimuli and during development, growth, and aging. One frequently cited illustration of the dramatic impact of proteome changes in a single organism is the comparison between a caterpillar and its development into a butterfly— i.e., a single genome can result in two different proteomes with strikingly different pheno-types. Although sequencing a genome is a finite problem, the constant dynamics of proteomes means that an essentially infinite number of proteomes can be produced during the lifetime of a cell or organism. Third, the technologies available for analyzing proteomes are currently not as robust as nucleic acid based methods and further improvements are needed to optimally analyze the most complex proteomes.


As the above discussion suggests, the challenges in proteome analysis increase in higher organisms. Most prokaryotes have simple genomes ranging from less than 1,000 open reading frames (ORFs) to about 4,500 ORFs, and there is minimal post-translational processing of proteins in these systems. Therefore, the maximum number of unique proteins that can be produced is close to the number of ORFs, and these proteomes are relatively simple, limiting the maximum potential complexity of these proteomes. Yeast, the simplest eukaryote, is more complex, with over 6,000 ORFs and moderate amounts of post-translational processing. In contrast, human proteomes are extremely complex with between 35,000 and 50,000 ORFs (not all ORFs have been identified at present), alternative splicing of many mRNAs, and extensive, variable post-transla-tional modifications of many proteins. It is too early in the history of proteome analysis of human cells and tissues to accurately estimate the total number of structurally and functionally distinct protein components that can be produced from a single genome, but a reasonable estimate is probably on the order of 1 million or more distinct components. The average number of unique protein components that occur at any given time in a single human cell type is similarly not well defined, but estimates of 20,000 to 50,000 or more seem likely for most cells.

Wide differences in protein abundance levels (dynamic range) in most cells and organisms further complicate protein profile comparisons because the dynamic range of current separation and detection methods is more limited than the range of protein abundances in most pro-teomes. The range of protein abundances in even simple prokaryotes will often exceed the dynamic range of protein-profiling methods. Protein copy numbers in a single cell will frequently range from about 10,000,000 copies for major structural proteins such as actin to <100 copies for low-abundance regulatory proteins and some enzymes, for a dynamic range of >105. Biological fluids usually have even wider dynamic ranges than cells and tissues. Human plasma or serum has a few major proteins that comprise more than 90% of the total protein in the sample. Albumin, the most abundant protein, is present at about 40 mg/ml while cytok-ines and other low-abundance proteins are present at a few pg/ml for a dynamic range of about 1010.

The first critical factor in a successful pro-teome analysis experiment is selection of an experimental design that minimizes protein changes unrelated to the biological question (i.e., protein profile "noise") that is being addressed. This requires selecting samples for comparison where the only factor that should affect protein levels is the desired experimental parameter, and sample processing conditions must be used that do not introduce artifactual changes in protein profiles. In this context it is important to recognize that most experimental manipulations, including those reflecting a biological question or those caused by poorly controlled experimental parameters, will typically result in quantitative changes of many proteins rather than just a few proteins. Extraneous changes introduced due to faulty experimental design can result in much wasted effort spent in unnecessary protein identification and characterization. But most importantly, it is often difficult or impossible to easily distinguish between observed changes caused by the biologi cal question being tested versus changes caused by experimental design problems.

One example of an experiment that would probably have extensive extraneous protein profile noise would be evaluation of protein changes related to tumor progression by comparing different cancer cell lines derived from different-stage tumors. Because most cancer cells have impaired DNA repair and apoptosis control, cell lines are usually aneuploid (having an abnormal number of chromosomes) and have extensive random chromosomal translocations that result in very different basic protein profiles in which many changes may not be related to stage of tumor development. Also, when working with cell lines, it is essential that growth conditions and external signals, such as extent of confluency, pH, nutrient depletion, and pH changes be maintained as consistently as possible to avoid introducing unnecessary changes into protein profiles. Sample harvesting and processing are other key steps, and potential variable artifacts that can occur at these stages include differing extents of solu-bilization, proteolysis, oxidation, and dephos-phorylation.

For many studies involving higher organisms, direct analysis of tissue specimens rather than cells in culture frequently has the advantage of more closely reflecting events in the intact organism. However, the presence of multiple cell types in a specimen further increases the number of protein components present in a sample, and if the proportions of cell types change between compared samples, many observed protein changes may be due to the differing proportions of cell types rather than to changes within the targeted cell type. To circumvent or at least minimize this problem, laser capture microdissection, fluorescence activated cell sorting (FACS), and magnetic bead capture are some of the maj or methods that have been developed to isolate more homogenous cell populations from tissues.

Was this article helpful?

0 0
Healthy Chemistry For Optimal Health

Healthy Chemistry For Optimal Health

Thousands Have Used Chemicals To Improve Their Medical Condition. This Book Is one Of The Most Valuable Resources In The World When It Comes To Chemicals. Not All Chemicals Are Harmful For Your Body – Find Out Those That Helps To Maintain Your Health.

Get My Free Ebook

Post a comment