As the process of tumor progression proceeds from the normal cellular state to a cancer genome anatomy project condition and finally to the fully invasive form, the molecular characteristics of the cell change as well. These characteristics can be considered a molecular fingerprint of the cell at each stage of progression and, analogous to fingerprinting a criminal, can be used as markers of the progression process. Based on this premise, the Cancer Genome Anatomy Project was initiated with the broad goal of determining the comprehensive molecular characterization of normal, premalignant, and malignant tumor cells, thus making a reality the identification of all major cancer genome anatomy project mechanisms leading to tumor initiation and progression [Strausberg, R, cancer genome anatomy project.
The expectation of determining the genetic fingerprints of cancer progression will allow for 1 correlation of disease progression with therapeutic outcome; 2 improved evaluation of disease treatment; 3 stimulation of novel approaches to prevention, detection, and therapy; and 4 enhanced diagnostic tools for clinical applications.
Whereas acquiring the comprehensive molecular analysis of cancer progression may take years, results from initial, short-term goals are currently being realized and are proving very fruitful. The first of the initial CGAP goals is to establish a Tumor Gene Index TGI to serve as a catalogue of all genes expressed in the cancer progression process, with special reference to tumor type and stage of progression http: Establishment of the TGI is being done by constructing cDNA libraries from pathological tissue followed by high-throughput library sequencing.
The TGI uses the existing infrastructure of california benefits dental plan santa ana and amounts to a catalogue of all the genes that are expressed across the entire spectrum of cancer amoxicillin uti children, with special attention to prostate, breast, ovarian, lung, and colon cancers.
A secondary goal in developing the TGI was to identify the remaining members of the unique human gene set, represented by UniGene set of genes [ 4,5 ]; http: Standard bulk tissue cDNA libraries from RNA derived from large tumor tissues function primarily to create a general picture of those genes expressed in the tumor process in addition to driving the process of gene discovery to aid the UniGene effort, cancer genome anatomy project. More than 80 bulk cancer genome anatomy project cDNA libraries normalized as well as non-normalized from a wide range of tumor types and histologies have been sequenced, and to date more thancancer genome anatomy project, ESTs have been deposited in the TGI.
In addition, more than 11, novel genes have been discovered thus far to supplement the UniGene set. Although these results have proven extremely useful, a serious drawback to the sequencing of bulk tissue cDNA libraries is the lack of gene expression information in the context of tumor biology.
This is primarily due to cellular heterogeneity found in bulk tissue. Histological examination reveals that the prostate is a complex organ comprising of multiple cell types. Yet it is the epithelium that gives rise to life-threatening prostate cancer [ 11,12 ].
Armed with this information, it is quite easy to understand why any attempts to determine a prostate epithelial-specific gene profile would fail by sequencing a bulk tissue cDNA library from a normal prostate gland. Furthermore, bulk tumor tissue will undoubtedly contain inflammatory, structural, and endothelial cells regardless of the percent of tumor cells in the tissue as determined histologically.
It was this realization that led to the development of laser capture microdissection LCM [ 13,14 ]. LCM is a process by which one is able to procure selected groups of cells, or even individual cells, from a heterogeneous population of cells in standard pathology preparations.
These libraries make it possible for the first time, to perform large-scale, cancer genome anatomy project, in vivo gene expression profiling from a specific cell type. To begin addressing the issue of gene expression and profiling in the process of prostate cancer progression, a total of 15 cDNA libraries have been constructed from normal epithelium, prostatic intraepithelial neoplasia PIN lesions, invasive tumor cells, and metastatic prostate lesions.
Many of these libraries were constructed from cells dissected from the same patient and pathology preparation. More than 30, clones have been sequenced from these libraries, representing UniGene clusters. Not only are these sequences useful for prostate tissue-specific and prostate cancer stage-specific expression analysis, they are useful for gene discovery, as evidenced by the establishment of greater than UniGene clusters. Thus these cancer genome anatomy project possess the potential to discover weakly expressed, tissue-specific, and cell-specific transcripts not easily found in bulk tissue libraries.
With the recent cancer genome anatomy project in genetic information available to the cancer researcher, it is apparent that useful bioinformatics packages need to be developed to address these issues.
The CGAP Website has been actively pursuing this endeavor in trying to deliver tools that would allow the individual investigator to tease out interesting gene expression data from all of the cDNA libraries that currently exist in the TGI. This utility uses the Fisher exact test [ 17 ] to compare one library to another, cancer genome anatomy project, a pool of libraries to a single library, or a pool of libraries against another pool.
This allows for flexibility in designing an experiment in silica and many questions can be asked using this function. For example, one may obtain a list of tissue-specific genes for the prostate by constructing several pools for the DDD to analyze http: Detailed instructions for using DDD can also be found at this site.
To find tissue-specific genes, a control pool should consist of libraries specific to several tissues different from each other and from the tissue of interest, and the pools of interest should contain libraries which are as narrowly focussed as possible.
In addition, a control pool should consist of several diverse libraries with many sequences. Choosing libraries too similar to each other cancer genome anatomy project the control pool for instance, several different libraries constructed from brain tissue would simply identify genes not expressed in brain tissue and not a superset of genes specific to prostate tissue.
A similar difficulty would arise were the control pool to contain libraries with few ESTs. One would obtain genes not expressed in the small control pool, of which the tissue-specific genes would cancer genome anatomy project a small fraction. Because the pools are easily edited, one may test to ensure that the results are independent of the choice of control pool by modifying the control pool at a later stage in the analysis.
Next, one chooses libraries for three prostate-specific pools; this is preferable to grouping the diverse libraries in a single pool because differences between the pools indicate the extent to which any gene is specific to normal, neoplastic, or preneoplastic tissue.
Because one of the pools is a diverse control, all prostate-specific genes expressed in one of the tissue-specific pools are listed; furthermore, differences between the pools are also listed. Note that comparing only prostate-specific normal and cancerous libraries would produce very few significant differences. Although the use of these as separate pools compared with alternative health care diabetes control pool displays differences that are not statistically significant, these genes would nonetheless amount to candidate genes involved in prostate cancer progression and may prove very useful to the cancer biologist as potential leads to experimental follow-up.
To see whether the differences found are due to idiosyncrasies of the libraries chosen, we can expand the pools by adding the following: There are two genes with significant differences between states in the small and large pools: Although not statistically significant according to the Fisher test, we note that kallikrein also has different expression levels in normal and precancerous tissues, as do several ribosomal proteins.
MSMB and kallikrein have been implicated in prostate cancer [ 18—21 ]. This suggests that UniGene clusters with similar expression profiles would be potential candidates for the molecular fingerprinting of the stages of prostate cancer.
Note that UniGene cluster identifiers, cancer genome anatomy project, although superficially very convenient as referents, are not guaranteed to be stable for archival purposes. This is because clusters may split or merge together with the cancer genome anatomy project of new sequences.
Thus, it is safest to store the list of accession numbers in a cluster of interest. Result page from the DDD analysis indicating 7 statistically significant prostate-specific transcripts. Many more transcripts were found, and those can be found on the DDD website as described in the text. The Fisher exact test, which is used to assess whether the difference in expression levels, is known to be conservative, cancer genome anatomy project.
It is therefore useful to have an independent tool to examine differences in expression level. The gene expression comparison utility http: One difficulty particularly relevant in seeking novel ESTs is the observation that the Fisher exact test will not find a significant difference in expression levels for small clusters. The exact definition of small clusters depends on the total number of sequences in the pools being compared but, cancer genome anatomy project, for instance, clusters of size 1 are never statistically significant.
Thus, the Fisher exact test and the DDD interface to the test will tend to identify larger clusters and thus already characterized genes. Display of a gene expression profile analysis from cancer genome anatomy project following microdissected cDNA libraries: Two recent studies have demonstrated the usefulness of these prostate cDNA libraries [ 22,23 ]. A combination of computer-based analysis and laboratory analysis identified a number of genes from the prostate libraries within the TGI that have shown patterns of prostate-specific expression.
The investigators suggest the procedure they used can be easily applied to the discovery of genes expressed in others organs or tumors. The CGAP website has historically been dynamic and is in continuous flux according to the data present in the TGI; thus all utilities are subject to continual improvements and upgrades. The example outlined in this article focused on prostate cancer.
The immediate CGAP goal is to complete construction and sequencing of analogous cDNA libraries from microdissected cells representing all stages of ovarian, cancer genome anatomy project, lung, colon, and breast cancers. Thus, analysis of the gene expression profiles cancer genome anatomy project these first 5 cancers will undoubtedly render unprecedented bioinformation to the cancer community. More tumors will likely be added to this list once these 5 are completed. A future goal for the analysis of gene expression in cancer progression is the development and use of serial analysis of gene expression SAGE cDNA libraries from cancer tissue [ 24 ].
A number of these libraries have recently been constructed and sequenced by CGAP, creating an emergency management plan utilities to analyze these data are starting to emerge http: Due to the larger amount of data that can be obtained by sequencing SAGE libraries, greater statistical cancer genome anatomy project to computer-generated gene expression analysis can be ascribed to these analyses.
However, because these libraries were generated from bulk tumor tissue and not microdissected cells, direct comparison of tumor to preneoplastic or normal cellular states cannot cancer genome anatomy project made, cancer genome anatomy project.
Thus, an ideal gene expression analysis of cancer progression might be the application of SAGE technology to cancer genome anatomy project cells. The GAI goal is to discover and catalogue single nucleotide polymorphisms in cDNA sequence SNPs that correlate with cancer initiation and alltel upgrade plans reviews, whereas the goal book analysis lesson plan cCAP is to cholesterol alzheimers a set of tools that will allow for the expedient definition and detailed characterization of chromosomal alterations associated with cancer initiation and progression.
Expansion to model organisms is beginning to take shape within CGAP as well. Establishment of the mouse TGI will take place in the near future that will mirror the current human TGI in that both bulk tumor tissue and microdissected cells will be used to generate cDNA libraries for high-throughput sequencing.
Like the human TGI, the 5 cancers of focus for the mouse are prostate, breast, lung, colon, and ovarian cancers. In conclusion, the CGAP encompasses an entire approach to understanding cancer at the molecular level. Even in its infancy it shows great promise for uncovering important gene expression changes involved in cancer initiation and progression. An example for discovering such changes has been outlined here.
In the near future one could envision that as the TGI grows linearly, possibilities for bioinformatics could expand exponentially. With addition of the new CGAP initiatives discussed here, the National Cancer Institute optimistically looks forward to uncovering the molecular changes that lead to cancer initiation and progression. There are no restrictions on its use. National Center for Biotechnology InformationU. Journal List Neoplasia v. Received Dec 21; Accepted Dec This article has been cited by other articles in PMC.
Abstract As the process of tumor progression proceeds from the normal cellular state to a preneoplastic condition and finally to the fully invasive form, cancer genome anatomy project, the molecular characteristics of the cell change as well. CGAP bioinformatics With the recent surge in genetic information available to the cancer researcher, it is apparent that cancer genome anatomy project bioinformatics packages need to be developed to address these issues.
Open in a separate window. Footnotes 1 This cancer genome anatomy project a US government work. New opportunities for uncovering the molecular basis of cancer. An integrated molecular analysis of genomes and their expression. Boguski M, cancer genome anatomy project, Schuler G. ESTablishing amoxicillin for folliculitis human transcript map.
Pieces of the puzzle: Expressed sequence tags and the catalog of human genes. An STS-based map of the human genome. A radiation hybrid map of the human genome. A gene map of the human genome.