Human Genome Project

Also found in: Dictionary, Thesaurus, Medical, Financial, Acronyms, Wikipedia.
Related to Human Genome Project: gene therapy, DNA fingerprinting

Human Genome Project

Human Genome Project, international scientific effort to map all of the genes on the 23 pairs of human chromosomes and, to sequence the 3.1 billion DNA base pairs that make up the chromosomes (see nucleic acid). Begun in 1990 with the goal of enabling scientists to understand the basis of genetic diseases and to gain insight into human evolution, the project was largely completed in 2000 when 85% of the human genome was decoded, and ended in 2003 with 99% decoded; detailed analyses of all the pairs were published by 2006. In the process, scientists identified genes for cystic fibrosis, neurofibromatosis, Huntington's disease, and an inherited form of breast cancer. In addition, the project decoded the genome of the bacterium E. coli, a fruit fly, and a nematode worm (see phylum Nematoda), in order to study genetic similarities among species, and a mouse genome was also decoded.

The Human Genome Project involved laboratories in the United States, France, Great Britain, Germany, and Japan. It was financed in the United States by the National Institutes of Health and by the Department of Energy and in Great Britain by the Wellcome Trust of London. A comparable project using new DNA-sequencing machines was begun as a private industry venture in the United States in 1998, with a stated goal of completing the mapping of the genome in three years.

Early in 2001 scientists from both teams jointly announced the “completion” of the mapping of the human genome, indicating that they had identified an estimated 30,000 protein-coding genes (instead of the expected 100,000), constituting just 1% of the total human DNA. Subsequent comparison of the two teams' data has indicated that, because of differences in the genes identified by the teams, there may in fact be as many as 40,000 human genes. A subsequent, more refined estimate (2004) based on additional work on the genome was that there are between 20,000 and 25,000 genes; more recently, the that range has been reduced from around 20,000 to somewhat more than 21,000. Scientists have also identified stretches of the genome that code for RNA that is not used to produce protein; there are more than 25,000 of these RNA-producing, or noncoding, genes.

Work continues on further refining the sequencing of the genes on the chromosomes, eliminating the remaining gaps in the genome map, and identifying the extent of variation in the human genome. In 2007 the first sequences of human individuals (James D. Watson and J. Craig Venter, who led the public and private human genome sequencing efforts, respectively) were released; Venter's genome was the first full (diploid) individual human genome. The NIH's National Center for Biotechnology Information maintains GenBank, a database of publicly available genetic sequences from the genomes of plants and animals, including some extinct species.


See studies by J. Sulston and G. Ferry (2002) and J. Shreeve (2004).

The Columbia Electronic Encyclopedia™ Copyright © 2022, Columbia University Press. Licensed from Columbia University Press. All rights reserved.

Human Genome Project

An organized international scientific endeavor to determine the complete structure of the human genetic material deoxyribonucleic acid (DNA) and understand its function. See Human genetics


The idea for the Human Genome Project (HGP) first arose in the mid-1980s. Several scientific groups met to discuss the feasibility, and various reports were published. The most influential report was prepared by the National Research Council (NRC) of the U.S. National Academy of Sciences. It proposed a detailed scientific strategy that persuaded many scientists that the project was possible. October 1, 1990, was declared the official start time for the HGP in the United States; significant funding had become available and research groups were starting their work. Major contributions to the HGP have been made by the United Kingdom, France, Japan, and Germany, with smaller contributions from many other quarters. Coordination among the countries has been informal, relying largely on scientist-to-scientist collaborations, but has proved to be very effective.

Scientific strategy

First, markers are placed on the chromosomes by genetic mapping, that is, observing how the markers are inherited in families. Second, a physical map is created from overlapping cloned pieces of the DNA. Third, the sequence of each piece is determined, and the sequences are lined up by computer until a continuous sequence along the whole chromosome is obtained. The second and third steps can be reversed or done in parallel. As the pieces are sequenced, the sequences at the overlapping ends can be used to help order the pieces. If the sequencing is done before the pieces are mapped, the process is called whole-genome shotgun sequencing. See Deoxyribonucleic acid (DNA), Gene

Because the human genome is so big (human DNA consists of about 3 billion nucleotides connected end to end in a linear array), it was necessary to break the task down into manageable chunks (see illustration).

Steps in analyzing a genomeenlarge picture
Steps in analyzing a genome

Model organisms

An important element of the overall strategy was to include the study of model organisms in the HGP. There were two reasons for this: (1) Simpler organisms provide good practice material. (2) Comparisons between model organisms and humans yield very valuable scientific information. The HGP initially adopted five model organisms to have their DNA sequenced: the bacterium Escherichia coli, the yeast Saccharomyces cerevisiae, the roundworm Caenorhabditis elegans, the fruitfly Drosophila melanogaster, and the laboratory mouse Mus musculus. The mouse genome is just as complex as the human genome, but the mouse offers the advantages that it can be bred and other experiments can be conducted that are not possible on humans.


How many genes are there is probably the most common question regarding the human genome. The first two human chromosomes to be sequenced, chromosomes 22 and 21, provided some interesting observations. Although the two chromosomes are approximately the same length, chromosome 22 has more than twice as many genes as chromosome 21. Extrapolation of the number of genes found on chromosomes 22 and 21 led to the estimate that the whole human genome contains about 36,000 genes. This is quite a surprise because previous estimates were 80,000 to 100,000 genes. Preliminary examination of the draft sequence of the entire human genome confirmed that the number of genes is much lower than previously thought. This does not necessarily mean that the human genome is less complex, because many genes can produce more than one protein by alternate splicing of their exons (protein-encoding regions of the gene) during translation into the constituents of proteins. See Chromosome, Genetic code

Another fascinating feature of the human genome sequence is the large fraction that consists of repeated sequence elements; 40% of chromosome 21 and 42% of chromosome 22 are composed of repeats. The function of any of these repeats is not yet known, but elucidating their distribution in the genome may help to reveal it.

Another statistic that is of interest is the base composition, the percent of the DNA that is made of guanine-cytosine (GC) base pairs as opposed to adenine-thymine (AT) base pairs. Chromosome 22 has a 48% GC content, whereas chromosome 21 has 41% and the average over the genome is 42%. Again, the significance of this is not yet known, but higher GC content seems to correlate with higher gene density.

The type of analysis performed initially on chromosomes 21 and 22 has been extended to the entire human sequence. However, a full understanding will take decades to achieve.

Future research

With the complete sets of genes of organisms available, how genes are turned on and off and how genes interact with each other can be studied. What the different genes do and how they affect human health must also be learned. Consequently, much effort is now directed to studying the regulation of gene expression and annotating the sequence with useful biological information about function.

Another key challenge is to understand how DNA function varies with differences in the DNA sequence. Each human being has a unique DNA sequence which differs from that of any other human being by about 0.1%, regardless of ethnic origin. Yet this small difference affects characteristics such as how humans look and to what diseases they are susceptible. The differences also provide clues about the evolution of the human species and the historical migration patterns of people across the world. See Molecular biology, Nucleic acid

McGraw-Hill Concise Encyclopedia of Bioscience. © 2002 by The McGraw-Hill Companies, Inc.

Human Genome Project

a multi-national project to map the human GENOME (i.e. of every gene on every human chromosome). Initiated in 1990, the project aims to complete its mapping early in the early decades of the present century. The traditional issues in nature and nurture – see NATURE-NURTURE DEBATE – arise in relation to the Project. There are those who expect that from the initial identification of genes links between specific genes and bodily functions, disease, etc will also be widely established and provide greatly enhanced understanding and a capacity to intervene. Others point to the limitations and dangers likely to be associated with such a reductionistic account. The likelihood is that while some matching of genes with functions will be tight (e.g. as already clear for some genetically inherited disease), many other areas will continue to require explanation beyond such reductionistic accounts.
Collins Dictionary of Sociology, 3rd ed. © HarperCollins Publishers 2000

Human Genome Project

A bioinformatics project that has identified the 30,000 genes in human DNA. Coordinated by the U.S. Department of Energy and the National Institutes of Health, the U.S. Human Genome Project started in 1990 and released its findings in February 2001 along with findings from a separate project by Celera Genomics Group. There are similar projects in other countries as well. The purpose is to store the three billion chemical base pairs (the DNA sequence) derived from these analyses in databases for use in biomedical research. See micro array.

The End Goal
The goal of the Human Genome Project is to determine the relationships between DNA makeup and human traits and predispositions. Although sequencing costs have been extremely expensive, they are approaching the $1,000 level per human genome, enabling "personalized genomics" to dramatically alter the course of medicine. See Personal Genome Project.

A Human Component Dictionary
This information is not a blueprint of the human being, rather it is a dictionary of components. Once believed that each gene made only one protein, it is now believed that each gene creates numerous proteins, although this information is expected to take years to determine. Part of the U.S. government project is to study the ethical and legal impact that this information will have on society. An abundance of information can be found at
Copyright © 1981-2019 by The Computer Language Company Inc. All Rights reserved. THIS DEFINITION IS FOR PERSONAL USE ONLY. All other reproduction is strictly prohibited without permission from the publisher.
References in periodicals archive ?
Understanding the sources and types of information that families access to learn about new developments related to the Human Genome Project is a key step as social workers strive to support families in integrating genetic information regarding their family's health care.
First structured as a course and then transcribed into a book, this is an excellent introductory text into the theological viewpoint of the Human Genome Project. Its clear and coherent writing makes it accessible to the pastor and lay leader, providing material for group study in the congregational setting.
The Wellcome Trust Sanger Institute offers a basic overview of the Human Genome Project supported by links to related resources.
Scientifically, the human genome project is already revolutionizing our understanding of sporadic and inherited diseases, including cancer, Alzheimer's, autism, and many more.
There are also great resources online such as the Department of Energy Human Genome Project Information site (, which even provides educational presentations to download, and the National Human Genome Research Institute site (
Upon the foundation of knowledge gained from the Human Genome Project, he outlined the challenges of three broad themes that need to be met in order to translate the sequence of the human genome into health benefits.
Coffins, director of the National Human Genome Research Institute, also has been the leader of the Human Genome Project, an international effort to sequence the 3 billion DNA letters in the human genome, since 1993.
The successful completion of the Human Genome Project was announced more than two years ahead of schedule.
As a former nurse, Franklin has long understood the importance of the Human Genome Project's findings to the American public and has worked to get other legislators involved.
Organizers said the HapMap Project -- or ''haplotype map'' referring to a set of genes determining different antigens -- marks the second international project on genetic information after the groundbreaking Human Genome Project.
Researchers reported last month that by comparing the Fugu sequence with the results of the Human Genome Project, they were able to predict the presence of nearly 1,000 previously unidentified human genes.
The National Institutes of Health, which funds the public consortium's human genome project, plans to fix these errors using the Decode map and produce an accurate version of the human genome by April 2003.

Full browser ?