Misplaced Pages

Human genetic variation

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Wet dog fur (talk | contribs) at 08:05, 20 January 2009 (Geographic variation: removed space). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 08:05, 20 January 2009 by Wet dog fur (talk | contribs) (Geographic variation: removed space)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
This article contains too many or overly lengthy quotations. Please help summarize the quotations. Consider transferring direct quotations to Wikiquote or excerpts to Wikisource.

Human genetic variation is the genetic variability of humans and is the variation in gene frequencies observed between the genomes of individuals or groups of humans. Variation can be measured at both the individual level (differences between individual people) and at the population level, i.e. differences between populations living in different regions.

The study of human geographic variation has both evolutionary significance and medical applications. The study can help scientists understand ancient human population migrations as well as how different human groups are biologically related to one another. From a medical perspective the study of human genetic variation may be important because some disease causing alleles occur at a greater frequency in people from specific geographic regions.

Types of human variation

"Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes." Nucleotide diversity is based on single mutations known as single nucleotide polymorphisms (SNPs).

Variants of a gene (polymorphism) are called alleles. Any individual human has only two copies of any given allele, one inherited from their mother and the other from their father, but many more different versions of the gene may exist locally or within their own family, with offspring sharing 50% of their genes with each parent, and 50% of their genes on average with each sibling (see Coefficient of relationship). In a given population certain alleles may be more abundant than other alleles, leading to variation in the frequencies of alleles between populations, the more geographically distant these populations are from each other, the more differences there are between the populations.

With a genome of approximate 3 billion nucleotides, two humans on the average differ at approximately 3 million nucleotides. Most of these single nucleotide polymorphisms (SNPs) are neutral, but some are functional and influence the phenotypic differences between humans. It is estimated that about 10 million SNPs exist in human populations, where the rarer SNP allele has a frequency of at least 1% (see International HapMap Project).

A better understanding of the structure of the genome has been gained fairly recently with the publication of two examples of full sequences of an individual's genome. This represents a new development because the Human Genome Project and a parallel project by Celera Genomics produced two haploid sequences, both of which were an amalgamation of sequences from many individuals. Recently the diploid sequences of both Craig Venter and James Watson have been published. Analysis of diploid sequences has shown that non-SNP variation accounts for much more human genetic variation than single nucleotide diversity. This non-SNP variation is called copy number variation and results from deletions, inversions, insertions and duplications. It is estimated that approximately 0.4% of the genomes of unrelated people typically differ with respect to copy number. When copy number variation is included, human to human genetic variation is estimated to be at least 0.5% (99.5% similarity). Copy number variations, that result in interindividual differences, are not necessarily completely inherited, but can also arise during development.

Distinctiveness of human variation

Data gathered to date suggest that human variation exhibits several distinctive characteristics. First, compared with many other mammalian species, humans are genetically less diverse—a counterintuitive finding, given our large population and worldwide distribution (Li and Sadler 1991; Kaessmann et al. 2001).

The nucleotide diversity between two random humans is about 0.1%, that is one difference per 1000 base pairs, and is considered a small difference compared to other large primates. For example, the chimpanzee subspecies living in central and western Africa have higher levels of diversity than do humans (Ebersberger et al. 2002; Yu et al. 2003; Fischer et al. 2004). Chimpanzees have a restricted geographical range and small population numbers, however, their nucleotide diversity is greater than that of humans with one difference between individuals per 500 base pairs. This is often taken as evidence for the Recent African origin of our species; it also makes it difficult to claim any great divergence between individuals or groups of humans due to our overall relative homogeneity as a species, both on an individual and a group basis.

Geographic variation

There are at least two reasons why genetic variation is geographically distributed:

  • natural selection may confer an adaptive advantage to individuals in a specific environment, for example dark skin pigmentation protects from high levels of ultraviolet radiation, whereas a low level of melanin in the skin may confer an advantage in regions with low levels of UV light. Alleles under selection are likely to occur only in those geographic regions where they confer an advantage.
  • The second main cause of geographically distributed genetic variation is due to non-uniform sampling of a population. The main cause is founder effect, this is the effect of a small group of individuals migrating from a larger group and founding a new population, if the migrating population represents only a small subset of the parental population, then it will not be genetically representative of the parental population (sampling error). Small founding populations are also subject to genetic drift, which may further alter allele frequencies. An example of this is the human migration out of Africa, it has been theorised that the migration out of Africa only represented a small fraction of the genetic variation in East Africa, and that this is the cause of the observed lower levels of diversity in all indigenous non-African humans.

Generally, more recent neutral polymorphisms caused by mutation are likely to be relatively geographically localised, while older polymorphisms are more likely to be shared by all human groups. A large majority of the observed genetic variation is nevertheless distributed within any geographic region rather than between regions, though it is usually possible to accurately identify the geographic origins of any individual's ancestors by genetic means.

The details of the distribution of variants within and among human populations are impossible to describe succinctly because of the difficulty of defining a "population," the clinal nature of variation, and heterogeneity across the genome (Long and Kittles 2003). In general, however, 6%–10% of genetic variation occurs between large groups living on different continents, with 5%-6% distributed between localised populations within the same continent, the remaining ~85% of the variation exists within populations. (Lewontin 1972; Jorde et al. 2000a; Hinds et al. 2005). Long and Kittles (2003) point out that this estimate is somewhat misleading, in fact the figure of ~85% of diversity existing within populations is an average for all human populations. The recent African origin theory for humans would predict that in Africa there exists a great deal more diversity than without Africa, and that diversity should decrease the further from Africa a population is sampled. Long and Kittles show that indeed, African populations contain about 100% of human genetic diversity, whereas in populations outside of Africa diversity is much reduced, for example in their population from New Guinea only about 70% of human variation is captured. This distribution of genetic variation differs from the pattern seen in many other mammalian species, for which existing data suggest greater differentiation between groups (Templeton 1998; Kittles and Weiss 2003).

Our history as a species also has left genetic signals in regional populations. For example, in addition to having higher levels of genetic diversity, populations in Africa tend to have lower amounts of linkage disequilibrium than do populations outside Africa, partly because of the larger size of human populations in Africa over the course of human history and partly because the number of modern humans who left Africa to colonize the rest of the world appears to have been relatively low (Gabriel et al. 2002). In contrast, populations that have undergone dramatic size reductions or rapid expansions in the past and populations formed by the mixture of previously separate ancestral groups can have unusually high levels of linkage disequilibrium (Nordborg and Tavare 2002).

In the field of population genetics, it is believed that the distribution of neutral polymorphisms among contemporary humans reflects human demographic history. It is believed that humans passed through a population bottleneck before a rapid expansion coinciding with migrations out of Africa leading to an African-Eurasian divergence around 100,000 years ago (ca. 5,000 generations), followed by a European-Asian divergence about 40,000 years ago (ca. 2,000 generations). Richard G. Klein, Nicholas Wade and Spencer Wells, among others, have postulated that modern humans did not leave Africa and successfully colonize the rest of the world until as recently as 60,000 - 50,000 years B.P., pushing back the dates for subsequent population splits as well.

The rapid expansion of a previously small population has two important effects on the distribution of genetic variation. First, the so-called founder effect occurs when founder populations bring only a subset of the genetic variation from their ancestral population. Second, as founders become more geographically separated, the probability that two individuals from different founder populations will mate becomes smaller. The effect of this assortative mating is to reduce gene flow between geographical groups, and to increase the genetic distance between groups. The expansion of humans from Africa affected the distribution of genetic variation in two other ways. First, smaller (founder) populations experience greater genetic drift because of increased fluctuations in neutral polymorphisms. Second, new polymorphisms that arose in one group were less likely to be transmitted to other groups as gene flow was restricted.

Many other geographic, climatic, and historical factors have contributed to the patterns of human genetic variation seen in the world today. For example, population processes associated with colonization, periods of geographic isolation, socially reinforced endogamy, and natural selection all have affected allele frequencies in certain populations (Jorde et al. 2000b; Bamshad and Wooding 2003). In general, however, the recency of our common ancestry and continual gene flow among human groups have limited genetic differentiation in our species.

Genetic clustering

Gene clusters from Rosenberg (2006) for N=7. (Cluster analysis divides a dataset into any prespecified number of clusters.) Individuals have genes from multiple clusters. The cluster prevalent only among the Kalash people (yellow) only splits off at N=7 and greater.

Genetic data can be used to infer population structure and assign individuals to groups that often correspond with their self-identified geographical ancestry. Recently, Lynn Jorde and Steven Wooding argued that "Analysis of many loci now yields reasonably accurate estimates of genetic similarity among individuals, rather than populations. Clustering of individuals is correlated with geographic origin or ancestry."

In 2003 A. W. F. Edwards wrote a paper called Lewontin's Fallacy, rebuking the argument that because most of the variation is within-group, therefore classification of humans is not possible. He claimed that this conclusion ignores the fact that most of the information that distinguishes populations is hidden in the correlation structure of the data and not simply in the variation of the individual factors. Edwards concludes that "It is not true that 'racial classification is ... of virtually no genetic or taxonomic significance' or that 'you can't predict someone’s race by their genes'." Likewise Neil Risch of Stanford University has proposed that self-identified race/ethnic group could be a valid means of categorization in the USA for public health and policy considerations. While a 2002 paper by Noah Rosenberg's group makes a similar claim "The structure of human populations is relevant in various epidemiological contexts. As a result of variation in frequencies of both genetic and nongenetic risk factors, rates of disease and of such phenotypes as adverse drug response vary across populations. Further, information about a patient’s population of origin might provide health care practitioners with information about risk when direct causes of disease are unknown."

Researchers such as Neil Risch and Noah Rosenberg have argued that a person's biological and cultural background may have important implications for medical treatment decisions, for example an opinion paper by Neil Risch's group in 2002 states:

Both for genetic and non-genetic reasons, we believe that racial and ethnic groups should not be assumed to be equivalent, either in terms of disease risk or drug response.....Whether African Americans, Hispanics, Native Americans, Pacific Islanders or Asians respond equally to a particular drug is an empirical question that can only be addressed by studying these groups individually.

While another 2002 paper by Noah Rosenberg's group makes a similar claim

The structure of human populations is relevant in various epidemiological contexts. As a result of variation in frequencies of both genetic and nongenetic risk factors, rates of disease and of such phenotypes as adverse drug response vary across populations. Further, information about a patient’s population of origin might provide health care practitioners with information about risk when direct causes of disease are unknown.

This work used samples from the Human Genome Diversity Project (HGDP), a project that has collected samples from individuals from 52 ethnic groups from various locations around the world. The HGDP has itself been criticised for collecting samples on an "ethnic group" basis, on the grounds that ethnic groups represent constructed categories rather than categories which are solely natural or biological. The molecular anthropologist Jonathan Marks states:

As any anthropologist knows, ethnic groups are categories of human invention, not given by nature. Their boundaries are porous, their existence historically ephemeral. There are the French, but no more Franks; there are the English, but no Saxons; and Navajos, but no Anasazi...we cannot really know the nature of the actual relationship of the modern group to the ancient one...The worst mistake you can make in human biology is to confuse constructed categories with natural ones. And to overload a big project with cultural categories as the overall sampling strategy would be a serious problem

In the same issue of Science that published the Rosenberg data, Mary-Claire King and Arno G. Motulsky give a similar warning regarding the HGDP data:

The identification of clusters corresponding to the major geographic regions may depend on the sampling of individuals from well-defined, relatively homogeneous populations. If individuals were sampled from a worldwide 'grid' (or a worldwide grid weighted by population density), the clusters might be much less precisely defined. Does the correspondence of worldwide genetic clusters and major geographic regions suggest borders around genetic clusters analogous to the physical borders—oceans, mountain ranges, and deserts—separating geographic regions? No. Both the results of Rosenberg and colleagues and those of previous studies indicate that unlike separations between geographic regions, differences in allele frequencies are gradual.

Another study by Neil Risch in 2005 used 326 microsatellite markers and self-identified race/ethnic group (SIRE), white (European American), African-American (black), Asian and Hispanic (individuals involved in the study had to choose from one of these categories), to representing discrete "populations", and showed distinct and non-overlapping clustering of the white, African-American and Asian samples. The results were claimed to confirm the integrity of self-described ancestry: "We have shown a nearly perfect correspondence between genetic cluster and SIRE for major ethnic groups living in the United States, with a discrepancy rate of only 0.14%." But also warned that: "This observation does not eliminate the potential for confounding in these populations. First, there may be subgroups within the larger population group that are too small to detect by cluster analysis. Second, there may not be discrete subgrouping but continuous ancestral variation that could lead to stratification bias. For example, African Americans have a continuous range of European ancestry that would not be detected by cluster analysis but could strongly confound genetic case-control studies. (Tang, 2005)

Studies such as those by Risch and Rosenberg use a computer program called STRUCTURE to find human populations (gene clusters). It is a statistical program that works by placing individuals into one of two clusters based on their overall genetic similarity, many possible pairs of clusters are tested per individual to generate multiple clusters. These populations are based on multiple genetic markers that are often shared between different human populations even over large geographic ranges. The notion of a genetic cluster is that people within the cluster share on average similar allele frequencies to each other than to those in other clusters. (Edwards, 2003 but see also infobox "Multi Locus Allele Clusters") In a test of idealised populations, the computer programme STRUCTURE was found to consistently under-estimate the numbers of populations in the data set when high migration rates between populations and slow mutation rates (such as single nucleotide polymorphisms) were considered.

Nevertheless the Rosenberg et al. (2002) paper shows that individuals can be assigned to specific clusters to a high degree of accuracy. One of the underlying questions regarding the distribution of human genetic diversity is related to the degree to which genes are shared between the observed clusters. It has been observed repeatedly that the majority of variation observed in the global human population is found within populations. This variation is usually calculated using Sewall Wright's Fixation index (FST), which is an estimate of between to within group variation. The degree of human genetic variation is a little different depending upon the gene type studied, but in general it is common to claim that ~85% of genetic variation is found within groups, ~6-10% between groups within the same continent and ~6-10% is found between continental groups. For example The Human Genome Project states "two random individuals from any one group are almost as different as any two random individuals from the entire world." On the other hand Edwards (2003) claims in his essay "Lewontin's Fallacy" that: "It is not true, as Nature claimed, that 'two random individuals from any one group are almost as different as any two random individuals from the entire world'" and Risch et al. (2002) state "Two Caucasians are more similar to each other genetically than a Caucasian and an Asian." It should be noted that these statements are not the same. Risch et al. simply state that two indigenous individuals from the same geographical region are more similar to each other than either is to an indigenous individual from a different geographical region, a claim few would argue with. Jorde et al put it like this:

The picture that begins to emerge from this and other analyses of human genetic variation is that variation tends to be geographically structured, such that most individuals from the same geographic region will be more similar to one another than to individuals from a distant region.

Whereas Edwards claims that it is not true that the differences between individuals from different geographical regions represent only a small proportion of the variation within the human population (he claims that within group differences between individuals are not almost as large as between group differences). Bamshad et al. (2004) used the data from Rosenberg et al. (2002) to investigate the extent of genetic differences between individuals within continental groups relative to genetic differences between individuals between continental groups. They found that though these individuals could be classified very accurately to continental clusters, there was a significant degree of genetic overlap on the individual level, to the extent that, using 377 loci, individual Europeans were about 38% of the time more genetically similar to East Asians than to other Europeans.

The results obtained by clustering analyses are dependent on several criteria:

  • The clusters produced are relative clusters and not absolute clusters, each cluster is the product of comparisons between sets of data derived for the study, results are therefore highly influenced by sampling strategies. (Edwards, 2003)
  • The geographic distribution of the populations sampled, because human genetic diversity is marked by isolation by distance, populations from geographically distant regions will form much more discrete clusters than those from geographically close regions. (Kittles and Weiss, 2003)
  • The number of genes used. The more genes used in a study the greater the resolution produced and therefore the greater number of clusters that will be identified. (Tang, 2005)
Distribution of European clusters identified by Bauchet. When two clusters are identified there is a north-southeast cline that may be due to demic diffusion during the European Neolithic

Additionally two studies of European population clusters have been produced. Seldin et al. (2006) identified three European clusters using 5,700 genome-wide polymorphisms. Bauchet et al. (2007) used 10,000 polymorphisms to identify five distinct clusters in the European population, consisting of a south-eastern European cluster (including samples from southern Italians, Armenian, Ashkenazi Jewish and Greek "populations"); a northern-European Cluster (including samples from German, eastern English, Polish and western Irish "populations"); a Basque cluster (including samples from Basque "populations"); a Finnish cluster (including samples from Finnish "populations") and a Spanish cluster (including samples from Spanish "populations"). Most "populations" contained individuals from clusters other than the dominant cluster for that population, there were also individuals with membership of several clusters. The results of this study are presented on a map of Europe. (Bauchet, 2007) The existence of allelic clines and the observation that the bulk of human variation is continuously distributed, has led some scientists to conclude that any categorization schema attempting to partition that variation meaningfully will necessarily create artificial truncations. (Kittles & Weiss 2003). It is for this reason, Reanne Frank argues, that attempts to allocate individuals into ancestry groupings based on genetic information have yielded varying results that are highly dependent on methodological design. Serre and Pääbo (2004) make a similar claim:

The absence of strong continental clustering in the human gene pool is of practical importance. It has recently been claimed that “the greatest genetic structure that exists in the human population occurs at the racial level” (Risch et al. 2002). Our results show that this is not the case, and we see no reason to assume that “races” represent any units of relevance for understanding human genetic history.

In a response to Serre and Pääbo (2004), Rosenberg et al. (2005) make three relevant observations. Firstly they maintain that their clustering analysis is robust. Secondly they agree with Serre and Pääbo that membership of multiple clusters can be interpreted as evidence for clinality (isolation by distance), though they also comment that this may also be due to admixture between neighbouring groups (small island model). Thirdly they comment that evidence of clusterdness is not evidence for any concepts of "biological race".

Serre and Pääbo argue that human genetic diversity consists of clines of variation in allele frequencies. We agree and had commented on this issue in our original paper: “In several populations, individuals had partial membership in multiple clusters, with similar membership coefficients for most individuals. These populations might reflect continuous gradations across regions or admixture of neighboring groups.” (Rosenberg, 2002) At the same time, we find that human genetic diversity consists not only of clines, but also of clusters, which STRUCTURE observes to be repeatable and robust....Our evidence for clustering should not be taken as evidence of our support of any particular concept of “biological race.” In general, representations of human genetic diversity are evaluated based on their ability to facilitate further research into such topics as human evolutionary history and the identification of medically important genotypes that vary in frequency across populations. Both clines and clusters are among the constructs that meet this standard of usefulness: for example, clines of allele frequency variation have proven important for inference about the genetic history of Europe, and clusters have been shown to be valuable for avoidance of the false positive associations that result from population structure in genetic association studies. The arguments about the existence or nonexistence of “biological races” in the absence of a specific context are largely orthogonal to the question of scientific utility, and they should not obscure the fact that, ultimately, the primary goals for studies of genetic variation in humans are to make inferences about human evolutionary history, human biology, and the genetic causes of disease.

Similarly Witherspoon et al. (2007) have shown that while it is possible to classify people into genetic clusters this does not resolve the observation that any two individuals from different populations are often genetically more similar to each other than to two individuals from the same population:

Discussions of genetic differences between major human populations have long been dominated by two facts: (a) Such differences account for only a small fraction of variance in allele frequencies, but nonetheless (b) multilocus statistics assign most individuals to the correct population. This is widely understood to reflect the increased discriminatory power of multilocus statistics. Yet Bamshad et al. (2004) showed, using multilocus statistics and nearly 400 polymorphic loci, that (c) pairs of individuals from different populations are often more similar than pairs from the same population. If multilocus statistics are so powerful, then how are we to understand this finding?
All three of the claims listed above appear in disputes over the significance of human population variation and "race"...The Human Genome Project (2001, p. 812) states that "two random individuals from any one group are almost as different as any two random individuals from the entire world."

Risch et al. (2002) state that "two Caucasians are more similar to each other genetically than a Caucasian and an Asian", but Bamshad et al (2004) used the same data set as Rosenberg et al. (2002) to show that Europeans are more similar to Asians 38% of the time than they are to other Europeans when only 377 microsatellite markers are analysed.

If a landmass is considered with variation distributed in one dimension (west-east). Top: Distribution of genetic variation if a small island model is considered; there are two "populations" with a narrow region of hybridisation where migration occurs. This pattern is "clustered".
Bottom: Distribution of genetic variation if isolation by distance is considered; all variation is gradual over the extent of the landmass. This pattern is "clinal".
Percentage similarity between two individuals from different clusters when 377 microsatellite markers are considered.
x Africans Europeans Asians
Europeans 36.5
Asians 35.5 38.3
Indigenous Americans 26.1 33.4 35

In agreement with the observation of Bamshad et al. (2004), Witherspoon et al. (2007) have shown that many more than 326 or 377 microsatellite loci are required in order to show that individuals are always more similar to individuals in their own population group than to individuals in different population groups, even for three distinct populations.

In 2007 Witherspoon et al. sought to investigate these apparently contradictory observations. In their paper Genetic similarities within and between human populations they expand upon the observation of Bamshad et al. (2004). They show that the observed clustering of human populations into relatively discrete groups is a product of using what they call "population trait values". This means that each individual is compared to the "typical" trait for several populations, and assigned to a population based on the individual's overall similarity to one of the populations as a whole: "population membership is treated as an additive quantitative genetic trait controlled by many loci of equal effect, and individuals are divided into populations on the basis of their trait values." They therefore claim that clustering analyses cannot necessarily be used to make inferences regarding the similarity or dissimilarity of individuals between or within clusters, but only for similarities or dissimilarities of individuals to the "trait values" of any given cluster. The paper measures the rate of misclassification using these "trait values" and calls this the "population trait value misclassification rate" (CT). The paper investigates the similarities between individuals by use of what they term the "dissimilarity fraction" (ω): "the probability that a pair of individuals randomly chosen from different populations is genetically more similar than an independent pair chosen from any single population." Witherspoon et al. show that two individuals can be more genetically similar to each other than to the typical genetic type of their own respective populations, and yet be correctly assigned to their respective populations. An important observation is that the likelihood that two individuals from different populations will be more similar to each other genetically than two individuals from the same population depends on several criteria, most importantly the number of genes studied and the distinctiveness of the populations under investigation.

Given 10 loci, three distinct populations, and the full spectrum of polymorphisms, the answer is ω ~ 0.3, or nearly one-third of the time. With 100 loci, the answer is ~20% of the time and even using 1000 loci, ω ~ 10%. However, if genetic similarity is measured over many thousands of loci, the answer becomes never when individuals are sampled from geographically separated populations.

By geographically separated populations, they mean sampling of people only from distant geographical regions while omitting intermediate regions, in this case Europe, sub-Saharan Africa, and East Asian. They continue:

On the other hand, if the entire world population were analyzed, the inclusion of many closely related and admixed populations would increase ω... In a similar vein, Romualdi et al. (2002) and Serre and Paabo (2004) have suggested that highly accurate classification of individuals from continuously sampled (and therefore closely related) populations may be impossible.... Classification methods typically make use of aggregate properties of populations, not just properties of individuals or even of pairs of individuals... The Structure classification algorithm (Pritchard et al. 2000) also relies on aggregate properties of populations, such as Hardy–Weinberg and linkage equilibrium. In contrast, the pairwise distances used to compute ω make no use of population-level information and are strongly affected by the high level of within-groups variation typical of human populations. This accounts for the difference in behavior between ω and the classification results.

Witherspoon et al. also add:

given enough genetic data, individuals can be correctly assigned to their populations of origin is compatible with the observation that most human genetic variation is found within populations, not between them. It is also compatible with our finding that, even when the most distinct populations are considered and hundreds of loci are used, individuals are frequently more similar to members of other populations than to members of their own population.

Race

Further information: ]

New data on human genetic variation has reignited the debate surrounding race. Most of the controversy surrounds the question of how to interpret this new data, and whether conclusions based on existing data are sound. A large majority of researchers endorse the view that continental groups do not constitute different subspecies. However, other researchers still debate whether evolutionary lineages should rightly be called "races". These questions are particularly pressing for biomedicine, where self-described race is often used as an indicator of ancestry (see race in biomedicine below).

Although the genetic differences among human groups are relatively small, these differences in certain genes such as duffy, ABCC11, SLC24A5, called ancestry-informative markers (AIMs) nevertheless can be used to reliably situate many individuals within broad, geographically based groupings or self-identified race. For example, computer analyses of hundreds of polymorphic loci sampled in globally distributed populations have revealed the existence of genetic clustering that roughly is associated with groups that historically have occupied large continental and subcontinental regions (Rosenberg et al. 2002; Bamshad et al. 2003).

Some commentators have argued that these patterns of variation provide a biological justification for the use of traditional racial categories. They argue that the continental clusterings correspond roughly with the division of human beings into sub-Saharan Africans; Europeans, Western Asians, Southern Asians and Northern Africans; Eastern Asians, Southeast Asians, Polynesians and Native Americans; and other inhabitants of Oceania (Melanesians, Micronesians & Australian Aborigines) (Risch et al. 2002). Other observers disagree, saying that the same data undercut traditional notions of racial groups (King and Motulsky 2002; Calafell 2003; Tishkoff and Kidd 2004). They point out, for example, that major populations considered races or subgroups within races do not necessarily form their own clusters. Thus, samples taken from India and Pakistan affiliate with Europeans or eastern Asians rather than separating into a distinct cluster.

Furthermore, because human genetic variation is clinal, many individuals affiliate with two or more continental groups. Thus, the genetically based "biogeographical ancestry" assigned to any given person generally will be broadly distributed and will be accompanied by sizable uncertainties (Pfaff et al. 2004).

In many parts of the world, groups have mixed in such a way that many individuals have relatively recent ancestors from widely separated regions. Although genetic analyses of large numbers of loci can produce estimates of the percentage of a person's ancestors coming from various continental populations (Shriver et al. 2003; Bamshad et al. 2004), these estimates may assume a false distinctiveness of the parental populations, since human groups have exchanged mates from local to continental scales throughout history (Cavalli-Sforza et al. 1994; Hoerder 2002). Even with large numbers of markers, information for estimating admixture proportions of individuals or groups is limited, and estimates typically will have wide confidence intervals or CIs (Pfaff et al. 2004).

Racial admixture

Triangle plot shows average admixture of five North American ethnic groups. Individuals that self-identify with each group can be found at many locations on the map, but on average groups tend to cluster differently.
Main article: Miscegenation § Genetic studies of racial admixture

Miscegenation between two populations reduces the genetic distance between the populations. During the Age of Discovery which began in the early 15th century, European explorers sailed all across the globe reaching all the major continents. In the process they came into contact with many populations that had been isolated for thousands of years. It is generally accepted that the Tasmanian aboriginals were the most isolated group on the planet. They were driven to extinction by European explorers, however a number of their descendants survive today as a result of admixture with Europeans. This is an example of how modern migrations have begun to reduce the genetic divergence of the human race.

The demographic composition of the old world has not changed significantly since the age of discovery. However, the new world demographics were radically changed within a short time following the voyage of Columbus. The colonization of Americas brought Native Americans into contact with the distant populations of Europe, Africa, and Asia. As a result many countries in the Americas have significant and complex multiracial populations. Furthermore many who identify themselves by only one race still have multiracial ancestry.

Variation in physical traits

This article may contain citations that do not verify the text. Please check for citation inaccuracies. (October 2007) (Learn how and when to remove this message)
Further information: skin color, hair color, eye color, body hair, Human height, Human weight, and human intelligence Further information: ]

The distribution of many physical traits resembles the distribution of genetic variation within and between human populations (American Association of Physical Anthropologists 1996; Keita and Kittles 1997). For example, ~90% of the variation in human head shapes occurs within continental groups, and ~10% separates groups, with a greater variability of head shape among individuals with recent African ancestors (Relethford 2002).

Skin color

A prominent exception to the common distribution of physical characteristics within and among groups is skin color. Approximately 10% of the variance in skin color occurs within groups, and ~90% occurs between groups (Relethford 2002). This distribution of skin color and its geographic patterning — with people whose ancestors lived predominantly near the equator having darker skin than those with ancestors who lived predominantly in higher latitudes — indicate that this attribute has been under strong selective pressure. Darker skin appears to be strongly selected for in equatorial regions to prevent sunburn, skin cancer, the photolysis of folate, and damage to sweat glands (Sturm et al. 2001; Rees 2003). A leading hypothesis for the selection of lighter skin in higher latitudes is that it enables the body to form greater amounts of vitamin D, which helps prevent rickets (Jablonski 2004). Evidence for this includes the finding that a substantial portion of the differences of skin color between Europeans and Africans resides in a single gene, SLC24A5 the threonine-111 allele of which was found in 98.7 to 100% among several European samples, while the alanine-111 form was found in 93 to 100% of samples of Africans, East Asians and Indigenous Americans (Lamason et al. 2005). However, the vitamin D hypothesis is not universally accepted (Aoki 2002), and lighter skin in high latitudes may correspond simply to an absence of selection for dark skin (Harding et al. 2000). Melanin which serves as the pigment, is located in the epidermis of the skin, and is based on hereditary gene expression.

Because skin color has been under strong selective pressure, similar skin colors can result from convergent adaptation rather than from genetic relatedness. Sub-Saharan Africans, populations from southern India, and Indigenous Australians have similar skin pigmentation, but genetically they are no more similar than are other widely separated groups. Furthermore, in some parts of the world in which people from different regions have mixed extensively, the connection between skin color and ancestry has been substantially weakened (Parra et al. 2004). In Brazil, for example, skin color is not closely associated with the percentage of recent African ancestors a person has, as estimated from an analysis of genetic variants differing in frequency among continent groups (Parra et al. 2003).

Considerable speculation has surrounded the possible adaptive value of other physical features characteristic of groups, such as the constellation of facial features observed in many eastern and northeastern Asians (Guthrie 1996). However, any given physical characteristic generally is found in multiple groups (Lahr 1996), and demonstrating that environmental selective pressures shaped specific physical features will be difficult, since such features may have resulted from sexual selection for individuals with certain appearances or from genetic drift (Roseman 2004).

Epigenetics

Epigenetics is another type of genetic variation. "This type of variation arises from chemical tags that attach to DNA and affect how it gets read. The chemical tags, called epigenetic markings, act as switches that control how genes can be read." At some alleles, the epigenetic state of the DNA, and associated phenotype, can be inherited transgenerationally.

See also

References

  1. ^ Lynn B Jorde & Stephen P Wooding, 2004, "Genetic variation, classification and 'race'" in Nature Genetics 36, S28 - S33 Genetic variation, classification and 'race'
  2. "Human genetic diversity: Lewontin's fallacy.", Edwards AW., Gonville and Caius College, Cambridge, in PubMed, 2003 Aug;25(8):798-801.
  3. Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies by Hua Tang, Tom Quertermous, Beatriz Rodriguez, Sharon L. R. Kardia, Xiaofeng Zhu, Andrew Brown, James S. Pankow, Michael A. Province, Steven C. Hunt, Eric Boerwinkle, Nicholas J. Schork, and Neil J. Risch Am J Hum Genet. 2005 February; 76(2): 268–275.
  4. Categorization of humans in biomedical research: genes, race and disease by Neil Risch, Esteban Burchard, Elad Ziv and Hua Tang] Genome Biology 2002, 3:comment
  5. Noah A. Rosenberg, Jonathan K. Pritchard, James L. Weber, Howard M. Cann, Kenneth K. Kidd, Lev A. Zhivotovsky, Marcus W. Feldman. Genetic Structure of Human Populations. Science (2002) 298:2381-5
  6. Risch, N., Burchard, E., Ziv, E. & Tang, H. Categorization of humans in biomedical research: genes, race, and disease. Genome Biol. 3, 1−12 (2003)
  7. Noah A. Rosenberg, Jonathan K. Pritchard, James L. Weber, Howard M. Cann, Kenneth K. Kidd, Lev A. Zhivotovsky, Marcus W. Feldman. Genetic Structure of Human Populations. Science (2002) 298:2381-5
  8. Marks, J. (2002) What it means to be 98% chimpanzee (paperback ed.) pp.202-203. Berkley. University of California Press.
  9. Mary-Claire King and Arno G. Motulsky Mapping Human History. Science (2002) 298: pp. 2342 - 2343. doi:10.1126/science.1080373
  10. "Genetic Similarities Within and Between Human Populations" (2007) by D.J. Witherspoon, S. Wooding, A.R. Rogers, E.E. Marchani, W.S. Watkins, M.A. Batzer and L.B. Jorde. Genetics. 176(1): 351–359.
  11. Wapples, R., S. and Gaggiotti, O. What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity Molecular Ecology (2006) 15: 1419–1439. doi:10.1111/j.1365-294X.2006.02890.x
  12. ^ Genetic Similarities Within and Between Human Populations by D. J. Witherspoon, S. Wooding, A. R. Rogers, E. E. Marchani, W. S. Watkins, M. A. Batzer, and L. B. Jorde Genetics. 2007 May; 176(1): 351–359.
  13. Back with a Vengeance: the Reemergence of a Biological Conceptualization of Race in Research on Race/Ethnic Disparities in Health Reanne Frank
  14. Rosenberg NA, Mahajan S, Ramachandran S, Zhao C, Pritchard JK, et al. (2005) Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure. PLoS Genet 1(6): e70 doi:10.1371/journal.pgen.0010070
  15. Bamshad, Wooding, Salisbury§ and Stephens (2004) Deconstructing the relationship between genetics and race. Nature Reviews Genetics 8:598-609. doi:10.1038/nrg1401
  16. The table gives the percentage likelihood that two individuals from different clusters are genetically more similar to each other than to someone from their own population when 377 microsatellite markers are considered from Bamshad et al. (2004)doi:10.1038/nrg1401, original data from Rosenberg (2002).
  • Altmüller J, Palmer LJ, Fischer G, Scherb H, Wjst M (2001) Genomewide scans of complex human diseases: true linkage is hard to find. Am J Hum Genet 69:936–950
  • Aoki K (2002) Sexual selection as a cause of human skin colour variation: Darwin's hypothesis revisited. Ann Hum Biol 29:589–608
  • Bamshad, Michael; Wooding, Stephen; Salisbury, Benjamin A.; Stephens, J. Claiborne (2004). Deconstructing The Relationship Between Genetics And Race. Nature Reviews Genetics 5, 598–609. reprint-zip
  • Bamshad M, Wooding SP (2003) Signature of natural selection in the human genome. Nat Rev Genet 4:99–111
  • Bamshad MJ, Wooding S, Watkins WS, Ostler CT, Batzer MA, Jorde LB (2003) Human population genetic structure and inference of group membership. Am J Hum Genet 72:578–589
  • Cann, Rebecca, M. Stoneking, A. Wilson 1987 "Mitochondrial DNA and Human Evolution" in Nature 325(January) 31-36.
  • Cardon LR, Abecasis GR (2003) Using haplotype blocks to map human complex trait loci. Trends Genet 19:135–140
  • Cavalli-Sforza LL, Feldman MW (2003) The application of molecular genetic approaches to the study of human evolution. Nat Genet Suppl 33:266–275
  • Collins FS (2004) What we do and don't know about "race," "ethnicity," genetics and health at the dawn of the genome era. Nat Genet 36:S13–S15
  • Collins FS, Green ED, Guttmacher AE, Guyer MS, for the US National Human Genome Research Institute (2003) A vision for the future of genomics research. Nature 422:835–847
  • Ebersberger I, Metzler D, Schwarz C, Pääbo S (2002) Genomewide comparison of DNA sequences between humans and chimpanzees. Am J Hum Genet 70:1490–1497
  • Edwards, AW (2003). Human genetic diversity: Lewontin's fallacy Bioessays 25, 798–801.
  • Foster MW, Sharp RR (2004) Beyond race: towards a whole-genome perspective on human populations and genetic variation. Nat Rev Genet 5:790–796
  • Foster MW, Sharp RR, Freeman WL, Chino M, Bernsten D, Carter TH (1999) The role of community review in evaluating the risks of human genetic variation research. Am J Hum Genet 64:1719–1727
  • Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229
  • Harding RM, Healy E, Ray AJ, Ellis NS, Flanagan N, Todd C, Dixon C, Sajantila A, Jackson IJ, Birch-Machin MA, Rees JL (2000) Evidence for variable selective pressures at MC1R. Am J Hum Genet 66:1351–1361
  • Ingman M, Kaessmann H, Pääbo S, Gyllensten U (2000) Mitochondrial genome variation and the origin of modern humans. Nature 408:708–713
  • International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796
  • ——— (2004) Integrating ethics and science in the International HapMap Project. Nat Rev Genet 5:467–475
  • International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
  • Jorde LB, Bamshad M, Rogers AR (1998) Using mitochondrial and nuclear DNA markers to reconstruct human evolution. BioEssays 20:126–136
  • Jorde LB, Watkins WS, Bamshad MJ, Dixon ME, Ricker CE, Seielstad MT, Batzer MA (2000a) The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data. Am J Hum Genet 66:979–988
  • Jorde LB, Watkins WS, Kere J, Nyman D, Eriksson AW (2000b) Gene mapping in isolated populations: new roles for old friends? Hum Hered 50:57–65
  • Jorde, Lynn B.; Wooding, Stephen P. (2004). Genetic variation, classification and race. Nature Genetics 36, S28–S33.
  • Kaessmann H, Heissig F, von Haeseler A, Pääbo S (1999) DNA sequence variation in a non-coding region of low recombination on the human X chromosome. Nat Genet 22:78–81
  • Kaessmann H, Wiebe V, Weiss G, Pääbo S (2001) Great ape DNA sequences reveal a reduced diversity and an expansion in humans. Nat Genet 27:155–156
  • Keita SOY, Kittles RA (1997) The persistence of racial thinking and the myth of racial divergence. Am Anthropol 99:534–544
  • Lewontin RC (1972) The apportionment of human diversity. Evol Biol 6:381–398
  • Mountain JL, Risch N (2004) Assessing genetic contributions to phenotypic differences among "racial" and "ethnic" groups. Nat Genet Suppl 36:S48–S53
  • Pääbo S (2003) The mosaic that is our genome. Nature 421:409–412
  • Ramachandran Sohini, Deshpande Omkar, Roseman Charles C., Rosenberg Noah A., Feldman Marcus W., and Cavalli-Sforza L. Luca (2005) Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa DOI:10.1073/pnas.0507611102
  • Relethford JH (2002) Apportionment of global human genetic diversity based on craniometrics and skin color. Am J Phys Anthropol 118:393–398
  • Sankar P, Cho MK (2002) Toward a new vocabulary of human genetic variation. Science 298:1337–1338
  • Sankar P, Cho MK, Condit DM, Hunt LM, Koenig B, Marshall P, Lee SS, Spicer P (2004) Genetic research and health disparities. JAMA 291:2985–2989
  • Serre D, Pääbo S (2004) Evidence for gradients of human genetic diversity within and among continents. Genome Res 14:1679–1685 PDF
  • Templeton AR (1998). Human Races: A Genetic and Evolutionary Perspective. American Anthropologist

September 1998, Vol. 100, No. 3, pp. 632-650

  • Weiss KM (1998) Coming to terms with human variation. Annu Rev Anthropol 27:273–300
  • Weiss KM, Terwilliger JD (2000) How many diseases does it take to map a gene with SNPs? Nat Genet 26:151–157
  • Yu N, Jensen-Seaman MI, Chemnick L, Kidd JR, Deinard AS, Ryder O, Kidd KK, Li WH (2003) Low nucleotide diversity in chimpanzees and bonobos. Genetics 164:1511–1518
  • Ziętkiewicz E, Yotova V, Gehl D, Wambach T, Arrieta I, Batzer M, Cole DEC, Hechtman P, Kaplan F, Modiano D, Moisan J-P, Michalski R, Labuda D (2003) Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern human diversity. Am J Hum Genet 73:994–1015

External links

Human genetics
Sub-topics
Genetic history
by region
Population genetics
by group
Population genetics
Key concepts
Selection
Effects of selection
on genomic variation
Genetic drift
Founders
Related topics
Categories:
Human genetic variation Add topic