Multiple lines of evidence indicate that modern humans evolved within the last 200,000 years and spread out of Africa beginning about 60,000 years ago. But before that, the details get a little complicated. We still debate which ancestral group produced our lineage. Somewhere around 600,000 years ago, that lineage split into Neanderthals and Denisovans, and both of those lineages later interbred with modern humans after some of those lineages left Africa.
To figure out what we know today, we need to combine fossils, ancient DNA, and modern genomes. A new study claims that there was another complex event in humanity’s past. That’s a near-extinction era when nearly 99 percent of our ancestral lineage died. However, as this finding is based on an entirely new approach for analyzing modern genomes, it can be difficult to verify.
track diversity
Unless the population is small and inbred, it will have genetic diversity, the collection of DNA differences ranging from individual bases to large chromosomal rearrangements. These differences are tracked when the testing service estimates the probable location of ancestral origin. Some genetic differences are recent, while others have been around our lineage before modern humans.
These differences form the basis of a new study that analyzed multiple human genomes based on several well-established principles.
First, given enough genomes, we can work out what the ancestral state of different regions of the chromosome was. For example, a mutation present only in a set of closely related individuals and not in others probably arose in that common ancestor. This means that the chromosomal ancestral state did not have such mutations.
Since we know the rate at which new mutations occur in modern humans, we can use these differences to create a molecular clock. In other words, it is possible to take the number of mutations between the current and ancestral state, compare it to the incidence of mutations, and estimate when that ancestral state was last present in the population. can.
Finally, the number of mutations present in a population is related to population size. Small populations tend to be inbred because it becomes difficult to avoid inbreeding, leading to loss of genetic diversity. In addition, small populations simply have a low total number of chromosomes, which limits the potential for diversity. The reverse is also true, a large population can support more diversity.
Taken together, these give an overview of what the researchers behind the new study did. They took the mutations present in today’s genomes and used them to determine the existence of different ancestral states and when they likely existed. Knowing how many different ancestral states existed at a given time also allowed us to estimate population size.
does this actually work?
Since all this work is based on probabilities, the results for individual parts of the chromosome have a fairly high probability of being wrong. But all those individual mistakes should be wrong in different ways. However, with a sufficient number of whole genomes of individuals, the real signal should emerge from the noise of individual errors. The big question is whether the algorithm devised by the authors can recognize the signal, and whether there is enough data to make it possible.
Researchers make their point by creating several model populations that undergo different forms of change. (Examples include constant population size, constant growth, stagnation followed by growth.) Various algorithms have been published on this data, including the researcher’s software, FitCoal. I was. Most of them had some serious errors, but some were better than others. And FitCoal consistently outperformed all, and in most cases produced population size estimates that were difficult to distinguish from the model population.
Reassuringly, most of the other algorithms produced results similar to FitCoal’s, but with significantly larger margins of error.
However, the accuracy of the algorithm is likely to be the most controversial aspect of this study going forward. Unless someone finds an error in your code, you’ll have to rely on comparisons with other software. Unfortunately, this kind of software is very computationally expensive. Adding more genomes to the analysis may provide some clarity, as working with more data may lead to more accurate results. However, additional genomes will exacerbate the computational challenge.