The Genome Origin of Mummies in the Tarim Basin in the Bronze Age Naturally

2021-11-13 06:33:08 By : Mr. Harry Wang

Thank you for visiting Nature. The browser version you are using has limited support for CSS. For the best experience, we recommend that you use a newer version of the browser (or turn off the compatibility mode in Internet Explorer). At the same time, to ensure continued support, we will display sites without styles and JavaScript.

Nature Volume 599, pages 256–261 (2021) Cite this article

The identities of the earliest residents of Xinjiang, located in the heart of Asia’s interior, and the languages ​​they speak have been disputed for a long time1. Here, we show the genome data of 5 individuals from the Junggar Basin from about 3000-2800 BC and 13 individuals from the Tarim Basin from 2100-1700 BC, representing the earliest discovered in northern and southern Xinjiang, respectively. Human remains. We found that the Junggars in the early Bronze Age mainly showed Afanasievo ancestry and made additional local contributions, while the Tarim in the early-mid Bronze Age only contained local ancestry. The calculus of the Tarim people from the Xiaohe site further shows strong evidence of milk protein, indicating that the site has been dependent on dairy cattle husbandry since its establishment. Our results do not support the previous hypothesis about the origin of the Tarim mummies. They are considered to be the primitive Tocharian herders of Afanasievo1,2, or originated from the Bactria-Margiana archaeological complex 3 or the inner Asian mountain corridor culture 4. On the contrary, although in the early Bronze Age, the Afanasiyevo immigrants might reasonably introduce the Tocharian language into the Junggar Basin, we found that the earliest culture of the Tarim Basin seemed to originate from a genetically isolated local population, and they adopted neighboring herders. And agricultural practices, which allowed them to settle and thrive along the ever-changing river oasis of the Taklimakan Desert.

As part of the Silk Road and located at the crossroads of Eastern and Western cultures, the Xinjiang Uygur Autonomous Region (hereinafter referred to as Xinjiang) has long been an important crossroads for cross-Eurasian, cultural, agricultural and language exchanges1, 5, 6, 7 , 8, 9. Xinjiang is divided into two by the Tianshan Mountains, the northern Xinjiang contains the Junggar Basin, and the southern Xinjiang contains the Tarim Basin (Figure 1). The Junggar Basin in the north is composed of the Gurbantunggut Desert, surrounded by large grasslands, and is traditionally a place where mobile herders live. The southern part of Xinjiang is composed of the Tarim Basin, a dry inland sea that now forms the Taklimakan Desert. Although most areas are not suitable for living, the Tarim Basin also contains small oasis and river corridors, recharged by glacial ice and snow runoff from melting surrounding mountains4,10,11.

a. An overview of the main geographic regions, features, and archaeological sites in Europe and Asia discussed in the article; the new sites analyzed in this study are shown in gray. b. An enhanced view of Xinjiang and the six new sites analyzed in this study. c, the timeline of the site in a. The timeline is organized by region and shows the median date for each study group. The base maps in a and b are from the Natural Earth public domain map dataset (https://www.naturalearthdata.com/downloads/10m-raster-data/10m-cross-blend-hypso/). In the group label, the suffix represents the archaeological time period of each group: N, Neolithic; EN, MN, and LN, respectively, the early, middle and late Neolithic; CN, the Neolithic of Geoksyur, Parkhai and Sarazm; CA, Bronze Age; BA, Bronze Age; MBA, Bronze Age; EIA, Early Iron Age. MA-1, Malta; EHG, Eastern European hunter-gatherer.

In and around the Junggar Basin, 12 locations in the Early Bronze Age (EBA) Afanasievo (3000-2600 BC) and Chemurchek (or Chemurchek) (2500-1700 BC) appear to be related to the Afanasievo herders in Altai In the southern region of Siberia (3150-2750 BC), they have a close genetic link with Yannaya (3500-2500 BC) in the Pontiac-Caspian steppe, which is located 3,000 kilometers west. 13,14,15. Linguists hypothesized that the spread of Afanasivo caused the extinct Indo-European Tocharian branch to migrate eastward, distinguishing it from other Indo-European languages ​​in the third or fourth millennium BC (Reference 14 ). However, although the Junggar people of the Iron Age (approximately 200-400 BC)7 confirmed the ancestry related to Afanasievo, and the Tocharian language is recorded in Buddhist scriptures in the Tarim Basin, dating back to 500-AD. 1000 (Reference 13), but little is known about the early Xinjiang population and its possible genetic relationship with the Afanasivo or other populations.

Since the late 1990s, hundreds of natural mummified human remains have been discovered in the Tarim Basin dating back to around 2000 BC to 200 AD, due to their so-called western appearance, felted and woven woolen clothing, and farmers and herdsmen. Including cattle, sheep/goats, wheat, barley, millet and even kefir cheese 16, 17, 18, 19. Now this kind of mummies have been found in various places in the Tarim Basin, and the earliest found in the bottom of the graves ( Figure 1. Extended data figure 1 and extended data table 1). These and related Bronze Age sites are classified as Xiaohe Archaeological Horizon 13, 16, 20 based on their shared material culture.

Scholars have proposed a variety of contrast hypotheses to explain the origin of the Xiaohe strata and Western elements, including Yannaya/Afanasivo Steppe Hypothesis 16, Twin Peaks Oasis Hypothesis 21 and Inner Asia Mountain Corridor (IAMC) Island Biogeography Hypothesis 4 . The Yamnaya/Afanasievo steppe hypothesis assumes that the Afanasievo-related EBA populations in the Altai-Sayan Mountains spread through the Junggar Basin to the Tarim Basin, and then the farmers and herdsmen communities that constitute the Xiaohe horizon were established around 2000 BC (Ref. 16, 22, twenty three). In contrast, the Bactria oasis hypothesis holds that the Tarim Basin was originally created by farmers from the Bactria-Margiana Archaeological Site (BMAC) (circa 2300-1800 BC) from Afghanistan, Turkmenistan, and Uzbekistan. The desert oasis migrated through the central mountainous area. Asia. The support for this hypothesis is mainly based on the similarities in the agricultural and irrigation systems between the two regions, which reflect the adaptation to the desert environment, as well as evidence of the use of ephedra ceremonies in the two locations3,21. The IAMC island biogeography hypothesis also assumes that the Xiaohe ancestor population originated in the mountains of Central Asia, but this hypothesis is related to the migration of farmers and herdsmen in the western and northern Tarim Basin of the IAMC4,24,25. Compared with these three migration models, the larger IAMC spanning the Hindu Kush to the Altai Mountains may have become a geographical stage, mainly through cultural thought rather than population movement25.

Recent archaeological genome studies have shown that the Bronze Age Afanasievo in southern Siberia and the IAMC/BMAC populations in Central Asia have distinguishable genetic characteristics15,26, and these characteristics are also different from the hunting-gathering populations of former farmers and herders in Inner Asia2 ,5 ,7,27,28,29,30. Therefore, the archaeological genomics study of the Xinjiang population in the Bronze Age provides a powerful method to reconstruct the population history of the Junggar and Tarim Basin and the origin of the Xiaohe strata in the Bronze Age. By examining the bone materials of 33 Bronze Age individuals from the Junggar (Nilek, Ayitohan, and Songshugou) and Tarim (Xiaohe, Gumugou, and northern) basins, we successfully collected 5 EBA Junggar (AD) From 3000-2800 BC), we retrieved the ancient genome sequence assigned as Afanasievo and the whole genome data (extended data) from 13 Early-Middle Bronze Age (EMBA) Tarim individuals (2100-1700 BC) belonging to the Xiaohe strata. Table 1 and Supplementary Data 1A). We also report the calculus proteome of 7 individuals from the basement of the Xiaohe site in the Tarim Basin (Extended Data Table 2). As far as we know, these people represent the earliest human remains excavated in the area so far.

Through whole-genome sequencing or DNA enrichment, we obtained whole-genome data of 18 of 33 trial individuals, which contained approximately 1.2 million SNPs (1,240k groups of SNPs) (Supplementary Data 1A). In general, the endogenous DNA is well preserved and has the lowest contamination level (Extended Data Table 1 and Supplementary Data 1A). In order to explore the genetic spectrum of the ancient Xinjiang population, we first calculated the main components of the current Eurasian and Native American populations, and we projected the main components of ancient individuals onto these main components. Ancient Xinjiang individuals formed several different clusters, distributed in principal component 1 (PC1) (Figure 2), which was the main principal component that separated the eastern and western Eurasian populations. The EBA Junggar people from the Ayitokhan and Songshugou sites near the Altai Mountains (Dzungaria_EBA1) are close to the EBA Afanasievo steppe herders in the northern Altai-Sayan Mountains. The genetic clustering of ADMIXTURE further supports this observation (Extended Data Figure 3). Contemporaneous individuals from the Nileke site (Dzungaria_EBA2) near Tianshan moved slightly along PC1 to later Tarim individuals. In contrast to the EBA Junggar individuals, the EMBA individuals from the Xiaohe and Gumugou sites in eastern Tarim (Tarim_EMBA1) formed close clusters with the pre-Bronze Age central steppe and Siberian individuals, and these individuals have a higher level of ancient Northern Europe (ANE) ancestry (e.g. Botai_CA). A contemporaneous individual from the northern site of the Tarim Basin (Tarim_EMBA2) shifted slightly from Tarim_EMBA1 to an EBA individual in the Baikal region.

Principal component analysis of ancient individuals projected on Eurasian and Native American populations; the illustration shows ancient individuals projected only on Eurasian populations.

Outer group f3 statistics support the close genetic connection between Junggar and Tarim (extended data Figure 2A). However, these two Junggar ethnic groups are significantly different from the Tarim ethnic groups, showing excessive affinity with various Western European and Asian populations, and share fewer alleles with ANE-related ethnic groups (expanded data Figure 2b, c). In order to understand this mixed genetic spectrum, we use qpAdm to explore the mixed model of the Junggar ethnic group and Tarim_EMBA1 or terminal Pleistocene individuals (AG3) from the Siberian site of Afontova Gora31 as a source (Supplementary Data 1D). AG3 is the distal representative of the ANE ancestor and shows high affinity with Tarim_EMBA1. Although the Tarim_EMBA1 individuals are one thousand years later than the Junggar group, they are genetically farther away from the Afa Nasiwo people than the Junggar group, indicating that they have a higher proportion of local native ancestry. Here, we define indigenous as representing a genetic spectrum that has existed in an area for thousands of years, rather than being associated with a recently arrived population.

We found that both Dzungaria_EBA1 and Dzungaria_EBA2 can be best described by the three-way hybrid model (Figure 3c, Extended Data Table 3 and Supplementary Data 1D), where they derive most of their lineages from Afanasievo (about 70% in Dzungaria_EBA1 and in Dzungaria_EBA1) About 50% of the total) Dzungaria_EBA2), and the rest of the ancestors are best modeled as a mixture of AG3/Tarim_EMBA1 (19–36%) and Baikal_EBA (9–21%). When we use the Neolithic and Bronze Age populations from IAMC as the source, the model fails when Afanasievo is not included, and when Afanasievo is included, the IAMC group has no contribution (Supplementary Data 1D). Therefore, the Afanasievo lineage without the contribution of IAMC is sufficient to explain the western Eurasian composition of the Junggars. We also found that Chemurchek is an EBA herder culture that inherited Afanasievo in both the Junggar Basin and the Altai Mountains. About two-thirds of its ancestors came from Dzungaria_EBA1, and the rest came from Tarim_EMBA1 and IAMC/BMAC related sources (Figure 3). Extended data Table 3, Supplementary Data 1F and Supplementary Text 5). This helps explain the IAMC/BMAC-related lineages previously found in Chemurchek individuals30 and their reported cultural and genetic links with the Afanasievo population32. In summary, these results show that the early proliferation of Afanasiwo herders to Junggar was accompanied by a large amount of genetic mixing with local native populations. This model is completely different from the initial formation of the Afanasiwo culture in southern Siberia.

a, Based on qpAdm's estimate of the proportion of ancestors of Dzungaria_EBA and Tarim_EMBA from three ancestral sources (AG3, Afanasievo, and Baikal_EBA) (Supplementary Data 1D, E). Unlike Dzungaria_EBA individuals, Tarim_EMBA individuals are fully modeled without EBA Eurasian steppe herders (for example, Afanasievo) ancestry. b. Genetic mixing dates of key populations in the Inner Asian Bronze Age, including Dzungaria_EBA1 (n = 3), Chemurchek (n = 3), Kumsay_EBA (n = 4), Mereke_MBA (n = 2), Dali_EBA (n = 1) and Tarim_EMBA1 (n = 12). The blue shading represents the radiocarbon dating range of the Yannaya and Afanassiwo people. The orange circle and the associated vertical bar represent the mean and standard deviation of the median radiocarbon date, respectively. The circle above each orange circle represents the estimated mixing date, and the generation time is 29 years, and the vertical bar represents the sum of the estimated standard errors of the mixing date and the radiocarbon date. c. The representation of the ancient Eurasian population is based on the mixed model of qpAdm (Supplementary Data 1D-I). For Dzungaria_EBA1 and Geoksyur_EN, we showed their three-way hybrid models, including Tarim_EMBA1 as the source. For the late populations in Xinjiang, IAMC, and nearby areas, we used them as sources and assigned a color to each population (Dzungaria_EBA1 is blue; Geoksyur_EN is magenta). The base map in c comes from the Natural Earth public domain map dataset (https://www.naturalearthdata.com/downloads/10m-raster-data/10m-gray-earth/).

Although the Tarim_EMBA1 and Tarim_EMBA2 populations are geographically separated by more than 600 kilometers of deserts, they have formed a homogeneous population that has experienced a large number of population bottlenecks, which indicates that their high genetic affinity does not have close kinship, and their uniparental haplogroups Diversity is limited (Figure 1 and Figure 2, Extended Data Figure 4, Extended Data Table 1, Supplementary Data 1B and Supplementary Text 4). Using qpAdm, we modeled the Tarim Basin individuals as a mixture of two ancient indigenous Asian genetic groups: ANE, represented by late Paleolithic individuals from the Afontova Gora site in the upper Yenisei region (AG3) of Siberia (approximately 72%) and ancient Northeast Asians, represented by Baikal_EBA (about 28%) (Supplementary Data 1E and Figure 3a). Tarim_EMBA2 in the north can also be modeled as a mixture of Tarim_EMBA1 (about 89%) and Baikal_EBA (about 11%). For these two Tarim populations, when using the Afanasievo or IAMC/BMAC populations as the western Eurasian source, the hybrid model consistently failed (Supplementary Data 1E), thus rejecting the western Eurasian genetics from the nearby populations with grazing and/or agricultural economy contribute. We estimated the deep formation date of the Tarim_EMBA1 gene map, which is consistent with the absence of the EBA mixture in Western Europe and Asia. The origin of the gene pool is placed 183 generations before the individual sampling in the Tarim Basin, or the average generation time is 9,157 ± 986 years before 29 The time of year (Figure 3b). Taking these findings together, the genetic characteristics of individuals in the Tarim Basin indicate that the earliest individuals in the Xiaohe strata belong to an ancient and isolated native Asian gene pool. This native gene pool related to ANE is likely to form the genetic basis of ANE-related populations of former herders in Central Asia and southern Siberia (Figure 3c, expanded data Figure 2 and Supplementary Text 5).

Although the harsh environment of the Tarim Basin may be a strong obstacle to the flow of genes into the region, it is not an obstacle to the flow of ideas or technology because of the emergence of foreign innovations such as dairy cattle husbandry and wheat and millet agriculture. Formed the foundation of Tarim's economy in the Bronze Age. Wool fabrics, horn bones of cattle, sheep and goats, livestock manure, milk and kefir dairy products 33, 34, 35, 36, as well as wheat and millet seeds and ephedra were found in the upper layers of Xiaohe Cemetery and Gumugou Cemetery. Bunch of branches 34, 37, 38. As we all know, many mummies dating back to 1650-1450 BC are even buried in chunks of cheese35. However, it is still unclear whether this herder lifestyle is also a characteristic of Xiaohe's earliest level.

In order to better understand the dietary economy of the earliest archaeological period, we analyzed the dental calculus proteome of seven people at the Xiaohe site around 2000-1700 BC. All 7 individuals were strongly positive for ruminant milk-specific proteins (Extended Data Table 2), including β-lactoglobulin, α-S1-casein, and α-lactalbumin (Extended Data Figure 5), and peptides The recovery rate is sufficient to provide a diagnostic match between the taxonomy and cattle (Bos), sheep (Ovis) and goat (Capra) milk (Extended Data Figure 5, Extended Data Table 2 and Supplementary Data 3). These results confirmed that dairy products were consumed by the indigenous ancestor (Tarim_EMBA1) buried in the lowest level of the Xiaohe Cemetery (Extended Data Table 2). However, it is important to note that, unlike the previous hypothesis 36, none of the Tarim individuals are inherited from lactase persistence (Supplementary Data 1J). On the contrary, the Tarim mummies contributed to growing evidence that prehistoric dairy cattle husbandry in Inner and East Asia spread independently of the lactase persistence genotype28,30.

Although human activities in Xinjiang can be traced back to approximately 40,000 years, 24,39, the earliest evidence of continued human habitation in the Tarim Basin can only be traced back to the late 3rd to early 2nd century BC. There, in Xiaohe, Gumugou and northern sites, well-preserved human remains of mummified humans are buried in wooden coffins and are associated with abundant organic graves, representing the earliest known archaeological culture in the area. Since their first discovery in the early 20th century and subsequent large-scale excavations in the 1990s (Ref. 16), Tarim mummies have been about their origin, their relationship with other Bronze Age grasslands (Afanasievo), oasis (BMAC), and mountains (Reference 16). IAMC and Chemurchek) groups, and their potential connections to the spread of Indo-European languages ​​to the region3,4,40.

The paleogenomics and proteomics data we provide here show that the population history is very different and more complex than previously proposed. Although the IAMC may be a vehicle for transferring cultural and economic factors to the Tarim Basin, the known sites of the IAMC did not provide a direct source of ancestry for the Xiaohe population. In contrast, the Tarim mummies belong to an isolated gene pool, and their Asian origins can be traced back to the early Holocene. This gene bank may have a wider geographic distribution, and it left a large genetic footprint in the EMBA population in the Junggar Basin, IAMC, and southern Siberia. The so-called western physical characteristics of the Tarim mummies may be due to their connection with the Pleistocene ANE gene pool, and their extreme genetic isolation is different from the EBA Junggar, IAMC and Chemurchek populations, which have experienced a large number of genetic interactions with nearby populations, reflecting To understand their genetic characteristics. Cultural connection points out the role of extreme environments as obstacles to human migration.

However, contrary to their obvious genetic isolation, the population of Xiaohe Horizon is culturally cosmopolitan, incorporating multiple economic elements and technologies with distant origins. They use a fermentation method similar to kefir to make cheese from ruminant milk 37, which may have been learned from the descendants of Afanasievo, they grow wheat, barley, and millet 37, 41, which were originally domesticated in the Near East and North China , And was then introduced to Xinjiang no earlier than 3500 BC (References 8, 42), probably through their IAMC neighbors24. They used ephedra branches to bury the dead in a style reminiscent of the BMAC oasis culture in Central Asia. They also developed unique cultural elements that are not found in other cultures in Xinjiang or other regions, such as boat-shaped wooden coffins, covered with cowhide and marked wood Rods or paddles, and a clear preference for woven baskets over pottery 43,44. Taking these findings together, the closely integrated people who built Xiaohe Horizon seem to have a good understanding of different technologies and cultures outside the Tarim Basin, and they have developed their own unique culture to deal with the extreme challenges of the Taklimakan Desert and its fertility. River Oasis 4.

This study clarified the origin of the Bronze Age population in the Junggar and Tarim Basins in Xinjiang in detail. It is worth noting that our results do not support the hypothesis that a large number of humans migrating from grasslands or mountainous farmers and herdsmen from the Bronze Age Tarim mummies, but we found that Tarim mummies represent a culturally cosmopolitan but genetically isolated indigenous population. This finding is consistent with earlier thesis that IAMC, as a geographic corridor and carrier of regional cultural interaction, connects different populations from the fourth millennium BC to the second millennium (References 24, 25). Around 3000 BC, the arrival and mixing of the Afa Nasiyevo population in the Junggar Basin in northern Xinjiang may introduce Indo-European languages ​​into the region. However, starting around 2100 BC, the material culture and genetic characteristics of the Tarim mummies have been simplified. The hypothesis raised questions about the connection between genetics, culture and language, and did not answer the question of whether the Tarim people of the Bronze Age spoke a primitive Tocharian language. Future archaeological and paleogenomics research on subsequent populations in the Tarim Basin-most importantly, research on the location and period of the discovery of the first millennium and the Tocharian script-is necessary to understand the population history of the late Tarim Basin of. Finally, the paleogenomic characteristics of the Tarim mummies unexpectedly revealed one of the few known Holocene genetic descendants of the Pleistocene ANE ancestor spectrum that once existed widely. Therefore, the Tarim mummified genome provides a key reference point for the genetic modeling of Holocene populations and the historical reconstruction of Asian populations.

The archaeological human remains studied in this manuscript were excavated by Xinjiang Institute of Cultural Relics and Archaeology from 1979 to 2017. On the written agreement.

Among the 18 individuals reported in this study, 10 used accelerator mass spectrometry (AMS) to directly determine the date at the Beta Analytic in Miami, USA and/or Lanzhou University in China. In order to confirm the reliability of our AMS dating results, 4 out of 10 people conducted AMS dating at Beta Analytic and Lanzhou University. In all cases, consistent dates were obtained (Supplementary Data 1C). The calibration of the date samples is based on the IntCal20 database45 and uses the OxCal v.4.4 program46. The age of all samples is consistent with the estimated time period of the archaeological strata and the excavated tomb objects.

The ancient DNA work was carried out in the ancient DNA laboratory of Changchun Jilin University and the dedicated clean room laboratory facilities of the Beijing Institute of Vertebrate Paleontology and Paleoanthropology (Extended Data Table 1 and Supplementary Data 1A). For the 33 individuals initially screened in this study, each person obtained approximately 50 mg of dentin or bone meal from their teeth or bones. DNA was extracted according to the established protocol 47, with minor modifications (https://doi.org/10.17504/protocols.io.baksicwe). According to the method described in the reference, a part of the DNA extract (n = 16) was repaired with a partial uracil-specific excision reagent. 48 (Extended Data Sheet 1 and Supplementary Data 1A). All 33 DNA extracts were constructed into a double-stranded, double-indexed Illumina library. The library prepared in Jilin (n = 26) used 2 × 150 base pair (bp) chemistry to directly perform shotgun sequencing on the Illumina HiSeq X10 or HiSeq 4000 instrument, and those endogenous human DNA were more than 10% (n = 12) Be sent for deeper sequencing. Due to high levels of modern human DNA contamination (Supplementary Data 1A), one of 12 individuals (XHBM1) was later excluded from this study. For the library prepared by the Institute of Vertebrate Paleontology and Paleoanthropology, samples containing 0.1% or more of human DNA in the initial screening (n = 7) were further enriched with approximately 1.2 million nuclear SNPs, and then used on the Illumina HiSeq 4000 instrument 2 × 150-bp chemical reaction. A total of 18 people provided enough high-quality ancient genome data for downstream analysis (Extended Data Table 1).

The raw read data is processed using EAGER v.1.92.55 (Reference 49), which is a pipeline dedicated to processing ancient DNA sequence data. Specifically, the original reads were trimmed for the Illumina adapter sequence, and the overlapping pairs were folded into a single read using AdapterRemoval 2.2.0 (Ref. 50). Use the aln/samse program in BWA v.0.7.12 (reference 51) to map the combined reads to the human reference genome (hs37d5; GRCh37 with decoy sequence). Use DeDup v.0.12.2 (Ref. 49) to remove PCR duplication. In order to minimize the impact of postmortem DNA damage on genotyping, we trimmed the BAM generated from samples treated (n = 11) or untreated (n = 7) with Uracil DNA Glycosylase (UDG) File, the method is to soft-mask up to 10 bp at both ends according to the DNA error incorporation pattern of each library tabulated using mapDamage v.2.0.9 (reference 53), on bamUtils v.1.0.13 (reference 52) Use the trimbam function for each read. For each SNP in the 1,240k panel, use pileupCaller v.1.4.0.5 downloaded from https to randomly sample a single base from a high-quality read (base and mapping quality score of 30 or higher) to represent pseudodiploid Genotype: The random haploid call mode (-randomHaploid) under //github.com/stschiff/sequenceTools. For converting SNPs (C/T and G/A), trimmed BAM files are used. For the transversion SNP, an untrimmed BAM file is used.

We assessed the authenticity of our ancient DNA data as follows. First, we calculated the proportion of C-to-T deamination errors at the 5'and 3'ends of the sequencing reads, and found that all samples showed the postmortem damage pattern of ancient DNA (Supplementary Data 1A). Then we used the Schmutzi v.1.5.1 program to estimate the mitochondrial DNA contamination of all individuals. To this end, we mapped the adaptor-pruned reads to the 500 bp extension of the human mitochondrial genome (NC_012920.1) and revised the Cambridge Reference Sequence (rCRS) to preserve the reads that passed the origin, and then packaged the alignment to use circularmapper v.1.1 (reference Reference 49) conventional rCRS. We ran the contDeam and schmutzi modules of the schmutzi program against the global allele frequency database of 197 individuals to estimate the rate of mitochondrial DNA contamination. Finally, we used ANGSD v.0.910 (Reference 55) to estimate the nuclear contamination rate in men. Based on the principle that men have only one copy of the X chromosome, contamination will introduce additional mismatch sites between the readings in the SNP, but Not in the flanking singlet sites.

We compared the genome sequences of ancient individuals with two sets of global genotype panels, one based on the Affymetrix Axiom Whole Genome Human Origins 1 Array (Human Origins; 593,124 autosomal SNPs) 56,57,58, and the other based on 1,240k Data set (1,233,013 autosomal SNPs, including all Human Origins SNPs) 59. We expanded these two datasets by adding Simons Genome Diversity Panel60 and publishing ancient genomes (Supplementary Data 2A).

We used the pairwise mismatch rate (pmr) 61 and lcMLkin v0.5.0 (Ref. 62) to determine the genetic correlation between ancient individuals. We calculated the pmr of all ancient individual pairs in this study using the autosomal SNPs in the 1,240k panel, and kept at least 8,000 SNP individual pairs to eliminate noise estimates in low-coverage samples. We use lcMLkin to validate our observations in pmr analysis and distinguish between parental and complete sibling pairs.

We align the reads trimmed by the adapter with rCRS NC_012920.1, and then use Geneious software v.11.1.3 (reference 63; https://www.geneious.com/) to generate the mitochondrial consensus sequence of each ancient individual. We use HaploGrep2 (Ref. 64) to assign each consensus sequence to a specific haplogroup. For the Y chromosome, we used the pedigree information SNP from the 2016 International Society of Genetic Genealogy (https://isogg.org/tree/2016/index16.html). For these SNPs, we use bcftools v.1.7 (ref. 51) mpileup and call modules to call the genotype of each person, and delete the bases with quality scores <30 (-q 30) and quality scores <30 (-Q 30). After the base, we subsequently deleted all heterozygous genotype calls. Then we manually compare the genotype call with the International Society of Genetic Pedigree SNPs and assign each individual to a specific Y haplogroup. Before the mutation call, we use the pysam library v.0.15.2 (https://pysam.readthedocs.io/en/latest/) to filter the comparison data to reduce false positive mutations caused by postmortem injury and modern human pollution . We retain the observed base only when it comes from a read shorter than 100 bp and the base is more than 10 bp from the end of the read. For converted SNPs, if they are from reads without post-mortem damage mode (ie no C-to-T or G-to-A substitutions), we will further delete aligned bases. We mainly determine the Y haplogroup of each person based on the transversion SNP. If the transversion is not sufficient, we additionally consider the change.

We used a set of 2,077 current Eurasian individuals from the HumanOrigins data set (Supplementary Data 2B) and the options "lsqproject: YES" and "shrinkmode: YES" to implement the implementation in smartpca v.16000 (reference 65) Principal component analysis. The unsupervised admixture analysis was performed using ADMIXTURE v.1.3.0 (Reference 66). For ADMIXTURE, we deleted genetic markers with minor allele frequencies below 1% and used the -indep-pairwise 200 25 0.2 option in PLINK v.1.90 to trim linkage disequilibrium (Ref. 67). We use the outgroup f3 statistic 68 to measure the genetic relationship between the target population and a group of Eurasian populations because they are at odds with the African outgroup. We calculated f4 statistics 58 using the "f4mode: YES" function in the ADMIXTOOLS package. The f3 and f4 statistics are calculated using qp3Pop v.435 and qpDstat v.755 in the ADMIXTOOLS package.

We estimate whether the Bronze Age Xinjiang individuals come from genetically related parents by estimating the operation of homozygosity (ROH). ROH refers to a segment of the genome in which the two chromosomes of an individual are identical to each other due to the nearest common ancestor. Therefore, the presence of long ROH fragments strongly suggests that the individual's parents are related. We applied the hapROH method 69 using the Python library hapROH v.0.3a4 with default parameters. This method was developed to identify ROH from the low-coverage genotype data typical of ancient DNA, and is still powerful enough to identify the ROH of individuals whose coverage is as low as 0.5 times (Ref. 69). We reported sums of ROH over 4, 8, 12, and 20 cM, and used DataGraph v.4.5.1 to visualize the results.

We used the qpWave/qpAdm program (qpWave v.410 (reference 70) and qpAdm v.810 (reference 57)) to simulate our ancient Xinjiang population. Unless explicitly stated otherwise, we use the following eight populations in the 1,240k data set as the basis set of the outer group (base): Mbuti (n = 5), Natufian (n = 6), Onge (n = 2), Iran_N (n = 5), Villabruna (n = 1), Mixe (n = 3), Ami (n = 2), Anatolia_N (n = 23). This group includes the African Outer Group (Mbuti), Early Holocene Levantine Hunter-Gatherers (Natufian), Andaman Islanders (Onge), Early Neolithic Iranians from the Tepe Ganj Dareh Site (Iran_N), and Late Pleistocene Western European hunter-gatherers (Villabruna), Central American natives (Mixe), native Taiwanese (Ami) and Neolithic farmers from Anatolia (Anatolia_N). In order to compare competing models, we also adopted the "spinning" method, in which we add the sources of the models to the outgroup of the competing models. We specified the outer group to be used for all qpAdm models.

We use DATES v.753 (reference 26) under the simplified assumption that gene flow is a single event, and assume that the generation time is 29 years, and use pseudo-haploid genotype data to date the mixed events of ancient populations (reference Literature 58). The DATES software measures the decay of the ancestral covariance to infer the mixing time and estimate the standard error of the jackknife. In the parameter file used to run DATES, we use the options binsize: 0.001, maxdis: 0.5, runmode: 1, qbin: 10 and lovalfit: 0.45 each time we run pseudo-haploid genotype data. For each target population, we selected a pair of reference populations that we identified as good sources in the qpAdm analysis. In the case of limited sample size or SNP coverage of the qpAdm source, we chose an alternative with similar genetic characteristics as the qpAdm source but with better data quality to enhance the statistical power of the DATES analysis (Supplementary Data 1D-G). For Dzungaria_EBA1 And Chemurchek, we use Afanasievo (n = 20) and Baikal_EBA (n = 9) as references. For Kumsay_EBA and Mereke_MBA, we used Afanasievo (n = 20) and Baikal_EN (n = 15). For Dali_EBA, we used Tarim_EMBA1 (n = 12) and Baikal_EBA (n = 9). For Tarim_EMBA1, we used West_Siberia_N (n = 3) and DevilsCave_N (n = 4).

The total protein was extracted from the calculus obtained from the seven Xiaohe individuals excavated in the 4th and 5th layers (Extended Data Table 2). Only individuals with calculi deposits> 5 mg were analyzed, and 5-10 mg calculi were processed for each sample. After decalcification in 0.5 M EDTA, the sample was extracted and digested using a filter-assisted sample preparation method (Ref. 71). According to the previously described protocol, the extracted peptides were analyzed by liquid chromatography and tandem mass spectrometry (MS/MS) using a Q-Exactive mass spectrometer (Thermo Scientific) coupled with an ACQUITY UPLC M-Class system (Waters AG). Monitor potential contamination and sample carryover by using extraction blank and injection blank between each sample.

MSConvert version 3.0.11781 uses the 100 strongest MS/MS peaks to convert tandem mass spectra into Mascot universal files. All MS/MS samples were analyzed using Mascot (Matrix Science; v.2.6.0). Mascot is set to search the SwissProt Release 2019_08 database (560,823 entries), assuming that the digestive enzyme is trypsin. Mascot's fragment ion mass tolerance is 0.050 Da, and the precursor ion tolerance is 10.0 ppm. Mascot specifies the aminomethylation of cysteine ​​as a fixed modification. The deamidation of asparagine and glutamine and the oxidation of methionine and proline are designated as variable modifications in Mascot. Perform duplicate analyses on a subset of samples (Supplementary Data 3), and combine the results using Multidimensional Protein Identification Technology (MudPIT) before analysis.

Use Scaffold (version Scaffold_4.9.0, Proteome Software) to verify MS/MS-based protein and peptide identification. If peptide identification can be established with a probability greater than 86.0% to achieve a false discovery rate (FDR) of less than 1.0% through the peptide prophet algorithm 71 with incremental mass correction of the scaffold, it is accepted. It is acceptable if the protein identification can be established with an FDR of less than 5.0% and contains at least two unique peptides. The final protein and peptide FDRs were 1.8% and 0.99%, respectively. The protein probability is assigned 72 by the protein prophet algorithm. After using these criteria to determine the presence of milk proteins β-lactoglobulin and α-S1-casein, we expanded our analysis to accept further milk proteins for high-scoring PSM (>60) based on single peptide identification. Lead to additional identification of alpha-whey protein. Proteins containing similar peptides that cannot be distinguished based on MS/MS analysis alone are grouped to satisfy the parsimony principle. All samples produced the typical proteome of the dental calculus oral microbiome, and the characteristics of damage-related modifications (N and Q deamidation) of ancient proteins were observed (Supplementary Data 3).

For more information on the study design, please see the abstract of the nature research report associated with this article.

The DNA sequence reported in this article has been deposited in the European Nucleotide Archive with the deposit number PRJEB46875. The haploid genotype data of ancient individuals with 1,240k panels in this study are provided in EIGENSTRAT format at https://edmond.mpdl.mpg.de/imeji/collection/OMm2fpu0jR3jSqnY. The protein spectrum has been placed in the ProteomeXchange Consortium through the PRIDE partner database, and the registration number is PDX027706. The publicly available SwissProt version 2019_08 can be accessed through the UniProt Knowledge Base (https://www.uniprot.org). The base map used in the figure. 1, 3 are in the public domain and can be accessed through the Natural Earth website (https://www.naturalearthdata.com/downloads/10m-raster-data/).

All analyses performed in this study are based on publicly available software programs. Specific version information and non-default parameters are described in the method.

Peyrot, M. in Aspectisation: Mobility, Exchange and the Development of Multi-Cultural States 12–17 (2017).

Damgaard, P. etc. 137 ancient human genomes from the Eurasian steppes. Nature 557, 369–374 (2018).

Hemphill, BE & Mallory, JP Equestrian invaders from the Russian-Kazakh steppes or agricultural colonists from western Central Asia? Skull survey of settlements in the Bronze Age in Xinjiang. Yes. J. Physics. Anthropologist. 124, 199–222 (2004).

Betts, A., Jia, P. & Abuduresule, I. A new hypothesis of cultural diversity in the early Bronze Age in Xinjiang, China. Archaea. Reservoir Asia 17, 204–213 (2019).

Li, C. etc. As early as the early Bronze Age, there was evidence of mixed population living in the Tarim Basin. BMC Biology. 8, 15 (2010).

PubMed PubMed Central Google Scholar 

Li, C. etc. Mitochondrial DNA analysis of ancient humans at Xiaohe Cemetery: Insights into the prehistoric population movement in the Tarim Basin, China. BMC gene. 16, 78 (2015).

PubMed PubMed Central Google Scholar 

Ning, C. etc. The ancient genome reveals the lineage related to Yannaya and the potential source of the Indo-European language family of the Tianshan Mountains during the Iron Age. curry. biology. 29, 2526-2532 (2019).

Zhou, X. etc. The 5,200-year-old grains from the eastern Altai Mountains redefine the trans-Eurasian crop exchange. Nat. Plant 6, 78–87 (2020).

Wang, T. etc. Tianshan North Road and the Road of Isotope Millet: Review the radiation of human consumption of millet from North China to Europe in the late Neolithic Age/Bronze Age. National Science. Rev. 6, 1024–1039 (2019).

CAS PubMed PubMed Central Google Scholar 

Zhang, Y. et al. Holocene environmental changes in the Xiaohe Cemetery in Xinjiang and their impact on human settlements. J. George. science. 27, 752–768 (2017).

Hong, Z., Jian-Wei, W., Qiu-Hong, Z. & Yun-Jiang, Y. A preliminary study on the evolution of oasis in the Tarim Basin, Xinjiang, China. J. Arid environment. 55, 545–553 (2003).

Jia, P. and Betts, A. Reanalysis of the Shamirshak cemetery in Xinjiang, China. J. Indo-Europe. Studs. 38, 275–317 (2010).

Peyrot, M. The abnormal typological characteristics of the Tocharian branch of Indo-European languages ​​may be due to the influence of the Uralic substrate. Indo-Europe. linguist. 7, 72–121 (2019).

Bouckaert, R. et al. Draw the origin and expansion of the Indo-European language family. Science 337, 957–960 (2012).

ADS CAS PubMed PubMed Central Google Scholar 

Arentoft, I'll wait. Population genomics of Eurasia in the Bronze Age. Nature 522, 167–172 (2015).

ADS CAS PubMed PubMed Central Google Scholar 

Mallory, JP and Mair, VH Tarim Mummies: The earliest ethnic mystery in ancient China and the West (Thames & Hudson, 2000).

Barber, EW Mummies in Urumqi (WW Norton & Co., 1999).

Mair, VH The corpse of a prehistoric Caucasian in the Tarim Basin. J. Indo-Europe. Studs. 23, 281–307 (1995).

Mair, VH in the Bronze Age and Early Iron Age People of Eastern Central Asia Vol. 2 835–855 (Institute of Anthropology and University of Pennsylvania Museum, 1998).

Mallory, JP Tocharian Origins: An Archaeological Perspective (University of Pennsylvania Press, 2015).

Chen, K. & Hiebert, FT The late prehistory of Xinjiang and its neighbors. J. Prehistory of the world. 9, 243–300 (1995).

Han, K. Skull measurement of ancient individuals at the Gumugou site in Xinjiang. Kogu Journal 361-384 (1986).

Kuzmina, EE in Archaeology, Migration and Nomadism, Linguistics Vol. 1 63–93 (University of Pennsylvania Museum Press, 1998).

Li, Y. Agriculture and ancient economy in Xinjiang in prehistoric China (3000-200 BC). vegetable. history. Archeological robot. 30, 287–303 (2021).

Frachetti, MD The multi-regional emergence of mobile animal husbandry across Eurasia and heterogeneous institutional complexity. curry. Anthropologist. 53, 2-38 (2012).

Narasimhan, VM etc. The formation of the population of South and Central Asia. Science 365, eaat7487 (2019).

CAS PubMed PubMed Central Google Scholar 

Feng, Q. etc. The genetic history of Xinjiang Uyghurs shows that there are multiple contacts in the Bronze Age in Eurasia. Mole. biology. evolution. 34, 2572–2582 (2017).

CAS PubMed PubMed Central Google Scholar 

Jeong, C. etc. The population dynamics of the Bronze Age and the rise of dairy cattle husbandry on the grasslands of eastern Europe and Asia. Process National Academy of Sciences. science. United States 115, E11248–E11255.

Yu, H. etc. The Siberians from the Paleolithic to the Bronze Age revealed their connection with the first Americans and the entire Eurasian continent. Cells 181, 1232-1245 (2020).

Jeong, C. etc. The 6,000-year dynamic genetic history of the eastern steppes of Eurasia. Cell 183, 890–904 (2020).

CAS PubMed PubMed Central Google Scholar 

Fu, Q. etc. The genetic history of the European Ice Age. Nature 534, 200–205 (2016).

ADS CAS PubMed PubMed Central Google Scholar 

Wang C.-C. et al. Genomics insights into East Asian population formation. Nature 591, 413–419 (2021).

ADS CAS PubMed PubMed Central Google Scholar 

Lee, J.-F. Wait. Burial in the Sand: Environmental Analysis of the Archaeological Site of the Xiaohe Cemetery in Xinjiang, China. PLoS ONE 8, e68957 (2013).

ADS CAS PubMed PubMed Central Google Scholar 

Qiu, Z. etc. Paleoenvironment and paleo diet inferred from the early Bronze Age cow dung in Xiaohe Cemetery, Xinjiang, Northwest China. Quaternary ammonium salt. internationality. 349, 167–177 (2014).

Yang, Y. et al. Proteomic evidence of kefir dairy products from the early Bronze Age in China. J. Archaeol. science. 45, 178–186 (2014).

Thanks, M. Wait. Identification of dairy products in straw baskets at Gumugou Cemetery (3800 BP, Northwest China). Quaternary ammonium salt. internationality. 426, 158–165 (2016).

Yang, R. et al. Investigation of Grain Remains in Xiaohe Cemetery, Xinjiang, China. J. Archaeol. science. 49, 42-47 (2014).

Zhang, G. et al. Ancient plant utilization and paleoenvironmental analysis of Gumugou Cemetery in Xinjiang, China: Enlightenment from the remains of dry plants. Archaea. Anthropologist. science. 9, 145–152 (2017).

Yu, J. & He, J. Major discoveries from the excavation of the Tongtian Cave site in Jimunai (Chinese)​​. Wenwubao 8 (2017).

Holard, C. et al. New genetic evidence of kinship and discontinuity between Siberian populations in the Bronze Age. Yes. J. Physics. Anthropologist. 167, 97–107 (2018).

PubMed PubMed Central Google Scholar 

Li, C. etc. Ancient DNA analysis of dry wheat grains unearthed in Xinjiang Bronze Age cemetery[J]. J. Archaeol. science. 38, 115–119 (2011).

Stevens, CJ & Fuller, DQ The spread of agriculture in East Asia: hypothetical peasants/archeological basis of language spread. Lang. Dyne. Change 7, 152–186 (2017).

Abuduresule, I. 2003 Archaeological Report on Xiaohe Cemetery. Civil and Military 4–42 (2007).

Abuduresule, Y., Li, W. & Hu, X. Ancient Xinjiang Culture in Western China: Crossroads of the Silk Road 19-51 (Archaeopress, 2019).

Remer, PJ and others. IntCal20 Northern Hemisphere radiocarbon age calibration curve (0–55 cal kBP). Radiocarbon 62, 725–757 (2020).

Ramsey, CB summarizes the methodology of radiocarbon data sets. Radiocarbon 59, 1809–1833 (2017).

Dabney, J. et al. The complete mitochondrial genome sequence of the Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Process National Academy of Sciences. science. United States 110, 15758–15763 (2013).

ADS CAS PubMed PubMed Central Google Scholar 

Rohland, N., Harney, E., Mallick, S., Nordenfelt, S. & Reich, D. Part of uracil-DNA-glycosylase treatment for screening ancient DNA. Philos. Translated by R. Soc. B 370, (2015).

Peltzer, A. etc. EAGER: Efficient reconstruction of ancient genomes. Genomic biology. 17, 60 (2016).

PubMed PubMed Central Google Scholar 

Schubert, M., Lindgreen, S. and Orlando, L. AdapterRemoval v2: fast adapter trimming, recognition and read merge. BMC Resource Note 9, (2016).

Li, H. & Durbin, R. uses the Burrows-Wheeler transformation for fast and accurate short-read comparisons. Bioinformatics 25, 1754–1760 (2009).

CAS PubMed PubMed Central Google Scholar 

Jun, G., Wing, MK, Abecasis, GR & Kang, HM An efficient and scalable analysis framework used to extract and refine variations from population-scale DNA sequence data. Genome research. 25, 918–925 (2015).

CAS PubMed PubMed Central Google Scholar 

Jónsson, H., Ginolhac, A., Schubert, M., Johnson, PLF & Orlando, L. mapDamage2.0: Fast approximate Bayesian estimation of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).

PubMed PubMed Central Google Scholar 

Renaud, G., Slon, V., Duggan, AT & Kelso, J. Schmutzi: Estimates of contamination and endogenous mitochondrial consensus require ancient DNA. Genomic biology. 16, 224 (2015).

PubMed PubMed Central Google Scholar 

Korneliussen, TS, Albrechtsen, A. & Nielsen, R. ANGSD: Next-generation sequencing data analysis. BMC Bioinformatics 15, 356 (2014).

PubMed PubMed Central Google Scholar 

Jeong, C. etc. The genetic history of the mixture within Eurasia. Nat. Ecology. evolution. 3, 966–976 (2019).

PubMed PubMed Central Google Scholar 

Lazaridis, I. etc. Genomics insights into the origin of agriculture in the ancient Near East. Nature 536, 419–424 (2016).

ADS CAS PubMed PubMed Central Google Scholar 

Patterson, N. et al. An ancient mixture in human history. Genetics 192, 1065–1093 (2012).

PubMed PubMed Central Google Scholar 

Fu, Q. etc. Early modern people from Romania have recently had Neanderthal ancestors. Nature 524, 216–219 (2015).

ADS CAS PubMed PubMed Central Google Scholar 

Mallick, S. etc. Simmons Genome Diversity Project: 300 genomes from 142 different populations. Nature 538, 201–206 (2016).

ADS CAS PubMed PubMed Central Google Scholar 

Kennett, DJ etc. Archaeological genomics evidence reveals prehistoric matrilineal dynasties. Nat. Community. 8, 14115 (2017).

ADS CAS PubMed PubMed Central Google Scholar 

Lipatov, M., Sanjeev, K., Patro, R. & Veeramah, KR Maximum likelihood estimation of biological relevance from low-coverage sequencing data. Preprint https://doi.org/10.1101/023374 (2015).

Kearse, M. etc. Geneious Basic: An integrated and extensible desktop software platform for organizing and analyzing sequence data. Bioinformatics 28, 1647–1649 (2012).

PubMed PubMed Central Google Scholar 

Weissensteiner, H. et al. HaploGrep 2: Mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic acid research. 44, W58–W63 (2016).

CAS PubMed PubMed Central Google Scholar 

Patterson, N., Price, AL & Reich, D. Demographic structure and characteristics analysis. PLoS gene. 2. e190 (2006).

PubMed PubMed Central Google Scholar 

Alexander, DH, Novembre, J. & Lange, K. Model-based rapid estimation of unrelated individual ancestry. Genome research. 19, 1655–1664 (2009).

CAS PubMed PubMed Central Google Scholar 

Zhang, CC, etc. The second generation of PLINK: to meet the challenge of larger and richer data sets. Super Science 4, (2015).

Raghavan, M. et al. The late Paleolithic Siberian genome revealed the dual ancestry of Native Americans. Nature 505, 87–91 (2014).

ADS PubMed PubMed Central Google Scholar 

Ringbauer, H., Novembre, J. & Steinrücken, M. Detect homozygous runs from ancient DNA with low coverage. Preprint https://doi.org/10.1101/2020.05.31.126912 (2020).

Reich, D. etc. Reconstruction of Native American population history. Nature 488, 370–374 (2012).

ADS CAS PubMed PubMed Central Google Scholar 

Keller, A., Nesvizhskii, AI, Kolker, E. & Aebersold, R. Empirical statistical model used to estimate the accuracy of peptide recognition by MS/MS and database searches. anus. Chemistry 74, 5383–5392 (2002).

Nesvizhskii, AI, Keller, A., Kolker, E. & Aebersold, R. A statistical model for protein identification by tandem mass spectrometry. anus. Chemistry 75, 4646-4658 (2003).

Thanks to Xinjiang Institute of Cultural Relics and Archaeology and Renmin University of China for providing such valuable research samples; Lanzhou University for providing AMS dating results; K. Wang, H. Yu and GA Gnecchi-Ruscone for their helpful comments on the genetic landscape of the Eurasian steppe; T. Hermes and R. Flad provided helpful comments on the broader archaeological context of the area. This work has been supported by the National Key Research and Development Program (2016YFE0203700 and 2018YFA0606402), the National Natural Science Foundation of China (42072018, 41925009), the European Central Research Council Fundamental Research Fund, and the European Union Horizon 2020 Research and Innovation Program (funding agreement numbers 804884-DAIRYCULTURES and 646612-Eurasia3angle) ), the Humanities and Social Sciences Key Research Base of the Ministry of Education (16JJD780005), the Korean National Research Foundation (MSIT; 2020R1C1C1003879) funded by the Korean government and the Max Planck Society.

Open access funding provided by the Max Planck Society.

The contributions of these authors are the same: Fan Zhang, Chao Ning, Ashley Scott

College of Life Sciences, Jilin University, Changchun, China

Zhang Fan, Fan Linyuan, Ma Pengcheng, Li Chunxiang, Yang Xu, Wu Sihao, Zhou Hui and Cui Yinqiu

Max Planck Institute for the Science of Human History, Jena, Germany

Chao Ning, Ashley Scott, Rasmus Bjorn, Martin Robitz, Johannes Klaus and Christina Wernana

Key Laboratory of Vertebrate Evolution and Human Origin, Institute of Vertebrate Paleontology and Paleoanthropology, Center for Excellence in Life and Paleoenvironmental Research, Chinese Academy of Sciences, Beijing, China

Fu Qiaomei, Wang Wenjun, Cao Peng, Liu Feng, Dai Qingyan, Feng Xiaotian, Yang Ruowei and Vikas Kumar

Xinjiang Institute of Cultural Relics and Archaeology, Urumqi, China

Li Wenying, Editis Abdul Sul, Hu Xingjun, Ruan Qiurong and Ali Pujiang Niazi

School of Archaeology, Jilin University, Changchun, China

Dong Wei & Zhu Hong

Key Laboratory of Western Environmental System of Ministry of Education, School of Earth and Environmental Sciences, Lanzhou University, Lanzhou

Key Laboratory of Cenozoic Geology and Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, China

School of Pharmacy, Jilin University, Changchun

Institute of Archaeological Sciences, Fudan University, Shanghai, China

Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany

Department of Anthropology, Harvard University, Cambridge, Massachusetts, USA

College of Biological Sciences, Seoul National University, South Korea

Key Laboratory of Past Life and Environmental Evolution in Northeast Asia, Ministry of Education, Jilin University, Changchun, China

China Frontier Archaeology Research Center, Jilin University, Jilin University, Changchun, China

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

YC, CJ, CW, CN, and JK conceived and supervised this research. FZ, AS, LF, PC, RY, FL and QD performed laboratory work. QF, DW, WL, XH, QR, IA, CL, SG, YX, S. Wu, S. Wen, H. Zhu, H. Zhou and AN provided archaeological materials and related information. RB and MR provided the language background, and GD and ZT assisted in the AMS dating. CN, FZ, AS, CW, CJ, QF, PM, XF, WW and VK analysis data. CN, CW, CJ, YC, FZ and AS wrote the manuscript with the participation of all co-authors.

Correspondence with Chao Ning, Johannes Krause, Christina Warinner, Choongwon Jeong or Yinqiu Cui.

The author declares no competing interests.

Peer review information Nature thanks Paula Dupuy, Michael Frachetti and other anonymous reviewers for their contributions to the peer review of this work. Peer review reports are available.

The publisher states that Springer Nature remains neutral on the jurisdiction claims of published maps and agency affiliates.

A. Wood carvings unearthed from the upper layer of the XHM75 double-layer mud coffin. B, A paddle board placed in front of the male tomb. C. Wooden poles placed in front of the female tomb. D, Tomb XHM66 on the 4th floor of Xiaohe Cemetery, shows the typical features of early tombs, including boat-shaped coffins and the remains of mummies wearing woolen clothes. This type of burial is very common in Bronze Age cemeteries throughout the Tarim Basin, including the North and Gumugou. E, Side view of Xiaohe Cemetery, showing wooden tombstones and fences.

A. We use the first 5 outgroup f3 statistics in the form of f3 (Target, X; Mbuti) of 361 world-wide populations as the comparison population X, and the 8 populations and Eurasian grasslands from this study as the target: Dzungaria_EBA1 Dzungaria_EBA2, Chemurchek, Dzungaria_EIA, Okunevo_EMBA, Kazakhstan_EMBA, Botai_CA, West_Siberia_N, horizontal bars represent ± 1 standard error metric (sem), calculated from 5 cm block cutting. F4-statistics in the form of B, f4 (Mbuti, X; Dzungaria_EBA1, Tarim_EMBA1), horizontal bars represent ±3 (thin) and ±1 (thick) sem, calculated by a 5cM block knife, C, f4-statistics f4(Mbuti, X; Dzungaria_EBA2, Tarim_EMBA1), where X is the global population of 361. We show 15 f4 statistics at the top and bottom. The horizontal bars represent point estimates ± 3 (thin) and ± 1 (thick) sem, respectively, as estimated with a 5 cm piece of folding knife. The F4 statistical data deviates from zero by three sem or more, marked in red.

We use "AncestryPainter" (https://www.picb.ac.cn/PGG/resource.php) to plot the estimated value of the ancestral component of K = 8. Dzungaria_EBA individuals showed similar ancestral patterns to Afanasievo and Yamnaya, while Tarim_EMBA individuals showed similar patterns to AG3, West_Siberia_N and Botai_CA from the Eurasian steppe.

A. Comparison of individual f3-statistical data of the ancient Xinjiang population and its neighboring populations, including Tarim_EMBA1 (n = 12), Tarim_EMBA2 (n = 1), ANE (n = 3), Dzungaria_EBA1 (n = 3) ), Dzungaria_EBA2 (n = 2), West_Siberia_N (n = 3) and Botai_CA (n = 3), among which the affinity between individuals in the Tarim Basin is the highest. In each box plot, the boxes mark the 25th quartile and the 75th quartile of the distribution, and the horizontal line inside the box marks the median. The whiskers depict the maximum and minimum values. B. The cumulative distribution of the ROH tract indicates that the individual Tarim_EMBA is not the offspring of a closely related parent. ( n = 15), Botai_CA (n = 3), Dzungaria_EBA (n = 5), Dzungaria_EIA (n = 10), Sintashta_MLBA (n = 51), Tarim_EMBA (n = 13), West_Siberia_N (n = 3), and as current Isolated crowds, such as Papuans and Karritinas. Tarim_EMBA individuals all showed a greatly reduced pmr value, which is equivalent to the first-degree relatives of Afanasievo or Sintashta_MLBA. The red dotted line marks the expected pmr value for a given relationship coefficient (r), ranging from 0 (irrelevant) and 1/4 (second-degree relatives) to 1/2 (first-degree relatives), based on the average pmr in these populations . In each box plot, the box represents the interquartile range (25th and 75th quartile), and the horizontal line inside the box represents the median. The black and open circles represent outliers (exceeding 1.5 times of IQR) and extreme outliers (exceeding 3 times of IQR), respectively. The whiskers depict the smallest and largest non-outlier observations. D, Y chromosome phylogeny of Xinjiang male individuals in the Bronze Age. The males of Xiaohe belonged to a branch different from the herders of the western grasslands in the Bronze Age, such as Afa Nasiwo and Yannaya. An individual in the north is located at a more basement position than the small river, but due to the low coverage rate, its phylogenetic position cannot be fixed, but the closest position is indicated by an asterisk.

The frequently observed β-lactoglobulin peptide TPEVD (D/N/K) EALEK's A, B- and Y-ion series, which contain taxa-specific polymorphic residues: D, Bovine subfamily; N, sheep; K, Capra. See SI appendix. B, Taxonomically designated β-lactoglobulin (black), α-S1-casein (dark gray), and α-lactalbumin peptide mapping (PSM) are shown as zoomed pie charts on the clades of cows Present. The number in parentheses indicates the number of PSM (not including duplicates) allocated to each node. †The bovid node includes: 13 PSMs allocated to bovids; 21 PSMs allocated to bovids, excluding Capra.

This PDF file includes five parts of supplementary text. (1) Xinjiang environmental background; (2) archaeological sites and environment; (3) linguistic background of Xinjiang population history; (4) detailed description of the genetic separation of the Tarim group; (5) Tarim mummies and former herders' genetic matrix in Central Asia.

Sample information, qpAdm modeling results and the phenotypic characteristics of the researched individual.

This research analyzes ancient and modern populations.

Milk peptides identified in Xiaohe dental calculus samples.

Open Access This article has been licensed under the Creative Commons Attribution 4.0 International License Agreement, which permits use, sharing, adaptation, distribution and reproduction in any media or format, as long as you appropriately indicate the original author and source, and provide a link to the Creative Commons license , And indicate whether any changes have been made. The images or other third-party materials in this article are included in the article’s Creative Commons license, unless otherwise stated in the material’s credit line. If the article’s Creative Commons license does not include the material, and your intended use is not permitted by laws and regulations or exceeds the permitted use, you need to obtain permission directly from the copyright owner. To view a copy of this license, please visit http://creativecommons.org/licenses/by/4.0/.

Zhang, F., Ning, C., Scott, A., etc. The genome origin of the mummies in the Tarim Basin in the Bronze Age. Nature 599, 256–261 (2021). https://doi.org/10.1038/s41586-021-04052-7

DOI: https://doi.org/10.1038/s41586-021-04052-7

Anyone you share the following link with can read this content:

Sorry, there is currently no shareable link in this article.

Provided by Springer Nature SharedIt content sharing program

By submitting a comment, you agree to abide by our terms and community guidelines. If you find content that is abusive or does not comply with our terms or guidelines, please mark it as inappropriate.

Nature ISSN 1476-4687 (online) ISSN 0028-0836 (print)