It performs: K-mer based detection Map/align, variant calling Consensus sequence generation Lineage/clade analysis using Pangolin and NextClade Access the DRAGEN COVID Lineage App on BaseSpace Sequence Hub PLoS Pathog. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. J. Virol. Chernomor, O. et al. Genetics 176, 10351047 (2007). Coronavirus Disease 2019 (COVID-19) Situation Report 51 (World Health Organization, 2020). Mol. 2, bottom) show that SARS-CoV-2 is unlikely to have acquired the variable loop from an ancestor of Pangolin-2019 because these two sequences are approximately 1015% divergent throughout the entire Sprotein (excluding the N-terminal domain). Scientists defined the pangolin lineage of this variant to be B.1.1.523 and it was originally recognized as a variant under monitoring on July 14, 2021. Software package for assigning SARS-CoV-2 genome sequences to global lineages. To begin characterizing any ancestral relationships for SARS-CoV-2, NRRs of the genome must be identified so that reliable phylogenetic reconstruction and dating can be performed. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 18791999), 1969 (95% HPD: 19302000) and 1982 (95% HPD: 19482009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades. For coronaviruses, however, recombination means that small genomic subregions can have independent origins, identifiable if sufficient sampling has been done in the animal reservoirs that support the endemic circulation, co-infection and recombination that appear to be common. To gauge the length of time this lineage has circulated in bats, we estimate the time to the most recent common ancestor (TMRCA) of SARS-CoV-2 and RaTG13. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Duchene, S. et al. All authors contributed to analyses and interpretations. Holmes, E. C. The Evolution and Emergence of RNA Viruses (Oxford Univ. and T.A.C. Our most conservative approach attempted to ensure that putative NRRs had no mosaic or phylogenetic incongruence signals. Root-to-tip divergence as a function of sampling time for non-recombinant regions NRR1 and NRR2 and recombination-masked alignment set NRA3. performed codon usage analysis. Current Overview on Disease and Health Research Vol. 6 This boundary appears to be rarely crossed. (2020) with additional (and higher quality) snake coding sequence data and several miscellaneous eukaryotes with low genomic GC content failed to find any meaningful clustering of the SARS-CoV-2 with snake genomes (a). Alexandre Hassanin, Vuong Tan Tu, Gabor Csorba, Nicola F. Mller, Kathryn E. Kistler & Trevor Bedford, Jack M. Crook, Ivana Murphy, Diana Bell, Simon Pollett, Matthew A. Conte, Irina Maljkovic Berry, Yatish Turakhia, Bryan Thornlow, Russell Corbett-Detig, Nature Microbiology A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the Spike protein. Extended Data Fig. A., Filip, I., AlQuraishi, M. & Rabadan, R. Recombination and lineage-specific mutations led to the emergence of SARS-CoV-2. 26 March 2020. PubMed Central Softw. Because 3SEQ identified ten BFRs >500nt, we used GARDs (v.2.5.0) inference on 10, 11 and 12 breakpoints. Given what was known about the origins of SARS, as well as identification of SARS-like viruses circulating in bats that had binding sites adapted to human receptors29,30,31, appropriate measures should have been in place for immediate control of outbreaks of novel coronaviruses. pango-designation Public Repository for suggesting new lineages that should be added to the current scheme Python 968 73 pangolin Public Software package for assigning SARS-CoV-2 genome sequences to global lineages. Humans' selfish, speciesist treatment of these animals could be the very reason why the novel coronavirus exists. Evol. The plots are based on maximum likelihood tree reconstructions with a root position that maximises the residual mean squared for the regression of root-to-tip divergence and sampling time. 5). 6, 8391 (2015). Lie, P., Chen, W. & Chen, J.-P. USA 113, 30483053 (2016). When the genomic data included both coding and non-coding regions we used a single GTR+ substitution model; for concatenated coding genes we partitioned the alignment by codon position and specified an independent GTR+ model for each partition with a separate gamma model to accommodate inter-site rate variation. The web application was developed by the Centre for Genomic Pathogen Surveillance. This provides compelling support for the SARS-CoV-2 lineage being the consequence of a direct or nearly-direct zoonotic jump from bats, because the key ACE2-binding residues were present in viruses circulating in bats. The key to successful surveillance is knowing which viruses to look for and prioritizing those that can readily infect humans47. RegionB showed no PI signals within the region, except one including sequence SC2018 (Sichuan), and thus this sequence was also removed from the set. Furthermore, the other key feature thought to be instrumental in the ability of SARS-CoV-2 to infect humansa polybasic cleavage site insertion in the Sproteinhas not yet been seen in another close bat relative of the SARS-CoV-2 virus. To obtain Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Identifying the origins of an emerging pathogen can be critical during the early stages of an outbreak, because it may allow for containment measures to be precisely targeted at a stage when the number of daily new infections is still low. The presence in pangolins of an RBD very similar to that of SARS-CoV-2 means that we can infer this was also probably in the virus that jumped to humans. And this genotype pattern led to creating a new Pangolin lineage named B.1.640.2, a phylogenetic sister group to the old B.1.640 lineage renamed B.1.640.1. Allen O'Brien on LinkedIn: #r #rstudio #rstats #pangolin #covid19 # 2). Li, X. et al. Note that six of these sequences fall under the terms of use of the GISAID platform. To avoid artefacts due to recombination, we focused on NRR1 and NRR2 and the recombination-masked alignment NRA3 to infer time-measured evolutionary histories. These authors contributed equally: Maciej F. Boni, Philippe Lemey. Holmes, E. C., Rambaut, A. Methods Ecol. The inset represents divergence time estimates based on NRR1, NRR2 and NRA3. Scientists trying to trace the ancestry of SARS-CoV-2, the virus responsible for COVID-19, have found the pangolin is unlikely to be the source of the virus responsible for the current pandemic. Holmes, E. C., Dudas, G., Rambaut, A. 1c). The boxplots show divergence time estimates (posterior medians) for SARS-CoV-2 (red) and the 20022003 SARS-CoV virus (blue) from their most closely related bat virus. Intragenomic rearrangements involving 5-untranslated region segments in SARS-CoV-2, other betacoronaviruses, and alphacoronaviruses, Crystal structure of the CoV-Y domain of SARS-CoV-2 nonstructural protein 3, Association of underlying comorbidities and progression of COVID-19 infection amongst 2586 patients hospitalised in the National Capital Region of India: a retrospective cohort study, Molecular characterization of horse nettle virus A, a new member of subgroup B of the genus Nepovirus, Molecular phylogeny of coronaviruses and host receptors among domestic and close-contact animals reveals subgenome-level conservation, crossover, and divergence. https://doi.org/10.1093/molbev/msaa163 (2020). To examine temporal signal in the sequenced data, we plotted root-to-tip divergence against sampling time using TempEst39 v.1.5.3 based on a maximum likelihood tree. obtained the genome sequences of 10 SARS-CoV-2 virus strains through nanopore sequencing of nasopharyngeal swabs in Malta and analyzed the assembled genome with pangolin software, and the results showed that these virus strains were assigned to B.1 lineage, indicating that SARS-CoV-2 was widely spread in Europe (Biazzo et al., 2021). Of importance for future spillover events is the appreciation that SARS-CoV-2 has emerged from the same horseshoe bat subgenus that harbours SARS-like coronaviruses. N. China corresponds to Jilin, Shanxi, Hebei and Henan provinces, and the N. China clade also includes one sequence sampled in Hubei Province in 2004. Phylogenetic trees and exact breakpoints for all ten BFRs are shown in Supplementary Figs. Using a third consensus-based approach for identifying recombinant regions in individual sequenceswith six different recombination detection methods in RDP5 (ref. Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide. Mol. Webster, R. G., Bean, W. J., Gorman, O. T., Chambers, T. M. & Kawaoka, Y. Evolution and ecology of influenza A viruses. However, the coronavirus isolated from pangolin is similar at 99% in a specific region of the S protein, which corresponds to the 74 amino acids involved in the ACE (Angiotensin Converting Enzyme . To evaluate the performance procedure, we confirmed that the recombination masking resulted in (1) a markedly different outcome of the PHI test64, (2) removal of well-supported (bootstrap value >95%) incompatible splits in Neighbor-Net65 and (3) a near-complete reduction of mosaic signal as identified by 3SEQ. We demonstrate that the sarbecoviruses circulating in horseshoe bats have complex recombination histories as reported by others15,20,21,22,23,24,25,26. Conservatively, we combined the three BFRs >2kb identified above into non-recombining region1 (NRR1). 1 Phylogenetic relationships in the C-terminal domain (CTD). Virological.org http://virological.org/t/ncov-2019-codon-usage-and-reservoir-not-snakes-v2/339 (2020). 36, 17931803 (2019). Zhou, P. et al. A new SARS-CoV-2 variant (B.1.1.523) capable of escaping immune protections . https://doi.org/10.1038/s41564-020-0771-4, DOI: https://doi.org/10.1038/s41564-020-0771-4. Virology 507, 110 (2017). Cell 181, 223227 (2020). & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. This statement informs us of the possibility that a virus has spilled over from a very rare and shy reptile-looking mammal . The unsampled diversity descended from the SARS-CoV-2/RaTG13 common ancestor forms a clade of bat sarbecoviruses with generalist propertieswith respect to their ability to infect a range of mammalian cellsthat facilitated its jump to humans and may do so again. PubMedGoogle Scholar. The first available sequence data6 placed this novel human pathogen in the Sarbecovirus subgenus of Coronaviridae7, the same subgenus as the SARS virus that caused a global outbreak of >8,000 cases in 20022003. New COVID-19 Variant Alert: Everything We Know About the IHU Variant The most parsimonious explanation for these shared ACE2-specific residues is that they were present in the common ancestors of SARS-CoV-2, RaTG13 and Pangolin Guangdong 2019, and were lost through recombination in the lineage leading to RaTG13. [12] He, B. et al. The new paper finds that the genetic sequences of several strains of coronavirus found in pangolins were between 88.5 percent and 92.4 percent similar to those of the novel coronavirus. 5). 26, 450452 (2020). Aiewsakun, P. & Katzourakis, A. Time-dependent rate phenomenon in viruses. CoV-lineages GitHub A., Lytras, S., Singer, J. 2a. The genetic distances between SARS-CoV-2 and Pangolin Guangdong 2019 are consistent across all regions except the N-terminal domain, implying that a recombination event between these two sequences in this region is unlikely. 206298/Z/17/Z. and D.L.R. Adv. c, Maximum likelihood phylogenetic trees rooted on a 2007 virus sampled in Kenya (BtKy72; root truncated from images), shown for five BFRs of the sarbecovirus alignment. matics program called Pangolin was developed. Publishers note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RegionB is 5,525nt long. . Schierup, M. H. & Hein, J. Recombination and the molecular clock. ISSN 2058-5276 (online). Evidence of the recombinant origin of a bat severe acute respiratory syndrome (SARS)-like coronavirus and its implications on the direct ancestor of SARS coronavirus. GARD identified eight breakpoints that were also within 50nt of those identified by 3SEQ. Here, we analyse the evolutionary history of SARS-CoV-2 using available genomic data on sarbecoviruses. We considered (1) the possibility that BFRs could be combined into larger non-recombinant regions and (2) the possibility of further recombination within each BFR. Biol. J. Virol. In this approach, we considered a breakpoint as supported only if it had three types of statistical support: from (1) mosaic signals identified by 3SEQ, (2) PI signals identified by building trees around 3SEQs breakpoints and (3) the GARD algorithm35, which identifies breakpoints by identifying PI signals across proposed breakpoints. Trends Microbiol. Background & objectives: Several phylogenetic classification systems have been devised to trace the viral lineages of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Coronavirus Software Tools - Illumina, Inc. with an alignment on which an initial recombination analysis was done. Google Scholar. Aside from RaTG13, Pangolin-CoV is the most closely related CoV to SARS-CoV-2. RegionC showed no PI signals within it. Next, we (1) collected all breakpoints into a single set, (2) complemented this set to generate a set of non-breakpoints, (3) grouped non-breakpoints into contiguous BFRs and (4) sorted these regions by length. N. Engl. Across a large region of the virus genome, corresponding approximately to ORF1b, it did not cluster with any of the known bat coronaviruses indicating that recombination probably played a role in the evolutionary history of these viruses5,7. The virus then. The coronavirus genome that these researchers had assembled, from pangolin lung-tissue samples, contained some gene regions that were ninety-nine per cent similar to equivalent parts of the SARS . Pangolin relies on a novel algorithm called pangoLEARN. PDF single centre retrospective study 92, 433440 (2020). The 2009 influenza pandemic and subsequent outbreaks of MERS-CoV (2012), H7N9 avian influenza (2013), Ebola virus (2014) and Zika virus (2015) were met with rapid sequencing and genomic characterization. BEAST inferences made use of the BEAGLE v.3 library68 for efficient likelihood computations. Because these subclades had different phylogenetic relationships in regionD (Supplementary Fig. performed Srecombination analysis. Proc. Biol. The Pango dynamic nomenclature is a popular system for classifying and naming genetically-distinct lineages of SARS-CoV-2, including variants of concern, and is based on the analysis of complete or near-complete virus genomes. & Bedford, T. MERS-CoV spillover at the camelhuman interface. Google Scholar. 3 Priors and posteriors for evolutionary rate of SARS-CoV-2. Specifically, using a formal Bayesian approach42 (see Methods), we estimate a fast evolutionary rate (0.00169 substitutions per siteyr1, 95% highest posterior density (HPD) interval (0.00131,0.00205)) for SARS viruses sampled over a limited timescale (1year), a slower rate (0.00078 (0.00063,0.00092) substitutions per siteyr1) for MERS-CoV on a timescale of about 4years and the slowest rate (0.00024 (0.00019,0.00029) substitutions per siteyr1) for HCoV-OC43 over almost five decades. Sequences were aligned by MAFTT58 v.7.310, with a final alignment length of 30,927, and used in the analyses below. J. Infect. It is RaTG13 that is more divergent in the variable-loop region (Extended Data Fig. A deep dive into the genetics of the novel coronavirus shows it seems to have spent some time infecting both bats and pangolins before it jumped into humans, researchers said . 2, vew007 (2016). Frontiers | Novel Highly Divergent SARS-CoV-2 Lineage With the Spike # File containing the ID of the samples, the Sequence of the haplotype, the Continent, the country, the Region, the Data, the Lineage of Pangolin and Nextstrain clade, and the haplotype number # In this order # Could be obtained from the database This is evidence for numerous recombination events occurring in the evolutionary history of the sarbecoviruses22,33; specifying all past events in their correct temporal order34 is challenging and not shown here. Boxplots show interquartile ranges, white lines are medians and box whiskers show the full range of posterior distribution. Without better sampling, however, it is impossible to estimate whether or how many of these additional lineages exist. Sci. Lam, H. M., Ratmann, O. Another similarity between SARS-CoV and SARS-CoV-2 is their divergence time (4070years ago) from currently known extant bat virus lineages (Fig. and X.J. Using both prior distributions, this results in six highly similar posterior rate estimates for NRR1, NRR2 and NRA3, centred around 0.00055 substitutions per siteyr1. Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. Viral metagenomics revealed Sendai virus and coronavirus infection of Malayan pangolins (Manis javanica). The assumption of long-term purifying selection would imply that coronaviruses are in endemic equilibrium with their natural host species, horseshoe bats, to which they are presumably well adapted. Sarbecovirus, HCoV-OC43 and SARS-CoV data were assembled from GenBank to be as complete as possible, with sampling year as an inclusion criterion. A.R. 17, 15781579 (1999). As of December 2, 2021, SJdRP, a medium-sized city in the Northwest region of So Paulo state, Brazil (Fig. Impact of SARS-CoV-2 Gamma lineage introduction and COVID-19 - Nature The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding . 94, e0012720 (2020). Published. 4), but also by markedly different evolutionary rates. Nguyen, L.-T., Schmidt, H. A., Von Haeseler, A. Mol. M.F.B. J. Gen. Virol. In the variable-loop region, RaTG13 diverges considerably with the TMRCA, now outside that of SARS-CoV-2 and the Pangolin Guangdong 2019 ancestor, suggesting that RaTG13 has acquired this region from a more divergent and undetected bat lineage. EPI_ISL_410538, EPI_ISL_410539, EPI_ISL_410540, EPI_ISL_410541 and EPI_ISL_410542) for the use of sequence data via the GISAID platform. Host ecology determines the dispersal patterns of a plant virus. master 4 branches 94 tags Code AngieHinrichs Add entries for pangolin-data/-assignment 1.18.1.1 ( #512) ad16752 4 days ago 990 commits .github/ workflows Update pangolin.yml 7 months ago docs docs need guide tree now 3 years ago pangolin Evol. In early January, the aetiological agent of the pneumonia cases was found to be a coronavirus3, subsequently named SARS-CoV-2 by an International Committee on Taxonomy of Viruses (ICTV) Study Group4 and also named hCoV-19 by Wu et al.5. In addition, sequences NC_014470 (Bulgaria 2008), CoVZXC21, CoVZC45 and DQ412042 (Hubei-Yichang) needed to be removed to maintain a clean non-recombinant signal in A. PubMed Central Thank you for visiting nature.com. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Genet. (Yes, Pango is a tongue-in-cheek reference to pangolins, which were briefly suspected to have had a role in the coronavirus's originseveral of the team's computational tools are named after. Based on the identified breakpoints in each genome, only the major non-recombinant region is kept in each genome while other regions are masked. Lu, R. et al. However, for several reasons, nucleotide sequences may be generated that cover only the spike gene of SARS-CoV-2. The SARS-CoV divergence times are somewhat earlier than dates previously estimated15 because previous estimates were obtained using a collection of SARS-CoV genomes from human and civet hosts (as well as a few closely related bat genomes), which implies that evolutionary rates were predominantly informed by the short-term SARS outbreak scale and probably biased upwards. We thank all authors who have kindly deposited and shared genome data on GISAID. Bryant, D. & Moulton, V. Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. Evol. Regions AC were further examined for mosaic signals by 3SEQ, and all showed signs of mosaicism. covid19_mostefai2021_paper/01_CreateObjects.r at master HussinLab Lam, T. T. et al. These means are based on the mean rates estimated for MERS-CoV and HCoV-OC43, respectively, while the standard deviations are set ten times higher than empirical values to allow greater prior uncertainty and avoid strong bias (Extended Data Fig. It is clear from our analysis that viruses closely related to SARS-CoV-2 have been circulating in horseshoe bats for many decades. Because 3SEQ is the most statistically powerful of the mosaic methods61, we used it to identify the best-supported breakpoint history for each potential child (recombinant) sequence in the dataset. Biol. COVID-19 lineage names can be confusing to navigate; there are many aliases and if you want to catch them all to examine further in data analyses it helps to Allen O'Brien on LinkedIn: #r #rstudio #rstats #pangolin #covid19 #datascience #epidemiology Temporal signal was tested using a recently developed marginal likelihood estimation procedure41 (Supplementary Table 1). Google Scholar. 35, 247251 (2018). 6, eabb9153 (2020). MC_UU_1201412). CAS Uncertainty measures are shown in Extended Data Fig. For the current pandemic, the novel pathogen identification component of outbreak response delivered on its promise, with viral identification and rapid genomic analysis providing a genome sequence and confirmation, within weeks, that the December 2019 outbreak first detected in Wuhan, China was caused by a coronavirus3. Global epidemiology of bat coronaviruses. The fact that they are geographically relatively distant is in agreement with their somewhat distant TMRCA, because the spatial structure suggests that migration between their locations may be uncommon. We used an uncorrelated relaxed clock model with log-normal distribution for all datasets, except for the low-diversity SARS data for which we specified a strict molecular clock model. Evol. SARS-CoV-2 is an appropriate name for the new coronavirus. Virus Evol. Due to the absence of temporal signal in the sarbecovirus datasets, we used informative prior distributions on the evolutionary rate to estimate divergence dates. acknowledges support by the Research FoundationFlanders (Fonds voor Wetenschappelijk OnderzoekVlaanderen (nos. Stegeman, A. et al. Kosakovsky Pond, S. L., Posada, D., Gravenor, M. B., Woelk, C. H. & Frost, S. D. W. Automated phylogenetic detection of recombination using a genetic algorithm. PubMed PDF How COVID-19 Variants Get Their Name - doh.wa.gov Add entries for pangolin-data/-assignment 1.18.1.1 (, Really add a document on testing strategy. Evol. 88, 70707082 (2014). Evolutionary rate estimation can be profoundly affected by the presence of recombination50. Katoh, K., Asimenos, G. & Toh, H. in Bioinformatics for DNA Sequence Analysis (ed. MERS-CoV data were subsampled to match sample sizes with SARS-CoV and HCoV-OC43. PLoS ONE 5, e10434 (2010). After removal of A1 and A4, we named the new region A. However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. Early detection via genomics was not possible during Southeast Asias initial outbreaks of avian influenza H5N1 (1997 and 20032004) or the first SARS outbreak (20022003). These datasets were subjected to the same recombination masking approach as NRA3 and were characterized by a strong temporal signal (Fig. The authors declare no competing interests. Its genome is closest to that of severe acute respiratory syndrome-related coronaviruses from horseshoe bats, and its receptor-binding domain is closest to that of pangolin viruses. Indeed, the rates reported by these studies are in line with the short-term SARS rates that we estimate (Fig. 1. Unfortunately, a response that would achieve containment was not possible. 874850). The genetic distances between SARS-CoV-2 and RaTG13 (bottom) demonstrate that their relationship is consistent across all regions except for the variable loop. Robertson, D. nCoVs relationship to bat coronaviruses & recombination signals (no snakes) no evidence the 2019-nCoV lineage is recombinant. Is the COVID-19 Outbreak the 'Revenge of the Pangolin'? | PETA While there is involvement of other mammalian speciesspecifically pangolins for SARS-CoV-2as a plausible conduit for transmission to humans, there is no evidence that pangolins are facilitating adaptation to humans. Five example sequences with incongruent phylogenetic positions in the two trees are indicated by dashed lines. Genetic lineages of SARS-CoV-2 have been emerging and circulating around the world since the beginning of the COVID-19 pandemic. While such models have recently been made available, we lack the information to calibrate the rate decline over time (for example, through internal node calibrations44). Future trajectory of SARS-CoV-2: Constant spillover back and forth performed recombination and phylogenetic analysis and annotated virus names with geographical and sampling dates. Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). 3). 82, 18191826 (2008). In the presence of time-dependent rate variation, a widely observed phenomenon for viruses43,44,52, slower prior rates appear more appropriate for sarbecoviruses that currently encompass a sampling time range of about 18years. Lemey, P., Minin, V. N., Bielejec, F., Pond, S. L. K. & Suchard, M. A. the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in We call this approach breakpoint-conservative, but note that this has the opposite effect to the construction of NRR1 in that this approach is the most likely to allow breakpoints to remain inside putative non-recombining regions. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic, https://doi.org/10.1038/s41564-020-0771-4. D.L.R. There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. 1a-c ), has the third-highest number of confirmed COVID-19 cases in the state of So. Our third approach involved identifying breakpoints and masking minor recombinant regions (with gaps, which are treated as unobserved characters in probabilistic phylogenetic approaches). 4), that region and shorter BFRs were not included in combined putative non-recombinant regions. More evidence Pangolin not intermediary in transmission of SARS-CoV-2