Rare microbial relict sheds light on an ancient eukaryotic supergroup

17 min read Original article ↗
  • Lax, G. et al. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564, 410–414 (2018).

    Article  CAS  PubMed  ADS  Google Scholar 

  • Brown, M. W. et al. Phylogenomics places orphan protistan lineages in a novel eukaryotic super-group. Genome Biol. Evol. 10, 427–433 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tikhonenkov, D. V. et al. Microbial predators form a new supergroup of eukaryotes. Nature 612, 714–719 (2022).

    Article  CAS  PubMed  ADS  Google Scholar 

  • Janouškovec, J. et al. A new lineage of eukaryotes illuminates early mitochondrial genome reduction. Curr. Biol. 27, 3717–3724 (2017).

    Article  PubMed  Google Scholar 

  • Gawryluk, R. M. R. et al. Non-photosynthetic predators are sister to red algae. Nature 572, 240–243 (2019).

    Article  CAS  PubMed  Google Scholar 

  • Schön, M. E. et al. Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae. Nat. Commun. 12, 6651 (2021).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  • Gray, M. W. et al. The draft nuclear genome sequence and predicted mitochondrial proteome of Andalucia godoyi, a protist with the most gene-rich and bacteria-like mitochondrial genome. BMC Biol. 18, 22 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Horváthová, L., et al. Analysis of diverse eukaryotes suggests the existence of an ancestral mitochondrial apparatus derived from the bacterial type II secretion system. Nat. Commun. 12, 2947 (2021).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  • Burger, G., Gray, M. W., Forget, L. & Lang, B. F. Strikingly bacteria-like and gene-rich mitochondrial genomes throughout jakobid protists. Genome Biol. Evol. 5, 418–438 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  • Moreira, D., Blaz, J., Kim, E. & Eme, L. A gene-rich mitochondrion with a unique ancestral protein transport system. Curr. Biol. 34, 3812–3819 (2024).

    Article  CAS  PubMed  Google Scholar 

  • Burki, F., Roger, A. J., Brown, M. W. & Simpson, A. G. B. The new tree of eukaryotes. Trends Ecol. Evol. 35, 43–55 (2020).

    Article  CAS  PubMed  Google Scholar 

  • Lukeš, J., Čepička, I. & Kolísko, M. Evolution: no end in sight for novel incredible (heterotrophic) protists. Curr. Biol. 34, R55–R58 (2024).

    Article  PubMed  Google Scholar 

  • Sunagawa, S. et al. Tara Oceans: towards global ocean ecosystems biology. Nat. Rev. Microbiol. 18, 428–445 (2020).

    Article  CAS  PubMed  Google Scholar 

  • del Campo, J. et al. The protist cultural renaissance. Trends Microbiol. 32, 128–131 (2024).

    Article  PubMed  Google Scholar 

  • Timmis, J. N., Ayliff, M. A., Huang, C. Y. & Martin, W. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat. Rev. Genet. 5, 123–135 (2004).

    Article  CAS  PubMed  Google Scholar 

  • Gabaldón, T. & Huynen, M. A. Reconstruction of the proto-mitochondrial metabolism. Science 301, 609 (2003).

    Article  PubMed  Google Scholar 

  • Gawryluk, R. M. R. & Stairs, C. W. Diversity of electron transport chains in anaerobic protists. Biochim. Biophys. Acta Bioenerg. 1862, 148334 (2021).

    Article  CAS  PubMed  Google Scholar 

  • Namasivayam, S. et al. Massive invasion of organellar DNA drives nuclear genome evolution in Toxoplasma. Proc. Natl Acad. Sci. USA 120, e2308569120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • He, D., Fu, C.-J. & Baldauf, S. L. Multiple origins of eukaryotic cox15 suggest horizontal gene transfer from bacteria to jakobid mitochondrial DNA. Mol. Biol. Evol. 33, 122–133 (2016).

    Article  CAS  PubMed  Google Scholar 

  • Milner, D. S., Wideman, J. G., Stairs, C. W., Dunn, C. D. & Richards, T. A. A functional bacteria-derived restriction modification system in the mitochondrion of a heterotrophic protist. PLoS Biol. 19, e3001126 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gray, M. W. Mosaic nature of the mitochondrial proteome: Implications for the origin and evolution of mitochondria. Proc. Natl Acad. Sci. USA 112, 10133–10138 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  • Pyrih, J. et al. Vestiges of the bacterial signal recognition particle-based protein targeting in mitochondria. Mol. Biol. Evol. 38, 3170–3187 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eglit, Y. et al. Meteora sporadica, a protist with incredible cell architecture, is related to Hemimastigophora. Curr. Biol. 34, 451–459 (2024).

    Article  CAS  PubMed  Google Scholar 

  • Shiryev, S. A. & Agarwala, R. Indexing and searching petabase-scale nucleotide resources. Nat. Methods 21, 994–1002 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lynch, M. D. J. & Neufeld, J. D. Ecology and exploration of the rare biosphere. Nat. Rev. Microbiol. 13, 217–229 (2015).

    Article  CAS  PubMed  Google Scholar 

  • Forster, D. et al. Benthic protists: the under-charted majority. FEMS Microbiol. Ecol. 92, fiw120 (2016).

    Article  PubMed  Google Scholar 

  • Hausmann, K. Extrusive organelles in protists. Int. Rev. Cytol. 52, 197–276 (1978).

    Article  CAS  PubMed  Google Scholar 

  • Tice, A. K., et al. PhyloFisher: a phylogenomic package for resolving eukaryotic relationships. PLoS Biol. 19, e3001365 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Banos, H. et al. GTRpmix: a linked general time-reversible model for profile mixture models. Mol. Biol. Evol. 41, msae174 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Si Quang, L., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).

    Article  Google Scholar 

  • Shimodaira, H. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508 (2002).

    Article  PubMed  Google Scholar 

  • Torruella, G., Galindo, L. J., Moreira, D. & López-García, P. Phylogenomics of neglected flagellated protists supports a revised eukaryotic tree of life. Curr. Biol. 35, 198–207 (2025).

    Article  CAS  PubMed  Google Scholar 

  • Lartillot, N. & Philippe, H. Improvement of molecular phylogenetic inference and the phylogeny of Bilateria. Phil. Trans. R. Soc. B 363, 1463–1472 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  • Cranford-Smith, T. & Huber, D. The way is the goal: how SecA transports proteins across the cytoplasmic membrane in bacteria. FEMS Microbiol. Lett. 365, fny093 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  • Petrů, M., Dohnálek, V., Füssy, Z. & Doležal, P. Fates of Sec, Tat, and YidC translocases in mitochondria and other eukaryotic compartments. Mol. Biol. Evol. 38, 5241–5254 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  • Smets, D., Loos, M. S., Karamanou, S. & Economou, A. Protein transport across the bacterial plasma membrane by the Sec pathway. Protein J. 38, 262–273 (2019).

    Article  CAS  PubMed  Google Scholar 

  • Hsieh, Y. et al. SecA alone can promote protein translocation and ion channel activity. J. Biol. Chem. 286, 44702–44709 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hsieh, Y. et al. Dissecting structures and functions of SecA-only protein-conducting channels: ATPase, pore structure, ion channel activity, protein translocation, and interaction with SecYEG/SecDF•YajC. PLoS One 12, e0178307 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  • Köstlbacher, S., Panagiotou, K., Tamarit, D. & Ettema, T. J. G. WitChi: Efficient detection and pruning of compositional bias in phylogenomic alignments using empirical chi-squared testing. Preprint at bioRxiv https://doi.org/10.1101/2025.07.14.663642 (2025).

  • Tong, J. et al. Ancestral and derived protein import pathways in the mitochondrion of Reclinomonas americana. Mol. Biol. Evol. 28, 1581–1591 (2011).

    Article  CAS  PubMed  Google Scholar 

  • Dembech, E. et al. Identification of hidden associations among eukaryotic genes through statistical analysis of coevolutionary transitions. Proc. Natl Acad. Sci. USA 120, e2218329120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alto, L. T. & Terman, J. R. in Semaphorin Signaling: Methods in Molecular Biology Vol. 1493 (ed. Terman, J. R.) 1–25 (Springer, 2017).

  • Hochstrasser, M. Origin and function of ubiquitin-like proteins. Nature 458, 422–429 (2009).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  • Pereira, R. V. et al. Ubiquitin-specific proteases are differentially expressed throughout the Schistosoma mansoni life cycle. Parasit. Vectors 8, 349 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  • Burge, R. J., Damianou, A., Wilkinson, A. J., Rodenko, B. & Mottram, J. C. Leishmania differentiation requires ubiquitin conjugation mediated by a UBC2–UEV1 E2 complex. PLoS Pathog. 16, e1008784 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rizos, I., Frada, M. J., Bittner, L. & Not, F. Life cycle strategies in free-living unicellular eukaryotes: diversity, evolution, and current molecular tools to unravel the private life of microorganisms. J. Eukaryot. Microbiol. 71, e13052 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hofstatter, P. G., Brown, M. W. & Lahr, D. J. G. Comparative genomics supports sex and meiosis in diverse Amoebozoa. Genome Biol. Evol. 10, 3118–3128 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sibbald, S. J. & Archibald, J. M. More protist genomes needed. Nat. Ecol. Evol. 1, 0145 (2017).

    Article  Google Scholar 

  • Valt, M. & Hrubá, P. Chemical fixation of Solarion arienae for transmission electron microscopy. protocols.io https://doi.org/10.17504/protocols.io.kxygxyd5zl8j/v2 (2024).

  • Valt, M. HPF-FS of Solarion arienae for transmission electron microscopy. protocols.io https://doi.org/10.17504/protocols.io.dm6gpzp15lzp/v2 (2024).

  • Mastronarde, D. N. Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol. 152, 36–51 (2005).

    Article  PubMed  Google Scholar 

  • Mastronarde, D. N. & Held, S. R. Automated tilt series alignment and tomographic reconstruction in IMOD. J. Struct. Biol. 197, 102–113 (2017).

    Article  PubMed  Google Scholar 

  • Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Bodian, D. A new method for staining nerve fibers and nerve endings in mounted paraffin sections. Anat. Rec. 65, 89–97 (1936).

    Article  Google Scholar 

  • Nie, D. Morphology and taxonomy of the intestinal protozoa of the guinea-pig, Cavia porcella. J. Morphol. 86, 381–493 (1950).

    Article  CAS  PubMed  Google Scholar 

  • Valt, M. & Kotyk, M. Permanent specimen preparation by protargol staining. protocols.io https://doi.org/10.17504/protocols.io.q26g71or9gwz/v1 (2024).

  • Medlin, L., Elwood, H. J., Stickel, S. & Sogin, M. L. The characterization of enzymatically amplified eukaryotic 16S-like rRNA-coding regions. Gene 71, 491–499 (1988).

    Article  CAS  PubMed  Google Scholar 

  • Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bushmanova, E., Antipov, D., Lapidus, A. & Prjibelski, A. D. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data. GigaScience 8, giz100 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  • Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Criscuolo, A. & Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  • Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  • Guillou, L. et al. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41, D597–D604 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  • Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  • Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., Von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rio, D. C., Ares, M., Hannon, G. J. & Nilsen, T. W. Purification of RNA using TRIzol (TRI Reagent). Cold Spring Harb. Protoc. https://doi.org/10.1101/pdb.prot5439 (2010).

    Article  PubMed  Google Scholar 

  • Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lafond-Lapalme, J., Duceppe, M.-O., Wang, S., Moffett, P. & Mimee, B. A new method for decontamination of de novo transcriptomes using a hierarchical clustering algorithm. Bioinformatics 33, 1293–1300 (2017).

    Article  CAS  PubMed  Google Scholar 

  • Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huerta-Cepas, J. et al. EggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).

    Article  CAS  PubMed  Google Scholar 

  • Pánek, T. et al. A new lineage of non-photosynthetic green algae with extreme organellar genomes. BMC Biol. 20, 66 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  • Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).

    Article  CAS  PubMed  Google Scholar 

  • Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).

    Article  CAS  PubMed  Google Scholar 

  • Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  • Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: assessing genomic data quality and beyond. Curr. Protoc. 1, e323 (2021).

    Article  PubMed  Google Scholar 

  • Challis, R., Richards, E., Rajan, J., Cochrane, G. & Blaxter, M. BlobToolKit – interactive quality assessment of genome assemblies. G3 10, 1361–1374 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gabriel, L. et al. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 34, 769–777 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0 http://www.repeatmasker.org (2013–2015).

  • Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  • Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  • Tegenfeldt, F. et al. OrthoDB and BUSCO update: annotation of orthologs with wider sampling of genomes. Nucleic Acids Res. 53, D516–D522 (2025).

    Article  PubMed  Google Scholar 

  • Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    Article  PubMed  Google Scholar 

  • Huang, N. & Li, H. compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics 39, btad595 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gabriel, L., Hoff, K. J., Brůna, T., Borodovsky, M. & Stanke, M. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Brůna, T., Gabriel, L. & Hoff, K. J. in Insect Genomics: Methods in Molecular Biology Vol. 2935 (eds Bonizzoni, M. & Ometto, L.) 67–107 (Springer, 2025).

  • Jones, R. E. et al. Create, analyze, and visualize phylogenomic datasets using PhyloFisher. Curr. Protoc. 4, e969 (2024).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  • Wang, H.-C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).

    Article  CAS  PubMed  Google Scholar 

  • Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60, 685–699 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  • Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).

    Article  CAS  PubMed  Google Scholar 

  • Kishino, H., Miyata, T. & Hasegawa, M. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 31, 151–160 (1990).

    Article  CAS  ADS  Google Scholar 

  • Susko, E., Field, C., Blouin, C. & Roger, A. J. Estimation of rates-across-sites distributions in phylogenetic substitution models. Syst. Biol. 52, 594–603 (2003).

    Article  PubMed  Google Scholar 

  • Comte, A. et al. PhylteR: efficient identification of outlier sequences in phylogenomic datasets. Mol. Biol. Evol. 40, msad234 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shen, X.-X., Hittinger, C. T. & Rokas, A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat. Ecol. Evol. 1, 0126 (2017).

    Article  Google Scholar 

  • Huson, D. H. et al. Dendroscope: an interactive viewer for large phylogenetic trees. BMC Bioinformatics 8, 460 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  • Lang, B. F. et al. Mitochondrial genome annotation with MFannot: a critical analysis of gene identification and gene model prediction. Front. Plant Sci. 14, 1222186 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  • Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  • Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).

    Article  CAS  PubMed  Google Scholar 

  • Chan, P. P. & Lowe, T. M. in Gene Prediction: Methods in Molecular Biology Vol. 1962 (ed. Kollmar, M.) 1–14 (Springer, 2019).

  • Teufel, F. et al. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat. Biotechnol. 40, 1023–1025 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  • Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).

    Article  CAS  PubMed  ADS  Google Scholar 

  • Steinegger, M. & Söding, J. Clustering huge protein sequence sets in linear time. Nat. Commun. 9, 2542 (2018).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  • Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community J. 2, e56 (2022).

    Article  Google Scholar 

  • Soding, J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  • Krogh, A., Larsson, B., Von Heijne, G. & Sonnhammer, E. L. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001).

    Article  CAS  PubMed  Google Scholar 

  • Liu, Y., Schmidt, B. & Maskell, D. L. MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26, 1958–1964 (2010).

    Article  CAS  PubMed  Google Scholar 

  • Nguyen, L. T., Schmidt, H. A., Von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article  CAS  PubMed  Google Scholar 

  • Hoang, D. T., Chernomor, O., Von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).

    Article  CAS  PubMed  Google Scholar 

  • Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).

    Article  CAS  PubMed  Google Scholar 

  • Sun, J. et al. OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res. 51, W397–W403 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  • Tang, S., Lomsadze, A. & Borodovsky, M. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 43, e78 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  • Blum, M. et al. InterPro: the protein sequence classification resource in 2025. Nucleic Acids Res. 53, D444–D456 (2025).

    Article  PubMed  Google Scholar 

  • Valt, M. et al. Molecular and supplementary data of Solarion arienae. Figshare https://doi.org/10.6084/m9.figshare.27182820 (2025).

  • Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).

    Article  CAS  PubMed  Google Scholar 

  • Zimin, A. V. et al. The MaSuRCA genome assembler. Bioinformatics 29, 2669–2677 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Di Genova, A., Buena-Atienza, E., Ossowski, S. & Sagot, M. F. Efficient hybrid de novo assembly of human genomes with WENGAN. Nat. Biotechnol. 39, 422–430 (2021).

    Article  PubMed  Google Scholar 

  • Field, H. I., Coulson, R. M. & Field, M. C. An automated graphics tool for comparative genomics: the Coulson plot generator. BMC Bioinformatics 14, 141 (2013).

    Article  PubMed  PubMed Central  Google Scholar