000070929 001__ 70929
000070929 005__ 20190709135549.0
000070929 0247_ $$2doi$$a10.3389/fmicb.2018.00771
000070929 0248_ $$2sideral$$a106491
000070929 037__ $$aART-2018-106491
000070929 041__ $$aeng
000070929 100__ $$aVinuesa, P.
000070929 245__ $$aGET_PHYLOMARKERS, a software package to select optimal orthologous clusters for phylogenomics and inferring pan-genome phylogenies, used for a critical geno-taxonomic revision of the genus Stenotrophomonas
000070929 260__ $$c2018
000070929 5060_ $$aAccess copy available to the general public$$fUnrestricted
000070929 5203_ $$aThe massive accumulation of genome-sequences in public databases promoted the proliferation of genome-level phylogenetic analyses in many areas of biological research. However, due to diverse evolutionary and genetic processes, many loci have undesirable properties for phylogenetic reconstruction. These, if undetected, can result in erroneous or biased estimates, particularly when estimating species trees from concatenated datasets. To deal with these problems, we developed GET_PHYLOMARKERS, a pipeline designed to identify high-quality markers to estimate robust genome phylogenies from the orthologous clusters, or the pan-genome matrix (PGM), computed by GET_HOMOLOGUES. In the first context, a set of sequential filters are applied to exclude recombinant alignments and those producing anomalous or poorly resolved trees. Multiple sequence alignments and maximum likelihood (ML) phylogenies are computed in parallel on multi-core computers. A ML species tree is estimated from the concatenated set of top-ranking alignments at the DNA or protein levels, using either FastTree or IQ-TREE (IQT). The latter is used by default due to its superior performance revealed in an extensive benchmark analysis. In addition, parsimony and ML phylogenies can be estimated from the PGM. We demonstrate the practical utility of the software by analyzing 170 Stenotrophomonas genome sequences available in RefSeq and 10 new complete genomes of Mexican environmental S. maltophilia complex (Smc) isolates reported herein. A combination of core-genome and PGM analyses was used to revise the molecular systematics of the genus. An unsupervised learning approach that uses a goodness of clustering statistic identified 20 groups within the Smc at a core-genome average nucleotide identity (cgANIb) of 95.9% that are perfectly consistent with strongly supported clades on the core- and pan-genome trees. In addition, we identified 16 misclassified RefSeq genome sequences, 14 of them labeled as S. maltophilia, demonstrating the broad utility of the software for phylogenomics and geno-taxonomic studies. The code, a detailed manual and tutorials are freely available for Linux/UNIX servers under the GNU GPLv3 license at https://github.com/vinuesa/get_phylomarkers. A docker image bundling GET_PHYLOMARKERS with GET_HOMOLOGUES is available at https://hub.docker.com/r/csicunam/get_homologues/, which can be easily run on any platform.
000070929 536__ $$9info:eu-repo/grantAgreement/ES/CSIC/ARAID/200720I038$$9info:eu-repo/grantAgreement/ES/MINECO/AGL2013-48756-R
000070929 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000070929 590__ $$a4.259$$b2018
000070929 591__ $$aMICROBIOLOGY$$b32 / 133 = 0.241$$c2018$$dQ1$$eT1
000070929 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000070929 700__ $$aOchoa-Sánchez, L.E.
000070929 700__ $$0(orcid)0000-0002-5462-907X$$aContreras-Moreira, B.$$uUniversidad de Zaragoza
000070929 7102_ $$11002$$2060$$aUniversidad de Zaragoza$$bDpto. Bioq.Biolog.Mol. Celular$$cÁrea Bioquímica y Biolog.Mole.
000070929 773__ $$g9, MAY (2018), 771 [22 pp]$$pFront. microbiol.$$tFRONTIERS IN MICROBIOLOGY$$x1664-302X
000070929 8564_ $$s536454$$uhttp://zaguan.unizar.es/record/70929/files/texto_completo.pdf$$yVersión publicada
000070929 8564_ $$s11447$$uhttp://zaguan.unizar.es/record/70929/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000070929 909CO $$ooai:zaguan.unizar.es:70929$$particulos$$pdriver
000070929 951__ $$a2019-07-09-12:11:19
000070929 980__ $$aARTICLE