Fouille de texte pour la biologie

Résumés des interventions



Genie: literature-based gene prioritization at multi genomic scale
Jean-Fred Fontaine,

The Medline database contains millions of biomedical citations that are freely available. Its powerful PubMed interface connected to genomic resources such as Entrez Gene allows biologists and biomedical scientists to ‘manually’ evaluate the relevance of a gene to a research topic. However many genes, especially from poorly studied organisms, are not discussed in the literature. Moreover, too many genes and abstracts prevent a scientist from performing a comprehensive summarization of the literature attached to the genes of an organism. Thanks to the ‘Génie’ data and text mining algorithms, it is possible to analyze systematically all the literature attached to all the genes of an organism, including data from orthologs. The usability of this algorithm benefits from the fact that it requires no expert knowledge, produces results in a short time, and it is very precise. Indeed, it uses a Bayesian classifier on abstract words that guesses automatically the most discriminative words for a given topic. Its implementation uses extensive pre-computations to return results in a few seconds. Classification performance outperforms existing tools, especially when high sensitivity is required, and could reach 100% precision. Finally, the power of this approach is clearly demonstrated when ranking zebrafish genes for which few scientific abstracts are available, in comparison to a high-throughput molecular biology dataset. The Génie web server is accessible from: http://cbdm.mdc-berlin.de/tools/genie/.

Retour à la page principale


Automated vocabulary discovery for geo-parsing online epidemic intelligence
Mikaela Keller, INRIA Mostrare et Université Lille 3 Lille

Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.

Retour à la page principale


Introduction à IPA: Un outil pour explorer les mécanismes biologiques
Myriam Boutet, Ingenuity

Les biologistes ont besoin d’outils complexes pour analyser leurs données, tester des hypothèses et prendre des décisions tout au long du cycle expérimental. IPA est un logiciel largement adopté par la communauté des chercheurs en sciences du vivant qui permet d'analyser et de comprendre les systèmes biologiques et chimiques complexes . IPA aide à mieux comprendre la biologie, à de multiples niveaux, en intégrant des données d'une variété de plates-formes expérimentales et en fournissant un aperçu des interactions moléculaires et chimiques, des phénotypes cellulaires et des processus pathologiques. La base de connaissance Ingenuity contient des informations sur les gènes, protéines, produits chimiques, médicaments, et les relations moléculaires pour construire des modèles biologiques. IPA fournit le bon contexte biologique pour faciliter la prise de décision, la conception de projet et pour faire avancer la recherche.

Retour à la page principale