METAFUNCTIONS

Environmental and metagenomics – a bioinformatics system to detect and assign functions to habitat-specific gene patterns

A research project funded by the European Commission within Framework Programme 6 under the NEST – Newly Emerging Science and Technology Adventure initiative.

 The METAFUNCTIONS project, which started on 1 October 2005 is pooling expertise in bioinformatics, computer science, geographical information systems and marine sciences to develop a data-mining system that correlates genetic patterns in genomes and metagenomes with contextual environmental data. This innovative tool may enable scientists to infer functions and activities for sequenced hypothetical genes, thus providing a wealth of information about niche adaptations as well as new enzymes and proteins for medical and industrial use.

The development of the METAFUNCTIONS approach is only possible through the integration of the diverse range of expertise from four European institutions from Germany, Switzerland and Poland. The innovative combination has the potential to produce a technology with broad application and high potential pay-off.

In the last seven years, more than 300 microbial genomes have been successfully sequenced while over 600 are currently in progress. So far, researchers have largely focused on bacteria that are medically important; environmentally important organisms. e.g. those involved in methane production and consumption. have not received the same attention. As it is difficult to culture ecologically relevant bacteria for genomic sequencing under laboratory conditions, scientists often take DNA samples directly from the environment instead. Sequences of these samples are known as metagenomes – not the genome of an organism, but the genetic composition of a particular environment.

Exploring new territory

A wealth of metagenome information is emerging – but the tools to analyse it are seriously lacking. Consequently, METAFUNCTIONS will develop a novel data-mining system that can identify relationships between sequenced genes and their environmental and ecological context. The ultimate aim is to determine the function of as yet unknown genes, known as hypothetical genes. For this purpose, a ‘Genomes MapServer’ is under construction, soon allowing scientists around the world to access integrated genomic and ecological data and clearly visualise the results of their analyses.

 

Objectives

METAFUNCTIONS is a “high risk/high impact” proposal creating a new scientific basis cutting across life and environmental sciences in a highly innovative approach.The overall objective of this proposal is to correlate possible functions to genes that lack functional assignments (hypothetical genes) obtained from genome and metagenome projects. Taking into account high genome plasticity and lateral gene transfer combined with the fact that organisms can only survive in a given environment if they have the appropriate genetic equipment, there should be a natural selection for habitat specific genes. This means, if these gene patterns can be identified, their function can be predicted by correlation to habitat-specific parameters.Further objectives are:establishing a new kind of multidisciplinarity where microbiologists, molecular ecologists, biogeochemists, bioinformaticians and GIS specialists collaborate; developing a new basic technology to systematically save, analyse and evaluate ecological and genomic data in conjunction, resulting in the new "Genomes Mapserver" integrating genomic and metagenomic data into a consistent, curated database including geographic and ecological context to provide a valuable resource for the scientific community for data mining, analysis and targeted lab experiments within diverse sectors of European relevance.Thus, METAFUNCTIONS will stabilise and extend the strategic position of Europe in environmental genomics by creating new scientific knowledge and technical capability in the emerging field of metagenomics.

Background

In the last years, more than 300 microbial genomes have been successfully sequenced while over 600 are currently in progress. So far, researchers have largely focused on bacteria that are medically important; ‘environmentally important’ organisms (e.g. those involved in methane production and consumption) have not received the same attention.

As it is difficult to culture ecologically relevant bacteria for genomic sequencing under laboratory conditions scientists often take DNA samples directly from the environment instead. Sequences of these samples are known as metagenomes – not the genome of an organism, but the genetic make-up of a particular environment.