For several years now, there has been an exponential growth of the amount of life science data (e. g. , sequenced complete genomes, 3D structures, DNA chips, mass spectroscopy data), most of which are generated by high-throughput - periments. This exponentialcorpusof data is storedand made availablethrough a large number of databases and resources over the Web, but unfortunately still with a high degreeof semantic heterogeneity and varying levels of quality. These data must be combined together and processed by bioinformatics tools deployed on powerful and e?cient platforms to permit the uncovering of patterns, s- ilarities and in general to help in the process of discovery. Analyzing complex, voluminous, and heterogeneous data and guiding the analysis of data are thus of paramount importance and necessitate the involvement of data integration techniques. DILS 2008 was the ?fth in a workshop series that aims at fostering disc- sion, exchange, and innovation in research and development in the area of data integration for the life sciences. Each previous DILS workshop attracted around 100 researchers from all over the world and saw an increase of submitted - pers over the preceding one. This year was not an exception and the number of submitted papers increased to 54. The ProgramCommittee selected 18 of them. The selected papers cover a wide spectrum of theoretical and practical issues including data annotation, Semantic Web for the life sciences, and data mining on integrated biological data.
This book constitutes the refereed proceedings of the 5th International Workshop on Data Integration in the Life Sciences, DILS 2008, held in Evry, France in June 2008.
The 18 revised full papers presented together with 3 keynote talks and a tutorial paper were carefully reviewed and selected from 54 submissions. The papers adress all current issues in data integration and data management from the life science point of view and are organized in topical sections on Semantic Web for the life sciences, designing and evaluating architectures to integrate biological data, new architectures and experience on using systems, systems using technologies from the Semantic Web for the life sciences, mining integrated biological data, and new features of major resources for biomolecular data.