NASH Data Bank
Experiments at Genfit generate large and/or complex data sets derived from different technologies (microarray, proteomics, etc.). Our bioinformatics team strives to assist scientists in the management and analysis of these data sets.
In the NASH field, we have developed a scientific information system for the long-term storage of our biomedical and associated results, to unite the diverse data sets and to facilitate their extraction and transformation into relevant information: the NASH databank.
Capitalizing on our skills and prior experience in biodatabase development (EU Biobridge project, OSEO ITDiab project, …), we have identified NASH-related key data sources and formats, and determined how to consolidate data streams.
Based on an in-depth understanding of the needs of our scientists in terms of query/ retrieval of biomedical data, we have developed a data governance plan (data quality, migration, integration, …) and designed the NASH databank as a system based on a three-level architecture:
- multiple databases form the foundations of the system (each one dedicated to the storage of a well-controlled and specialized data source)
- implementation of a data integration platform (ETL) to unite all available data sets
- a web portal offering many data views and tools to analyse the data through pre-packaged queries (also known as workflows).
Thus, our scientists and clinicians can analyze data more efficiently and quickly, in order to better advance NASH studies.