Exposome Data Analysis Toolbox

A new, user-friendly online toolbox to collate and add value to data analysis tools and visualisation methodologies.

LongITools is studying how life-course exposure to environmental (e.g. air and noise pollution and the built environment) and lifestyle factors interplay with genetic factors to contribute to the risk of developing chronic cardiovascular and metabolic diseases. To undertake this exposome or holistic-based research activity, the LongITools researchers need to use and develop a number of tools and methodologies to analyse multiple exposome variables held within the project’s data sets*. These data analysis tools and visualisation methodologies include causal inference, machine learning, hierarchical modelling, data visualisation and summary measures, and are developed within interfaces such as R/R-Studio and DataSHIELD.

The LongITools team is developing a new, user-friendly Exposome Data Analysis Toolbox to collate and add value to these data analysis tools and visualisation methodologies. Within this online toolbox, users will be able to search for and use the tools and methodologies. Each tool or methodology will be supported by comprehensive user documentation (e.g. guidelines, videos, example data). The toolbox will be open access and some functions will require registration (e.g. to run a tool within a private session).

Once developed, this novel toolbox will enable researchers to access and use multiple exposome data analysis tools and methodologies via a single platform. Users will be able to interact with the tools based on their needs and level of expertise. For example, users could view and download the specific code for a particular tool or use the tools directly on the platform to run their analysis. The toolbox will enhance research and support open science by increasing usability and interoperability, defining data analysis standards and reference methods, and facilitating the execution of complex computational pipelines.

*The LongITools data sets include register-based cohorts, prospective birth cohort studies, biobanks, longitudinal studies in adults and RCTs. Data variables include for example BMI, diet, physical activity, education, occupation.

Timescale for completion: Winter 2024


Description of TRL levels

  • TRL 1 – Basic principles observed
  • TRL 2 – Technology concept formulated
  • TRL 3 – Experimental proof of concept
  • TRL 4 – Technology validated in lab
  • TRL 5 – Technology validated in relevant environment
  • TRL 6 – Technology demonstrated in relevant environment
  • TRL 7 – System prototype demonstration in operational environment
  • TRL 8 – System complete and qualified
  • TRL 9 – Actual system proven in operational environment


Developing and validating prototype

The first prototype version of the Exposome Data Analysis Toolbox has now been developed and includes a couple of data analysis tools and methodologies. The LongITools partners are continuing to add tools and methodologies whilst testing and developing the toolbox’s functionality. 

Toolbox platform in development

A survey was undertaken with LongITools and other EHEN project partners to determine the content and functionality of the Exposome Data Analysis Toolbox, and the toolbox platform is now being developed.