University of Limerick
Browse

A data reusability assessment in the nanosafety domain based on the  NSDRA framework followed by an exploratory quantitative structure  activity relationships (QSAR) modeling targeting cellular viability

Download (8.96 MB)
journal contribution
posted on 2023-07-26, 08:58 authored by Irini FurxhiIrini Furxhi, Egon Willighagen, Chris Evelo, Anna Costa, Davide Gardini, Ammar Ammar

Introduction: The current effort towards the digital transformation across multiple scientific domains requires  data that is Findable, Accessible, Interoperable and Reusable (FAIR). In addition to the FAIR data, what is  required for the application of computational tools, such as Quantitative Structure Activity Relationships  (QSARs), is a sufficient data volume and the ability to merge sources into homogeneous digital assets. In the  nanosafety domain there is a lack of FAIR available metadata.  Methodology: To address this challenge, we utilized 34 datasets from the nanosafety domain by exploiting the  NanoSafety Data Reusability Assessment (NSDRA) framework, which allowed the annotation and assessment of  dataset's reusability. From the framework's application results, eight datasets targeting the same endpoint (i.e.  numerical cellular viability) were selected, processed and merged to test several hypothesis including universal  versus nanogroup-specific QSAR models (metal oxide and nanotubes), and regression versus classification Machine Learning (ML) algorithms.  Results: Universal regression and classification QSARs reached an 0.86 R2 and 0.92 accuracy, respectively, for the  test set. Nanogroup-specific regression models reached 0.88 R2 for nanotubes test set followed by metal oxide  (0.78). Nanogroup-specific classification models reached 0.99 accuracy for nanotubes test set, followed by metal  oxide (0.91). Feature importance revealed different patterns depending on the dataset with common influential  features including core size, exposure conditions and toxicological assay.  Even in the case where the available experimental knowledge was merged, the models still failed to correctly  predict the outputs of an unseen dataset, revealing the cumbersome conundrum of scientific reproducibility in  realistic applications of QSAR for nanosafety. To harness the full potential of computational tools and ensure  their long-term applications, embracing FAIR data practices is imperative in driving the development of  responsible QSAR models.  Conclusions: This study reveals that the digitalization of nanosafety knowledge in a reproducible manner has a  long way towards its successful pragmatic implementation. The workflow carried out in the study shows a  promising approach to increase the FAIRness across all the elements of computational studies, from dataset's  annotation, selection, merging to FAIR modeling reporting. This has significant implications for future research  as it provides an example of how to utilize and report different tools available in the nanosafety knowledge  system, while increasing the transparency of the results. One of the main benefits of this workflow is that it  promotes data sharing and reuse, which is essential for advancing scientific knowledge by making data and  metadata FAIR compliant. In addition, the increased transparency and reproducibility of the results can enhance  the trustworthiness of the computational findings.  

Funding

Anticipating Safety Issues at the Design Stage of NAno Product Development

European Commission

Find out more...

Innovative Nanoinformatics models and tools: towards a Solid, verified and Integrated Approach to Predictive (eco)Toxicology (NanoSolveIT)

European Commission

Find out more...

Risk Governance of Nanotechnology

European Commission

Find out more...

History

Publication

NanoImpact 31, 100475

Publisher

Elsevier

Department or School

  • Accounting & Finance

Usage metrics

    University of Limerick

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC