MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models

Journal of Cheminformatics
2024
Sosnin Sergey
Sosnin, S.
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-024-00888-z
DOI: 10.1186/s13321-024-00888-z
PMID:
Keyword:

Abstract

The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community.