Electrical and Computer Engineering Faculty Publication Series

Benchmarking Small-Dataset Structure-Activity-Relationship Models for Prediction of Wnt Signaling Inhibition

Mahtab Kokabi, University of Massachusetts Amherst
Matthew Donnelly, University of Massachusetts Amherst
Guangyu Xu, University of Massachusetts Amherst

Publication Date

2020

Journal or Book Title

IEEE Access

Abstract

Quantitative structure-activity relationship (QSAR) models based on machine learning algorithms are powerful tools to expedite drug discovery processes and therapeutics development. Given the cost in acquiring large-sized training datasets, it is useful to examine if QSAR analysis can reasonably predict drug activity with only a small-sized dataset (size <; 100) and benchmark these small-dataset QSAR models in application-specific studies. To this end, here we present a systematic benchmarking study on small-dataset QSAR models built for prediction of effective Wnt signaling inhibitors, which are essential to therapeutics development in prevalent human diseases (e.g., cancer). Specifically, we examined a total of 72 two-dimensional (2D) QSAR models based on 4 best-performing algorithms, 6 commonly used molecular fingerprints, and 3 typical fingerprint lengths. We trained these models using a training dataset (56 compounds), benchmarked their performance on 4 figures-of-merit (FOMs), and examined their prediction accuracy using an external validation dataset (14 compounds). Our data show that the model performance is maximized when: 1) molecular fingerprints are selected to provide sufficient, unique, and not overly detailed representations of the chemical structures of drug compounds; 2) algorithms are selected to reduce the number of false predictions due to class imbalance in the dataset; and 3) models are selected to reach balanced performance on all 4 FOMs. These results may provide general guidelines in developing high-performance small-dataset QSAR models for drug activity prediction.

DOI

https://doi.org/10.1109/ACCESS.2020.3046190

Pages

228831-228840

Volume

License

UMass Amherst Open Access Policy

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Kokabi, Mahtab; Donnelly, Matthew; and Xu, Guangyu, "Benchmarking Small-Dataset Structure-Activity-Relationship Models for Prediction of Wnt Signaling Inhibition" (2020). IEEE Access. 1200.
https://doi.org/10.1109/ACCESS.2020.3046190

Download

COinS

ScholarWorks@UMass Amherst

Electrical and Computer Engineering Faculty Publication Series

Benchmarking Small-Dataset Structure-Activity-Relationship Models for Prediction of Wnt Signaling Inhibition

Publication Date

Journal or Book Title

Abstract

DOI

Pages

Volume

License

Creative Commons License

Recommended Citation

Browse

Author Corner

Links

ScholarWorks@UMass Amherst

Electrical and Computer Engineering Faculty Publication Series

Benchmarking Small-Dataset Structure-Activity-Relationship Models for Prediction of Wnt Signaling Inhibition

Authors

Publication Date

Journal or Book Title

Abstract

DOI

Pages

Volume

License

Creative Commons License

Recommended Citation

Share

Browse

Author Corner

Links