國立虎尾科技大學 |

Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment./
作者:	Ashtawy, Hossam M.
面頁冊數:	1 online resource (205 pages)
附註:	Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.
標題:	Electrical engineering. -
電子資源:	click for full text (PQDT)
ISBN:	9780355086249

Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.
Ashtawy, Hossam M.

Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment. - 1 online resource (205 pages)

Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.

Thesis (Ph.D.)--Michigan State University, 2017.

Includes bibliographical references

Molecular modeling has become an essential tool to assist in early stages of drug discovery and development. Molecular docking, scoring, and virtual screening are three such modeling tasks of particular importance in computer-aided drug discovery. They are used to computationally simulate the interaction between small drug-like molecules, known as ligands, and a target protein whose activity is to be altered. Scoring functions (SF) are typically employed to predict the binding conformation (docking task), binary activity label (screening task), and binding affinity (scoring task) of ligands against a critical protein in the disease's pathway. In most molecular docking software packages available today, a generic binding affinity-based (BA-based) SF is invoked for the three tasks to solve three different, but related, prediction problems. The vast majority of these predictive models are knowledge-based, empirical, or force-field scoring functions. The fourth family of SFs that has gained popularity recently and showed potential of improved accuracy is based on machine-learning (ML) approaches. Despite intense efforts in developing conventional and current ML SFs, their limited predictive accuracies in these three tasks have been a major roadblock toward cost-effective drug discovery. Therefore, in this work we present (i) novel task-specific and multi-task SFs employing large ensembles of deep neural networks (NN) and other state-of-the-art ML algorithms in conjunction with (ii) data-driven multi-perspective descriptors (features) for accurate characterization of protein-ligand complexes (PLCs) extracted using our Descriptor Data Bank (DDB) platform.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780355086249Subjects--Topical Terms:

596380
Electrical engineering.
Index Terms--Genre/Form:

554714
Electronic books.

Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.
LDR:05649ntm a2200325K 4500 001 914369
005 20180703084808.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9780355086249
035 $a (MiAaPQ)AAI10602187
035 $a (MiAaPQ)grad.msu:15404
035 $a AAI10602187
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Ashtawy, Hossam M. $3 1187602
245 1 0 $a Data-Driven and Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.
264 0 $c 2017
300 $a 1 online resource (205 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.
500 $a Adviser: Nihar Mahapatra.
502 $a Thesis (Ph.D.)--Michigan State University, 2017.
504 $a Includes bibliographical references
520 $a Molecular modeling has become an essential tool to assist in early stages of drug discovery and development. Molecular docking, scoring, and virtual screening are three such modeling tasks of particular importance in computer-aided drug discovery. They are used to computationally simulate the interaction between small drug-like molecules, known as ligands, and a target protein whose activity is to be altered. Scoring functions (SF) are typically employed to predict the binding conformation (docking task), binary activity label (screening task), and binding affinity (scoring task) of ligands against a critical protein in the disease's pathway. In most molecular docking software packages available today, a generic binding affinity-based (BA-based) SF is invoked for the three tasks to solve three different, but related, prediction problems. The vast majority of these predictive models are knowledge-based, empirical, or force-field scoring functions. The fourth family of SFs that has gained popularity recently and showed potential of improved accuracy is based on machine-learning (ML) approaches. Despite intense efforts in developing conventional and current ML SFs, their limited predictive accuracies in these three tasks have been a major roadblock toward cost-effective drug discovery. Therefore, in this work we present (i) novel task-specific and multi-task SFs employing large ensembles of deep neural networks (NN) and other state-of-the-art ML algorithms in conjunction with (ii) data-driven multi-perspective descriptors (features) for accurate characterization of protein-ligand complexes (PLCs) extracted using our Descriptor Data Bank (DDB) platform.
520 $a We assess the docking, screening, scoring, and ranking accuracies of the proposed task-specific SFs with DDB descriptors as well as several conventional approaches in the context of the 2007 and 2014 PDBbind benchmark that encompasses a diverse set of high-quality PLCs. Our approaches substantially outperform conventional SFs based on BA and single-perspective descriptors in all tests. In terms of scoring accuracy, we find that the ensemble NN SFs, BsN-Score and BgN-Score, have more than 34% better correlation (0.844 and 0.840 vs. 0.627) between predicted and measured BAs compared to that achieved by X-Score, a top performing conventional SF. We further find that ensemble NN models surpass SFs based on other state-of-the-art ML algorithms. Similar results have been obtained for the ranking task. Within clusters of PLCs with different ligands bound to the same target protein, we find that the best ensemble NN SF is able to rank the ligands correctly 64.6% of the time compared to 57.8% obtained by X-Score. A substantial improvement in the docking task has also been achieved by our proposed docking-specific SFs. We find that the docking NN SF, BsN-Dock, has a success rate of 95% in identifying poses that are within 2 A RMSD from the native poses of 65 different protein families. This is in comparison to a success rate of only 82% achieved by the best conventional SF, ChemPLP, employed in the commercial docking software GOLD. As for the ability to distinguish active molecules from inactives, our screening-specific SFs showed excellent improvements over the conventional approaches. The proposed SF BsN-Screen achieved a screening enrichment factor of 33.90 as opposed to 19.54 obtained from the best conventional SF, GlideScore, employed in the docking software Glide. For all tasks, we observed that the proposed task-specific SFs benefit more than their conventional counterparts from increases in the number of descriptors and training PLCs. They also perform better on novel proteins that they were never trained on before. In addition to the three task-specific SFs, we propose a novel multi-task deep neural network (MT-Net) that is trained on data from three tasks to simultaneously predict binding poses, affinities, and activity labels. MT-Net is composed of shared hidden layers for the three tasks to learn common features, task-specific hidden layers for higher feature representation, and three outputs for the three tasks. We show that the performance of MT-Net is superior to conventional SFs and competitive with other ML approaches. Based on current results and potential improvements, we believe our proposed ideas will have a transformative impact on the accuracy and outcomes of molecular docking and virtual screening.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Electrical engineering. $3 596380
655 7 $a Electronic books. $2 local $3 554714
690 $a 0544
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Michigan State University. $b Electrical Engineering - Doctor of Philosophy. $3 1187603
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10602187 $z click for full text (PQDT)