Development and application of computer-aided strategies for virtual screening and hit-to-lead

Author

Perez Lopez, Carles

Director

Guallar i Tasies, Victor

Tutor

Gelpi Buchaca, Josep Lluís

Date of defense

2022-07-13

Pages

193 p.



Department/Institute

Universitat de Barcelona. Facultat de Biologia

Abstract

The incremental application of supercomputers to offer solutions to complex problems has motivated the usage of computational modeling tools in drug design pipelines. Specifically, small low-affinity compounds are modified by including multiple decorators in the hit-to-lead phase to obtain more potent compounds. Techniques such as docking provide quick answers on classifying millions of candidates, differentiating active from inactive, but their accuracies tend to drop when ranking ligand’s potencies. Expensive methods such as FEP are more precise; however, the time consumption limitate their application in hit-to-lead campaigns. This thesis aims to implement and test novel methodologies in a mid computation and accuracy term, focused on facing hit-to-lead stages. We developed FragPELE, a novel ligand growing method integrated with PELE, an unconventional Monte Carlo sampling algorithm. FragPELE introduces a new concept of progressively expanding a small atom-sized moiety of atoms (fragment) within PELE simulations, adapting the protein binding site to the newly grown R-group. Structural and scoring benchmarks remarked accurate geometrical predictions and correlation with relative free energies, with a reasonable consumption of time and resources. Besides, we combined FragPELE with the recently developed aquaPELE algorithm to expand fragments on hydrated binding sites. Results stressed improved accuracies when introducing the mixed implicit/explicit solvent models integrated within aquaPELE. Additionally, we participated in a collaborative project with Almirall. We assessed our FragPELE tool in two prospective hit-to-lead studies. One of them ended with synthesizing an improved version of the initial hit. On the other, the method showed good predictive power in classifying non-terminal R-groups on 27 new compounds (not reported in the literature). Finally, we optimized virtual screening pipelines by integrating machine learning analysis with simulated data, training, testing, and validating the designed classification models with external experimental sets. From 785 compounds, Almirall purchased 23 based on our results. Two of them showed inhibition of the target, one in the nM range of activity.

Keywords

Disseny de medicaments; Diseño de medicamentos; Drug design; Bioinformàtica; Bioinformática; Bioinformatics; Aprenentatge automàtic; Aprendizaje automático; Machine learning

Subjects

577 - Material bases of life. Biochemistry. Molecular biology. Biophysics

Knowledge Area

Ciències Experimentals i Matemàtiques

Note

Programa de Doctorat en Biomedicina / Tesi realitzada al Centre de Supercomputació de Barcelona (BSC)

Documents

CPL_PhD_THESIS.pdf

27.06Mb

 

Rights

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nd/4.0/
L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nd/4.0/

This item appears in the following Collection(s)