A PROTAC (PROteolysis TArgeting Chimera) is a hetero-bifunctional molecule composed of a protein of interest (POI) ligand, a linker and an E3 ubiquitin ligase recruiting ligand. It promotes the formation of a ternary complex (POI-PROTAC-E3) by bringing the ubiquitination machinery to the proximity of POI, driving the transfer of ubiquitin from E2 enzyme to the exposed lysine on target protein. Subsequently, the polyubiquitination occurs and the ubiquitinated POI is recognized by 26 S proteasome and degraded into small peptide fragments or even amino acids (Figure 1).
Figure 1. Degradation mechanism of PROTACs on target proteins.
As a novel and promising technique, PROTACs displays a variety of superior properties in comparison to the current treatment methods. First, PROTACs is capable to modulate the undruggable targets that lack of a classical hydrophobic drug binding pocket or strongly bind with endogenous molecules. Further, it can also tackle the proteins that function through protein-protein interactions. Escaping the demand for blocking the catalytic activity or protein-protein interface, PROTACs can recruit a ligand that binds anywhere on the target protein with relatively low affinity. Second, PROTACs act catalytically because they are released from the ternary complex once the ubiquitination process is completed (Figure 1). Due to this catalytical nature, PROTACs can play a role at low exposures, reducing the potential for off-target and other undesirable effects. Third, the accumulation of target proteins is frequently observed in inhibitor-based methods on account of the protein stabilization by drug binding and transcriptional upregulation of proteins. This exerts adverse effects to the efficacy of inhibitor. However, the target accumulation can be avoided by employing PROTACs because they eliminate the whole proteins through proteasome. Additionally, this also indicates that PROTACs can modulate nonenzymatic/scaffolding functions and address the problem of drug resistance arose from the mutations surrounding the binding pocket. Finally, improved selectivity among closely related proteins can be provided by applying PROTACs. The active sites of homologous proteins are highly conserved, while the sequence and conformation outside the catalytic core maybe of great change. PROTACs can exploit this difference to degrade the specific targets as the ubiquitin transfer step depends on the relative location of exposed lysine and ubiquitin. It means that the conformation of ternary complex is of great significance for the development of potent PROTACs. The conformation is largely dependent on the PROTACs linker, which becomes one of the central tasks for PROTACs design. It is difficult to design a universal linker that is suitable for all cases due to the different structures of target proteins and E3 ligases.
Nevertheless, the structure-activity relationship of PROTACs is still not unambiguous due to the scarcity of experimental ternary structures and the difficulty in obtaining accurate computational structures. Further, there are no computational methods for rational design and efficacy evaluation of PROTACs so far. Practically, medical chemists designed the linkers with various lengths and structures by experience and linked them with known target ligands and E3 ligands to produce candidate PROTACs. Subsequently, these molecules would be synthesized and screened by protein immunoblot analysis or other experimental approaches to obtain the PROTACs with degradative potency.
In order to effectively instruct the rational design of PROTACs, especially the design of the linker, we proposed a deep learning-based model, DeepPROTACs. Given the structures of target protein, E3 ubiquitin ligase and designed PROTACs, DeepPROTACs would predict the degradation efficacy of PROTACs. The data were mainly derived from the PROTAC-DB database and we also collected some more data from public literatures. Based on half maximal degradation concentration (DC50) and maximal degradation (Dmax), we simplified the prediction task to a binary classification problem. To circumvent the intricate modeling process of PROTACs ternary complexes, we extracted pockets and ligands from known protein-ligand structures. Together with linker, the five parts (POI pocket, warhead, E3 pocket, E3 ligand, and linker) were fed into were fed into a neural network, which extracted the features of linker by using a bidirectional LSTM and the features of pocket/ligand by using a graph convolutional network. The output feature vectors of each module were then combined and input into a multilayer perceptron to predict the degradation efficacy of PROTACs (Figure 2).
Figure 2. Network architecture of DeepPROTACs model.
On test set, the DeepPROTACs model achieved an average prediction accuracy of about 78% and an area under the ROC curve (AUROC) of around 0.85. In addition, the prediction accuracy of DeepPROTACs for an external experiment set (PROTACs of ER protein) and data sets that were not included in the training set (PROTACs of EZH2, STAT3, eIF4E, and FLT-3) ranged from 65% - 80%, illustrating its good generalization capability.
The web server (https://bailab.siais.shanghaitech.edu.cn/services/deepprotacs/) and source code (https://github.com/fenglei104/DeepPROTACs) were released for reference. In brief, we developed a deep learning-based PROTACs degradation predictor - DeepPROTACs, which can be utilized to perform high-throughput virtual screening for potential PROTACs. It will not only provide beneficial guidelines for the rational design of PROTACs molecules, but also help significantly reduce the time/cost for R&D of PROTACs drugs.
 F. Li et al., “DeepPROTACs is a deep learning-based targeted degradation predictor for PROTACs,” Nat. Commun., vol. 13, no. 1, Art. no. 1, Nov. 2022, doi: 10.1038/s41467-022-34807-3.
 G. Weng et al., “PROTAC-DB: an online database of PROTACs,” Nucleic Acids Res., vol. 49, no. D1, pp. D1381–D1387, Jan. 2021, doi: 10.1093/nar/gkaa807.