Homology Modelling in Bioinformatics: Current Trends and Perspectives

August 8, 2024 Off By admin

Table of Contents

INTRODUCTION

The structure of a protein is determined by its amino acid sequence. Current experimental structure determination methods, such as X-ray crystallography and nuclear magnetic resonance (NMR), are slow, expensive, and often difficult to perform. In addition to experimentally determining the structure of a protein, problems occasionally arise with the cloning, expression, purification, and crystallization of a protein. When experimental techniques fail, computer modelling is the only way to obtain structural information (1). Therefore, computational predictive models have attracted great interest (2).

Among these modelling methods, homology modelling, also known as template-based modelling, provides the most reliable results. It is based on the observation that two proteins belong to the same family and therefore have similar primary and tertiary structures (3). Homology modelling is based on two fundamental observations in nature: the structure of a protein is based on its amino acid sequence, and the structure itself is more conserved than the sequence during evolution. Thus, homology modelling involves initially finding known homologous protein structures and then incorporating the query (target) sequence into homologous template structures (2).

Currently, the determination of the experimental structure will further increase the number of newly discovered sequences, which is growing much faster than the number of resolved structures (4). Homology modelling is the only method of choice for generating a reliable 3D model of a protein from its amino acid sequence. This has been shown in particular in several sessions of the Biannual Critical Assessment of Techniques for Predicting Protein Structure (CASP) (5). It is a collaborative experiment to determine the state of the art in protein structure modelling. Participants receive amino acid sequences of target proteins and build models of the corresponding three-dimensional structures. Presentations are compared with tests performed by independent reviewers. The experiment is double-blind, the participants do not have access to the experimental structures and the reviewers do not know the identity of the presentations. In addition to structural models, other aspects of protein modelling are also evaluated (6).

The quality of a model is directly related to the identity between the template and the target sequences, as a rule, models with more than 50% sequence similarities are accurate enough for drug discovery applications, and those between 25 and 50% identities can be useful in the design of mutagenesis experiments (7). Since over 95% of protein chains with little sequence identity have diverse structures, the accuracy of the predictions was lowered (8). As the sequence identity diminishes, the likelihood of finding wrong templates increases, resulting in less accurate models with projected model errors. Furthermore, when the sequence identity is low (also known as the “twilight-zone”), where the sequence identity is between 10% and 30%, finding homologous proteins is challenging. As a result, sequence identity is not a statistically valid indicator of model accuracy. In such cases, threading and the ab initio technique provide an alternative strategy for predicting protein structure (9).

Homology modelling is widely used in structure-based drug design processes. The importance of homology modelling increases with the number of available crystal structures. There are several other common uses for homology models: (a) studying the effect of mutations, (b) identifying active binding sites in proteins (useful for ligand design), (c) searching for ligands for a binding site particular binding (database mining), (d) designing new ligands for a given binding site; (e) modelling of substrate specificity, (f) prediction of antigenic epitopes, (g) protein-protein coupling simulations, (h) molecular replacement in the refinement of X-ray structure, (i) rationalization of experimental observations known and (j) planning of new computer experiments with the provided models (4).

On another note, artificial intelligence (AI) refers to a computer that “mimics characteristic intellectual processes, such as the ability to think, discover meaning, generalize, or learn from past experiences” to achieve goals without explicitly choosing one to be programmed for a certain action (10).

Since the mid-1990s, various computer algorithms have been involved in predicting the secondary and tertiary structure of proteins, such as genetic algorithms, graph theory, machine learning, and neural networks (11). While many of the traditional machine learning methods can be replaced by other statistical methods, the main trigger for the use of deep learning neural networks in homology modelling was the beginning of the “big data” era (12).

The principles of homology modelling are best illustrated by looking at the algorithms on which the prediction processes are based. When searching for homologous template sequences, the dynamic programming algorithm can calculate an optimal alignment. Such global alignments (13) are used primarily for sequences of similar length where strong sequence homology is expected. Local alignment (14) algorithms are used to identify motifs in protein sequences. BLAST is a heuristic algorithm for comparing protein or DNA sequences. It is based on the Smith-Waterman algorithm and compares a query sequence with sequences in databases and retrieves those that are similar to the query (2). These steps are often followed by multiple alignments, the generation of the backbone, the generation of loop structures, and the insertion of side chains.

This review discusses the steps involved in homology modelling, reasons to adopt such a technique, commonly available tools and resources to carry out the predictions of models, and the prospects of this field.

Computational Drug Design

Drug Discovery

The steady rise in the number of pathogenic microorganisms that are highly resistant to numerous medications makes controlling pathogenic illnesses extremely difficult. Drug-resistant infections are likely to increase morbidity and length of stay in the hospital. As a result, better drug candidates must be designed and developed to meet this challenge. The standard approach to drug discovery and development is a complex and time-consuming process (15). For the discovery of novel medications, increasingly complex in-silico techniques are being employed to address some of these issues. For consistent results, most in-silico approaches are used in conjunction with in-vitro and in-vivo data (16). Through the employment of diverse software and tools, these computational techniques play a significant role in the pharmacology sectors, which cover international drug development.

In the past, the drug discovery process has relied on high throughput experimental screening (HTS) to identify biologically active compounds. Despite advances in automation techniques for HTS, this approach remains extremely tedious and expensive and has often failed to identify powerful series of derivatives (17). Computational approaches are used to predict 3D protein models in the lack of actual structures, providing insight into the structure and function of these proteins. These are models that biologists and experimentalists can employ to help with structural genomics and biomedical research initiatives. In the context of drug discovery, the creation and improvement of homology modelling refinement tools is a hot topic (18). Applications include studies of protein function and mechanism, assessment of target druggability, high-throughput docking, and lead identification and optimization.

The process of homology modelling is divided into several parts. These stages can be repeated until one has sufficient models (4). The process of creating a homology model has multi-steps, which will be summarized in the following subtopic.

Steps in Homology Modelling

In general, a homology modelling pipeline consists of the steps below, which can be repeated until an acceptable model is obtained: (i) template selection to find the best experimentally determined structures; (ii) target–template sequence alignment; (iii) 3D model structure construction; (iv) model refining; and (v) model quality estimation. Model refinement usually entails more structural changes (19).

Step 1: A software compares the sequence of an unknown structure to a known structure contained in the Protein Data Bank as the first step. BLAST (Basic Local Alignment Search Tool) (http://www.ncbi.nlm.nih.gov/blast/) is the most used server for this. A database search for optimal local alignments with the query using BLAST returns a list of known protein structures that match the sequence (4). Following the identification of template candidates, the best structures must be chosen. To generate 3D structures with high precision, the template sequence’s sequence similarity level to the target sequence is critical. Aside from high sequence similarity, a variety of characteristics are taken into account while selecting an appropriate template. The phylogenetic similarity between the template and target sequences is one of these criteria (20).

Step 2: Following the selection of the most suited alignments, if necessary, corrections are carried out. When more than one template is utilized, the alignments are target-template and template-template. Clustal W (http://www.genome.jp/tools-bin/clustalw), T-Coffee (http://tcoffee.crg.cat/), and MUSCLE (https://www.ebi.ac.uk/Tools/msa/muscle/) are the most extensively used alignment methods (20).

Step 3: The next phase in homology modelling is model creation, which comes after the target–template alignment. To create a protein model for the target, a variety of strategies might be used. Models are typically built using rigid-body assembly, segment matching (21), spatial restraint, and artificial evolution.

The protein structure is broken down into fundamental conserved core sections, loops, and side chains during rigid-body assembly. This method is based on natural dissection, which allows for the construction of a protein 3D structure by putting together rigid bodies that are acquired from aligned template protein structures. A cluster of atomic positions acquired from the template structures is employed as leading positions in the segment matching procedure. The sequence identity, geometry, and energy are used to select segments from known structures in a database to match the segments. The full atom model is then created by laying the segments utilizing the leading structure as a pillar (20).

The spatial restraint approach constructs the model by satisfying constraints imposed by the template structure. Following the alignment, the restraints are framed onto the goal structure. Stereochemical limitations on bond length, bond angle, dihedral angles, and contact distances between London forces determine these restraints. The rigid-body assembly approach and stepwise template evolutionary mutations are used in an artificial evolution method until the template and target sequences are identical (20).

Homologous proteins feature gaps or insertions in their sequences known as loops, whose structures have not evolved. Loops are the most changeable sections of a protein, with frequent insertions and deletions. The functional specificity of a protein structure is frequently determined by loops (4). When creating loops, there are two crucial ways to address them. The database search method looks through all known protein structures for segments that provide crucial core areas. The conformational search method is based on the optimization of a score function. The next stage in model creation is side-chain construction. Putting side chains onto backbone coordinates taken from a parent structure and/or ab initio modelling simulations is how most side-chain modelling is done. Protein side chains are found in a small number of low-energy configurations known as rotamers (20).

Steps 4 and 5: Homology models can have a lot of flaws and inaccuracies. It’s worth noting that the model’s quality requirements are mostly determined by its intended purpose. Although the accuracy of a protein modelling method may be assessed using experimental structures, the quality of an individual model can vary greatly, making a priori model quality evaluation critical. WHATCHECK (https://swift.cmbi.umcn.nl/gv/whatcheck/) and PROCHECK (https://www.ebi.ac.uk/thorntonsrv/software/PROCHECK/) are two prominent homology modelling programs for determining the model’s stereochemistry. The Ramachandran plot (http://mordred.bioc.cam.ac.uk/rapper/rampage.php) is another useful tool for determining protein structure quality. PROSAII (https://www.came.sbg.ac.at/prosa.php) and VERIFY3D (http://servicesn.mbi.ucla.edu/Verify3d/) focus on determining the spatial properties of the model (20).

SWISS-MODEL

Protein three-dimensional structures provide vital insights into their molecular activity and inform a wide range of applications in life science research. Protein complexes are frequently at the heart of numerous biological processes (22). To address the computational prediction of protein-protein interactions, several techniques have been proposed. With the increasing availability of experimentally determined protein complex structures, it has been discovered that interacting surfaces are frequently conserved among homologous complexes and that templates exist for the majority of known protein-protein interactions (23). These findings paved the way for comparative modelling of protein complexes, often known as homology modelling.

SWISS-MODEL (https://swissmodel.expasy.org) (24) is a website that was the first server that could fully execute protein homology modelling. Its modelling capabilities have recently been expanded to enable the modelling of homomeric and heteromeric complexes using the amino acid sequences of the interaction partners as a starting point. SWISS-MODEL now generates 3000 models every day (two per minute), making it one of the most extensively used structure modelling servers in the world. Its performance is constantly assessed and compared to that of other cutting-edge servers in the field (22).

SWISS-MODEL’s default modelling workflow is as described in the following:

Input information;
Template search;
Template selection;
Model construction; and
Determination of model quality.

The method may be applied to both binary and higher-order protein assemblies and can be scaled to complete genomes. Homology modelling of protein complexes is gaining popularity, and it is predicted to play an important role in elucidating the quaternary structure space of proteins (22).

ProMod3

Other new features branching from SWISS-MODEL include the creation of a new modelling engine, ProMod3, which improves the accuracy of the created models, as well as an improved local model quality estimation approach, which is evolved from QMEAN (22).

The structure of a protein provides crucial insights into its molecular activity and aids scientists in designing targeted and effective tests. The number of entries in the Protein Data Bank (PDB) is orders of magnitude smaller than the number of known protein sequences because experimental structure determination is a limiting factor. The extensively used SWISS-MODEL web server is served by ProMod3, which has already supplied millions of protein structure models to the scientific community (25).

The underlying software architecture of a modelling prediction server must be adaptable and easily extendible for evolving demands to support the development of novel algorithms and the implementation of state-of-the-art algorithms. ProMod3 was created to meet these objectives (25). ProMod3 offers ‘actions’ that can be launched from the command line to do common modelling activities such as model construction, sidechain modelling, and so on.

The homology modelling workflow in ProMod3 is built to strike a balance between speed and precision. Conserved structural information is transferred between an alignment and a template structure to create an initial model with the desired target sequence. Small deletions are resolved if a stereochemically-valid conformation can be produced, by relaxing neighbouring residues. Non-resolved deletions are handled in the same way as insertions, with the loop modelling pipeline following. Stereochemical inconsistencies and clashes introduced during the modelling procedure are resolved through energy minimization (25).

On real-world homology modelling challenges, the entire homology modelling procedure is evaluated and directly compared to the MODELLER modelling tool.

MODELLER

The comparison modelling method can be automated using a variety of computer applications and web servers. MODELLER (26) is another example of such a program.

Once an initial target–template alignment is created, a variety of methods can be utilized to create a 3D model for the target protein, as previously mentioned. Modelling by fulfillment of spatial restraints is one group of methods that uses either distance geometry or optimization techniques to satisfy spatial restraints derived from the alignment of the target sequence with the template structures. MODELLER, a method that belongs to this category, extracts spatial restraints from two sources. The model is then constructed using a method that uses conjugate gradients and molecular dynamics to minimize violations of the spatial constraints. The method is conceptually similar to that employed in NMR-based protein structure determination (27).

The optimization-based technique is applied in MODELLER through the loop-modelling module. The versatility and ease of energy minimization, as well as the database approach’s limits imposed by a finite number of known protein structures, are the key reasons behind this. MODELLER uses conjugate gradients and molecular dynamics in conjunction with simulated annealing to get its results (27).

Homology Modelling and Artificial Intelligence (AI)

Protein structure prediction can be used to forecast a protein’s three-dimensional shape based on its amino acid sequence. This is a critical issue because the structure of a protein determines its function; however, protein structures can be difficult to determine experimentally. Utilizing genetic information has recently resulted in significant progress. By evaluating covariation in homologous sequences, it is possible to infer which amino acid residues are in contact, which aids in the prediction of protein structures (28).

Understanding protein structures has been a major challenge in biology for decades because the function of a protein is reliant on its structure. Several experimental structure determination techniques have been developed and improved in accuracy, but they are still difficult and time-consuming (28). Recent advances show the incorporation of artificial intelligence into every aspect of human life. Homology modelling is no exception.

Predictions from data are made using physical equations and modelling in traditional computational approaches. Machine learning proposes a new framework in which algorithms instantaneously deduce/learn a relationship between inputs and outputs based on a set of hypotheses. The property to be predicted can be either protein-level or residue-level (29). The most difficult aspect of modelling techniques is determining how to represent the protein. The encoding of a protein that serves as an input for prediction tasks or as an output for generation tasks is referred to as representation (30). Although a deep neural network is capable of extracting complex features, a well-chosen representation can improve learning effectiveness and efficiency (29).

Protein representations commonly used in deep-learning models include sequence-based (polypeptide chains), structure-based, and coarse-grained models, a unique type of representation relevant to computational modelling of proteins (29). To briefly iterate, AI has helped progress in solving one of the most complex problems in protein structure prediction; the Protein Folding problem, resulting in many prediction tools, one of them being AlphaFold.

AlphaFold has demonstrated the power of machine learning in identifying patterns in primary sequences that establish three-dimensional folds with high precision, following years of intensive experimental work and 175,000 3D structures in the Protein Data Bank, as well as great advances in predicting structures fueled by the CASP competition. The core of AlphaFold is a neural network that is trained on a sizable number of structures in the Protein Data Bank to forecast distributions of distances between carbon atoms of pairs of residues in a protein (28) and build a synthetic force field to direct folding without using a single template, but instead using patterns derived from many proteins (31).

AlphaFold could be a particularly valuable tool for drug creation, because structural knowledge of the target protein is frequently the starting point, and its power can be leveraged by both experimentalists and theoreticians (31). This is, essentially, an example of how to create intelligent programmes and machines that can solve problems creatively. It is primarily concerned with accuracy and patterns, and it automates the creation of analytical models.

DL, like many other technologies, has the potential to revolutionize the field of protein modelling. While DL originated in computers, machine learning’s rapid development, in conjunction with knowledge from research, game theory, and various inferences from other fields, has resulted in many new and powerful methodologies to solve increasingly complex problems. The use of DL for biomolecular structure has only recently begun, and more efforts on methodology development and applications in protein modelling and design are expected (29).

Challenges and Prospects

In the last few decades, the adoption of innovative experimental methods such as cryo-electron microscopy (Cryo-EM) is expected to increase the number of experimentally determined 3D structures. As the number of protein families increases, so does the role of homology modelling in determining the 3D structures of the remaining sequences in these families (20).

The model in classical homology modelling is built primarily on sequence similarity. Ligands are not present in the experimental structure determination because they are frequently lost during the purification process. As a result, the models that are created without taking into account the ligand information in the template represent an unliganded state. With the introduction of ligand-sensitive approaches, this shortcoming has been addressed. Such approaches, however, necessitate expertise and time-consuming manual interventions. As a result, the development of fully automated homology modelling tools capable of dealing with such problems is critical. Homology modelling may leave some questions unanswered computationally. This can be simplified by using models derived from more experimentally determined structures, which allow for more plausible target templates (20).

Another constraint of homology modelling is the presence of loops and inserts, which are difficult to model without template data. It is critical to optimize the loop region and side chains to have a model with high accuracy. Optimization entails using molecular dynamics simulations to refine the generated models. When there is a low level of sequence similarity between the target and the template, using multiple templates is favourable (20).

Many models of a target are built in general at the end of the homology modelling process. Having a large number of models is a benefit, but determining the best model requires additional research. There are plenty of methods for building models. New algorithms and methods have been developed. Several studies have shown that there is no single modelling programme or server that is superior in every way to others. As a result, it is critical to choose the method(s) to be used based on the protein at hand and the specific goal of future applications of the model (20).

CONCLUSION

With the help of homology modelling, it is possible to produce a highly refined structure of any number of protein sequences. This strategy is extremely useful these days, especially since a slew of illnesses are on the rise. Homology modelling is also used extensively in research labs to investigate the capacity and properties of proteins. It is used to increase harvest production as well as its quality.

The recent deep learning lessons emphasize the importance of adding more layers of information and processing in future trends in homology modelling. T Using the analogy of onion layers, the accuracy of homology modelling prediction has increased with each decade by adding new layers of information and processing. It is hoped that further specialization in modelling tool development will enable future customization and testing of module combinations.

BIBLIOGRAPHY

1. Chou K-C. Structural bioinformatics and its impact to biomedical science. Curr Med Chem [Internet]. 2004 Nov 6 [cited 2021 Oct 17];11(16):2105–34. Available from: https://www.meta.org/papers/structural-bioinformatics-and-its-impact-to/15279552

2. Wiltgen M. Algorithms for Structure Comparison and Analysis: Homology Modelling of Proteins. In: Ranganathan S, Nakai K, Schonbach C, editors. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics [Internet]. Elsevier; 2018 [cited 2021 Oct 8]. p. 38–61. Available from: https://books.google.com.my/books?hl=en&lr=&id=rs51DwAAQBAJ&oi=fnd&pg=PA38&dq=%22homology+modelling%22+and+%22general%22&ots=q_XX7dTqLW&sig=VAUw_uFRAjFwyC64uxU3nXHMm3g&redir_esc=y#v=onepage&q=%22homology modelling%22 and %22general%22&f=false

3. R S, A S. Advances in comparative protein-structure modelling. Curr Opin Struct Biol [Internet]. 1997 [cited 2021 Oct 17];7(2):206–14. Available from: https://pubmed.ncbi.nlm.nih.gov/9094331/

4. Vyas VK, Ukawala RD, Ghate M, Chintha C. Homology Modeling a Fast Tool for Drug Discovery: Current Perspectives. Indian J Pharm Sci [Internet]. 2012 Jan [cited 2021 Oct 14];74(1):1. Available from: /pmc/articles/PMC3507339/

5. A T, V M. Assessment of homology-based predictions in CASP5. Proteins [Internet]. 2003 [cited 2021 Oct 17];53 Suppl 6(SUPPL. 6):352–68. Available from: https://pubmed.ncbi.nlm.nih.gov/14579324/

6. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins Struct Funct Bioinforma [Internet]. 2019 Dec 1 [cited 2021 Oct 20];87(12):1011–20. Available from: https://onlinelibrary.wiley.com/doi/full/10.1002/prot.25823

7. CJ F, JP K, RM K. Sequence annotation of nuclear receptor ligand-binding domains by automated homology modeling. Protein Eng [Internet]. 2000 [cited 2021 Oct 17];13(6):391–4. Available from: https://pubmed.ncbi.nlm.nih.gov/10877848/

8. Mizianty MJ, Kurgan L. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences. BMC Bioinforma 2009 101 [Internet]. 2009 Dec 13 [cited 2021 Oct 21];10(1):1–24. Available from: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-414

9. Khor BY, Tye GJ, Lim TS, Choong YS. General overview on structure prediction of twilight-zone proteins. Theor Biol Med Model 2015 121 [Internet]. 2015 Sep 4 [cited 2021 Oct 20];12(1):1–11. Available from: https://tbiomed.biomedcentral.com/articles/10.1186/s12976-015-0014-1

10. Bali J, Garg R, Bali RT. Artificial intelligence (AI) in healthcare and biomedical research: Why a strong computational/AI bioethics framework is required? Indian J Ophthalmol [Internet]. 2019 Jan 1 [cited 2021 Oct 15];67(1):3. Available from: /pmc/articles/PMC6324122/

11. G B. New approaches in molecular structure prediction. Biophys Chem [Internet]. 1996 Mar 7 [cited 2021 Oct 17];59(1–2):1–32. Available from: https://pubmed.ncbi.nlm.nih.gov/8867324/

12. Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J [Internet]. 2020 Jan 1 [cited 2021 Oct 8];18:3494. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7695898/

13. Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar 28;48(3):443–53.

14. Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–7.

15. Malathi K, Ramaiah S. Bioinformatics approaches for new drug discovery: a review. https://doi.org/101080/0264872520181502984 [Internet]. 2018 Jul 3 [cited 2021 Oct 13];34(2):243–60. Available from: https://www.tandfonline.com/doi/abs/10.1080/02648725.2018.1502984

16. Noori HR, Spanagel R. In silico pharmacology: drug design and discovery’s gate to the future. Silico Pharmacol [Internet]. 2013 Dec [cited 2021 Oct 19];1(1). Available from: /pmc/articles/PMC4230818/

17. B L, S L, J H. Technological advances in high-throughput screening. Am J Pharmacogenomics [Internet]. 2004 [cited 2021 Oct 19];4(4):263–76. Available from: https://pubmed.ncbi.nlm.nih.gov/15287820/

18. Cavasotto CN, Phatak SS. Homology modeling in drug discovery: current trends and applications. Drug Discov Today. 2009 Jul 1;14(13–14):676–83.

19. Schmidt T, Bergner A, Schwede T. Modelling three-dimensional protein structures for applications in drug design. Drug Discov Today. 2014 Jul 1;19(7):890–7.

20. Muhammed MT, Aki-Yalcin E. Homology modeling in drug discovery: Overview, current applications, and future perspectives. Chem Biol Drug Des [Internet]. 2019 Jan 1 [cited 2021 Oct 8];93(1):12–20. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/cbdd.13388

21. M L. Accurate modeling of protein conformation by automatic segment matching. J Mol Biol [Internet]. 1992 Jul 20 [cited 2021 Oct 22];226(2):507–33. Available from: https://pubmed.ncbi.nlm.nih.gov/1640463/

22. Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: Homology modelling of protein structures and complexes. Nucleic Acids Res. 2018 Jul 2;46(W1):W296–303.

23. PJ K, Z Z, J J, IA V. Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci U S A [Internet]. 2012 Jun 12 [cited 2021 Oct 27];109(24):9438–41. Available from: https://pubmed.ncbi.nlm.nih.gov/22645367/

24. SWISS-MODEL [Internet]. [cited 2020 Mar 20]. Available from: https://swissmodel.expasy.org/

25. Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3—A versatile homology modelling toolbox. PLOS Comput Biol [Internet]. 2021 Jan 28 [cited 2021 Oct 8];17(1):e1008667. Available from: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008667

26. About MODELLER [Internet]. [cited 2020 Nov 7]. Available from: https://salilab.org/modeller/

27. Fiser A, Šali A. Modeller: Generation and Refinement of Homology-Based Protein Structure Models. Methods Enzymol. 2003 Jan 1;374:461–91.

28. AW S, R E, J J, J K, L S, T G, et al. Improved protein structure prediction using potentials from deep learning. Nature [Internet]. 2020 Jan 30 [cited 2021 Oct 29];577(7792):706–10. Available from: https://pubmed.ncbi.nlm.nih.gov/31942072/

29. Gao W, Mahajan SP, Sulam J, Gray JJ. Deep Learning in Protein Structural Modeling and Design. 2020 [cited 2021 Oct 29]; Available from: https://doi.org/10.1016/j.patter.2020.100142

30. Y B, A C, P V. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell [Internet]. 2013 [cited 2021 Nov 6];35(8):1798–828. Available from: https://pubmed.ncbi.nlm.nih.gov/23787338/

31. Fersht AR. AlphaFold – A Personal Perspective on the Impact of Machine Learning. J Mol Biol. 2021 Oct 1;433(20):167088.