Independent Modeling Project Paper

Testing Molecular Simulations with Minichaperones

Michael Pacold and Marty Pagel

May 3rd, 1999


Abstract

To test the ability of computational methods to duplicate bench data, free energy perturbation methods, or "thermodynamic cycles," were used to calculate the differences in folding and solvation energies of a the wild type GroEL "minichaperone" and an aggregation-prone mutant, leucine 237 to glutamate (L237E). The results indicate that additional computational power and modeling capabilities will be needed for quantitatively accurate simulation of this problem. Qualitatively consistent results, tho ugh, can be obtained if certain requirements are met. These conditions, such as the inclusion of water in energy minimizations and careful assessment of the initial and final folding states of the protein, are essential to the acquisition of accurate data .


Introduction

The GroEL family of molecular chaperones is responsible for the in vivo refolding of misfolded polypeptides. GroEL is an elegant molecular machine, consisting of fourteen identical subunits arranged in two rings of seven subunits stacked bac k to back. The hollow cavity in the center of the chaperone serves as a cavity in which misfolded polypeptides can be unfolded and allowed to refold in isolation, and is essential to in vivo function. The individual apical domains, though, are capa ble of chaperone activity in vitro, presumably through a cycle of polypeptide binding and release analogous to the chaperone cycle of holo-GroEL. These isolated apical domains are termed "minichaperones."

 

Figure 1. Overall architecture of GroEL. The central cavity of one of the GroEL rings is surrounded by seven apical domains, which are colored red through violet. The remainder of the ring is colored gray. The polypeptide binding regions are located on the faces of the apical domains that line the ring.

GroEL minichaperones usually express well, but are prone to aggregation in vitro. The extent of this aggregation can be observed by gel permeation chromatography. During the construction of a series of mutant minichaperones which consisted o f residues 191 to 376 of the GroEL subunit, it was noticed that several mutants containing hydrophobic to hydrophilic mutations in hydrophobic regions of the minichaperone expressed poorly and were more prone to aggregation than the wild type protein. The most aggregation-prone mutant, which was also obtained in lowest yield, was a hydrophobic to hydrophilic mutation, leucine 237 to glutamate, in a hydrophobic patch, which is the presumed "active site" of the minichaperone.

Figure 2. GroEL minichaperone showing the L237E mutation in the hydrophobic pocket. The residues are color coded from red to blue according to decreasing hydrophobicity. The glutamate is ball and stick rendered and colored by atom. The polypeptide bind ing site is presumably located in the hydrophobic cleft surrounding the mutation.

To test the consistency of computer simulation with bench experimental results, a series of free energy perturbation or "thermodynamic cycle" calculations were carried out on wild type and mutant minichaperones. The cycles, shown in figures 3 throu gh 6, started from a variety of initial states and ended with the folded, solvated or unsolvated wild type minichaperone.

Figure 3. Thermodynamic cycle 1 (Unsolvated -> Solvated)

Figure 4. Thermodynamic cycle 2 (Unsolvated extended -> Solvated)

Figure 5. Thermodynamic cycle 3 (Unsolvated vacuum -> Solvated)

Figure 6. Thermodynamic cycle 4 (vacuum extended -> Vacuum folded)


Experimental Methods

Coordinates for the minichaperone were obtained by truncating the available GroEL structure4 to residues 191 through 376. The resulting 185 residue polypeptide was imported into Insight II (Biosym/MSI, , California) and soaked with the a ssembly/soak command in the viewer module. The thickness of the water layer was limited by the computing power available. Initially, a 10 A thick solvent layer was used, but none of the available computers had sufficient memory or processor power to carry out the subsequent simulations. The solvent layer was decreased in increments of 1 A to a final thickness of 4 A, where energy minimizations could be run.

For calculations involving the unsolvated minichaperone, the protein and the solvent shell were separated by a distance of 20 A. Several calculations required an extended polypeptide with the minichaperone’s sequence. These extended minichaperones were prepared by using the Biopolymer module’s protein/secondary command to change the conformation of the backbone to "extended." For in vacuo calculations, the solvent shell was deleted.

The molecules were prepared for energy minimization with the Biopolymer module of Insight II, which was used to cap the model with hydrogens and fix potentials under the CFF91 forcefield. Energy minimizations were carried out with the Discover modu le of Insight II. For all minimizations, the CFF91cross terms, a harmonic bond potential, and a dielectric of 1.0 were used. A series of minimizations were carried out on each model to lower the energy to its minimum. The iteration parameters and derivati ves of these minimizations are presented in the results section.

The energies for the system were obtained from the *.out files produced by Discover, and analyzed with Microsoft Excel.


Results

Data from the energy minimizations of the folded native and mutant minichaperones in solvent and separated from the solvent are shown in tables 1 and 2. Data from the extended native and mutant minichaperones without solvent are shown in table 3. A s demonstrated by plots of the data (figure 7), the energy of the system drops greatly after the first 5000 iterations of conjugate gradient minimization, and decreases slightly in the succeeding minimizations.

Iterations and Minimization Type

Derivative

Native Energy (kcal)

Convergence

L237E Energy (kcal)

Convergence

500 SD

1.0

-15295

N

-14669.3

N

5000 CG

1.0

-18843.2

Y

-18119.9

Y

10000 CG

0.5

-18844.4

Y

-18181.3

Y

15000 CG

0.25

-18914.6

Y

-18133.2

Y

20000 CG

0.1

-18962.7

Y

-18280.2

Y

Table 1. Energies of folded native and folded L237E minichaperones in 4 A solvent. CG = Conjugate Gradient. SD = Steepest Descent.

Iterations and Minimization Type

Derivative

Native Energy (kcal)

Convergence

L237E Energy (kcal)

Convergence

500 SD

1.0

-13832.1

N

-13223.8

N

5000 CG

1.0

-17631.2

Y

-16804.4

Y

10000 CG

0.5

-17837.5

Y

-17047.3

Y

15000 CG

0.25

-17891.2

N

-17104.5

Y

20000 CG

0.1

-17993.3

N

-17199.3

N

Table 2. Energies of folded native and folded L237E minichaperones separated from the 4 A solvent shell. CG = Conjugate Gradient. SD = Steepest Descent.

 

Iterations and Minimization Type

Derivative

Native Energy (kcal)

Convergence

L237E Energy (kcal)

Convergence

500 SD

1.0

-12429.3

N

-11822.2

N

5000 CG

1.0

-16206.8

Y

-15504.5

Y

10000 CG

0.5

-16368.4

Y

-15588.4

Y

15000 CG

0.25

-16339.1

Y

-15676.6

N

20000 CG

0.1

-16320.7

Y

Table 3. Energies of extended native and extended L237E minichaperones separated from the 4 A solvent shell. CG = Conjugate Gradient. SD = Steepest Descent.

Figure 7. Energies of the wild type and mutated minichaperones, minimized in solvent and separated from solvent, as a function of minimization iterations.

The energies obtained from minimization of the native and L237E folded minichaperones in a vacuum are shown in table 4. The corresponding energies for the native and L237E extended minichaperones are shown in table 5. The data are plotted in figure s 8 and 9. The energies of the folded minichaperones minimized in a vacuum do not change significantly, but the energies of the extended minichaperones drop and then plateau.

Iterations and Minimization Type

Derivative

Native Energy (kcal)

Convergence

L237E Energy (kcal)

Convergence

5000 CG

0.1

-4460.6

Y

-4472.83

Y

10000 CG

0.01

-4460.69

Y

-4472.9

Y

15000 CG

0.001

-4465.4

Y

-4477.66

Y

Table 4. Energies of folded native and extended L237E minichaperones minimized in a vacuum. CG = Conjugate Gradient. 

Iterations and Minimization Type

Derivative

Native Energy (kcal)

Convergence

L237E Energy (kcal)

Convergence

500 SD

1

-2650.31

N

-2664.16

N

5000 CG

0.25

-2997.81

Y

-3016.48

Y

10000 CG

0.1

-3044.16

Y

-3060.6

Y

15000 CG

0.01

-3092.33

N

  Table 5. Energies of extended native and extended L237E minichaperones minimized in a vacuum. CG = Conjugate Gradient. SD = Steepest Descent.

Figure 8. Energy of the folded wild type and L237E minichaperones as a function of minimization steps in vacuo.

 

Figure 9. Energy of the extended wild type and L237E minichaperones as a function of minimization steps in vacuo.

The results from the energy minimizations were used in thermodynamic cycle calculations, as shown in figures 3 through 6. The energies used and the results of the thermodynamic cycle calculations are summarized in list 1.

List 1. Thermodynamic Cycle Results

Cycle 1 (figure 3). DDG of solvating the folded wild type and mutant minichaperones

Cycle 2 (figure 4). DDG of folding and solvating the folded wild type and mutant minichaperones

N.B. For this calculation, data from 15000 iterations of Conjugate Gradient minimization were used, as the final 5000 steps were not completed.

Cycle 3 (figure 5). DDG of solvating (from a vacuum) the folded wild type and mutant minichaperones

For this calculation, the solvent data came from 20000 iterations of conjugate gradient minimization, and the vacuum data came from 15000 iterations of conjugate gradient minimization.

Cycle 4 (figure 6). DDG of folding the wild type and mutant minichaperones in a vacuum.

For this calculation, the folded vacuum data came from 15000 iterations of conjugate gradient minimization and the extended vacuum data came from 10000 iterations of conjugate gradient minimization.


Discussion

The convergence of an energy minimization is not an indication that the structure has reached a local minimum, and is certainly not an indication that the structure has reached a global minimum (which is better accessed by molecular dynamics calcul ations). The convergence of the energies from successive minimizations carried out with increasingly higher derivatives is a better indicator that a minimum has been reached.

The minimization data from these experiments appear to meet this criterion. As shown in figure 7, the energy of water-containing systems dropped significantly after the first set of conjugate gradient energy minimizations, and did not decrease sign ificantly from that point. The same pattern was observed in energy minimization data from the extended minichaperones. The energies of the folded minichaperones minimized in vacuo decreased slightly after 15000 conjugate gradient minimizations, but the decrease (-5 kcal) was not significant relative to the overall energy of the system (-4000 kcal). Energies from the maximum number of iterations were used for the thermodynamic cycles.

In all of the thermodynamic cycles, the minichaperone proceeded from a mutated, unsolvated state to a folded, native state. To be in accordance with the experimental evidence supporting instability of the mutant minichaperone, the DDG values for the thermodynamic cycles would have to be negative. This would indicate that the energy of solvating and folding the native chaperone would be more negative (i.e. more favorable) than the energy of solvating and folding the mutant L237E minichaperone. In systems without solvent, a positive DDG value would state that the free energy released by folding the native minichaperone would be greater than the free energy released by folding the mutant, and that folding of the native minichaperone in vacuo would be more favorable than folding of the mutant minichaperone in vacuo.

The cycles which simulated the solvation of the folded wild type and mutant minichaperones (figure 3) and the folding of the native and mutant minichaperones in a vacuum (figure 6) produced negative DDG values. The cycles which studied the solvation of the folded minichaperones from a vacuum (figure 5) and the folding and solvation of the wild type and mutant minichaperones (figure 6) produced positive DDG values. Three of the four DDG values were quite high. With the exception of the cycle which examined folding of the minichaperones in a vacuum, which returned a DDG value of +4.1 kcal, the other cycles produced DD G values of - -118.8 kcal, - 111.5 kcal, and - -694.8 kcal. These values support qualitative, rather than quantitative, conclusions.

Proteins normally exist in aqueous environments. While the folding of a protein in a vacuum may allow examination of the folding pathway and the study of the interactions responsible for the formation of the molten globule and the final, folded sta te, the energies obtained from such calculations may not accurately reflect the protein’s actual energies in its aqueous environment. Therefore, the results from the thermodynamic cycle in which water was excluded from all calculations (figure 6) may be t reated as suspect.

The other thermodynamic cycle which produced a positive DDG value examined the solvation of the folded mutant and wild type minichaperones. Examination of the structures of the L237E minichaperone and its solvent shell re vealed a new hydrogen bond between the glutamic acid in the minichaperone’s active site and a water molecule (figure 10). This additional hydrogen bond, while certainly not worth 118 kcal, certainly could make the solvation of the mutant minichaperone mor e favorable.

Figure 10. Comparison of the wild type and mutant minichaperones after energy minimization in a 4 A solvent layer. The minichaperones and solvent shell are colored from red to blue in order of decreasing hydrophobicity. The mutated resid ues are colored by atom. The polar backbone of glutamate 237 in the mutated minichaperone is clearly visible and available for hydrogen bonding with the solvent.

An analogous cycle, in which the solvation of the wild type and mutant minichaperones from a vacuum was studied, produced a negative DDG value as expected. This cycle assumed that the solvent shells of the wild type and m utant minichaperones would minimize to the same energy. This energy would not need to be taken into consideration during calculation of the DG for making the mutation in the folded, non-solvated protein, and would therefore not enter into the calculation of the DDG. The large value of the DDG for the cycle, though, prevents quantitative conclusions from being drawn. In addition, both this cycle and its analog in which the sol vent shell is incorporated into energy minimizations of the starting minichaperones assume that the mutant minichaperone will fold correctly prior to solvation. Since proteins are not produced in their folded states, this assumption is not applicable to t he experimental results. It does suggest that the folded mutant minichaperone might be more stable in the gas state, but there is currently no data to confirm or deny this prediction.

The final thermodynamic cycle (figure 4) examined the energy of folding and solvating the wild type and mutant minichaperones. The DDG obtained from this cycle was -118.8 kcal. While this number is quite large, it does in dicate that the energy of folding and solvating the native minichaperone is more negative (and therefore more favorable) than the energy of folding and solvating the L237E minichaperone. This qualitative result is probably the most consistent with the act ual data, which indicate that the folded mutant minichaperone is poorly expressed and recovered.


Conclusions

The accuracy of the energy minimizations in this project was hampered by lack of computer power required to model the solvent more accurately. Simulation of the solvent with periodic boundary conditions would probably have yielded better results. P BC calculations on this protein, though, would not be possible without substantial upgrades to the computing power available for this course. Given the 4 A solvent layer used for the calculations, and the possible inadequacies of the InsightII solvent mod el, it is not surprising that only qualitative conclusions could be drawn from the thermodynamic cycles, and that results from two of the cycles were not in concurrence with actual data obtained during expression of the mutant minichaperone.

Even under these conditions, it was possible to obtain some data which agreed with the results of in vitro experiments. The results obtained from the thermodynamic cycle describing the folding and solvation of the minichaperone are in accord ance with the data indicating that this mutant is difficult to express in a system where the wild type minichaperone and other mutants express well. The failure of an analogous thermodynamic cycle lacking water to concur with actual results indicates the importance of including water in energy minimization calculations involving proteins or other biological macromolecules whose natural environment is aqueous.


References

Bukau, B., and Horwich, A. L. (1998) Cell 92, 351-366.

2 Chatellier, J., Hill, F., Lund, P. A., and Fersht, A. R. (1998) Proc. Natl. Acad. Sci. USA 95, 9861-9866.

3 Zahn, R., Buckle, A. M., Perrett, S., Johnson, C. M.., Corrales, F. J., Golbik, R., and Fersht, A. R. (1996). Proc. Natl. Acad. Sci. U.S.A. 93, 15024-15029.

4 Braig, K., Otwinowski, Z., Hegde, R., Boisvert, D. C., Joachimiak, A., Horwich, A. L. & Sigler, P. B. (1994). Nature (London) 371, 578-586.

5 Buckle, A. M., Zahn, R., and Fersht, A. R. (1997). Proc. Natl. Acad. Sci. U.S.A. 94, 3576-3578.

6 Discover Manual, Part 1, September 1996. San Diego: Molecular Simulations, Inc., 1996.