Structural biology has emerged as a cornerstone discipline in the scientific quest to understand the intricate functions of macromolecules. By shedding light on their roles in diseases and facilitating the development of therapeutic molecules, structural biology has significantly contributed to scientific advancements. Among the key tools in this field, the Protein Data Bank has long served as a repository for invaluable protein structures, primarily elucidated through X-ray crystallography. However, recent years have witnessed two transformative breakthroughs that have revolutionized the field: the cryo-electron microscopy (cryo-EM) resolution revolution and the integration of machine learning techniques in protein structure prediction, exemplified by AlphaFold and RoseTTAFold. Despite these remarkable advancements, experimental phasing remains indispensable, especially when predictions fall short or when dealing with proteins unsuitable for cryo-EM techniques due to their size.
The Role of Experimental Phasing:
Experimental phasing, particularly the single-wavelength anomalous diffraction (SAD) technique, plays a pivotal role in unravelling protein structures. SAD involves collecting datasets at a wavelength close to the absorption edge of an anomalous scatterer. Traditionally, selenium has been the preferred choice for this purpose. However, when long-wavelength X-rays are utilized, this method can be extended to other elements such as sulfur, calcium, potassium, chlorine, and phosphorus, which are commonly encountered in biological structures or crystallization conditions. In the latter case, it is referred to as native-SAD since no extra anomalous scatterer is added during protein production or by soaking crystals with heavy atoms. For instance, sulphur-SAD (S-SAD) leverages sulphur atoms from cysteine and methionine residues, eliminating the need for protein labelling, saving time and money.
Challenges and Solutions:
Nevertheless, long-wavelength X-rays present their own set of challenges. They are more susceptible to absorption in air and diffract at larger angles, necessitating specialized instrumentation. The development of beamline I23 at Diamond Light Source has addressed these challenges. Operating in a vacuum environment to remove air scattering and absorption, it boasts a large detector (Pilatus 12M, Dectris) to collect large diffraction angles. Sample cooling, a further challenge, was resolved by using conduction rather than traditional LN2 cryo-streams.
Advantages of Long-Wavelength Native-SAD:
Long-wavelength native-SAD holds the promise of solving protein structures with lower data multiplicity, often from a single crystal, thanks to the heightened anomalous signal at long wavelengths and reduced noise background in a vacuum. This is particularly valuable for projects with limited crystal availability or those involving crystals sensitive to radiation damage. Furthermore, long-wavelength native-SAD widens the choice of anomalous scatterers for phasing to include elements such as calcium, potassium, chlorine, vanadium, cadmium, iodine, and even phosphorus. Identifying and localizing these elements in protein structures provide invaluable functional insights.
Applications and Success:
Native-SAD phasing has been applied to a diverse array of protein structures, including soluble and membrane proteins with molecular weights ranging from 14 to 114 kDa. Even proteins with low sulphur content can be effectively phased using native-SAD at longer wavelengths. Notably, this technique has also been successfully applied to integral membrane proteins, which are notoriously challenging to study. Furthermore, proteins with modest diffraction resolution have been phased, underscoring the robustness of the method.
Optimizing data collection for long-wavelengths:
While solving these structures, it became evident that the choice of wavelength is critical in S-SAD experiments. An optimal wavelength for S-SAD experiments on beamline I23 has been determined to be λ= 2.75 Å, offering an effective compromise for most crystals between anomalous signal and absorption effects.
Sulphur content is also a vital factor for S-SAD success, but it is not the sole determinant. Key parameters contributing to the effectiveness of SAD analyses, including the ratio between the number of unique reflections and the number of anomalous scatterers, have been identified. For successful S-SAD phasing at a wavelength of λ= 2.75 Å on beamline I23, this ratio typically needs to exceed 1000, resulting in an approximate 89% success rate among the deposited structures in the PDB.
To implement native-SAD phasing effectively, a specific data collection strategy is employed. Researchers typically collect three sets of 360° data from a single crystal, with each dataset taken at multiple orientations using a multi-axis goniometer to limit systematic error and increase true multiplicity of the measured data. This strategy not only enables the assessment of radiation damage during data collection but also minimizes its effects, contributing to the success of S-SAD phasing experiments.
These advancements in structural biology, from cryo-EM to machine learning-driven predictions and the refinement of experimental phasing techniques, are propelling the field toward greater understanding and discovery. They equip researchers with powerful tools to continue unravelling the intricate mysteries of macromolecular functions and their implications in health and disease, ultimately paving the way for innovative therapeutic solutions. The unique setup at the long-wavelength beamline I23 at Diamond Light Source enables experiments at wavelengths not attainable elsewhere. This opens unprecedented opportunities for experimental phasing. This technique, with its versatility and adaptability, will remain a cornerstone in the pursuit of a deeper understanding of the biological world.