X-ray crystallography

What will 2020 bring? I don’t have a crystal ball, but I do have protein crystals! And I predict that, if your New Year’s Resolution is to learn more about x-ray crystallography, you’re in luck! New Years is probably the only time of the year when the general public talks about resolutions as much as x-ray crystallographers do, so I thought I’d “ride the hashtag” and tell you about this cool technique for looking at molecules and what “resolution” means when structural biologists talk about it.

We’ve been “looking” a lot at proteins and at the amino acid “letters” that they’re made up of – and of the atoms those letters themselves are made of. But how do we actually *look* at proteins and their parts? One way is with a technique called x-ray crystallography. Basically you get a protein really really pure and then you take that dissolved protein and (through a lot of screening for good conditions) convince the individual protein molecules to ditch their watery coats (come out of solution) and bind to one another in an orderly lattice we call a crystal. And then you beam x-rays at the crystal.

When the x-ray waves run into the electrons of the atoms making up the proteins inside the crystal, they get scattered. Kinda like dropping a ball in a pool, the electric field of X-rays perturbs the electron clouds surrounding the nuclei of atoms, causing them to vibrate and give off their own waves. And this happens in each of the lots and lots of individual protein molecules inside the crystal. Since a crystal has a defined spacing, for each wave scattered, there’s almost always another wave exactly out of phase (e.g. one’s a a peak when another’s at a trough), so most of the scattered rays will cancel each other out (destructive interference), but some will add together to give you a stronger wave (constructive interference).

We collect a “diffraction pattern” consisting of a pattern of spots showing us where those strong “diffracted” waves hit a detector – and then we (our computers) work backwards mathematically from those spots to figure out where the electrons are that scattered them. This gives us an meshy-thing called an “electron density map” and then – since we know that electrons orbit around the dense central part of atoms (the atomic nuclei) we can build an atomic model (those sticky or ribbony things) into that mesh indicating the location of the center of each of the atoms that make it up.

That is, if you have high enough resolution data…

Resolution refers to how close together 2 things can be before you stop being able to tell that they’re 2 separate things (so “higher resolution” has lower numbers). It’s like if you have 2 asterisks * & *. If they’re really far apart, it’s easy to see there are 2 of them: *_____________*. But as they get closer together, it becomes harder to tell them apart: *_*. And at some point, they’ll be so close together you’ll see it as a single thing **. The cutoff point will be slightly different for different people depending on how good their eyesight is.

Similarly, at low resolutions, we can make out things like the protein backbone, but it’s harder to be confident about the position of things like side chains (the unique parts that stick off the backbone). Once you get to higher resolutions, you start to be able to make out the side chains & their orientations. We call structures w/resolution at or better than 1.2 Å (an angstrom is 10⁻¹⁰ meters, or 100 picometers (100pm)) “atomic resolution” because we can confidently make out the location of all the atoms. BUT this is rare in protein crystallography. The average published structure is ~2, but resolution isn’t the only thing that matters.

But we’re doing all this “seeing” of proteins without actually seeing them – the reason we can see things is that when light (made up of packets of energy called photons traveling in waves) hits a solid object, like the 2 things we’re trying to tell apart, some light waves get reflected at us and it’s the lenses’ job to take those light waves and bend them to focus them onto our eyes’ retinas. And photoreceptors in the retina send signals to the brain to interpret.

But the waves we use for x-ray crystallography are way too energetic for such focusing – they don’t get slowed down when they travel through glass so they wouldn’t get “bent” by a lens like visible light does – and they’d basically fry any lens we tried to use to focus them (they’d even fry our detector so we have a “beam stop” in the direct path of the beam to absorb any of the concentrated rays that go through undeflected) so we have to work “directly” from (with evidence from) the diffracted waves.

X-rays and visible light might sound like 2 really different things, but X-rays are really just a much more energetic form of visible light – or visible light is just a much less energetic X-ray. They’re both forms of ElectroMagnetic Radiation (EMR), a spectrum which includes everything from microwaves, radio waves, and infrared through visible light and ultraviolet (UV) light and X-rays. All of these are made up of little packets of energy called photons that travel in waves, but they differ in how much energy they have in those packets.

All forms of EMR travel at the same linear speed (the speed of light through air is c=3*10⁸m/s). So if you were to beam a laser and an x-ray from the same place at the same time, the wavefront would arrive at a detector at the same time (though you would only be able to see the laser). BUT the photons in the x-ray beam will have traveled up and down many more times on the way there because they have more energy. It’s kinda like if an energetic little kid goes on a pogo-stick walk with his less-energetic grandpa. The kid might hop up and down more on the way to use up some of that energy without getting ahead of grandpa. Similarly, the more energy a photon has, the higher frequency the waves are (the pogo stick hits the ground more times in any given period) and, since it’s traveling the same overall distance, the peaks have to come closer together (shorter hops).

In more formal, less pogo-sticky terms, we can describe waves in terms of their wavelength (λ) (the peak-to-peak distance), their frequency (# of peaks that will pass through a fixed point in a certain amount of time), or the energy of their photons (E). The energy of a photon is directly correlated to the frequency via Planck’s constant (h = 6.626*10^-34Js) via the equation E=hf. And the frequency is speed of light over wavelength (c/λ) – put that into the equation and you get

E=hf=hc/λ.

In words,

higher frequency light requires higher energy photons and corresponds to shorter wavelengths

lower frequency light -> lower energy photons -> longer wavelengths

In practice, our resolution is usually limited by our samples (any tiny defects in the crystal or slight variations between individual protein molecules can get them to deflect light slightly differently and thus “fuzzy up” the signals a little, making them harder to interpret). But, assuming you have a perfect sample, you are still limited in the resolution – this time from your light source.

The reason we have to use x-rays and can’t just use visible light which is so much easier to work with is that the wavelength of visible light is way bigger than the distances between the things we’re wanting to resolve. Kinda like how you wouldn’t use a yardstick to measure a hair, you can’t use radio waves to look at proteins. Optical physics says you cannot achieve better resolution than ~1/2 the wavelength of the light you’re using to tell things apart. And in crystallography we want to tell apart things that are REALLY close together – we usually measure resolution in ANGSTROMS (Å). An angstrom is 10⁻¹⁰ meters, or 100 picometers (100pm). The length of an average carbon-carbon single bond is ~1.54Å and the length of an average carbon-hydrogen bond is ~1.09Å. The most energetic visible light has a wavelength of ~700 nanometers, so ~7000 Å. We need something on the same order of magnitude as what we’re looking at, so we turn to X-rays, which range from 0.01-10 nanometers (so 0.1-100Å). For protein crystallography, we usually use x-rays with wavelengths of ~1 Å.

Getting that high energy light requires putting a lot of energy in. A LOT of energy – and you need the x-rays to be traveling in a really straight path and all have the exact same energy, so you can’t just go to Home Depot and buy yourself an x-ray flashlight. Instead, we usually go to synchrotrons (though we also have a less powerful “home source” at our lab). I’ve gone to Brookhaven National Laboratory (BNL)’s NSLSII synchrotron a couple of times and have collected data remotely from Lawrence Berkeley National Laboratory (LBNL)’s ALS synchrotron once – in such “remote collection” situations you ship your protein crystals to them and then you control the robot from any computer.

A quick overview of how a synchrotron works (cuz it’s cool) – electrons (e⁻) are produced by an electron gun (similar to those in cathode ray TVs but on a MUCH bigger scale). It generates e⁻ through thermionic emission – basically if you get a metal super hot it starts to lose e⁻. And since electrons are negatively-charged, if you put a positively-charged electric field nearby it will yank them away. In this case, it yanks them into a linear accelerator (LINAC) which has chambers of positive charge that attract the e⁻ & cause them to accelerate (near the speed of light) towards a booster ring where a series of magnets direct them to travel in circles. Some of the e⁻ are then fed into the storage ring which uses series of magnets to get the electrons to change course and give off photons of different wavelengths that get separated and sent to “beam lines” & work stations where we can stick protein crystals in their path. More here: http://bit.ly/2z0JjwR

Once you collect a diffraction pattern you have to work backwards from that data to figure out where molecules are that created pattern. First you have to process the data, which involves combining information from images of the diffraction pattern from different angles. Then, in order to reconstruct the molecule based on the waves it gives off, you need to “determine phases” which can pose a real problem….We can measure the intensity of each spot to get that wave’s amplitude (peak height) but not its PHASE (position relative to all the other waves crystal’s giving off) & we need both! There are different ways to solve this phase problem including using heavy atoms or using a similar protein as a reference. Then you build the model, refine & validate it using your knowledge of biochemistry to make sure it makes sense biochemically & fits experimental data. Then you deposit it into Protein Data Bank (PDB) so anyone can view & analyze them.

As I mentioned briefly before, when we’re dealing with proteins or other “macromolecules” – maybe a protein/DNA complex or something – we’re almost always limited by the quality of our crystals instead of by the wavelength of our radiation. So how do we get good crystals?

A crystal is “just” an orderly 3D lattice of repeating units – kinda like floor tiles but in 3 dimensions – if you know where one spot is on one thing you know exactly where in space the corresponding spot is in every other copy of the thing because there’s a “recipe” to follow – like stick one copy down, take 2 steps left and 3 up, stick another copy down, etc. The “tiles” in a protein crystal can correspond to one or multiple copies of the protein.

When something is dissolved, each molecule has a full coat of water, but to crystallize, something (like our protein) needs to come out of solution – “prioritize” contacts to things other than water – like other protein molecules. So, for instance, it swaps some of the water molecules it was coated in for interactions with other protein molecules. But the tricky part about crystals is that, while they represent optimal packing layouts, they require coordination because all the molecules have to arrange themselves the same way. And if they do it in different ways you just get clumpy protein “aggregate”

When we “flash freeze” protein after purifying to protect it for storage until we’re ready to use it, we dunk tiny tubes of our protein into liquid nitrogen so rapidly cool them. And we do this to prevent the formation of ice crystals. This works because crystallizing takes coordination. And coordination takes time (think putting together a jigsaw puzzle “properly” vs just tossing all the pieces into a box). So if you don’t give molecules time to coordinate, they can’t crystallize. And we don’t want the water to crystallize because it could damage our protein. But with x-ray crystallography we *want* crystallization (but of our protein and not of water) so we want to *slowly* promote grouping together.

There are a lot of different techniques for doing this. The one that I’ve used the most is “vapor diffusion,” mostly “hanging drop” crystallization. Basically you stick a drop of liquid containing your protein on a glass slide and then you flip the slide over and use it as the “roof” for a well of protein-less liquid (reservoir). Since this reservoir liquid is more concentrated than the drop liquid, water evaporates from the drop to help “dilute out” the reservoir (not really its goal – it’s trying to escape the well altogether but there’s a lid, so it gets pulled in by the reservoir). And this leaves less water available to surround the protein molecules. So the protein molecules start binding each other instead – hopefully in the coordinated fashion that leads to nice crystals.

In addition to just removing water, we promote crystallization by optimizing the pH, salt types & concentrations, protein concentrations, etc. When you see “optimize” think “troubleshooting” and LOTS and LOTS of “trial and error” – since each protein is different and has different binding opportunities to offer up to other protein molecules and different types of interactions are favored in different conditions, the ideal “crystallization cocktail” varies from protein to protein – so we usually carry out extensive screening – we even have liquid dispensing robots to help us do this with tiny tiny volumes so that we can test hundreds of combinations without needing a ton a ton of protein (you still need a lot of protein though because it needs to be at a high concentration so the molecules can find one another okay – and this can be a major limitation of crystallography).

We also have a microscope robot that takes pictures of our crystal trays for us over time so we can see if crystals are forming in any of the wells. If we get any hits, we can then optimize more around the conditions that were in those wells to try to get even better crystals. Once our crystals have grown and stop growing (could be days to weeks to months depending on the crystal) we have to fish it out with little loops, freeze them with cryoprotectants, and store them in liquid nitrogen dewars to keep them super cold until we’re ready to collect diffraction data from them.

I cover a lot of these topics in more detail in past posts. You can find them http://bit.ly/2OllAB

more on topics mentioned (& others) #366DaysOfScience All (with topics listed) 👉 http://bit.ly/2OllAB0