Experimental design: variables, controls, replicates, etc.

Yesterday I told you about some general principles of clinical trials, and how it’s really important that they’re controlled, meaning that if you have 2 groups of things you’re comparing (e.g. treatment vs. no treatment) you want to make sure (as much as possible) that the only thing different between the groups is the treatment. I’m not involved in clinical trials, but the experiments I carry out in the lab (that is the experiments I carried out in the lab and hope to be able to return to soon…) are each like mini (more easily controlled) trials – a lot of the same principles about dependent variables, independent variables, replicates, etc. still hold. So today I thought I’d tell you some more about experimental design.

In this post, I hope to take you inside the mind of a scientist as they plan out an experiment. With any experiment, you have to make compromises to get the type of information that is most important for you. Many of these “compromises” involve controlling variables, a crucial aspect of the scientific process.

A Thought Experiment: To illustrate some points, let’s start with a “simple” experiment, one you might have done for a science fair. Say you want to test the effects of water and light exposure on plant growth. Seems pretty straightforward, right? Water and light are the “independent variables” you want to test and growth is the “dependent variable” that is “dependent” on your independent variables. So you take some seeds, give them different amounts of water and light, and measure their growth. But wait, what do you mean by growth? Increases in height, leaf size, circumference, total mass? In this simple experiment, you could collect them all, but this often not practical. Once you choose your experimental “read out” you need to determine when to take the measurements – this depends on what you want to know. Are you interested in changes in the rate of growth? If so, you’d want to take a series of measurements at fixed timepoints. If you don’t care about rate, just overall growth, you could just take a single measurement at a single timepoint.

For simplicity’s sake for this thought exercise, let’s say you decide to measure plant height after two weeks. Now you have to decide how you want to change your variables. If you change the amounts of water and light at the same time, any changes in growth you see are a combination of effects of changing water and changing light and you won’t know how much of the change you see is due to which factor. If you want to get information about the individual contribution of one variable, you need to hold the other variable constant – so you run two parallel sets of experiments. In one, you give each plant the same amount of water but different amounts of light and vice versa for the other set.

If all variables but the one you’re interested in are kept constant, then any differences in the dependent variable (growth in this case) are taken to be due to the independent variable you changed. If every other variable really were kept constant, this would be true, but this theoretical perfectly controlled system doesn’t exist. There is always some variability in your variables! For example, there could be genetic differences in the seeds, slight differences in soil composition, differences in distance to the light, etc. The difficulty of controlling experimental variables is especially pronounced in biology because living organisms are incredibly complex.

Replicates: It is impossible to control for every variable – to account for this, scientists include replicates. With replicates, you hope that although each replicate will differ slightly, these differences will “buffer” each other, similar to how all the colors in the rainbow “cancel each other out” to make white. There are two main types of replicates that are both important:

Technical replicates are when you test the same sample multiple times to buffer out inconsistencies in measuring. In our plant case, this would mean measuring each plant several times – Were you measuring from exactly the same starting point? Did you correctly count the number of lines on the ruler?

Biological replicates are when you test different samples that are “identical” in all aspects but their source. In our plant case, this would mean including multiple seeds in each treatment group. Since it’s just a theoretical experiment, we could include as many seeds as we want, but in real experiments there are practical limitations (e.g. availability and cost of samples, amount of time and energy needed to collect the data).

More practical labby advice on replicates later…

In order to detect effects of your treatment, you need to make sure that differences between treatment groups are bigger than differences between individual samples within those treatment groups, and there are statistical tests scientists use to estimate how likely it is that the effects are due to the treatment.

Over-control? Controlling variables is crucial, but even if you could perfectly control every variable but the one you are interested in, you would lose important information in doing so. In science we talk of “non-additive effects” – where the sum of the effect of individual variables on their own is less than their combined effect because the variables themselves are interdependent. Say you wanted to determine the optimal amount of light and water for plant growth – you change these variables independently as we outlined above, and determine that the optimal amount of light is some value, A, and the optimal amount of water is B. This doesn’t necessarily mean that the optimal growth conditions are A + B. It could be that light has a bigger effect at a certain water level, but you wouldn’t see that effect if you only tested at a different water level. It also could be that one of the “controlled variables” such as temperature has a similar effect, with the effects of light or water being more pronounced at certain temperatures. Obviously, it’s impossible to test each combination of variables, so scientists must make compromises when designing their experiments.

A more “real-world” example. To show how these concepts play out in a more realistic scenario, let’s consider pharmaceutical drug development. Many early experiments are performed on cells in a dish (cell culture), which allows for moderate control over variables while still working in a cellular context. If a scientist wants to test the effects of a drug on human cells, they could take cells and plate them in 2 dishes – add the drug to one dish and only the delivery vehicle (the liquid the drug is dissolved in) to the second dish as a negative control. As we saw above, technical and biological variability could affect the results so the scientist would actually want to set up a number of dishes, not just one of each.

Say the scientist sees that the drug has a desired effect – it’s not quite time to celebrate. To make sure that the observed results weren’t specific to that cell preparation, they would also want to repeat the experiment on a different date with “new” cells. Next, they will likely test the drug on a different cell line (the initial source of the immortalized cells is different, not just the “batch” of those cells) to make sure that the effects aren’t cell-line specific.

If the drug has the same effect on multiple cell lines, it is more likely to have that effect in the body (in vivo), but this is far from guaranteed because the life of a cell in a dish is much different from the life of a cell in the body, where there are complex dynamics between cells and their surrounding environment, not to mention potential “off-target” effects that could cause dangerous complications. This is why further testing of the drug is required to determine 1) is it safe and 2) does it work?

When it comes to testing drugs in people, controlling (and over-controlling) variables is often a point of contention. If you thought cells in a dish were inherently variable, complete human beings are all the more so! In order to control for some some this variability, there are often strict requirements for participation in drug trials. As we saw above, there are legitimate reasons for such control – for example, if you test a drug in a patient who has an additional medical problem and that patient has a complication, you don’t know if it’s because of the drug alone or the preexisting condition, or the combination of the two. However, a problem often arises with regards to over-controlling variables. Tight control can lead a drug to be tested and approved on a population that isn’t representative of the true patient population. The drug therefore might not be effective in most patients (and can even have adverse effects). As you can see, scientists must make difficult and careful decisions when designing their experiments.

Some more about replicates: To review, TECHNICAL REPLICATES test the same sample multiple times to account for variation in measuring whereas BIOLOGICAL REPLICATES test different samples. And both of these are different from independent experiments, where you test different samples on different days with fresh setups, etc. Independent experiments can account for things like “there was something in the water” or more commonly when it comes to biochemistry there was something left out of the water! (it’s really easy to accidentally forget to add things so you want to develop systems like moving tubes to a different rack after you add them, or checking off things you’ve added on a piece of paper (but word of caution, it takes a while to develop habits so in the beginning days you can confuse yourself more because you might add something but forget to cross it off or move the tube so then you think you haven’t added it when you have!

Each of these types of “double-checking” have value but in different ways. To help illustrate the difference, let’s look at an example. Types of replicates are often explained in terms of patients or lab rats – e.g. say you treat 10 people with a drug that’s supposed to lower blood pressure and then you measure the treated people’s blood pressure. If you measure the same person’s blood pressure 10 times (maybe you thought the machines was acting weird or something – or it took the person a while to relax) those would be technical replicates (they tell you about how reliable the measuring is and variation within the sample). If you measure each person’s blood pressure those would be biological replicates (these will tell you about differences between how people respond). And if you repeated the experiment with a different group of people, that would be an independent experiment (this tells you about how representative of the wider population that first group was)

Say 1 of the people responded really well to the treatment but the other 9 didn’t. If you measured that 1 person’s blood pressure 5 times but everyone else’s just once and then you took the average it’d seem like the drug worked a lot better than it actually did. So instead when you average, you average the averages of the technical replicates (so you’d take the average of that strong-responder so it doesn’t skew the results). So, if you have 10 people, your “n” is 10 regardless of how many measurements you make.

I don’t work with people (well, I do work with people but I don’t do research on them) or animals – but I do work with a lot of replicates. Whether they’re different protein preps (biological replicates) or tests of the same protein prep but repeated multiple times to account for things like pipetting differences, etc. (technical replicates). And if I repeat the experiment on a different day with a different preps that would be an independent experiment.

technical replicates

e.g. same protein prep, repeat experiment in parallel or take multiple samples from the same reaction
variation in measurement – consistent pipetting (was there an air bubble, did you forget to add something)
variation in sample – was the sample mixed well? did you happen to take a pipetful that was super full of stuff?
more technical replicates -> better estimation of the mean but does not change sample size

biological replicates

how representative is your sample?
different people or animal subjects, different cell lines, different protein preps
are differences you saw in one sample really real? Are they just background variation

independent experiments

human error?
equipment error?

Many Approaches: Another thing to take into account is that in biochemistry there are often several approaches to a question. Say I want to figure out if 2 proteins interact – should I use an EMSA, an IP, analytical chromatography? (not gonna explain these techniques here, but you can find more info about them on my blog). One isn’t inherently “better” or “worse” they just give you different information. Each experiment, in all areas of science, has its strengths and weaknesses. It is important that scientists explore their options, think critically, and choose the experiment that will answer the question they’re looking for. Ideally, scientific conclusions should be drawn from multiple lines of evidence, multiple experiment types. Similarly to how a large number of biological replicates helps “buffer” variation, using multiple experiment types allows the strengths of one technique to complement the weaknesses of another.

In addition to careful experimental planning, it is important that scientists recognize the weaknesses and limitations of the methods they choose to use and convey these caveats to their audience. If you are the audience, some things to look for are: replication (technical but especially biological) and multiple lines of evidence (different types of experiments used).

This post isn’t meant to dampen your enthusiasm for science, but rather to help you think like a scientist and understand why we do the things we do. There are many ways to answer similar scientific questions and the particular experiment you choose depends on many factors (both practical and theoretical). Like in everything, there is variability among scientists and the techniques we choose, but this variability doesn’t make the pursuit of science less valid.

more on clinical trials: https://bit.ly/clinicaltrialterms

more #365DaysOfScience All (with topics listed) 👉 http://bit.ly/2OllAB0

Leave a Reply Cancel reply