Balancing Act
If you really want to do something, you’ll find a way.? If you don’t, you’ll find an excuse. Jim Rohn
I've said this before, so I'll say it again. In science, half the battle is to specify the dimensionality of the problem. Once we've done that we're half way there. We can begin to think about collecting data from that multidimensional space, maximizing the information needed to permit the generation of multidimensional OS maps of our products and processes.
The trouble is that while most of the problems we work with are multidimensional, our exploratory strategies have evolved from foraging strategies designed around low dimensional 2-D and 3-D spaces. Whether you are a blackbird searching for earthworms on a lawn, a colobus monkey searching for fruits in a Ugandan rainforest, or a hunter-gatherer foraging for mushrooms, you will likely follow a search strategy exploiting small movements in regions of high reward punctuated by longer excursions as the returns diminish.
Previously we've seen that such scientific foraging is a recipe for disaster when exploring multidimensional spaces. Scientists just do not perform well in hyperspace - see Hyperspace: The Final Frontier. Even randomly selected points in multidimensional space perform better than scientists in the Virtual PCR Simulator.
Robots, on the other hand, can be programmed to systematically explore such spaces maximizing the information needed to generate OS maps permitting scientists to navigate those multidimensional spaces. And while the early robots in the 80s were pretty basic the modern robots are pretty amazing.
Perversely though, scientists will use the success of robots as a reason for inaction. They might accept that, in principle, designed experiments are a bit of a wheeze, a potentially useful dodge, possibly even the way forward. But not right now. Right now, they don't have the robots to make it happen.
I'll let you into a secret.
You don't need robots.
You can do this now.
When I first started there was little in the way of design software. I was cribbing designs from the back of a battered copy of the experimental design classic Cochran WG, Cox G 1957.?Experimental Designs. Wiley, New York.?My first ever designed experiment in the pharmaceutical industry was taken from this book. It was a bog standard 8-run fractional factorial simultaneously testing the effects of five factors.
But let me back up a bit.
I was at a meeting in Koningsberg with the Head of Pharmaceutical Sciences. On the way back, our flight was delayed and we were stuck at Schiphol Airport. Over a beer, on the back of a brochure, I showed him how fractional designs worked. This would allow him, I explained to screen, say, seven factors at a time with just eight runs.
He made the right noises, but I wasn't totally sure he was sold.
The following Monday though, my first client pitched up. We’ll call him ‘John’.? [His real name is John, but to protect his anonymity I thought it important to pretend this was a pseudonym? GDPR and all that.]
Anyways, John is a proper Yorkshireman with a bluff and direct approach – he makes Sean Bean sound like an Oxford Professor of Fine Arts. He told me he'd been told to speak to me - he clearly had a gun to his head. He outlined the problem.?He had a liquid formulation that should be nice and clear, but occasionally had a mysterious milky white precipitate – some kind of impurity. They'd been going around the houses trying to work out what the hell the problem was. The boss had told him he needed seven factors, but he apologised that he only had five.
I told him that was OK – it would still work and we'd get some additional information on variability.
We ran through the eight runs he needed to nail down the culprit.
That week I poked my head round the door to the laboratory to see how he was getting on.
He scowled and complained bitterly.
“This better work.”
I was bricking it but thought it prudent at the time to ooze unshakeable self-confidence.
“It’ll be fine,” I assured him. "Just let me know when you're done."
I then beat a hasty retreat.
The following Monday, he burst into the office. He was triumphant.
"OK. We've got it."??
"That was quick. You've done the assays?"
"Don't need to."
"What?"
"Don't need to."
“Well, how are we going to do the analysis if you don’t have any data?”
“Don’t need to - its bloody obvious.”
With these words, John invented the Bloody-Obvious-Test.
Anyways, John proudly showed me a hand-written version of the Excel Table below.? [Microsoft Excel was still a gleam in Bill Gates' eye - PCs did not become infected with the Excel virus until rather later – 1987.]
I’ve changed the run order and sorted on the factors, but it looks pretty much like the data that John shared that day.
?
John had checked for the precipitate and noted the colour at high (Hi) and low (Lo) levels of the five factors of interest (X1-X5).? Note that when X1 is set to Lo, the bottom four rows, the solution is always CLEAR.? And when X1 is Hi, the top four rows, the solution is either MILKY indicating a fine suspension of the unknown precipitate or WHITE indicating the precipitate is actually coming out of suspension to form a white solid at the bottom of the tube.?
So what does this mean??
Well, providing we set X1 at the Lo level we don’t have a problem.?If X1 is Hi then we get the precipitate forming.?And if both X1 and X3 are Hi then we get so much of the impurity forming that the precipitate comes out of suspension and settles as that white deposit at the bottom of the tube.
领英推荐
Simples.
Of course, once we run the assays, and the numbers are in, we can analyze the amounts of the precipitate formed and use the full armoury of statistical tools available.? Things have moved on since the 80s - packages such as JMP generate useful graphics to help interpret the effects.? When we do that, we pick up on subtleties that John’s Bloody-Obvious-Test might have missed.? But essentially the main conclusions are the same - both factor X1 and factor X3 are important and there is evidence of an interaction between the two - when both are Hi the result is a "chuffing" disaster, as they say in Yorkshire, with tons of our impurity forming.
Bonus point: it won't have escaped the statisticians in the room that the interaction is aliased with what would have been the seventh factor X7. John didn't have a seventh factor - making it a dummy variable - so this gives a reasonable estimate of that interaction.
"So, what were the factors?" you ask.
X1 was ethanol and X3 was citric acid, but I can’t remember the details.? Give me a break - it was forty years ago.? But it made an impression.? It reminded me that much of the power of Design of Experiments comes not so much from the fancy analysis tools we can bring to bear, but simply from the balanced layout of the design. The experimental runs, or design points, are deliberately and beautifully balanced. There are four runs at the Lo and four runs at the Hi level of each of the factors. And each of those eight runs are evenly spread across the remaining factors. The results are Bloody Obvious but only because the design is beautifully balanced.
John's study is also a good reminder that in early development even preliminary data can be useful.? Even before the impurity was identified and the assays performed, we’d solved the problem and identified the two critical variables. And we’d established the formulation was robust to three other potential variables.? We were able to minimize the impurity and move forward with further development of the drug.?
And that, after all, is the goal. I might not get to run my fancy analyses and show people how clever I am.? But we have our solution, a go/no go decision, and a working formulation.? And that’s all we need.? Job done.
Epilogue
Measure that which can be measured. Make measurable that which cannot. Galileo Galilei.
We got to use this same trick in other projects. Except now we recorded the number of hours (or days) before the precipitate appeared. This meant that even for new discovery projects where the precipitate was unknown, we now had numeric data to run these more powerful methods.
If you’re not part of the solution, then you’re the precipitate.
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Scientific Strategist | Storyteller | Our Future is Multi-Omic
1 年Really great story, Dennis!
Design of Experiments (DoE) Expert @L'Oréal | ?? Empowering R&I Formulation labs with Data Science & Smart Experimentation | ?? Green Belt LSS | ???? ???? ????
1 年???? Great story Dennis, and nice illustration of the beauty behind DoE. ?? This example reminds me about one of the first DoE we made with other students for a project during our years at my chemical Engineering School : a fractional factorial design with 8 runs and 5 factors, to investigate the influence of these factors on the taste, aspect and texture of marshmallows. We were also able to investigate two 2-factors interactions, and it was quite amazing to create the DoE, run the experiments and analyze them in full autonomy, it really showed how powerful and flexible this approach was for any topic : https://www.dhirubhai.net/posts/victorguiller_my-first-doe-the-marshmallow-experiment-activity-7081881854819069952-NGlf
DOE & Data Analytics Evangelist | Nervously excited about Digital Future of Science, Engineering, R&D, Manufacturing | Medium-pace runner and road cyclist
1 年I like the sound of this John guy. Reminds of a very similar project I was involved in. Also about understanding formulation factors and precipitates. I was obsessed with trying to find a good model to fit the data but my collaborators were very happy with a ternary plot that with dots for each formulation coloured red (precipitated, bad) or green (clear, good).
Scientifique II en Sciences Chimiques à Paraza Pharma Inc.
1 年This entry is probably the one I was expecting the most. A real-life example to showcase how powerful DoE could be if used properly. 5 variables - 8 entries is something I would have never dreamed of... I guess, it is time for me to get my hands dirty and try DoE for real. (And fun fact. JMP are my father's initiales... :) )