Intellegens Blog – Stephen Warde, June 2023
Discussing applied machine learning for chemicals, materials and manufacturing – see all blog posts.
It was interesting at our recent Customer Advisory Board meeting to hear participants talk about their use of machine learning (ML) alongside or as a replacement for the statistical software traditionally used in Design of Experiments (DOE) campaigns. In many of these organisations, this DOE software continues to be very widely used and is valued for its familiarity, data visualisation tools, and ability to easily plan experiments. But these research teams are recognising the limitations of traditional DOE and benefiting from applying ML in many experimental scenarios. The mantra at the meeting was “use the right tool for the right job”. With this in mind, it is worth recording some of the reasons why they use ML.
ML is adaptive and responsive. One meeting attendee explained that traditional DOE was unsuited to their formulation development application, where experiments had to run over long time periods during which results were collected periodically from each test. Instead of working systematically through a grid of experiments proposed by DOE, which could take many years, their preferred approach was just to try something, learn from the early test results, and use their scientific knowledge to propose the next experiment. ML has helped them, because it can continually learn from the latest data and use that information to propose the next one or two experiments most likely to improve overall understanding of the system.
ML is intrinsically non-linear. Typical statistical DOE software assumes that the response of experimental outputs to inputs is linear, or at best quadratic. ML makes no such assumption. Its models learn from the data provided even when that data contains complex, non-linear relationships. So ML can model difficult multi-component systems where cross-correlations would not be accounted for by other DOE approaches.
ML can be target-driven. While traditional DOE aims to cover the design space defined by the user as efficiently as possible, ML can incorporate the idea of target outputs from the system and identify which missing data would best enable it to understand the design space around these targets. Where you have an idea of desired outputs, ML can thus design campaigns that achieve their objectives with dramatically fewer experiments – typically, we expect 50-80% reductions.
ML can handle as many inputs as you need. Standard DOE methods usually require you to vary only a limited number of inputs at any one time in your experimental design. With ML, you don’t have to identify which inputs are most important (thus potentially building bias into your design). You can ask the ML to explore all of the inputs simultaneously and it will find those that are most significant.
Traditional DOE software remains a reliable workhorse in many labs, and still a great solution in some experimental scenarios. But organisations who are taking the often challenging step of questioning long-established processes and trying out ML for adaptive DOE are reaping benefits. One attendee at the meeting described the main benefit as the ability to focus much more of their chemists’ time on doing creative, and better-informed, chemistry as opposed to planning and executing experiments. It’s worth considering what is “the right tool for the right job”. Often, in DOE campaigns, ML will be the answer.