An artificial intelligence performing experiments using a synthesis robot discovered what could be the most generic conditions for cross-coupling reactions from a pool of thousands of possible combinations. The AI-created reaction more than doubled the average yield in 20 delicate cross-couplings compared to baseline conditions.
Reaction conditions that work for compounds of different shapes, sizes and functional groups are “essential for automating the synthesis of small molecules, which in turn is essential for democratizing molecular innovation”, says the head of the Martin Burke study from the University of Illinois at Urbana-Champaign, USA. In 2009, Burke’s team created a version of the Suzuki-Miyaura cross-coupling that combines chloroarenes and NOT-methyliminodiacetic acid boronates (Mida).
The team then turned to AI to adapt the reaction to work for the widest possible range of substrates. They asked him to scour the literature and find the most generic reaction conditions among an almost infinite number of catalyst, ligand, temperature, and base combinations. Yet, rather than finding new general terms, the algorithm would only “discover” terms that are already the most popular. One reason, Burke explains, is that “the literature is remarkably deficient in negative data. Nobody publishes what does not work. But algorithms need both good and bad examples to learn.
Now the teams of Bartosz Grzybowski at the Polish Academy of Sciences and Alán Aspuru-Guzik at the University of Toronto, Canada, along with Burke and his colleagues have combined AI and robotic experimentation to solve the generality problem without biased data from the literature. First, clustering analysis selected 22 coupling partners that would represent the chemical space of 5400 commercial aryl halides and 54 boronates from Mida. Then the team let the AI lose the data. “We didn’t make any decisions once we had it in place,” Burke says.
Over several months, the AI went through five rounds of design experiments, running them on the synthesis robot and evaluating the results. “The algorithm wasn’t set up necessarily to try to find the best conditions, it was trying to minimize uncertainty,” says Burke. For this, the AI ran a surprisingly high number of low-yielding reactions – around 100.
The conditions proposed by the algorithm do not seem very different from those of Burke’s 2009 article: 100°C instead of 60°C, sodium instead of potassium carbonate and a slightly different phosphine ligand for the catalyst palladium. However, in 20 cross-couplings of heteroarenes, the average yield more than doubled, from 21% to 46%. The speed and precision of the AI ”was very humble and inspiring, but also shocking and disturbing at the same time,” says Burke.
“It’s ideal for parallel medicinal chemistry, for example in high-throughput labs where they use automation to create compound libraries,” says Cayetana Zárate, who works in process development at Janssen Pharmaceuticals. However, she points out that the chemical space examined by the team lacked molecules with pharmaceutically relevant substituents such as halogens and aliphatic amines.
If similar AI could optimize conditions for a particular reaction, it could dramatically speed up the development of labor-intensive processes, Zárate says. After all, the Suzuki-Miyaura cross-coupling is “probably one of the reactions where we invest the most time in screening.”
Burke says he now wants their AI setup to search for molecules that have specific functions, such as materials in solar cells. “In chemistry, we tend to focus on structure. To me, what’s so exciting about this closed-loop engine, [is] it’s an opportunity for us to pivot to a function-driven approach.