Imagine a plastic bag that can carry home your groceries, then quickly degrade, without harming the environment. Or a super-strong, lightweight plastic for airplanes, rockets, and satellites that can replace traditional structural metals in aerospace technologies.
Machine learning and artificial intelligence have accelerated the ability to design materials with specific properties like these. But while scientists have had success designing new metallic alloys, polymers—like the plastic used for bags—have been much more difficult to design.
Researchers at the Pritzker School of Molecular Engineering (PME) at the University of Chicago have found a way forward in designing polymers by combining modeling and machine learning.
By computationally constructing nearly 2,000 hypothetical polymers, they were able to create a large enough database to train a neural network—a type of machine learning—to understand which polymer properties arise from different molecular sequences.
“We show that the problem is tractable,” said Juan de Pablo, Liew Family Professor of Molecular Engineering who led the research. “Now that we have established this foundation and have shown that it can be done, we can really move forward in using this framework to design polymers with specific properties.”
The results were published Oct. 21 in Science Advances.
Designing polymers difficult due to long strings of atoms
Polymers exhibit amorphous, disordered structures that cannot easily be defined using the techniques that scientists have developed to study metals or other crystalline materials.
Polymer molecules consist of large arrays of atoms arranged in a very long string, sometimes comprising millions of “monomers.” Every polymer molecule is different. Not only is the length different, but the sequence in which the atoms are arranged can vary considerably.
Length and sequence have a great influence on the properties of a polymeric molecule, and the extraordinarily large number of possible length and sequence combinations is a central challenge in designing molecules with specific properties. Trial-and-error approaches are of limited use and generating the necessary experimental data to inform a rational design strategy would be very demanding.
That is where machine learning comes in. The researchers set out to answer the question: "Can machine learning algorithms ‘learn’ how to predict the properties of polymers according to their sequence, and, if so, how large of a dataset would be needed to train the underlying algorithms?”
Creating a database to learn polymer sequences
To create the database, the researchers used nearly 2,000 computationally constructed polymers, all having different sequences, and ran molecular simulations to predict their properties and behavior. When they first used a neural network to figure out which properties were based on which molecular sequences, they were unsure if they would find a reasonable answer.
“We didn’t know how many different polymer sequences were necessary to learn the behavior of materials,” de Pablo said. “The answer could have been millions.”
Luckily, the network only needed less than a few hundred different sequences to learn the properties and predict the behavior of completely new molecular sequences. That meant that experimentalists could now follow a similar strategy and create a database to train a machine-learning network to predict the properties of polymers on the basis of experimental data.