Machine learning is speeding up catalyst discovery and making an impact on the world.
Imagine a world where breakthroughs in science happen at lightning speed and transform the world in ways never seen before. The discovery and development of catalysts have been pivotal in driving advancements in society and enabling more efficient and sustainable chemical processes. Traditionally, catalyst discovery has relied heavily on trial and error experimentation, making it slow and costly. Machine learning (ML) is a transformative technology that is reshaping the way scientists discover catalysts. By leveraging vast datasets and advanced algorithms, ML offers a more systematic approach to catalyst discovery, supporting innovation and helping society.
Catalysts are substances that accelerate a chemical reaction without being consumed in the process (1). They are essential in the energy sector, pharmaceutical industry, and environmental technology. The demand for efficient and readily available catalysts continues to grow, particularly in the context of addressing climate change and advancing green chemistry. However, finding catalysts that fit these criteria is a difficult job. Traditional catalyst discovery involves experimental procedures and very high costs due to material synthesis and testing (2). Additionally, this method is constrained by the limitations of human intuition, as chemists can only explore the narrow portion of chemistry that they understand. The complexity of catalytic reactions, influenced by factors such as material composition, structure, and reaction conditions, makes finding an optimal catalyst a formidable challenge (2). As a result, the discovery process can take years or even decades.
Diagram of how catalyst is involved in reaction
Machine learning, a subset of artificial intelligence (AI), utilizes algorithms to identify patterns in data and make predictions based on those patterns (3). In catalyst discovery, ML can analyze extensive datasets of chemical compositions, reaction conditions, and performance metrics to predict good catalysts. This approach, driven by data, is able to significantly reduce the need for human trials, which are more exhaustive and experimental. This enables faster, more economical discovery processes. One of the key advantages of ML in catalyst discovery is its ability to explore multiple dimensions of chemical spaces that are beyond human capacity to analyze (4). Additionally, they can identify hidden and complex correlations between material properties and catalytic performance, providing insights that guide experimentation.
The success of ML in catalyst discovery hinges on the size and availability of high quality datasets. These datasets are formed from experimental results, computational simulations, and scientific literature. However, this has become a key setback for scientists. Data quality and availability is a critical issue (3). Currently, the best datasets are only available for the top scientists and most datasets are incomplete, leading to inaccurate predictions (5). To combat this issue, scientists are now finding new ways to generate data like high-throughput experimentation (HTE) and density functional theory (DFT) (5). These simulations play a crucial role in generating data for ML models (5). HTE enables the rapid synthesis of catalysts, while DFT provides theoretical insights into reactions and material properties. Integrating these data sources makes ML models more powerful and increases their ability to make more accurate predictions. There are several ML techniques that are used in catalyst discovery. Supervised learning is a method in which models are trained on labeled datasets containing known catalysts and their performance metrics. These models can predict the efficacy of new catalyst candidates based on learned patterns (4). Secondly, unsupervised learning identifies hidden patterns and structures in unlabeled data, which enables the discovery of novel catalyst classes and reaction pathways (4). Reinforcement learning is where algorithms learn optimal strategies for optimizing reaction conditions and catalyst design. A final technique that is used in catalyst discovery is generative models. Techniques like generative adversarial networks (GANs) and variational autoencoders (VAEs) can design entirely new catalyst structures by learning from existing data (4).
Types of machine learning
There have already been many notable studies which have demonstrated the potential of ML in catalyst discovery. For instance, researchers at Carnegie Mellon University have used ML models to identify efficient catalysts for the hydrogen evolution reaction, a key process in splitting water for hydrogen production. ML accelerated the discovery of non-precious metal catalysts, offering cost effective alternatives to platinum materials (6). ML algorithms have also been employed to discover catalysts for carbon dioxide reduction, a crucial step in carbon capture and utilization technologies. By predicting optimal material compositions, ML has facilitated the development of catalysts that convert CO2 into valuable chemicals and fuels (7). Not only is ML great at discovering catalysts, ML and AI serves as a catalyst for innovations and productivity in society. The productivity gains from AI are expected to raise global GDP by 7% (or almost $7 trillion) over a decade and by up to 15% in the long run, according to Goldman Sachs (2023). AI could contribute an estimated $2.6 to $4.4 trillion annually, according to McKinsey’s analysis of 63 specific use cases. This estimate would approximately double if considered the integration of GAI (Generative AI) into existing software for other tasks beyond those use cases.
Machine learning is transforming the landscape of catalyst discovery, offering a powerful tool to accelerate the development of efficient, sustainable and innovative materials. By harnessing the power of data and advanced algorithms, researchers can explore vast chemical spaces, uncover hidden insights, and design catalysts with unprecedented performance. As ML continues to evolve, it holds the potential to drive significant breakthroughs in chemistry, energy, and environmental technology, paving the way for a more sustainable and technologically advanced future.
References
- Catal , N. (2023). Rates against the machine. Nature Catalysis, 6(2), 103–104. Retrieved from https://doi.org/10.1038/s41929-023-00933-4
- Department of Energy. (2019). DOE Explains…Catalysts. Energy.gov; U.S. Department of Energy. Retrieved from https://www.energy.gov/science/doe-explainscatalysts
- Dong Hyeon Mok, Li, H., Zhang, G., Lee, C., Jiang, K., & Back, S. (2023). Data-driven discovery of electrocatalysts for CO2 reduction using active motifs-based machine learning. Nature Communications, 14(1). Retrieved from https://doi.org/10.1038/s41467-023-43118-0
- Guo, W., Shafizadeh, A., Shahbeik, H., Rafiee, S., Motamedi, S., Ghafarian Nia, S. A., Nadian, M. H., Li, F., Pan, J., Tabatabaei, M., & Aghbashlo, M. (2024). Machine learning for predicting catalytic ammonia decomposition: An approach for catalyst design and performance prediction. Journal of Energy Storage, 89, 111688. Retrieved from https://doi.org/10.1016/j.est.2024.111688
- IBM. (2021, September 22). Machine learning. Ibm.com. Retrieved from https://www.ibm.com/think/topics/machine-learning
- Margraf, J. T., Jung, H.-W., Scheurer, C., & Reuter, K. (2023). Exploring catalytic reaction networks with machine learning. Nature Catalysis, 6(2), 112–121. Retrieved from https://doi.org/10.1038/s41929-022-00896-y
- Mennen, S. M., Alhambra, C., Allen, C. L., Barberis, M., Berritt, S., Brandt, T. A., Campbell, A. D., Castañón, J., Cherney, A. H., Christensen, M., Damon, D. B., Eugenio de Diego, J., García-Cerrada, S., García-Losada, P., Haro, R., Janey, J., Leitch, D. C., Li, L., Liu, F., & Lobben, P. C. (2019). The Evolution of High-Throughput Experimentation in Pharmaceutical Development and Perspectives on the Future. Organic Process Research & Development, 23(6), 1213–1242. Retrieved from https://doi.org/10.1021/acs.oprd.9b00140
- Smith, L. (2024). Machine learning framework finds catalysts for lower-cost hydrogen production. Cmu.edu. Retrieved from https://www.cheme.engineering.cmu.edu/news/2024/08/22-kitchin-ml-oer-catalyst.html
Images:
- University of Texas. (2022). Catalyst. Utexas.edu. Retrieved from https://ch302.cm.utexas.edu/kinetics/catalysts/catalysts-all.php
- Concannon, M. (2024). 4 Key Ways AI is Boosting Mobile Cybersecurity in 2024. Ntiva.com. https://doi.org/1071736/1730224971878