CEJ uses an iterative machine learning approach to screen catalysts for high performance H2O2 produc

Mondo Science Updated on 2024-02-08

Background:

As an important raw material, hydrogen peroxide (H2O2) has been widely used in textile, food, wastewater treatment and papermaking. Although H2O2 represents a green oxidant, the traditional production process is not green at all. Currently, the industrial production of H2O2 relies heavily on the anthraquinone oxidation process, which is an energy-intensive and environmentally polluting process that leads to environmental problems. Electrochemical H2O2 production is a highly desirable method to replace anthraquinone cycle processes without validation. With the help of catalysts, an advanced electrochemical method of two-electron oxygen reduction (ORR) can be carried out by clean reactants (O2, H+), the main product of which is H2O2.

However, the design and development of green, low-cost, and efficient electrocatalysts remains a challenge. Therefore,Wei Zengxi and Zhao Shuangliang of Guangxi UniversityAn iterative machine learning (IML) approach is proposed, which greatly reduces the required training set size, as shown in Figure 1. By introducing the features of spatial coordinate information, the optimal catalytic activity can be quickly screened from hundreds of single-atom catalysts. Rho2N2(a) was found to be an ideal catalyst for H2O2 production with 0Ultra-low overpotential of 013V. This work opens the way to accelerate the design and discovery of data-driven, high-performance catalysts.

Figure 1IML method schemeComputational and machine learning methods

DFT Calculation Method:

All calculations were performed using the spin polarization density functional theory (DFT) method, and the Vienna Ab initio Simulation Package (VASP) was used to calculate the electronic structure of the catalyst at the Generalized Gradient Approximation (GGA)-Perdew-Burke-Ernzerhof (PBE) level, and the electron-ion interaction was calculated by applying the projection enhanced wave PAW functional. The plane wave cut-off is set to 500 EV. The electron convergence energy is set at 10-6 ev, and all atomic positions are allowed to relax until the force is less than 002ev/å。The Brillouin district consists of a grid of monkhorst 2 2 1 k points. In order to avoid the interaction between two adjacent periodic geometric images, the vacuum degree was set to 15 to eliminate the effect of symmetry, and the DFT-D3 scheme was used to correct for the van der Waals interaction.

Machine Learning Methods:

In this paper, both linear and nonlinear models are taken into account. For the regression data, 50% of cross-validation was considered through a linear nonlinear model. Grid search was used to filter the model by hyperparameters. All algorithms were evaluated and optimized using the Python package scikit-learn.

Results & Discussion

In this paper, 17 different physical and chemical properties were selected: (1) the number of electrons (EDP) of d and p orbitals; (2) Enthalpy of oxide formation (hfox), (3) electronegativity (nm); (4) electron affinity (AM); (5) first ionization energy (im); (6) atomic number (AN); (7) atomic radius (r); (8) Number of valence electrons (VE); (9) percentage of screening (SCPE); (10) covalent radius (CCR); (11) the sum of the atomic radii of the coordination environment (sr); (12) the sum of atomic masses (SRAM) of the coordination environment; (13) the sum of atomic numbers (SANs) of the coordination environment; (14) the sum of electron affinity energies (SAM) of the coordination environment; (15) the sum of the electronegativity of the coordination environment (snm); (16) the sum of valence electrons (SVE) of the coordination environment; and (17) the sum of proton affinity energies (SEPA) of the coordination environment, which are derived from databases, periodic tables, and public literature information. The wrapping method of feature selection uses the basic model training as the machine learning model, and the performance evaluation index as the important basis for feature selection.

As a result, random forests (nonlinear models) and support vector machines (linear models) are employed. As shown in Figure 6a, when the dimension reaches 17, the random forest model stabilizes. The SVM model retains its performance when the dimension exceeds 5, but then rapidly decreases when the dimension exceeds 20. In order to further determine the importance of features, the embedding method LASSO is used, which combines the feature selection process with the model training process. As shown in Figure 2b, the results show that six electronic structure features dominate among these candidates, including nm, r, hfox, im, edp, and am. In addition, NM and R are two important features of H2O2 production. Feature correlation is studied by Pearson coefficients, which is achieved by assessing the correlation between each feature and the outcome. As shown in Figure 6c, the correlation coefficients between various features and G*OOH, in the feature selection method, the electronegativity and atomic radius of TM are the main catalytic activity descriptors of the catalytic activity of the two-electron ORR.

Figure 2(a) The performance in terms of feature number was tested by random forest algorithm and linear regression algorithm; (b) The characteristic importance of the 14 characteristics; (c) Pearson correlation plot between the 10 input features of transition metals and the 7 input features of coordination atoms, where E represents G*OOH

20 different transition metal atoms and six configurations (TM4, TM-N3O, TM-N2O2(A), TM-N2O3(B), TM-NO3, and TM-O4) construct 120 possible combinations. In general, the electronegativity of the coordination atoms is added together as a descriptor of activity. However, the geometric differences of the coordination atoms are ignored. In order to reveal the geometric differences, the spatial (three-dimensional coordinate) features of SACs are first proposed. As shown in Figure 3, the coordinates of the central metal and the coordination atoms are taken into account and the dimensionality reduction is performed by means of a mapping function.

Figure 3The spatial coordinate information of single-atom catalysts is simplified into numerical features to distinguish the diversity of structures. Features with spatial information are defined by the spatial information weights and intrinsic features of the coordination atoms and the spatial information weights of the PTMs of transition metals

After feature engineering, this paper further selects an appropriate model for dataset training. During the IML process, each batch of BI can be effectively trained by a cross-validation method. In this paper, one portion of the dataset is used for testing and the other four parts are used for training (Figure 4A). As shown in Figure 4b, a total of 13 models were tested. The results of four tests in the fourth iteration alone showed that Bayesianridge, RandomForest, MLP, and XGBous, with lower MAE and MSE, showed good performance. Both MLP and XGBow models can be used to train our datasets because of their efficient scalability and robustness.

However, in the absence of spatial coordinate information, the XGBoBoost and MLP models have poor performance against H2O2. Based on the spatial coordinate information, the XGBouss(S) model shows that the MAE and MSE scores of the dataset are 0., respectively494ev and 0434ev。In addition, the MLP(S) model showed excellent performance in terms of **H2O2 activity. In order to improve the generalization performance of the MLP model, dropout tuning is considered. The activation function and optimizer are set to tanh and sgd, respectively. Finally, 70 neurons can be obtained in the output.

Figure 4(a) a five-fold cross-validation model; (b) cross-validation tests were performed on the mean absolute error and mean square error of 13 machine learning models; (c) R2 scores, mean absolute errors, and mean square errors of machine learning models (i.e., XGBooze, XGBoy(S), MLP, and MLP(S)) after the third iteration of the machine learning process. XGBoy(S) and MLP(S) represent features that introduce spatial coordinate information into XGBow and MLP, with an adsorption free energy of about 4The input data for 22EV can be identified by the third iteration of the machine learning process

Using the IML process, the adsorption free energy is achieved. The results show that the optimal catalyst can be found during the third iteration, as shown in Figure 5a. The RHN2O2A model exhibits 4The best *OOH adsorption free energy of 233 EV indicates that it produces an ultra-low overpotential of 0 for H2O2013v。This is followed by Nin2O2B, which has a *OOH adsorption free energy of 4116ev。But the Nin3O configuration exhibits 4A weak *OOH absorption energy of 380 eV, in which the central atom of Ni is coordinated by one oxygen atom and three nitrogen atoms, respectively. In addition, for the production of H2O2, the Con3O configuration exhibits 4*OOH adsorption free energy of 115 EV.

Figure 5(a) 4.The adsorption free energy around 22EV can be identified by a third iterative machine learning process. Exhibits suitable *OOH adsorption free energies for H2O2 production, including RHN2O2A (4233ev)、nin2o2b(4.116ev)、con3o(4.115EV) and NIN3O (4.).422ev);(b) Intermediate adsorption of free energy by *OOH on various SACs

In addition, the adsorption free energy of *OOH intermediates on various SACs is determined by DFT calculations, as shown in Figure 5B. In H2O2 production, the dotted line represents the ideal G*OOH=422ev。In fact, the two-electron ORR pathway has been seen as a side reaction of the four-electron ORR process in fuel cell applications for the past few decades. Therefore, the two-electron ORR selectivity on SACs can be used as a criterion for screening highly efficient catalysts for H2O2 production. As shown in Figure 6, both the ORR activity of two electrons and the ORR activity of four electrons are plotted as a function of *OOH adsorption free energy. The overpotential of the two-electron ORR on SACs is smaller than that of the four-electron ORR, which indicates that the ORR process has an advantage over the generation of H2O2.

Figure 6Theoretical volcano diagram of the two-electron and four-electron oxygen reduction reactions as a function of *OOH adsorption free energy

Conclusions and prospects

In this paper, we propose an IML method to rapidly screen target catalysts for H2O2 production, and in the process of machine learning, the complex coordination environment in SACs can be distinguished by spatial coordinate features. In the new iterative machine learning method, a multilayer neural network was adopted, and the ideal catalyst RHN2O2A was identified during the third iteration. In addition, RHN2O2A is at 0013V exhibits the best activity for H2O2 production at ultra-low overpotential, and this work will help to screen novel catalysts for H2O2 production in industrial systems.

Bibliographic information

deng b, chen p, xie p, et al. iterative machine learning method for screening high-performance catalysts for h2o2 production[j]. chemical engineering science, 2023, 267: 118368.

Related Pages