A new deep learning framework that encodes physics to learn the process of reaction diffusion

Mondo Education Updated on 2024-02-06

Spatiotemporal dynamics are ubiquitous in nature. For example, the process of reaction diffusion exhibits interesting phenomena that are common in many disciplines such as chemistry, biology, geology, physics, and ecology. Modeling complex spatiotemporal dynamical systems relies heavily on finding the underlying partial differential equations (PDEs).

However, due to insufficient prior knowledge and the lack of explicit PDE formulas to describe the nonlinear processes of system variables, the evolution of these systems remains a challenging task in many cases. Here, the research team from the University of Chinese Academy of Sciences, Chinese National University, Northeastern University (USA) and MIT proposed a new deep learning framework, PerCNN, which forcibly encodes a given physical structure in a recurrent convolutional neural network to facilitate the learning of spatiotemporal dynamics in sparse data systems. A large number of numerical experiments demonstrate how the proposed method can be applied to a variety of problems related to reactive diffusion processes and other PDE systems, including forward and backward analysis, data-driven modeling, and PDE discovery.

It is found that the physical coding machine learning method exhibits high precision, robustness, interpretability and generalization. The study, titled Encoding Physics to Learn Reaction Diffusion Processes, was published in Nature Machine Intelligence on July 17, 2023.

Through diffusion and reaction, the mechanism of the autonomous formation of Turing patterns can be revealed. Like many other systems, understanding its complex spatiotemporal dynamics, which is controlled by intrinsic PDE, is a central task. However, the laws of principle in the closed-form governing equations of many underexplored systems remain uncertain or partially unknown. Machine learning opens up new avenues for scientific discovery of these systems in a data-driven way.

More recently, machine learning methods have driven a resurgence in data-driven scientific computing. This is due in large part to the deep learning model's ability to automatically learn nonlinear mappings between variables from rich labeled data. However, purely data-driven approaches rooted in deep learning often learn representations from big data and rely heavily on big data, which is often insufficient in most scientific problems. The resulting model usually fails to meet the physical constraints, and its generalization cannot be guaranteed.

In order to solve this problem, physics-based neural networks (PNNs) have become a major research paradigm that uses people's prior knowledge of basic physics to realize learning in small data states. Pinn has shown effectiveness in a wide range of scientific applications. In particular, this paradigm has been shown to be effective in simulating a variety of physical systems.

However, the dominant physical information learning model, the pinn, often represents a continuous learning paradigm because it employs a fully connected neural network (FCNN) to continuously approximate the solution of the physical system. The resulting continuous representation of the system** introduces some limitations. Compared with continuous learning models, discrete learning methods have distinct advantages in hardcoding initial conditions (IC) and boundary conditions (BC), as well as incomplete PDE structures, into the learning model. Even without any labeled data, this practice avoids the inappropriateness of optimization.

As a result, researchers will establish an efficient, explainable, and generalizable discrete learning paradigm that can be used in nonlinear physical systems, which remains a major challenge in scientific machine learning. To this end, the researchers propose a physical coding model to encode prior physical knowledge in a network architecture, in contrast to teaching physical models through penalty loss functions that are common in physical information learning. Specifically, the model has the following main features:

1) In contrast to the Pinn mainstream approach that utilizes FCNN as a continuous approximator for the solution, the physical coding model is discrete (i.e., the solution is based on a spatial grid and defined on discrete time steps) and hardcodes the given physical structure into the network architecture.

2) The model employs a unique convolutional network (i.e., -block) to capture the spatial pattern of the system, while being advanced by the recurrent unit execution time. This unique network has been shown (through mathematical proofs and numerical experiments) to improve the expressive power of its nonlinear spatiotemporal dynamics models.

3) Due to the discretization of time, the network is able to combine well-known numerical time integration methods (e.g., forward Eulerian method, Runge-Kuta method) to encode incomplete partial differential equations into the network architecture. In this study, the researchers demonstrated the functionality of the proposed network architecture by applying it to various tasks in the scientific modeling of spatiotemporal dynamics, such as the reaction diffusion process.

The proposed network, i.e., perrcnn. The architecture of the network consists of two main components: a fully convolutional network as an ISG and a new type of convolutional block called -block(product) for cyclic computation.

Figure 1: Schematic diagram of the architecture of Perrcnn. (*

Due to the discretization scheme of the learning model, the prior physical knowledge of the system can be encoded into the network architecture, which helps to propose appropriate optimization problems. Given some existing items in PDE, you can encode them into the network by creating a shortcut join, that is, a physically-based FD convolutional join. The convolutional kernel in this physics-based convolutional layer will be fixed using the appropriate FD template to interpret known terms. The main advantage of this coding mechanism is the ability to take advantage of incomplete partial differential equations in learning. In a numerical example, it is demonstrated that this highway connection can speed up training and significantly improve model inference accuracy. In short, physics-based convolutional connections are built to explain known physics, while -blocks are designed to learn complementary unknown dynamics. In addition to incomplete PDEs, boundary conditions can also be encoded into the learning model.

Inspired by the ideas of the FD approach, the researchers applied physically-based padding to the model at each time step**. It can be further optimized in the futureThe researchers propose a novel deep learning architecture, PerCNN, for modeling and discovery of nonlinear spatiotemporal dynamical systems based on sparse and noisy data. Although PerCNN shows good promise for data-driven modeling of complex systems, it is limited by computational bottlenecks due to the high-dimensional nature of discrete systems, especially when it comes to systems in large 3D spatial domains that evolve over a long period of time.

However, this problem will be solved by time batch processing and multi-graphics processing unit training. In addition, the current model is rooted in standard convolution operations, which limits its applicability to irregular meshes of arbitrarily calculated geometries. This problem can be solved by introducing graph convolution into the network architecture. Finally, because the Perrcnn network is designed on the assumption that the underlying governing partial differential equations have a polynomial form, it may be less capable or too redundant at modeling unique space-time dynamics, and its governing partial differential equations are parsimonious but involve other high-level symbolic operators, such as division, sin, cos, exp, tan, sinh, log, and so on. Despite PerCNN's success in modeling data-driven non-polynomial term PDE systems, it is still an open question how to design a network that correctly uses a finite number of mathematical operators as symbolically activated functions to improve representational capabilities. In future studies, researchers will systematically address these issues.

Related Pages