Variational Formulation (FEM) vs Physics-Informed Neural Networks (PINNs)
How do we solve partial differential equations numerically? For decades, the answer has been the variational formulation leading to finite element methods—a mathematically rigorous approach with proven convergence guarantees. Recently, physics-informed neural networks have emerged, bypassing variational principles entirely and working directly with the strong form. This comparison explores both approaches through the lens of Mandel's problem in poromechanics.
Classical Approach • Weak Form • FEM
Begin with the governing equations in their original form. For Mandel's problem (coupled poromechanics):
Multiply by test functions and integrate by parts. This reduces smoothness requirements and naturally incorporates boundary conditions.
Select finite-dimensional subspaces with prescribed basis functions (shape functions). Common choices: linear, quadratic, or higher-order polynomials.
where \(\phi_i\) and \(\psi_j\) are pre-defined polynomial basis functions with compact support.
Substitute finite element approximations into weak form. This converts the infinite-dimensional problem into a finite system.
where stiffness matrix: \(K_{ij} = \int_\Omega \nabla \phi_j : \mathbb{C} : \nabla \phi_i \, d\Omega\)
Compute integrals element-by-element using numerical quadrature (Gaussian integration), then assemble global system.
Apply boundary conditions and solve using direct methods (LU decomposition) or iterative solvers (CG, GMRES, multigrid).
Compute a posteriori error estimates and adaptively refine mesh where error is large.
Modern Approach • Strong Form • Deep Learning
Same starting point! Begin directly with the differential equations. No need to derive a weak form.
PINNs work directly with the strong form. No integration by parts, no test functions, no variational principles needed.
Instead, we'll enforce the PDEs at specific collocation points using automatic differentiation.
Represent the solution as a neural network. The network parameters (weights and biases) are the unknowns to optimize.
Each network: \(\mathbf{o}^{(l)} = \varphi(\mathbf{W}^{(l)}\mathbf{o}^{(l-1)} + \mathbf{b}^{(l)})\)
No mesh needed! Randomly sample points in the domain using Latin hypercube or uniform sampling.
The objective directly enforces PDE residuals at collocation points, plus boundary/initial conditions.
where \(\Pi\) represents the PDE operators (computed via automatic differentiation).
Compute derivatives of neural network outputs with respect to inputs automatically using the chain rule through the computational graph.
Minimize loss function using Adam, L-BFGS, or other optimizers. Update network weights iteratively.
| Aspect | Variational Formulation (FEM) | Physics-Informed Neural Networks |
|---|---|---|
| Starting Point | Strong form → Weak form (variational) | Strong form directly (no weak form) |
| Mathematical Tool | Integration by parts, test functions | Automatic differentiation |
| Basis Functions | Pre-defined (polynomials, FE shape functions) | Learned (neural network discovers representation) |
| Discretization | Mesh required (elements, nodes, connectivity) | Meshless (random collocation points) |
| Integration | Numerical quadrature (Gaussian integration) | Point evaluation (no integration needed) |
| System Type | Linear/nonlinear algebraic system: \(\mathbf{K}\mathbf{u} = \mathbf{f}\) | Non-convex optimization problem: \(\min \mathcal{L}(\mathbf{W})\) |
| Solver | Direct (LU) or iterative (CG, GMRES) | Gradient descent (Adam, L-BFGS) |
| Convergence Theory | ✓ Well-established (Céa's lemma, error estimates) | ✗ Largely unknown (empirical, no guarantees) |
| Error Estimates | ✓ A priori & a posteriori available | ✗ No computable bounds |
| Convergence Rate | ✓ Known: \(\mathcal{O}(h^k)\) for degree \(k\) | ✗ Unknown (depends on architecture, optimizer) |
| Hyperparameters | Mesh size \(h\), polynomial degree \(k\) | Layers, neurons, learning rate, activation, batch size... |
| Complexity for User | Moderate (requires mesh generation expertise) | High (requires ML expertise, hyperparameter tuning) |
| Computational Cost | Predictable (depends on DOF count) | Variable (depends on convergence, architecture) |
| Maturity | ✓ 50+ years of development and theory | ~5 years of active research |
| Trust for Engineering | ✓ High (certified, proven) | ✗ Low (research stage, no guarantees) |
FEM provides a mathematically rigorous path: variational formulation → discretization → guaranteed convergence. You know exactly what to expect and can prove your solution is correct.
PINNs offer conceptual elegance: skip the weak form, work directly with PDEs, and let the network learn the solution. But you're navigating in the dark—no convergence theory, no error bounds, just empirical observation and hope that gradient descent finds something reasonable.
For Mandel's problem specifically, the non-monotonic pressure response and disparate magnitude scales challenge PINNs significantly, requiring careful choice of activation functions (truncated sine) and optimizers (L-BFGS) with no theoretical guidance on why these choices work.