|
PICurv 0.1.0
A Parallel Particle-In-Cell Solver for Curvilinear LES
|
This page documents PICurv's matrix-free Newton–Krylov momentum solver: what it solves, how it is configured, how to read its convergence output, and the residual-purity invariant that makes it work. It is one of two momentum-solution approaches in PICurv; see Momentum Solver Implementations for how it compares to the dual-time Picard–Jameson solver and how to select between them.
The Newton–Krylov solver advances the implicit momentum update by solving the nonlinear momentum residual directly with PETSc's SNES, using a matrix-free (Jacobian-free) Krylov linearization. It is implemented in MomentumSolver_NewtonKrylov (src/momentum_newton_krylov.c) and is selected with strategy.momentum_solver: "Newton Krylov".
Version one is deliberately narrow and validates its inputs up front (MomentumSolver_NewtonKrylov rejects anything outside this set):
StartStep == 0);Nvert must be fluid everywhere);Within that scope it is a drop-in alternative to the dual-time Picard–Jameson solver and shares the same fractional-step projection, BDF time discretization, boundary system, and pressure solve.
Each physical timestep advances the contravariant velocity Ucont by solving the discrete momentum residual to zero:
\[ F(\mathbf{U}) \;=\; -\,\mathrm{RHS}_{\text{spatial}}(\mathbf{U}) \;+\; \frac{a_0}{\Delta t}\,\mathbf{U} \;-\; (\text{BDF history terms}) \;=\; 0, \]
where RHS_spatial is the convective + viscous + source assembly (ComputeRHS) and the time term is the BDF discretization added by ComputeTotalResidual —
Order selection is centralized in MomentumUsesBDF2()/MomentumBDFCoefficient(), shared with the momentum stability estimate. The Newton solver does not change the residual arithmetic, BDF coefficients, conservation-outlet formulas, or the number of boundary passes; it only changes how the resulting nonlinear system is solved.
The solver builds a per-step SNES (MomentumSolver_NewtonKrylov):
SNESNEWTONLS** with a backtracking (bt) line search;MatCreateSNESMF, whose action is the finite-difference directional derivative \(J\mathbf{v} \approx [F(\mathbf{X}+h\mathbf{v}) - F(\mathbf{X})]/h\) (MatMFFDComputeJacobian). No Jacobian is assembled;KSPGMRES** linear solve with PCNONE (unpreconditioned). PCNONE is required in version one and is enforced: option processing that selects any other preconditioner is rejected.Because the Jacobian action is a finite difference of the residual, the residual must be a deterministic function of the trial vector X (see 5. Deterministic Cartesian Seeding (Why It Is Required)). GMRES uses PETSc's default classical Gram–Schmidt orthogonalization; modified Gram–Schmidt is not required and is not enabled.
The Newton loop is: evaluate F(X) → form Krylov solve of J dX = -F → line search along dX → repeat until an SNES convergence test fires. On convergence the solution is committed into Ucont; on failure the entry state is restored (rollback) and the physical step is reported as not converged (simCtx->mom_last_converged).
One residual evaluation (MomentumNewtonKrylov_FormResidual) performs, in order:
X into the global Ucont;Ucont and refresh local lUcont ghosts;Ucat from the current lUcont, finalize periodic Ucat, and refresh lUcat ghosts;F = -Rhs, then a constrained-row pass that replaces every non-independent row (fixed boundary-normal, homogeneous dummy/tangential, and periodic-duplicate rows) with an explicit algebraic equation so the matrix-free operator has no zero Jacobian rows.The three internal boundary passes are unchanged and remain necessary: each pass refreshes the Cartesian state after a boundary correction so the next pass sees a consistent field.
This is the invariant that makes the matrix-free solve correct, and it is easy to break by "simplifying" the residual, so it is documented explicitly.
Every evaluation of F(X) must start from velocity fields derived from that same X. The conservation-outlet handler reads the cell-centered Cartesian velocity lUcat during its first boundary sweep (it measures the uncorrected outlet flux and builds the outlet profile from it). If lUcat were left over from a previous residual or matrix-free evaluation, F(X) would depend on that hidden state — two evaluations at the same X could differ, and the finite-difference Jacobian action would be inconsistent.
The velocity-state relationship is:
So the residual seeds them in exactly this dependency order before the first boundary pass:
Important subtleties, all captured in the source comment above the seed:
Contra2Cart() alone is not sufficient: it rebuilds the interior of the global Ucat but does not refresh lUcat (nor lUcont), and the outlet reads the local ghosted lUcat.ApplyBoundaryConditions() runs after each handler sweep, so it prepares passes two and three — it cannot prepare the very first outlet read of pass one.SynchronizePeriodicCellFields("Ucat") must run before the ghost scatter so periodic duplicate planes are finalized consistently (it is a no-op when no direction is periodic).Removing or shortening this sequence reintroduces a history-dependent residual and invalidates the Newton directions. A permanent regression guards it (Section 10).
Select the solver and (optionally) tune its PETSc controls. Omitted fields keep the defaults established in src/momentum_newton_krylov.c and PETSc.
Field-by-field mappings and validation rules (nonnegative tolerances, positive iteration/restart counts, preconditioner.type: none only) are the authoritative configuration reference in Solver Reference, section 4. The complete annotated template is examples/master_template/master_solver.yml.
Three configuration layers interact, in increasing precedence:
momentum_solver.newton_krylov.*) — the supported surface.petsc_passthrough_options** — raw PETSc options applied last; they can override structured values and are an advanced escape hatch.The tolerances above are a reasonable starting point. Interpretation:
nonlinear_solver.absolute_tolerance stops Newton when the nonlinear residual norm falls below it — the primary physical convergence gate.nonlinear_solver.relative_tolerance stops Newton relative to the initial residual norm.linear_solver.relative_tolerance controls how tightly each inner GMRES solve is converged; a loose 1e-6 inexact-Newton setting is typical and cheap.Newton–Krylov monitors are enabled under solver_monitoring.momentum (see Configuration Reference: Monitor YAML):
newton_krylov_history -> -mom_nk_pic_monitor: PICurv's own per-iteration nonlinear-norm history;snes_monitor -> -mom_nk_snes_monitor, snes_converged_reason -> -mom_nk_snes_converged_reason;ksp_monitor -> -mom_nk_ksp_monitor, ksp_converged_reason -> -mom_nk_ksp_converged_reason.Independently of PETSc monitors, the solver writes structured rank-zero logs into log_dir:
Momentum_Solver_Newton_Krylov_History_Block_<b>.log: one row per Newton iteration (step | block | newton | nonlinear_norm);Momentum_Solver_Newton_Krylov_Summary_Block_<b>.log: one row per physical step (SNES reason, Newton iterations, residual evaluations, Krylov iterations, initial/final norm, and whether the result was committed or rolled back).A healthy solve on the validated duct case shows the nonlinear norm dropping by several orders of magnitude in about two Newton iterations with accepted line search lambda = 1.
Do not treat all non-convergence the same — the SNES/KSP reason identifies the failure class:
CONVERGED_FNORM_ABS / CONVERGED_FNORM_RELATIVE**: success (absolute or relative nonlinear tolerance met).DIVERGED_MAX_IT**: hit snes_max_it without meeting a tolerance — usually under-resolved inner solves or too tight a nonlinear tolerance for the timestep; loosen nonlinear_solver.relative_tolerance or reduce dt.DIVERGED_LINEAR_SOLVE**: an inner GMRES solve failed to converge — inspect ksp_converged_reason; raise linear_solver.max_iterations or gmres.restart, or loosen linear_solver.relative_tolerance.DIVERGED_LINE_SEARCH**: the backtracking line search could not find a sufficient decrease — typically a poor Newton direction. In this solver that most often means the residual was not deterministic (a broken Cartesian seed, Section 5); it should not occur with the shipped residual.dt.Troubleshooting workflow: enable snes_monitor + snes_converged_reason + ksp_converged_reason, reproduce on a short run, and classify by reason before changing tolerances. If you observe DIVERGED_LINE_SEARCH or non-repeatable nonlinear norms, suspect residual determinism (Section 5) rather than the Krylov settings.
Version one supports and validates only the unpreconditioned path (preconditioner.type: none, PCNONE), enforced at runtime. There is no assembled operator to precondition against, so a matrix-free preconditioner would be required; this is future work and is intentionally not exposed. Other current limitations are the version-one scope restrictions in Section 1.
The Newton–Krylov path is covered at two levels:
unit-newton-krylov, part of make check): constraint-row Jacobian structure, matrix-free vs direct differencing, small solve/rollback, and residual repeatability. The conservation-outlet conditioned-row derivative test doubles as a seed-removal detector: removing the deterministic Cartesian seed makes that row's self-derivative revert to the decoupled artifact and the test fails.make unit-momentum-newton-boundary-fixedpoint, one and four ranks): on the production-sized straight duct it advances a real physical step 1, then verifies (i) residual purity at the step-2 state (immediate and after real MFFD products), (ii) a complete step-2 solve with the default classical Gram–Schmidt, (iii) that the converged three-pass solution also zeros the 24-pass outlet residual, and (iv) clean pressure projection.Validated behavior on that case: convergence in about two Newton iterations from the true projected step-1 state, identical results with classical and modified Gram–Schmidt, and divergence-free projection, on both one and four ranks.
| Aspect | Newton–Krylov | Dual-Time Picard–Jameson |
|---|---|---|
| Linearization | true Newton (matrix-free Jv) | Picard fixed-point / pseudo-time smoothing |
| Inner solve | PETSc SNES + GMRES | staged Jameson RK pseudo-time |
| Main controls | SNES/KSP tolerances, GMRES restart | pseudo-CFL, pseudo-iterations |
| Maturity | newer, narrow validated scope (Section 1) | established, broadly exercised |
| Failure surface | SNES/KSP convergence reasons | pseudo-CFL rollback / rejection |
See Dual-Time Picard Jameson RK Momentum Solver for the Picard–Jameson solver and Momentum Solver Implementations for selection guidance.