|
PICurv 0.1.0
A Parallel Particle-In-Cell Solver for Curvilinear LES
|
This page explains how a PICurv run moves from a new solve to restart, post-processing reuse, and cluster job generation. It is the operational view of the run directory lifecycle.
For PICurv, a "run" is not just a solver launch. It is the full set of generated artifacts under runs/<run_id>/, including:
config/,logs/,scheduler/.Key rule:
picurv run --solve ... creates a fresh run directory,picurv does not mutate an old run directory in place when you start a new solve,--restart-from) read from an existing run but still create a new run directory for the restarted continuation,--continue --run-dir) resume inside the same run directory.run_id is generated automatically as <case_basename>_<timestamp>.
Typical local solve + post:
Typical cluster solve + post:
Recommended preflight:
picurv validate ...picurv run ... --dry-runpicurv run ... --cluster ... --no-submitThis sequence verifies contract correctness before consuming runtime or queue time.
A typical run directory contains:
runs/<run_id>/config/: generated .control, BC files, copied YAML inputs, and post.runruns/<run_id>/logs/: solver/postprocessor runtime logs and metrics written by PICurv itselfruns/<run_id>/output/: solver outputs when monitor paths use the default layoutruns/<run_id>/scheduler/: generated Slurm scripts, submission.json, and cluster stdout/stderr in cluster moderuns/<run_id>/manifest.json: top-level run metadataPractical interpretation:
config/ first,scheduler/solver.sbatch or scheduler/post.sbatch,scheduler/submission.json is the source of truth for delayed submit and run-directory-based cancel,PICurv now separates case physics from site execution policy.
Local multi-rank precedence:
PICURV_MPI_LAUNCHERMPI_LAUNCHER.picurv-execution.yml.picurv-local.ymlmpiexecCluster batch precedence:
cluster.yml -> execution.picurv-execution.yml -> cluster_execution.picurv-execution.yml -> default_executionsrunThis gives three clean cases:
picurv init creates .picurv-execution.yml in each new case with inert defaults,.picurv-execution.yml when needed,cluster.yml needs a batch-specific override.Restart and continuation use the normal solve workflow. There is no separate restart command. The restart source is specified entirely through CLI flags rather than YAML keys.
Three scenarios are supported:
Use --restart-from to create a new run that continues solving from another run's checkpoint data.
Relevant YAML settings:
case.yml: set start_step to the checkpoint step (e.g. 500) and total_steps to the desired additional count.solver.yml: set eulerian_field_source: "solve" so the solver advances the Eulerian fields from the restart state.Operational meaning:
runs/old_run into the new run's restart/ directory,runs/<new_run_id>/ directory.Use --restart-from when the Eulerian flow is already computed and you only need to track particles through it.
Relevant YAML settings:
solver.yml: set eulerian_field_source: "load" so the solver reads pre-computed Eulerian fields instead of advancing them.Operational meaning:
restart_dir directly at the source run's output (no file copy),runs/<new_run_id>/ directory.Use --continue --run-dir to resume a run that was interrupted or stopped early, writing into the same run directory.
Operational meaning:
output/ to restart/ inside runs/my_run/,runs/my_run/ directory.When start_step > 0, the initial Eulerian state is always loaded from the restart source regardless of the eulerian_field_source setting. The eulerian_field_source value only controls what happens on subsequent steps: "solve" advances the fields, "load" reads pre-computed fields.
Before launching any restart or continuation:
start_step equal to the saved checkpoint step, not the next desired step,monitor.yml directory names match the source run layout.When solver outputs already exist, reuse the run directory directly:
Use this when:
PICurv will auto-identify the required case/monitor/control artifacts from runs/<run_id>/config/.
Operational patterns for post-only reuse:
post.yml as the full analysis window you want, then use --continue to skip steps that were already completed for the same recipe. You do not need to keep editing start_step during batch catch-up.post.yml requests 0..1000 every 10, but solver source files currently exist only through step 420, PICurv launches only 0..420 on the first pass. A later --continue run resumes at 430 after those source files appear.Field_00070.vts exists but the required MSD CSV still stops at 60, step 70 is treated as incomplete and the next --continue run restarts from 70.--continue, PICurv honors the requested window exactly, rewrites any overlapping VTK files for those steps, and rewrites repeated statistics rows so each step still appears once in the final CSV.run_dir at a different post.yml recipe, such as adding Qcrit or changing the statistics prefix, PICurv starts from that recipe's configured start_step instead of inheriting completion from the previous recipe.runs/<run_id> is refused immediately so two writers cannot race on the same output tree.In cluster mode, picurv writes scheduler artifacts into the new run directory:
scheduler/solver.sbatchscheduler/solver_<jobid>.out / scheduler/solver_<jobid>.errscheduler/post.sbatchscheduler/post_<jobid>.out / scheduler/post_<jobid>.errscheduler/submission.jsonRecommended operational pattern:
--dry-run to confirm launch commands and artifact paths--no-submit to inspect generated batch scriptspicurv submit --run-dir runs/<run_id> only after the scripts look correctpicurv cancel --run-dir runs/<run_id> when you need to stop a submitted stage without separate job-id bookkeepingThis is especially useful when changing:
Operational examples:
Generated Slurm solver jobs also export runtime walltime metadata into solver.sbatch, so the solver can estimate completed-step cost and request a graceful final write before remaining walltime gets too tight. If the cluster profile also requests an early signal, PICurv traps SIGUSR1, SIGTERM, and SIGINT, then uses the same safe-checkpoint final-write path. Use signal: "USR1@300" for srun, or signal: "B:USR1@300" plus exec mpirun ... for direct mpirun batch launches.
For manual cancellation, plain picurv cancel is a hard Slurm cancel. Add --graceful when you want the solver to receive SIGUSR1, stop at the next safe checkpoint, and write the latest safe off-cadence output first. Fall back to plain cancel if the job is wedged or not reaching checkpoints.
runs/<run_id>/config/ as the ground truth for what the binaries actually consumed..picurv-execution.yml..picurv-execution.yml; keep scheduler policy in cluster.yml.execution.walltime_guard in the cluster profile rather than editing generated scripts.This page describes Run Lifecycle Guide within the PICurv workflow. For CFD users, the most reliable reading strategy is to map the page content to a concrete run decision: what is configured, what runtime stage it influences, and which diagnostics should confirm expected behavior.
Treat this page as both a conceptual reference and a runbook. If you are debugging, pair the method/procedure described here with monitor output, generated runtime artifacts under runs/<run_id>/config, and the associated solver/post logs so numerical intent and implementation behavior stay aligned.
config/ and scheduler/ artifacts before blaming the solver.