|
PICurv 0.1.0
A Parallel Particle-In-Cell Solver for Curvilinear LES
|
picurv sweep orchestrates parameter studies with generated run variants, scheduler arrays, and aggregate metrics.
A sweep/study commonly uses:
study.yml defining parameter combinations and metrics,Starter templates are available under examples/*/*study*.yml and examples/master_template/.
Optional generation-only mode:
Delayed submit from existing staged study artifacts:
There is no dedicated --dry-run flag on sweep; use --no-submit for non-submitting artifact generation.
A study definition usually specifies:
base_configs:case, solver, monitor, post paths (all required)study_type:grid_independence, timestep_independence, sensitivityparameters:<target>.<yaml.path> -> non-empty list of values<target> must be one of case, solver, monitor, postparameter_sets:parameters for coupled overrides that should move together<target>.<yaml.path> -> scalar bundlesparameters or parameter_setsmetrics (optional):plotting (optional):enabled, output_format)execution (optional):max_concurrent_array_tasks for Slurm array throttlingEach combination yields a generated run with fully materialized config set.
Parameter keys can target nested case/solver/monitor/post values such as:
case.models.physics.particles.countcase.run_control.dt_physicalNot every study should use the default msd_final metric shorthand. Cases that write other scalar diagnostics, such as logs/interpolation_error.csv, should define explicit CSV metrics instead. Search and migration characterization studies can aggregate logs/search_metrics.csv columns such as search_failure_fraction, search_work_index, re_search_fraction, or normalized run-level signals derived from lost_cumulative.
CSV metric specs also support reduction: p95, per-row ratios via numerator_column plus denominator_column, and scalar normalization through normalize_by_parameter for observables such as run loss fraction.
Expected study outputs include:
studies/<study_id>/cases/case_####/ per-combination run directoriesstudies/<study_id>/scheduler/case_index.tsvstudies/<study_id>/scheduler/solver_array.sbatchstudies/<study_id>/scheduler/post_array.sbatchstudies/<study_id>/scheduler/metrics_aggregate.sbatchstudies/<study_id>/scheduler/solver_<array_jobid>_<taskid>.out/.err after submissionstudies/<study_id>/scheduler/post_<array_jobid>_<taskid>.out/.err after submissionstudies/<study_id>/scheduler/submission.json (when jobs are submitted)studies/<study_id>/results/metrics_table.csvstudies/<study_id>/results/plots/* (when plotting is enabled and matplotlib is available)studies/<study_id>/study_manifest.jsonThis keeps raw run data and comparative study diagnostics in one reproducible structure.
Metrics aggregation runs automatically as a Slurm job chained after the post-processing array (afterany dependency). If the automatic metrics job fails (e.g. Python unavailable on compute nodes), use --reaggregate manually.
Recommended workflow:
--no-submit,picurv sweep ... or later with picurv submit --study-dir ...,--reaggregate),picurv sweep is the scheduler-backed study path. For local parameter studies, repeat picurv run manually across a small set of edited case variants and compare the resulting run directories.
For fragile metrics, add smoke tests or fixture-based validation before large queue submissions.
Implementation details worth knowing:
parameters.* lists, or explicit paired bundles from parameter_sets when coupled overrides must move together.picurv run.afterok) → metrics job (afterany).scheduler/submission.json is the study-directory contract consumed by picurv submit --study-dir ....studies/<study_id>/cases/....solver_array.sbatch exports walltime metadata for the runtime walltime guard, while post_array.sbatch remains a plain post-processing launcher.post_array.sbatch is rendered with nodes=1, ntasks_per_node=1, and a single-rank launcher command even if the solver array uses more tasks or the cluster launcher args include -n/-np.If any solver case is killed (e.g. by the walltime guard or Slurm time limit), the entire post array is cancelled (afterok dependency). Use --continue to resume the study:
To override cluster resources (e.g. increase walltime):
What --continue does:
study.yml and case_index.tsv from the study directory.case.yml (start_step, total_steps), sets particle restart_mode to load when a checkpoint exists, and delegates to resolve_restart_source for the full restart scenario matrix.Repeated continuation is safe: the target step count is always computed from the original study.yml, not from the (potentially modified) per-case case.yml.
If the automatic metrics Slurm job fails or you want to re-collect metrics after manual intervention:
This reads all case outputs, writes results/metrics_table.csv, and generates plots (if enabled in study.yml).
This page describes Sweep and Study Guide within the PICurv workflow. For CFD users, the most reliable reading strategy is to map the page content to a concrete run decision: what is configured, what runtime stage it influences, and which diagnostics should confirm expected behavior.
Treat this page as both a conceptual reference and a runbook. If you are debugging, pair the method/procedure described here with monitor output, generated runtime artifacts under runs/<run_id>/config, and the associated solver/post logs so numerical intent and implementation behavior stay aligned.