Data Output & Restart Files

GeoDynamo.jl provides a parallel NetCDF-based I/O system using MPI-IO. All ranks write concurrently to a single shared file per timestep via NCDatasets.jl with parallel HDF5. The system handles simulation snapshots, restart files, diagnostics, and boundary condition data.

Quick Start
config = default_config(precision=Float64)
tracker = create_time_tracker(config, 0.0)
write_fields!(fields, tracker, metadata, config, shtns_config, pencils)

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    GeoDynamo I/O System                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────┐        │
│  │ Simulation  │────▶│ TimeTracker │────▶│  write_     │        │
│  │    Loop     │     │             │     │  fields!    │        │
│  └─────────────┘     └─────────────┘     └───────┬─────┘        │
│                                                  │              │
│         ┌────────────────────────────────────────┼──────┐       │
│         │                                        ▼      │       │
│         │  ┌─────────────┐  ┌─────────────┐  ┌───────┐  │       │
│         │  │ Grid File   │  │ History     │  │Restart│  │       │
│         │  │ (rank 0)    │  │ (shared)    │  │ Files │  │       │
│         │  └─────────────┘  └─────────────┘  └───────┘  │       │
│         │      Parallel NetCDF (MPI-IO) Output Layer     │       │
│         └───────────────────────────────────────────────┘       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Design Principles

PrincipleDescription
Parallel I/OAll ranks write concurrently to a single shared NetCDF file via MPI-IO
Mixed-space outputSpectral data for velocity/magnetic, physical for T/C
Memory safeEach rank holds only its local pencil slice; no gathering
Time-based schedulingAutomatic output at specified intervals

OutputConfig

The OutputConfig struct controls all aspects of data output.

Creating Configurations

# Method 1: From defaults
config = default_config(precision=Float64)

# Method 2: From simulation parameters
config = output_config_from_parameters()

# Method 3: Modify existing config
config = with_output_precision(config, Float32)

Configuration Options

FieldTypeDefaultDescription
output_spaceOutputSpaceMIXED_FIELDSData representation mode
output_dirString"./output"Output directory
filename_prefixString"geodynamo"File prefix
include_metadataBooltrueInclude simulation metadata
include_gridBooltrueInclude coordinate arrays
include_diagnosticsBooltrueInclude diagnostic scalars
output_precisionDataTypeFloat64Precision for field data
spectral_lmax_outputInt-1Max ℓ to output (-1 = all)
overwrite_filesBooltrueOverwrite existing files
output_intervalFloat640.1Time between snapshots
restart_intervalFloat641.0Time between restarts
max_output_timeFloat64InfStop output after this time
time_toleranceFloat641e-10Tolerance for time comparisons

Output Space Modes

@enum OutputSpace begin
    MIXED_FIELDS      # Spectral for velocity/magnetic, physical for T/C
    PHYSICAL_ONLY     # All fields in physical (θ, φ, r) space
    SPECTRAL_ONLY     # All fields in spectral (l, m, r) space
end

File Layout

Naming Convention

All ranks write to a single shared file per timestep:

{prefix}_{geometry}_{type}_{N}.nc

Examples:

geodynamo_shell_hist_1.nc       # History snapshot 1 (all ranks)
geodynamo_shell_hist_42.nc      # History snapshot 42 (all ranks)
geodynamo_shell_restart_3.nc    # Restart file 3 (all ranks)
geodynamo_shell_grid.nc         # Grid file (rank 0 only)

NetCDF Structure

NetCDF File Structure
├── Dimensions
│   ├── theta         (nlat)
│   ├── phi           (nlon)
│   ├── r             (nr)
│   ├── spectral_mode (nlm)
│   ├── time          (1)
│   ├── scalar        (1)
│   └── meta          (1)
│
├── Coordinate Variables
│   ├── theta[theta]        : Colatitude (radians, 0 to π)
│   ├── phi[phi]            : Longitude (radians, 0 to 2π)
│   ├── r[r]                : Radial coordinate (dimensionless)
│   ├── l_values[spectral]  : Spherical harmonic degree
│   ├── m_values[spectral]  : Spherical harmonic order
│   ├── time[time]          : Simulation time
│   └── step[time]          : Timestep number
│
├── Physical Fields (MIXED_FIELDS or PHYSICAL_ONLY)
│   ├── temperature[theta,phi,r]
│   └── composition[theta,phi,r]
│
├── Spectral Fields (MIXED_FIELDS or SPECTRAL_ONLY)
│   ├── velocity_toroidal_real[spectral,r]
│   ├── velocity_toroidal_imag[spectral,r]
│   ├── velocity_poloidal_real[spectral,r]
│   ├── velocity_poloidal_imag[spectral,r]
│   ├── magnetic_toroidal_real[spectral,r]
│   ├── magnetic_poloidal_real[spectral,r]
│   └── (optional) temperature_spectral_*, composition_spectral_*
│
├── Diagnostics
│   ├── diag_temp_mean, diag_temp_std, diag_temp_min, diag_temp_max
│   ├── diag_velocity_toroidal_energy, diag_velocity_toroidal_rms
│   ├── diag_magnetic_toroidal_energy, diag_magnetic_poloidal_energy
│   └── diag_*_peak_l, diag_*_spectral_centroid
│
└── Global Attributes
    ├── title, source, history, Conventions
    ├── mpi_total_ranks
    ├── geometry (shell/ball)
    └── simulation parameters...

Grid File

The grid file ({prefix}_{geometry}_grid.nc) is written once by rank 0 and contains:

  • Full coordinate arrays at highest precision
  • Gauss-Legendre quadrature weights
  • SHTnsKit configuration metadata
  • Grid type descriptors
Note

This allows post-processing tools to reconstruct the full grid without duplicating coordinates in every history file.


Time Tracking

The TimeTracker manages output scheduling:

# Create tracker starting at t=0
tracker = create_time_tracker(config, 0.0)

# In simulation loop
if should_output_now(tracker, current_time, config)
    write_fields!(fields, tracker, metadata, config)
end

# Query next output time for adaptive timestep
dt_to_output = time_to_next_output(tracker, current_time, config)

TimeTracker Fields

FieldTypeDescription
last_output_timeFloat64Time of last snapshot
last_restart_timeFloat64Time of last restart
output_countIntTotal snapshots written
restart_countIntTotal restarts written
next_output_timeFloat64Scheduled next snapshot
next_restart_timeFloat64Scheduled next restart
grid_file_writtenBoolWhether grid file exists

Adaptive Timestep Integration

# Adjust timestep to hit exact output times
time_to_next = time_to_next_output(tracker, t, config)
if 0 < time_to_next < dt
    dt = time_to_next  # Shorten step to hit output exactly
end

Writing Simulation Data

Main Output Function

function write_fields!(
    fields::Dict{String,Any},
    tracker::TimeTracker,
    metadata::Dict{String,Any},
    config::OutputConfig = output_config_from_parameters(),
    shtns_config::Union{SHTnsKitConfig,Nothing} = nothing,
    pencils::Union{NamedTuple,Nothing} = nothing
) -> Bool

Returns true if output was written.

Field Data Format

fields = Dict(
    # Physical space fields (θ × φ × r arrays)
    "temperature" => Array{Float64,3}(nlat, nlon, nr),
    "composition" => Array{Float64,3}(nlat, nlon, nr),

    # Spectral fields (real/imag pairs)
    "velocity_toroidal" => Dict(
        "real" => Array{Float64,3}(nlm, 1, nr),
        "imag" => Array{Float64,3}(nlm, 1, nr)
    ),
    "velocity_poloidal" => Dict("real" => ..., "imag" => ...),
    "magnetic_toroidal" => Dict("real" => ..., "imag" => ...),
    "magnetic_poloidal" => Dict("real" => ..., "imag" => ...)
)

Metadata Dictionary

metadata = Dict{String,Any}(
    "current_time" => t,
    "current_step" => step,
    "current_dt" => dt,
    "Rayleigh_number" => Ra,
    "Ekman_number" => E,
    "Prandtl_number" => Pr,
    "magnetic_Prandtl" => Pm,
    "geometry" => "shell"  # or "ball"
)

Restart Files

Writing Restarts

Restart files are automatically written at restart_interval using parallel I/O:

write_restart!(fields, tracker, metadata, config, pencils)

Restart files include:

  • All field data (same as history files)
  • Tracker state (last_output_time, output_count, etc.)
  • grid_file_written flag for consistency

Reading Restarts

# From directory (finds most recent restart)
restart_data, metadata = read_restart!(tracker, "output", target_time, config, pencils)

# From specific file
restart_data, metadata = _load_restart_file(filepath, tracker, config; pencils=pencils)

# Access fields (each rank gets its local slice)
temperature = restart_data["temperature"]       # local θ×φ×r slice
velocity_tor = restart_data["velocity_toroidal"] # local lm×r slice

# Access simulation state
t = metadata["current_time"]
step = metadata["current_step"]
Restart Tips
  • Set restart_interval longer than output_interval unless frequent checkpointing is needed
  • All ranks open the restart file collectively; each reads only its local pencil slice
  • To change precision: load restart, apply with_output_precision, then continue

Boundary Condition I/O

Reading Boundary Data

using GeoDynamo.bcs

# Read from NetCDF
bc_data = read_netcdf_boundary_data("boundary_temperature.nc"; precision=Float64)

# Access fields
bc_data.values       # The boundary values
bc_data.theta        # Colatitude coordinates
bc_data.phi          # Longitude coordinates
bc_data.time         # Time values (if time-dependent)

Writing Boundary Data

# Create boundary data structure
bc = create_boundary_data(
    values_array, "temperature";
    theta=theta_grid, phi=phi_grid, time=time_series,
    units="K", description="Surface temperature"
)

# Write to file
write_netcdf_boundary_data("boundary_out.nc", bc)

Validating Boundary Files

# Check file structure
validate_netcdf_boundary_file("boundary.nc")

# Get file information
info = get_netcdf_file_info("boundary.nc")
println("Grid: $(info["nlat"]) × $(info["nlon"])")
println("Time-dependent: $(info["is_time_dependent"])")

Boundary File Format

boundary_temperature.nc
├── Dimensions
│   ├── theta (nlat)
│   ├── phi (nlon)
│   └── time (ntime)  # optional
│
├── Coordinates
│   ├── theta[theta]   : 0 to π
│   ├── phi[phi]       : 0 to 2π
│   └── time[time]     : simulation times
│
└── Field
    └── temperature[theta,phi] or temperature[theta,phi,time]

Diagnostics

Automatic Diagnostics

Each output file includes computed diagnostics:

Scalar Field Statistics:

DiagnosticDescription
diag_temp_meanVolume-averaged temperature
diag_temp_stdTemperature standard deviation
diag_temp_min, diag_temp_maxTemperature extrema
diag_temp_radial_variationRadial temperature variation

Spectral Field Diagnostics:

DiagnosticDescription
diag_{field}_energyTotal spectral energy
diag_{field}_rmsRMS amplitude
diag_{field}_maxMaximum coefficient magnitude
diag_{field}_peak_lDegree with maximum energy
diag_{field}_spectral_centroidEnergy-weighted mean degree
diag_{field}_low_mode_fractionEnergy fraction in ℓ ≤ 10

Custom Diagnostics

function compute_diagnostics(fields::Dict{String,Any}, field_info::FieldInfo)
    diagnostics = Dict{String, Float64}()

    if haskey(fields, "temperature")
        T = fields["temperature"]
        diagnostics["temp_nusselt"] = compute_nusselt(T, field_info)
    end

    return diagnostics
end

MPI Parallel I/O

How It Works

All MPI ranks collectively open a single NetCDF file using parallel HDF5 (MPI-IO). Each rank writes only its local pencil slice at the correct global offset. No data gathering is required.

# File is opened collectively by all ranks
ds = NCDataset(filename, "c"; comm=comm, info=MPI.Info())

# Each rank writes its portion using pencil ranges
θ_range = range_local(pencils.r, 1)   # local theta indices in global array
φ_range = range_local(pencils.r, 2)   # local phi indices in global array
ds["temperature"][θ_range, φ_range, :] = local_temperature

lm_range = range_local(pencils.spec, 1) # local spectral mode indices
r_range = range_local(pencils.spec, 3)  # local radial indices
ds["velocity_toroidal_real"][lm_range, r_range] = local_vel_tor_real

Pencil-to-Global Index Mapping

Field TypePencilLocal DimsWrite Pattern
Temperature/Compositionpencils.rr local, (θ,φ) distributedds["temperature"][θ_range, φ_range, :]
Spectral fieldspencils.spec(lm, r) distributedds["vel_tor_real"][lm_range, r_range]

Runtime Check

At initialization, the system verifies that parallel HDF5 is available:

check_parallel_netcdf_support(comm)

This will error immediately if parallel NetCDF is not supported, with instructions on how to install a parallel-enabled HDF5.

Verification

# Verify the shared output file exists and has correct dimensions
success, missing, info = verify_all_ranks_wrote(
    "output", hist_number;
    geometry="shell",
    expected_dims=Dict("theta" => 64, "phi" => 128, "r" => 20)
)

Post-Processing Tools

Spectral to Physical Conversion

julia --project extras/spectral_to_physical.jl \
    --input output \
    --output physical \
    --precision Float64

File Analysis Utilities

# Get available time series from output directory
times = get_time_series("output")

# Get file information
info = get_file_info("geodynamo_shell_hist_1.nc")
println("Time: $(info["time"]), Step: $(info["step"])")
println("Dimensions: $(info["dimensions"])")
println("Variables: $(info["variables"])")

Cleanup Old Files

# Keep only last 10 time snapshots
cleanup_old_files("output", 10)

Reading Output Files

Since all data is in a single file per timestep, reading is straightforward:

using NCDatasets

NCDataset("output/geodynamo_shell_hist_5.nc", "r") do ds
    temperature = Array(ds["temperature"][:, :, :])   # Full global array
    vel_tor_real = Array(ds["velocity_toroidal_real"][:, :])  # (nlm, nr)
    println("Dimensions: theta=$(ds.dim["theta"]), phi=$(ds.dim["phi"]), r=$(ds.dim["r"])")
end

Performance

Precision vs Size

PrecisionBytes/ElementRelative Size
Float648100%
Float32450%
Recommendation

Use Float32 for history files, Float64 for restart files:

history_config = with_output_precision(default_config(), Float32)
restart_config = with_output_precision(default_config(), Float64)

Memory-Efficient Output

For large arrays, the system automatically uses optimized copying:

if length(data) > 10000
    data_out = similar(data, output_precision)
    copyto!(data_out, data)  # Efficient in-place conversion
else
    data_out = output_precision.(data)  # Direct broadcast
end

Troubleshooting

Files Not Created

# Check directory exists and is writable
mkpath(config.output_dir)
@assert isdir(config.output_dir) && iswritable(config.output_dir)

# Verify parallel HDF5 is available
check_parallel_netcdf_support(MPI.COMM_WORLD)

Output File Verification

success, missing, info = verify_all_ranks_wrote("output", 1;
    expected_dims=Dict("theta" => nlat, "phi" => nlon, "r" => nr))
if !success
    @warn "Output verification failed: $missing"
end

Dimension Mismatches

validate_output_compatibility(field_info, shtns_config)

NaN in Output

if any(isnan, temperature)
    @warn "NaN detected in temperature field!"
end

Debug Mode

# Enable verbose output
ENV["GEODYNAMO_IO_DEBUG"] = "true"

# Check each rank's status
rank = MPI.Comm_rank(MPI.COMM_WORLD)
println("Rank $rank: Writing to $(config.output_dir)")

Complete Example

using GeoDynamo
using MPI

MPI.Init()

# Configuration
config = OutputConfig(
    MIXED_FIELDS,       # output space mode
    "./output",         # output directory
    "dynamo",           # filename prefix
    true, true, true,   # metadata, grid, diagnostics
    Float32,            # output precision
    -1,                 # all spectral modes
    true,               # overwrite files
    0.05, 0.5,          # output, restart intervals
    10.0, 1e-12         # max time, tolerance
)

# Verify parallel HDF5 support (call once at startup)
check_parallel_netcdf_support(MPI.COMM_WORLD)

# Initialize tracker
tracker = create_time_tracker(config, 0.0)
t, dt, step = 0.0, 0.001, 0

# Simulation loop
while t < 2.0
    t += dt
    step += 1

    # ... physics update ...

    # Prepare output data
    fields = Dict(
        "temperature" => T_physical,
        "velocity_toroidal" => Dict("real" => vT_real, "imag" => vT_imag),
        "velocity_poloidal" => Dict("real" => vP_real, "imag" => vP_imag),
        "magnetic_toroidal" => Dict("real" => bT_real, "imag" => bT_imag),
        "magnetic_poloidal" => Dict("real" => bP_real, "imag" => bP_imag)
    )

    metadata = Dict{String,Any}(
        "current_time" => t,
        "current_step" => step,
        "current_dt" => dt
    )

    # Time-based output
    if write_fields!(fields, tracker, metadata, config, shtns_config, pencils)
        rank = MPI.Comm_rank(MPI.COMM_WORLD)
        rank == 0 && println("Output at t=$t")
    end

    # Adaptive timestep for exact output times
    dt_next = time_to_next_output(tracker, t, config)
    if 0 < dt_next < dt
        dt = dt_next
    end
end

MPI.Finalize()

Next Steps

GoalResource
Configure simulation parametersConfiguration & Parameters
Understand time integrationTime Integration
Set up boundary conditionsBoundary Topography
Contribute to developmentDeveloper Guide