--- title: "Minimal example: Wien, ET-diagnostics scenario grid" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Minimal example: Wien, ET-diagnostics scenario grid} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE, eval = TRUE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") is_ghactions <- tolower(Sys.getenv("GITHUB_ACTIONS")) == "true" || tolower(Sys.getenv("CI")) %in% c("true", "1", "yes") path_base <- system.file("extdata/models/wien/base.h5", package = "kwb.raindrop") data_available <- nzchar(path_base) && file.exists(path_base) is_windows <- Sys.info()[["sysname"]] == "Windows" ``` This vignette is a small, self-contained reduction of the full Wien workflow used for end-to-end checks (CI and review) and — since 2026-05-07 — for diagnosing the very high modelled ET share (~47-58 %) versus the SWIMM Urban Eva reference of ~7 % for a comparable parametrisation. Geometry is fixed at Daniel's reference design (`mulde_area = 75`, `mulde_height = 200`, `filter_height = 300`, `storage_height = 500`, `filter_kf = 36`, `bottom_kf = 12`). **`base.h5` defaults are used as-is on every row** (`Dach/Evapotranspiration_aktiv = 1`, `EvapPond = 1`, `LAI = 8.5`). Daniel's three review corrections (`Dach/Evapotranspiration_aktiv = 0`, `EvapPond = 0`, `LAI = 3.9`) were applied between PR #11 and the current branch but made the Tandler engine return Status 1 for **every** scenario, leaving the rendered datatable with no usable result columns. They are reverted on this branch so the engine produces output again. The corrections will be re-introduced one-by-one as sweep dimensions in a follow-up diagnostic vignette so the failing combination can be isolated. The 12-scenario grid varies three engine-side switches that are candidate ET-discrepancy drivers: * `//Berechnungsparameter/keineVerdunstungBeiRegen` — `0` (default, ET active during rain — physically suspect) vs `1` (ET suppressed while raining). * `//Berechnungsparameter/Hoernschemeyer_aktiv` — `0` (default, generic ET model) vs `1` (Hoernschemeyer et al. 2023 scheme with seasonal Growth/Shading modulation; with the current flat `Growth_1 = Shading_1 = 1` curves the legacy path effectively unmodulates ET). * `//Massnahmenelemente/Mulde_Rigole/Parameter_Evapotranspiration/ET0ref_GrasReferenzverdunstung` — default `0.01` looks like a scaling factor, not a reference ET₀ value. Sweep `0`, `1`, `100` to check whether the parameter acts as a multiplier on the ET₀ curve (in which case `1` is "normal" behaviour and the `0.01` default is what is currently throttling / unbalancing ET). After the run, the per-scenario `*.h5` inputs are dumped to a single XLSX file (`raindrop_wien_minimal_params.xlsx`) with one sheet per scenario and a `base` sheet for the un-modified template, so the exact engine input for any row can be diffed against the others. ### Input data Precipitation (10-minute resolution) and evapotranspiration (daily ET0) for 2011-2025 are provided by [GeoSphere Austria](https://www.geosphere.at/) (Österreichischer Wetterdienst, formerly ZAMG) and ship with the package under `inst/extdata/models/wien/`. The HDF5 model template (`base.h5`) is produced with the Tandler "Regenwasserbewirtschaftung" calculation engine; the engine itself is downloaded from the `KWB-R/kwb.raindrop.binaries` GitHub Release on demand via `kwb.raindrop::download_engine()`. ### Static `base.h5` parameter overview All values present in the shipped `base.h5` template. The scenarios below override geometry, three ET-related engine switches, plus the three corrections listed above (roof ET off, pond ET off, LAI = 3.9). Everything else listed here is the static default that drives the simulation. ```{r show_static_params, eval = data_available} library(kwb.raindrop) h5 <- hdf5r::H5File$new(path_base, mode = "r") all_vals <- kwb.raindrop::h5_read_values(h5) h5$close_all() format_value <- function(v) { if (is.data.frame(v)) { sprintf("[time series: %d rows, columns: %s]", nrow(v), paste(names(v), collapse = ", ")) } else if (length(v) > 1L) { paste(format(v, trim = TRUE), collapse = ", ") } else { format(v, trim = TRUE) } } base_params <- tibble::tibble( parameter = names(all_vals), value = vapply(all_vals, format_value, character(1L)) ) |> dplyr::arrange(parameter) DT::datatable( base_params, filter = "top", options = list(pageLength = 25, autoWidth = TRUE), caption = sprintf("All %d parameters in base.h5 (Wien)", nrow(base_params)) ) ``` ### Twelve-scenario parameter grid Daniel's reference geometry at every row; the cross product of `keineVerdunstungBeiRegen × Hoernschemeyer_aktiv × ET0ref` gives 2 × 2 × 3 = 12 scenarios. ```{r preparation, eval = data_available} path_list <- list( modelname = "Wien", root_path = file.path(tempdir(), "raindrop_wien_minimal"), dir_input = "/models//input", dir_output = "/models//output", dir_target_output = "/", file_errors_hdf5 = "Fehlerprotokoll.h5", file_results_hdf5_element = "Mulde_Rigole.h5", file_results_hdf5_flaeche = "Dach.h5", file_results_hdf5_verschaltungen = "_Verschaltungen.h5", file_results_txt = "Mulde_Rigole_RAINDROP.txt", file_results_txt_multilayer = "Mulde_Rigole_RAINDROP_multi_layer.txt", file_target = ".h5", path_base = system.file("extdata/models/wien/base.h5", package = "kwb.raindrop"), path_exe = if (is_windows) kwb.raindrop::download_engine() else NA_character_, path_et = system.file("extdata/models/wien/et.csv", package = "kwb.raindrop"), path_rain = system.file("extdata/models/wien/rain.csv.gz", package = "kwb.raindrop"), path_errors_hdf5 = "/", path_results_hdf5_element = "/", path_results_hdf5_flaeche = "/", path_results_hdf5_verschaltungen = "/", path_results_txt = "/", path_results_txt_multilayer = "/", path_target_input = "/" ) param_grid <- expand.grid( keineVerdunstungBeiRegen = c(0L, 1L), Hoernschemeyer_aktiv = c(0L, 1L), ET0ref_factor = c(0, 1, 100), KEEP.OUT.ATTRS = FALSE, stringsAsFactors = FALSE ) param_grid <- tibble::as_tibble(param_grid) |> dplyr::mutate( scenario_name = sprintf("s%05d", seq_len(dplyr::n())), connected_area = 1000, mulde_area = 75, mulde_height = 200, filter_hydraulicconductivity = 36, filter_height = 300, storage_height = 500, bottom_hydraulicconductivity = 12, rain_factor = 1 ) |> dplyr::relocate(scenario_name, .before = dplyr::everything()) DT::datatable( param_grid, filter = "top", options = list(pageLength = 12, autoWidth = TRUE), caption = "Twelve scenarios — fixed Daniel-reference geometry, sweep of three ET-related engine switches." ) psi_s_mm <- function(kf_mmh) (3.237 * (kf_mmh / 25.4)^(-0.328)) * 25.4 paths <- kwb.utils::resolve(path_list) timeseries_et <- readr::read_delim(paths$path_et, delim = ";", col_types = "cd") |> dplyr::mutate( date = lubridate::dmy(date), time = as.integer(difftime(date, min(date), units = "hours")) ) |> dplyr::select(-date) |> dplyr::filter(!is.na(value)) |> dplyr::relocate(time, .before = value) timeseries_et$time[nrow(timeseries_et)] <- ceiling(timeseries_et$time[nrow(timeseries_et)]) timeseries_rain <- readr::read_csv(paths$path_rain, show_col_types = FALSE) |> dplyr::rename(datetime = time, value = rr) |> dplyr::mutate(time = as.double(difftime(datetime, min(datetime), units = "secs")) / 3600) |> dplyr::filter(!is.na(value)) |> dplyr::select(-datetime, -station) |> dplyr::relocate(time, .before = value) timeseries_rain$time[nrow(timeseries_rain)] <- ceiling(timeseries_rain$time[nrow(timeseries_rain)]) # Align ET / rain end-times to the longer of the two if (max(timeseries_rain$time) > max(timeseries_et$time)) { timeseries_et <- dplyr::bind_rows( timeseries_et, tibble::tibble(time = max(timeseries_rain$time), value = timeseries_et$value[nrow(timeseries_et)]) ) } if (max(timeseries_et$time) > max(timeseries_rain$time)) { timeseries_rain <- dplyr::bind_rows( timeseries_rain, tibble::tibble(time = max(timeseries_et$time), value = timeseries_rain$value[nrow(timeseries_rain)]) ) } # Convert rain from mm to mm/h (per 10-min interval) period <- c(diff(timeseries_rain$time), mean(diff(timeseries_rain$time))) timeseries_rain$value <- timeseries_rain$value / period # Convert ET0 from mm/day to mm/h. The engine reads //Kurven/ET0 as a rate in # mm/h (same convention as //Kurven/Regen on the hour-based time axis), so the # daily ET0 values must be divided by their interval in hours (= 24 for daily # data) -- exactly the symmetric step rain already gets. Without it the kernel # integrates ET0 24x too high, which is the cause of the implausibly large # modelled ET share. period_et <- c(diff(timeseries_et$time), mean(diff(timeseries_et$time))) timeseries_et$value <- timeseries_et$value / period_et ``` ### Run the twelve scenarios ```{r run_model, eval = data_available && is_windows} run_one <- function(i, timestep_hours, debug = FALSE, ...) { param_grid_tmp <- param_grid[i, ] paths <- kwb.utils::resolve(path_list, dir_target = param_grid_tmp$scenario_name) fs::dir_create(paths$dir_input, recurse = TRUE) fs::dir_create(paths$dir_output, recurse = TRUE) fs::dir_create(paths$dir_target_output, recurse = TRUE) fs::file_copy(path = paths$path_base, new_path = paths$path_target_input, overwrite = TRUE) h5 <- hdf5r::H5File$new(paths$path_target_input, mode = "a") new_path <- stringr::str_c(normalizePath(fs::path_abs(paths$dir_target_output)), "\\") vals <- kwb.raindrop::h5_read_values(h5) vals$`//Berechnungsparameter/Ergebnispfad` <- new_path vals$`//Berechnungsparameter/Zeitschritt_Infiltration` <- timestep_hours vals$`//Berechnungsparameter/Zeitschritt_ET` <- timestep_hours vals$`//Berechnungsparameter/Zeitschritt_Verschaltungen` <- timestep_hours vals$`//Berechnungsparameter/R-Plots` <- 0 vals$`//Berechnungsparameter/Ausgabemodus` <- "Optimierung" vals$`//Berechnungsparameter/Evapotranspiration_aktiv` <- 1 # ET-diagnostics sweep dimensions vals$`//Berechnungsparameter/keineVerdunstungBeiRegen` <- param_grid_tmp$keineVerdunstungBeiRegen vals$`//Berechnungsparameter/Hoernschemeyer_aktiv` <- param_grid_tmp$Hoernschemeyer_aktiv vals$`//Massnahmenelemente/Mulde_Rigole/Parameter_Evapotranspiration/ET0ref_GrasReferenzverdunstung` <- param_grid_tmp$ET0ref_factor # NOTE: Daniel's three review corrections (Dach/ET = 0, EvapPond = 0, # LAI = 3.9) were applied here from PR #11 onward but made the Tandler # engine return Status 1 for every scenario, which left every result # column NA in the rendered datatable. Reverted on this branch so the # engine produces output; corrections will be re-introduced one-by-one # as sweep dimensions in a follow-up so the failing combination can be # isolated. vals$`//Massnahmenelemente/Dach/Berechnungsparameter/Evapotranspiration_aktiv` <- 1 vals$`//Massnahmenelemente/Dach/Allgemein/Flaeche` <- param_grid_tmp$connected_area vals$`//Massnahmenelemente/Mulde_Rigole/Berechnungsparameter/Evapotranspiration_aktiv` <- 1 vals$`//Massnahmenelemente/Mulde_Rigole/Allgemein/Regen-Skalierungsfaktor` <- param_grid_tmp$rain_factor vals$`//Massnahmenelemente/Mulde_Rigole/Allgemein/Flaeche` <- param_grid_tmp$mulde_area vals$`//Massnahmenelemente/Mulde_Rigole/Eigenschaften_Oberflaeche/Ueberlaufhoehe` <- param_grid_tmp$mulde_height vals$`//Massnahmenelemente/Mulde_Rigole/Bodenschichtung/Startwerte_theta_ActualSoilMoisture` <- c(0.3, 0) vals$`//Massnahmenelemente/Mulde_Rigole/Bodenschichtung/Schichtdicken` <- c( param_grid_tmp$filter_height, param_grid_tmp$storage_height ) vals$`//Massnahmenelemente/Mulde_Rigole/Allgemein/Endversickerungsrate` <- param_grid_tmp$bottom_hydraulicconductivity vals$`//Bodenarten/Bodenfilter/Ks_HydraulicConductivity` <- param_grid_tmp$filter_hydraulicconductivity vals$`//Bodenarten/Bodenfilter/Psi_Saugspannung_CapillarySuction` <- psi_s_mm(param_grid_tmp$filter_hydraulicconductivity) vals$`//Kurven/ET0` <- timeseries_et vals$`//Kurven/Regen` <- timeseries_rain vals$`//Kurven/Growth_1`$time[2] <- max(timeseries_rain$time) vals$`//Kurven/Shading_1`$time[2] <- max(timeseries_rain$time) # h5_write_values(strict = FALSE) downgrades errors (e.g. a scalar dataset # given a length-2 override) to warnings so the scenario still produces an # input H5 the engine can attempt to run. Combined with the per-scenario # tryCatch below, this means one broken scenario no longer aborts # run_scenarios for everyone else. The diagnostic goes through message() # (not cat()) so it survives the future.apply worker boundary should this # vignette ever switch to parallel = TRUE -- the call below passes # parallel = FALSE today so the choice is forward-looking. tryCatch({ kwb.raindrop::h5_write_values(h5, vals, resize = TRUE, strict = FALSE, scalar_strategy = "first", verbose = FALSE) if (h5$is_valid) h5$close_all() kwb.raindrop::run_model(path_exe = paths$path_exe, path_input = paths$path_target_input, debug = debug) }, error = function(e) { if (h5$is_valid) try(h5$close_all(), silent = TRUE) message(sprintf("[%s] run failed: %s", param_grid_tmp$scenario_name, conditionMessage(e))) }) invisible(NULL) } kwb.raindrop::run_scenarios( indices = seq_len(nrow(param_grid)), run_one_scenario = run_one, timestep_hours = 0.1, debug = FALSE, parallel = FALSE, show_progress = FALSE ) ``` ### Persist per-scenario engine inputs to XLSX The XLSX dump bundles, in one file: * `base` — full flattened list of `base.h5` values (un-modified template). * `timeseries_info` — period and water-balance totals for the rain and ET0 series fed to every scenario (these are identical across runs). * `applied_settings` — long-format diff of every key the package writes on top of `base.h5`, per scenario (`scenario × parameter × base_value × scenario_value`). Use this to see at a glance exactly what RAINDROP is being told to change. * `s00001`, `s00002`, ..., `s00012` — full per-scenario H5 dump (the same view as `base`, but for each scenario's modified input file). ```{r xlsx_dump, eval = data_available && is_windows} format_h5_value <- function(v) { if (is.null(v)) { "" } else if (is.data.frame(v)) { sprintf("[time series: %d rows; cols: %s]", nrow(v), paste(names(v), collapse = ", ")) } else if (length(v) > 1L) { paste(format(v, trim = TRUE), collapse = ", ") } else { format(v, trim = TRUE) } } vals_to_tibble <- function(vals) { tibble::tibble( parameter = names(vals), value = vapply(vals, format_h5_value, character(1L)) ) |> dplyr::arrange(parameter) } vals_equal <- function(a, b) { if (is.null(a) || is.null(b)) return(identical(a, b)) isTRUE(all.equal(a, b)) } # --- Base sheet ----------------------------------------------------------- h5_base <- hdf5r::H5File$new(path_base, mode = "r") base_vals <- kwb.raindrop::h5_read_values(h5_base) h5_base$close_all() sheets <- list(base = vals_to_tibble(base_vals)) # --- Timeseries info (identical across all scenarios) --------------------- # `timeseries_rain$value` and `timeseries_et$value` are both mm/h after the # per-interval scaling earlier; multiply by the original interval (in h) to # recover total mm. rain_period_h <- c(diff(timeseries_rain$time), mean(diff(timeseries_rain$time))) rain_total_mm <- sum(timeseries_rain$value * rain_period_h) et0_period_h <- c(diff(timeseries_et$time), mean(diff(timeseries_et$time))) et0_total_mm <- sum(timeseries_et$value * et0_period_h) sim_period_h <- max(timeseries_rain$time) - min(timeseries_rain$time) sim_period_yr <- sim_period_h / (24 * 365.25) sheets$timeseries_info <- tibble::tibble( series = c("Rain (//Kurven/Regen)", "ET0 (//Kurven/ET0)"), n_rows = c(nrow(timeseries_rain), nrow(timeseries_et)), unit = rep("mm/h (engine input; convert via *period_h to mm)", 2), period_hours = round(c(sim_period_h, max(timeseries_et$time) - min(timeseries_et$time)), 1), period_years = round(rep(sim_period_yr, 2), 2), total_mm = round(c(rain_total_mm, et0_total_mm), 1), mean_per_year = round(c(rain_total_mm, et0_total_mm) / sim_period_yr, 1), min_value = c(min(timeseries_rain$value), min(timeseries_et$value)), max_value = c(max(timeseries_rain$value), max(timeseries_et$value)) ) # --- Applied settings: per-scenario diff vs. base.h5 ---------------------- applied_rows <- list() for (sname in param_grid$scenario_name) { p <- kwb.utils::resolve(path_list, dir_target = sname) h5_s <- hdf5r::H5File$new(p$path_target_input, mode = "r") scenario_vals <- kwb.raindrop::h5_read_values(h5_s) h5_s$close_all() sheets[[sname]] <- vals_to_tibble(scenario_vals) for (k in union(names(base_vals), names(scenario_vals))) { if (!vals_equal(base_vals[[k]], scenario_vals[[k]])) { applied_rows[[length(applied_rows) + 1L]] <- tibble::tibble( scenario = sname, parameter = k, base_value = format_h5_value(base_vals[[k]]), scenario_value = format_h5_value(scenario_vals[[k]]) ) } } } sheets$applied_settings <- if (length(applied_rows) > 0L) { dplyr::bind_rows(applied_rows) |> dplyr::arrange(parameter, scenario) } else { tibble::tibble(scenario = character(), parameter = character(), base_value = character(), scenario_value = character()) } # Re-order: base + timeseries_info + applied_settings + per-scenario sheets. sheets <- c(sheets[c("base", "timeseries_info", "applied_settings")], sheets[param_grid$scenario_name]) xlsx_path <- file.path(tempdir(), "raindrop_wien_minimal_params.xlsx") writexl::write_xlsx(sheets, path = xlsx_path) message("Wrote XLSX (", length(sheets), " sheets: base + timeseries_info + applied_settings + ", nrow(param_grid), " scenarios) to:\n ", xlsx_path) ``` ### Results ```{r analyse_results, eval = data_available && is_windows} simulation_results <- kwb.raindrop::get_simulation_results_optim_parallel( paths = paths, path_list = path_list, simulation_names = param_grid$scenario_name, debug = FALSE ) # get_simulation_results_optim_parallel() now returns NULL only when the # element H5 (Mulde_Rigole.h5) is missing; a missing connected-area H5 # (Dach.h5) yields a partial result with connected_area = NULL. Both cases # are absorbed by add_overflow_events_and_waterbalance(), which emits a # warning and fills the corresponding columns with NA. # canonical_variables provides a fallback column template so the rendered # datatable still exposes element.WB_*_ and connectedarea.WB_*_ columns # even when every scenario fails (e.g. the engine crashes for all 12 # scenarios with a given override combination, leaving simulation_results # as a list of NULLs). simulation_results_optimisation <- kwb.raindrop::add_overflow_events_and_waterbalance( simulation_results = simulation_results, event_separation_hours = 4, canonical_variables = kwb.raindrop::default_canonical_wb_variables() ) results <- param_grid |> dplyr::left_join(simulation_results_optimisation, by = c("scenario_name" = "s_name")) |> dplyr::relocate(scenario_name, .before = connected_area) # Attach construction-cost columns (storage layer assumed infiltration box; # switch to gravel trench via storage_type = "gravel_trench" or a per-row # storage_type column in param_grid). results <- kwb.raindrop::compute_costs(results) DT::datatable( results, filter = "top", options = list(pageLength = 12, autoWidth = TRUE), caption = "Twelve-scenario simulation results — water balance + overflow events + construction costs (EUR). Sort by the ET share column to see which combination of `keineVerdunstungBeiRegen` / `Hoernschemeyer_aktiv` / `ET0ref_factor` brings the modelled ET share closest to the SWIMM-Urban-Eva reference of ~7%. Scenarios whose water-balance step errored (see chunk log) are present in the table but have NA in the result columns." ) ``` The `cost_*` columns enable cost-aware optimisation: filter or sort the table by `cost_total` to find the cheapest scenario that still meets a hydraulic target (low overflow count, high evapotranspiration share, ...). Switching the storage layer between **infiltration box** (Austrian "Sickerbox", ≈ 95 % porosity, 350 EUR/m³) and **gravel trench** (Austrian "Schotterrigol", ≈ 30 % porosity, 50 EUR/m³) trades storage volume against cost; pass `storage_type = "gravel_trench"` to `compute_costs()` (or add a `storage_type` column to `param_grid`) to explore that trade-off.