Title: | Functions used within FLUSSHYGIENE project (BMBF) |
---|---|
Description: | Easy and transferable functions for creating and managing models for hygiene data in rivers. This package is developed within the FLUSSHYGIENE project. See https://bmbf.nawam-rewam.de/en/projekt/flusshygiene/ for details. |
Authors: | Carsten Vick [aut], Hauke Sonnenberg [cre] , Wolfgang Seis [aut, ths] |
Maintainer: | Hauke Sonnenberg <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.3.0 |
Built: | 2025-01-21 04:11:57 UTC |
Source: | https://github.com/KWB-R/kwb.flusshygiene |
Computes the quality assessment according to european bathing directive 2006/7/EC from E.Coli values. The four possible quality levels are: excellent, good, sufficient and poor
assess_bathing_quality_eu(e.coli, log = TRUE)
assess_bathing_quality_eu(e.coli, log = TRUE)
e.coli |
A numeric vector with e.coli values |
log |
logical. Are the values log-values? |
Returns a single factor with all quality levels
Functions for modelbuilding \n
build_model
: takes the riverdata, handles the other functions and
invokes stan_lm
build_model(riverdata, variables = ask_for_variables(riverdata), with_interaction = TRUE) ask_for_variables(riverdata) process_model_riverdata(riverdata, variables) create_formula(variables, with_interaction = FALSE)
build_model(riverdata, variables = ask_for_variables(riverdata), with_interaction = TRUE) ask_for_variables(riverdata) process_model_riverdata(riverdata, variables) create_formula(variables, with_interaction = FALSE)
riverdata |
a list with riverdata (hygiene + physical data) |
variables |
character. Selected variables for the model |
with_interaction |
logical. Formula with interactions? Default set to TRUE |
Build the model from hygiene data and physical data like flow, rain,
wwtp. Asks for user input to select variables. Computes the data.frame with
data for hygiene and chosen variables and creates a formula of the form:
Q*(K + R)
while multiple Qs will be multiplied, multiple Ks and Rs
will be added.
Returns a model of the riverdata.
Returns a character-vector with the chosen model variables
Returns a data.frame with data for hygiene and chosen variables
Returns parsed model-formula. (Like model$formula)
ask_for_variables
: Internal function. Quite time consuming
process_model_riverdata
: Internal usage
create_formula
: Internal usage
## Not run: variables <- c("e.coli","q_havel",...) lm(formula = eval(create_formula(variables)), data = process_model_riverdata(riverdata, variables)) ## End(Not run) create_formula(c("log_e.coli","q_havel","ka_ruhleben","r_berlin")) create_formula(c("e.coli","r_mitte","r_charlottenburg","r_spandau"))
## Not run: variables <- c("e.coli","q_havel",...) lm(formula = eval(create_formula(variables)), data = process_model_riverdata(riverdata, variables)) ## End(Not run) create_formula(c("log_e.coli","q_havel","ka_ruhleben","r_berlin")) create_formula(c("e.coli","r_mitte","r_charlottenburg","r_spandau"))
Main function for creation, object handling and saving of new models. The models will be saved in a own subdirectory as r-objects as a side-effect. The models will not be returned. They have to be loaded by other functions.
build_new_model(river)
build_new_model(river)
river |
character. river you want to build a model on |
This function returns merely a message what happend.
Choose model asks the user inside the console for input. Options are Exit, New Model, or one of a list of existing models. If no integer number was presented by the user an ERROR message will be created but no ERROR will be thrown. This way this can be inserted inside a loop.
choose_model(rivermodels)
choose_model(rivermodels)
rivermodels |
A list of named models |
Returns the user input as character vector, or an ERROR message.
choose_model(list()) choose_model(list(fake_river_model = 1))
choose_model(list()) choose_model(list(fake_river_model = 1))
Takes similar named variables and produces a matrix of scatterplots and their correlation coefficients to E.Coli.
correlation_scatterplot(df, ...) correlation_values(df, ...)
correlation_scatterplot(df, ...) correlation_values(df, ...)
df |
data.frame with data for e.coli and chosen variables in lagdays |
... |
Arguments passed to |
Plotting function. Returns a plot.
Returns correlation values.
correlation_values
: Internal function
correlation_values(data.frame(datum = rep("egal",10), e.coli = 1:10, var = 1:10), variable = "var")
correlation_values(data.frame(datum = rep("egal",10), e.coli = 1:10, var = 1:10), variable = "var")
Lookup laboratory tables for MPN values for E.Coli to get upper and lower 0.95 confidence interval for the given values. If value is not directly found in table it will be generated by interpolating nearest neighbors.
get_mpn_ci(e.coli)
get_mpn_ci(e.coli)
e.coli |
numeric. A vector for e.coli values |
A data.frame with 3 columns: e.coli, lo, up
## Not run: print(get_mpn_ci(c(15,30,35,60,61,71,120,1959,25000,369990))) ## End(Not run)
## Not run: print(get_mpn_ci(c(15,30,35,60,61,71,120,1959,25000,369990))) ## End(Not run)
Get List of Paths used in the Flusshygiene Project
get_paths(resolve = TRUE, ...)
get_paths(resolve = TRUE, ...)
resolve |
if |
... |
arguments passed to |
## Not run: paths <- get_paths() # Paths to the different work package folders paths$ap2 paths$ap3 paths$ap4 # What tables are contained in the ODM database? kwb.db::hsTables(paths$odm) # Get all Flusshygiene data into one data frame data <- kwb.ogre.model::get_lab_values(paths$odm) ## End(Not run)
## Not run: paths <- get_paths() # Paths to the different work package folders paths$ap2 paths$ap3 paths$ap4 # What tables are contained in the ODM database? kwb.db::hsTables(paths$odm) # Get all Flusshygiene data into one data frame data <- kwb.ogre.model::get_lab_values(paths$odm) ## End(Not run)
Read existing, preprocessed csv files with first column datetime and other columns variable information.
import_riverdata(path)
import_riverdata(path)
path |
character-string to a DATA_preprocessed_csv directory |
Returns a list of data.frames containing the river data
The kwb.flusshygiene package provides functions in three major categories: model handling, model creation and model prediction.
river_model_prediction
is the main function in this package. It uses all of the following functions from within.
get_paths
reads a serverpath library for easy directory accessing
search_existing_models
searches saved R-objects in the river directories.
build_new_model
again a overhead function for model creation. Asks the user whether or not the new model shall be saved as R-object.
import_riverdata
reads all river data from a directory.
build_model
is a small function
handling model creation and invokes stan_lm
for
model building.
ask_for_variables
asks the user which variables shall be included in the model. Creates plots as a side effect.
process_model_riverdata
processes a data.frame with
the necessary data for the data argument in
stan_lm
create_formula
creates a hygiene formula out of the
variables with the form e.coli ~ Q * (R + Ka)
predict_quality
is
the overhead function for the prediction. It also invokes
posterior_predict.stanreg
get_newdata
gathers the latest data for prediction.
print_latest
prints the prediction of the latest day.
plot_predicted_quality
plots a whole season with quality assessment.
unroll_physical_data
unrolls a list with data with lagday combinations to 5 days (default)
correlation_scatterplot
plots a scatterplot matrix of the unrolled physical data together with correlation values.
plot_stan_model
plots a stan_lm posterior predction with quality assessment.
plot_data_overview
plot data overview
plot_hygiene_overview
a statistical hygiene data overview
plot_q_overview
plot a overview of all q values
plot_rain_overview
plot a monthly overview of all rain gauges
Creates a plot with segments or points of the data availability.
plot_data_overview(riverdata, type = "segment")
plot_data_overview(riverdata, type = "segment")
riverdata |
A list of hygiene and physical data of the river |
type |
Either "segment" or "point" for more precise information |
Returns a plot
Creates a plot with three graphs: Histogramm of all e.coli values, a density curve of the last 16 values, and a boxplot of all values again
plot_hygiene_overview(hygiene_df)
plot_hygiene_overview(hygiene_df)
hygiene_df |
A data.frame with the hygiene data of a given river |
Returns a plot
Window function for plot_stan_model
plot_predicted_quality(model, prediction, ...)
plot_predicted_quality(model, prediction, ...)
model |
stan.lm model for the river |
prediction |
list of season, ppd of predcit and ppd of means |
... |
Further parameter passed to plot.default |
Plotting function. Returns a plot.
Creates a plot with the standard flowing conditions over the year. The data of all years will be taken into account.
plot_q_overview(q_df)
plot_q_overview(q_df)
q_df |
The data.frame with 2 columns: datum and q |
Returns a plot
Creates a plot with a monthly summary overview over the different rain sites
plot_rain_overview(df)
plot_rain_overview(df)
df |
A data frame with different rain gauges. |
Returns a plot
Plots a sample of posterior predictions and means. Furthermore colours an hygiene quality assessment as background (see EU Bathing Water Directive) Dark blue means excellent quality. Steelblue means good quality. Yellow means sufficient quality. Red means insufficient quality.
plot_stan_model(timestamp, predict, linpred, log = FALSE, q90, q95, nlines = 250, nlinesCenter = 100, ...)
plot_stan_model(timestamp, predict, linpred, log = FALSE, q90, q95, nlines = 250, nlinesCenter = 100, ...)
timestamp |
POSIX. The x-axis timestamp |
predict |
ppd. The posterior prediction of the model |
linpred |
ppd. The linpred (predicted means) of the model |
log |
logical. Is E.Coli log01-transformed? |
q90 |
numeric. The 90. percentile of predict. |
q95 |
numeric. The 95. precentile of predict. |
nlines |
numeric. How many lines for posterion predictions? |
nlinesCenter |
numeric. How many lines for predicted means? |
... |
Further parameters for plot.default |
Plotting function. Returns a plot.
Main function for invoking and object handling. E.Coli hygiene models will be used to predict hygiene quality on differnt scopes.
predict_quality(model, river_dir, output = "season") get_newdata(variables, river_dir) print_latest(model, newdata) get_latest_season(newdata)
predict_quality(model, river_dir, output = "season") get_newdata(variables, river_dir) print_latest(model, newdata) get_latest_season(newdata)
model |
stan_lm. A model of e.coli concentration in given river |
river_dir |
character. Path to river-data for up-to-date predictions. |
output |
character. |
variables |
character. A vector with all variables used in the model |
newdata |
data.frame with physical data used in the model |
Returns a list of physical data and prediction and linpred from model
Returns a data.frame with the merged data found
get_newdata
: Internal Usage
print_latest
: Internal Usage
get_latest_season
: Internal Usage
Read data for ODM tables from CSV files stored in the package
readTableData(sourcedir = system.file("extdata", "ODM", package = "kwb.flusshygiene"))
readTableData(sourcedir = system.file("extdata", "ODM", package = "kwb.flusshygiene"))
sourcedir |
path to input directory |
list of data frames
This function is the front-end for model search on the server, model building with existing data, or prediction with new data. It invokes all other functions and handles their objects. It is a main function.
river_model_prediction(river) search_existing_models(river_dir)
river_model_prediction(river) search_existing_models(river_dir)
river |
character. The desired river, like "isar". |
river_dir |
character. Path to server and river directory |
(invisible) The data.frame returned by the prediction plus a date column for easy plotting.
Returns a list with the existing models for that river (empty if no model was found).
search_existing_models
: directory searching
river_model_prediction(river = "isar") serverpath <- "//poseidon/projekte$/SUW_Department/Projects/FLUSSHYGIENE/Data-Work packages/Daten" river_dir <- search_existing_river_dir(river = "isar", server = serverpath) search_existing_models(river_dir = river_dir)
river_model_prediction(river = "isar") serverpath <- "//poseidon/projekte$/SUW_Department/Projects/FLUSSHYGIENE/Data-Work packages/Daten" river_dir <- search_existing_river_dir(river = "isar", server = serverpath) search_existing_models(river_dir = river_dir)
Unrolls the lagdays of data.frames.
unroll_physical_data(physical_data) unroll_lagdays(df, n = 5)
unroll_physical_data(physical_data) unroll_lagdays(df, n = 5)
physical_data |
list of river data (without hygiene) |
df |
data.frame of 2 columns: datum and var |
n |
numeric. unto to which day shall be lagged behind? |
Returns a list of data.frames for each variable. The data.frames contain the unrolled lagdays (with maxday = 5, length(df) == 17)
unroll_lagdays
: Internal usage mostly
df1 <- data.frame(datum = rep("egal", 25), var = 1:25) df2 <- data.frame(datum = rep("egal", 25), var2 = 51:75, var3 = 101:125) unroll_lagdays(df1) summary(unroll_physical_data(list(var1 = df1, var2 = df2)))
df1 <- data.frame(datum = rep("egal", 25), var = 1:25) df2 <- data.frame(datum = rep("egal", 25), var2 = 51:75, var3 = 101:125) unroll_lagdays(df1) summary(unroll_physical_data(list(var1 = df1, var2 = df2)))