Package 'wasserportal'

Title: R Package with Functions for Scraping Data of Wasserportal Berlin
Description: R Package with Functions for Scraping Data of Wasserportal Berlin (https://wasserportal.berlin.de), which contains real-time data of surface water and groundwater monitoring stations.
Authors: Hauke Sonnenberg [aut] (ORCID: <https://orcid.org/0000-0001-9134-2871>), Michael Rustler [aut, cre] (ORCID: <https://orcid.org/0000-0003-0647-7726>), AD4GD [fnd], DWC [fnd], IMPETUS [fnd], PROMISCES [fnd], Kompetenzzentrum Wasser Berlin gGmbH (KWB) [cph]
Maintainer: Michael Rustler <[email protected]>
License: MIT + file LICENSE
Version: 0.7.0
Built: 2026-06-19 14:11:56 UTC
Source: https://github.com/KWB-R/wasserportal

Help Index


Helper function: base url for download

Description

Helper function: base url for download

Usage

base_url_download()

Value

base url for download of csv/zip files prepared by R package


Create Text Labels from Data Frame Columns

Description

Create Text Labels from Data Frame Columns

Usage

columns_to_labels(data, columns, fmt = "%s: %s", sep = ", ")

Arguments

data

data frame

columns

names of columns from which to create labels

fmt

format string passed to sprintf

sep

separator (default: ", ")

Value

vector of character with as many elements as there are rows in data

Examples

data <- data.frame(number = 1:2, name = c("adam", "eva"), value = 3:4)
columns <- c("name", "value")
columns_to_labels(data, columns)
columns_to_labels(data, columns, fmt = "<p>%s: %s</p>", sep = "")

Provide Tables of Wasserportal API Documentation

Description

The tables that appear in the API documentation of the wasserportal (https://wasserportal.berlin.de/download/wasserportal_berlin_getting_data.pdf) have been added to the wasserportal package. This function returns a list of data frames with each element representing one of these tables.

Usage

get_api_tables(name = NULL)

Arguments

name

of element from the list of data frames to be selected. If this argument is left blank (name = NULL), the default, the list of data frames is returned.

Value

list of data frames or data frame specified by the name argument

Examples

get_api_tables()

Get Daily Surfacewater Data: wrapper to scrape daily surface water data

Description

Get Daily Surfacewater Data: wrapper to scrape daily surface water data

Usage

get_daily_surfacewater_data(
  stations,
  variables = get_surfacewater_variables(),
  list2df = FALSE
)

Arguments

stations

stations as retrieved by by get_stations

variables

variables as retrieved by by get_surfacewater_variables

list2df

convert result list to data frame (default: FALSE)

Value

list or data frame with all available data from Wasserportal

Examples

## Not run: 
stations <- wasserportal::get_stations()
variables <- wasserportal::get_surfacewater_variables()
variables
sw_data_daily <- wasserportal::get_daily_surfacewater_data(stations, variables)

## End(Not run)

Get Groundwater Data

Description

wrapper function to scrape all available raw data, i.e. groundwater level and quality data and save in list

Usage

get_groundwater_data(
  stations,
  groundwater_options = get_groundwater_options(),
  debug = TRUE,
  stations_list = NULL
)

Arguments

stations

list as retrieved by get_stations. Deprecated. Please use stations_list instead

groundwater_options

as retrieved by get_groundwater_options

debug

print debug messages (default: TRUE)

stations_list

list of station metadata as returned by get_stations(type = "list")

Value

list with elements "groundwater.level" and "groundwater.quality" data frames

Examples

## Not run: 
stations <- wasserportal::get_stations()
gw_data_list <- get_groundwater_data(stations)
str(gw_data_list)

## End(Not run)

Helper function: get groundwater options

Description

Helper function: get groundwater options

Usage

get_groundwater_options()

Value

return available groundwater data options and prepare for being used as input for get_groundwater_data

Examples

get_groundwater_options()

Wasserportal Berlin: get overview options for stations

Description

Wasserportal Berlin: get overview options for stations

Usage

get_overview_options()

Value

list with shortcuts to station overview tables (⁠wasserportal.berlin.de/messwerte.php?anzeige=tabelle&thema=<shortcut>⁠)

Examples

get_overview_options()

Helper function: get available station variables

Description

Helper function: get available station variables

Usage

get_station_variables(station_df)

Arguments

station_df

data frame with one row per station and columns "Messstellennummer", "Messstellenname" and additional columns each of which represents a variable that is measured at that station. If the variable columns contain the value "x" it means that the corresponding variable is measured and the name of the column is contained in the returned vector of variable names.

Value

returns names of available variables for station


Get Stations

Description

Get Stations

Usage

get_stations(
  type = c("list", "data.frame", "crosstable"),
  run_parallel = TRUE,
  n_cores = parallel::detectCores() - 1L,
  debug = TRUE
)

Arguments

type

vector of character describing the type(s) of output(s) to be returned. Expected values (and default): c("list", "data.frame", "crosstable"). If only one value is given the data is returned in the expected type. If more than one values are given, a list is returned with one list element per type.

run_parallel

default: TRUE

n_cores

number of cores to use if run_parallel = TRUE. Default: one less than the detected number of cores.

debug

logical indicating whether or not to show debug messages

Value

list with general station "overview" (either as list "overview_list" or as data.frame "overview_df") and a crosstable with information which parameters is available per station ("x" if available, NA if not)

Examples

stations <- wasserportal::get_stations(n_cores = 2L)
str(stations)

Get Surface Water Quality for Multiple Monitoring Stations

Description

Get Surface Water Quality for Multiple Monitoring Stations

Usage

get_surfacewater_qualities(station_ids, dbg = TRUE)

Arguments

station_ids

vector with ids of multiple (or one) monitoring stations

dbg

print debug messages (default: TRUE)

Value

data frame with water quality data for multiple monitoring stations

Examples

## Not run: 
stations <- wasserportal::get_stations()
station_ids <- stations$overview_list$surface_water.quality$Messstellennummer
swq <- wasserportal::get_surfacewater_qualities(station_ids)
str(swq)

## End(Not run)

Get Surface Water Quality for One Monitoring Station

Description

Get Surface Water Quality for One Monitoring Station

Usage

get_surfacewater_quality(station_id)

Arguments

station_id

id of surface water measurement station

Value

data frame with water quality data for one monitoring station

Examples

## Not run: 
stations <- wasserportal::get_stations()
station_id <- stations$overview_list$surface_water.quality$Messstellennummer[1]
swq <- wasserportal::get_surfacewater_quality(station_id)
str(swq)

## End(Not run)

Helper function: get surface water variables

Description

Helper function: get surface water variables

Usage

get_surfacewater_variables()

Value

vector with surface water variables


Wasserportal Berlin: get master data for a single station

Description

Wasserportal Berlin: get master data for a single station

Usage

get_wasserportal_master_data(master_url)

Arguments

master_url

url with master data for single station as retrieved by get_wasserportal_stations_table

Value

data frame with metadata for selected station

Examples

## Not run: 
stations_list <- wasserportal::get_stations(type = "list")

# GW Station
master_url <- stations_list %>%
  kwb.utils::selectElements("groundwater.level") %>%
  kwb.utils::selectColumns("stammdaten_link")[1L]

get_wasserportal_master_data(master_url)

# SW Station

# Reduce  to monitoring stations maintained by Berlin
master_urls <- stations_list %>%
  kwb.utils::selectElements("surface_water.water_level") %>%
  dplyr::filter(.data$Betreiber == "Land Berlin") %>%
  dplyr::pull(.data$stammdaten_link)

get_wasserportal_master_data(master_urls[1L])

## End(Not run)

Wasserportal Berlin: get master data for a multiple stations

Description

Wasserportal Berlin: get master data for a multiple stations

Usage

get_wasserportal_masters_data(master_urls, run_parallel = TRUE)

Arguments

master_urls

URLs to master data as found in column "stammdaten_link" of the data frame returned by get_stations(type = "list")

run_parallel

default: TRUE

Value

data frame with metadata for selected master urls

Examples

## Not run: 
stations_list <- wasserportal::get_stations(type = "list")

# Reduce  to monitoring stations maintained by Berlin
master_urls <- stations_list$surface_water.water_level %>%
  dplyr::filter(.data$Betreiber == "Land Berlin") %>%
  dplyr::pull(.data$stammdaten_link)

system.time(master_parallel <- get_wasserportal_masters_data(
  master_urls
))

system.time(master_sequential <- get_wasserportal_masters_data(
  master_urls,
  run_parallel = FALSE
))

## End(Not run)

Get Names and IDs of the Stations of wasserportal.berlin.de

Description

Get Names and IDs of the Stations of wasserportal.berlin.de

Usage

get_wasserportal_stations(type = "quality")

Arguments

type

one of "quality", "level", "flow"


Wasserportal Berlin: get stations overview table

Description

Wasserportal Berlin: get stations overview table

Usage

get_wasserportal_stations_table(
  type = get_overview_options()$groundwater$level,
  url_wasserportal = wasserportal_base_url()
)

Arguments

type

type of stations table to retrieve. Valid options defined in get_overview_options, default: get_overview_options()$groundwater$level

url_wasserportal

base url to Wasserportal berlin (default: wasserportal_base_url

Value

data frame with master data of selected monitoring stations

Examples

types <- wasserportal::get_overview_options()
str(types)
sw_l <- wasserportal::get_wasserportal_stations_table(type = types$surface_water$water_level)
str(sw_l)

Get Names and IDs of the Variables of wasserportal.berlin.de

Description

Get Names and IDs of the Variables of wasserportal.berlin.de

Usage

get_wasserportal_variables(station = NULL)

Arguments

station

station id. If given, only variables that are available for the given station are returned.


Download and Inspect Wasserportal ZIP Files Hosted on gh-pages

Description

Convenience helper for local debugging of the daily ZIP artefacts published at https://kwb-r.github.io/wasserportal. Downloads each ZIP, extracts the CSV, reads it with readr::read_csv() and prints a short summary (columns, row count, unique Messstellennummer count, head of the data). The intersection of Messstellennummer values across all loaded files is reported at the end so you can quickly see how many stations have measurements in every file.

Usage

inspect_gh_pages_zips(
  files = c("groundwater_level.zip", "groundwater_quality.zip"),
  base_url = "https://kwb-r.github.io/wasserportal",
  destdir = tempfile("wasserportal-inspect-"),
  head_rows = 5L
)

Arguments

files

character vector of ZIP file names hosted under base_url. Defaults to the two groundwater ZIPs.

base_url

base URL where the ZIPs are hosted, without trailing slash. Default: ⁠https://kwb-r.github.io/wasserportal⁠.

destdir

directory used to download and extract the ZIPs. Default is a fresh tempdir; pass an explicit path to keep the unpacked CSVs around for further inspection.

head_rows

number of rows to print from the top of every loaded data frame. Default 5.

Details

Returns the loaded data frames invisibly so the caller can further inspect them in R, e.g. dat$groundwater_level$Parameter |> table().

Value

invisibly a named list of tibbles, one per input file. Names are derived from the ZIP basename without the extension.

Examples

## Not run: 
# default: groundwater level + groundwater quality
dat <- inspect_gh_pages_zips()

# any ZIPs you want to inspect:
dat <- inspect_gh_pages_zips(files = c(
  "daily_surface-water_water-level.zip",
  "daily_surface-water_temperature.zip"
))

# keep the extracted CSVs:
dat <- inspect_gh_pages_zips(destdir = "~/tmp/wasserportal-inspect")

## End(Not run)

Helper function: list data to csv or zip

Description

Helper function: list data to csv or zip

Usage

list_data_to_csv_or_zip(data_list, file_prefix, to_zip)

Arguments

data_list

data in list form

file_prefix

file prefix

to_zip

whether or not to convert to zip file

Value

loops through list of data frames and uses list names as filenames


Helper function: list masters data to csv

Description

Helper function: list masters data to csv

Usage

list_masters_data_to_csv(masters_data_list)

Arguments

masters_data_list

masters data in list form as retrieved by get_stations(type = "list")

Value

loops through list of data frames and uses list names as filenames

Examples

## Not run: 
stations_list <- get_stations(type = "list")
masters_data_csv_files <- list_masters_data_to_csv(stations_list)
masters_data_csv_files

## End(Not run)

Helper function: list timeseries data to zip

Description

Helper function: list timeseries data to zip

Usage

list_timeseries_data_to_zip(timeseries_data_list)

Arguments

timeseries_data_list

time series data in list form as retrieved by get_groundwater_data

Value

loops through list of data frames and uses list names as filenames

Examples

## Not run: 
stations <- wasserportal::get_stations()

# Groundwater Time Series
gw_tsdata_list <- wasserportal::get_groundwater_data(stations)
gw_tsdata_files <- wasserportal::list_timeseries_data_to_zip(gw_tsdata_list)

# Surface Water Time Series
sw_tsdata_list <- wasserportal::get_daily_surfacewater_data(stations)
sw_tsdata_files <- wasserportal::list_timeseries_data_to_zip(sw_tsdata_list)

## End(Not run)

Helper function to read CSV

Description

Helper function to read CSV

Usage

read(text, ...)

Arguments

text

text

...

... additional arguments passed to read.table

Value

data frame with values


Download and Read Data from wasserportal.berlin.de

Description

This function downloads and reads CSV files from wasserportal.berlin.de.

Usage

read_wasserportal(
  station,
  variables = NULL,
  from_date = as.character(Sys.Date() - 90L),
  type = "single",
  include_raw_time = FALSE,
  stations_crosstable
)

Arguments

station

station number, as found in column "Messstellennummer" of the data frame returned by get_stations(type = "crosstable")

variables

vector of variable identifiers, as returned by get_station_variables

from_date

Date object (or string in format "yyyy-mm-dd" that can be converted to a Date object representing the first day for which to request data. Default: as.character(Sys.Date() - 90L)

type

one of "single" (the default), "daily", "monthly"

include_raw_time

if TRUE the original time column and the column with the corrected winter time are included in the output. The default is FALSE.

stations_crosstable

data frame as returned by get_stations(type = "crosstable")

Details

The original timestamps (column timestamps_raw in the example below) are not all plausible, e.g. "31.03.2019 03:00" appears twice! They are corrected (column timestamp_corr) to represent a plausible sequence of timestamps in Berlin Normal Time (UTC+01) Finally, a valid POSIXct timestamp in timezone "Berlin/Europe" (UTC+01 in winter, UTC+02 in summer) is created, together with the additional information on the UTC offset (column UTCOffset, 1 in winter, 2 in summer).

Value

data frame read from the CSV file that the download provides. IMPORTANT: It is not yet clear how to interpret the timestamp, see example

Examples

## Not run: 
# Get a list of available water quality stations and variables
stations_crosstable <- wasserportal::get_stations(type = "crosstable")

# Set the start date
from_date <- "2021-03-01"

# Read the timeseries (multiple variables for one station)
water_quality <- wasserportal::read_wasserportal(
  station = stations_crosstable$Messstellennummer[1L],
  from_date = from_date,
  include_raw_time = TRUE,
  stations_crosstable = stations_crosstable
)

# Look at the first few records
head(water_quality)

# Check the metadata
#kwb.utils::getAttribute(water_quality, "metadata")

# Set missing values to NA
water_quality[water_quality == -777] <- NA

# Look at the first few records again
head(water_quality)

### How was the original timestamp interpreted?

# Determine the days at which summer time starts and ends, respectively
from_year <- as.integer(substr(from_date, 1L, 4L))
switches <- kwb.datetime::date_range_CEST(from_year)

# Reformat to dd.mm.yyyy
switches <- kwb.datetime::reformatTimestamp(switches, "%Y-%m-%d", "%d.%m.%Y")

# Define a pattern to look for timestamps "around" the switches
pattern <- paste(switches, "0[1-4]", collapse = "|")

# Look at the data for these timestamps
water_quality[grepl(pattern, water_quality$timestamp_raw), ]

# The original timestamps (timestamps_raw) were not all plausible, e.g.
# for March 2019. This seems to have been fixed by the "wasserportal"!
sum(water_quality$timestamp_raw != water_quality$timestamp_corr)

## End(Not run)

Read Wasserportal Raw

Description

Read Wasserportal Raw

Usage

read_wasserportal_raw(
  variable,
  station,
  from_date,
  type = "single",
  include_raw_time = FALSE,
  handle = NULL,
  stations_crosstable,
  api_version = 2L
)

Arguments

variable

variable

station

station id

from_date

start date

type

one of "single", "daily", "monthly" (default: "single")

include_raw_time

TRUE or FALSE (default: FALSE)

handle

handle (default: NULL)

stations_crosstable

data frame as returned by get_stations(type = "crosstable")

api_version

1 integer number representing the version of wasserportal's API. 1L: before 2023, 2L: since 2023. Default: 2L

Value

????


read_wasserportal_raw_gw

Description

read_wasserportal_raw_gw

Usage

read_wasserportal_raw_gw(
  station = 149,
  stype = "gws",
  type = "single_all",
  from_date = "",
  include_raw_time = FALSE,
  handle = NULL,
  as_text = FALSE,
  dbg = FALSE
)

Arguments

station

station id

stype

"gws" or "gwq"

type

"single" or "single_all" (if stype = "gwq")

from_date

(default: "")

include_raw_time

default: FALSE

handle

default: NULL

as_text

if TRUE, the raw text that is returned by the HTTP request to the Wasserportal is returned by this function. Otherwise (the default) the raw text is tried to be interpreted as comma separated values and a corresponding data frame is returned. Use as_text = TRUE to analyse the raw text in case that an error occurs when trying to convert the text to a data frame.

dbg

logical indicating whether or not to show debug messages. The default is FALSE

Value

data.frame with values

Examples

## Not run: 
read_wasserportal_raw_gw(station = 149, stype = "gws")
read_wasserportal_raw_gw(station = 149, stype = "gwq")

## End(Not run)

Read CSV File from Package's "extdata" Folder

Description

Read CSV File from Package's "extdata" Folder

Usage

readPackageFile(file, ...)

Arguments

file

file name (without path)

...

additional arguments passed to read.csv

Value

data frame representing the content of file


Delete All Time-Series Data for Selected Keys on a ThingsBoard Device

Description

Wipes historical telemetry rows from ThingsBoard for the given device and keys via DELETE /api/plugins/telemetry/DEVICE/{id}/timeseries/delete. Pass keys = NULL (the default) to clear every key the device currently knows – the function then calls tb_list_device_telemetry_keys first to discover them.

Usage

tb_delete_device_telemetry(
  device_id,
  keys = NULL,
  api_key = Sys.getenv("TB_API_KEY"),
  host = tb_default_host(),
  delete_latest = TRUE,
  username = Sys.getenv("TB_USERNAME"),
  password = Sys.getenv("TB_PASSWORD")
)

Arguments

device_id

device UUID.

keys

character vector of telemetry keys to delete, or NULL to clear every key the device currently knows.

api_key

account-level API key. Defaults to env var TB_API_KEY.

host

base URL of the ThingsBoard instance. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

delete_latest

if TRUE (default) ThingsBoard also drops the cached "latest telemetry" entry so the device-detail tab in the UI immediately reflects the deletion. Set to FALSE to keep the latest value visible for keys that get repopulated by the next push anyway.

username

ThingsBoard user for the username/password (JWT) login (self-hosted / Community Edition). Defaults to env var TB_USERNAME.

password

ThingsBoard password. Defaults to env var TB_PASSWORD.

Details

Server-side attributes (latitude, longitude, Bezirk, ...) and the device itself are NOT touched, only the time-series telemetry store. Re-running the demo push afterwards re-fills the cleared keys with the real Wasserportal timestamps.

Value

invisibly the number of keys submitted for deletion.

Examples

## Not run: 
id <- tb_get_device_id("wasserportal-gw-6038")
# wipe everything currently stored:
tb_delete_device_telemetry(id)
# wipe just the smoke-test GW-Stand value:
tb_delete_device_telemetry(id, keys = "GW-Stand")

## End(Not run)

Look Up a ThingsBoard Device's UUID by Name

Description

Lightweight read-only companion to tb_setup_devices: when you only need a device's internal UUID (e.g. to call the telemetry-delete endpoint), this returns it directly without touching the access token or creating the device on the side. Returns NA_character_ when the device does not exist.

Usage

tb_get_device_id(
  device_name,
  api_key = Sys.getenv("TB_API_KEY"),
  host = tb_default_host(),
  username = Sys.getenv("TB_USERNAME"),
  password = Sys.getenv("TB_PASSWORD")
)

Arguments

device_name

device name as shown in the ThingsBoard UI.

api_key

account-level API key. Defaults to env var TB_API_KEY.

host

base URL of the ThingsBoard instance. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

username

ThingsBoard user for the username/password (JWT) login (self-hosted / Community Edition). Defaults to env var TB_USERNAME.

password

ThingsBoard password. Defaults to env var TB_PASSWORD.

Value

device UUID (character) or NA_character_ if the lookup did not resolve.

Examples

## Not run: 
tb_get_device_id("wasserportal-gw-6038")

## End(Not run)

List the Telemetry Keys Currently Stored for a ThingsBoard Device

Description

Wraps GET /api/plugins/telemetry/DEVICE/{id}/keys/timeseries. Useful to discover what's actually in the device-side time-series store before a wipe, or to compare against the Parameter column of the gh-pages source data.

Usage

tb_list_device_telemetry_keys(
  device_id,
  api_key = Sys.getenv("TB_API_KEY"),
  host = tb_default_host(),
  username = Sys.getenv("TB_USERNAME"),
  password = Sys.getenv("TB_PASSWORD")
)

Arguments

device_id

device UUID. Use tb_get_device_id to resolve a name.

api_key

account-level API key. Defaults to env var TB_API_KEY.

host

base URL of the ThingsBoard instance. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

username

ThingsBoard user for the username/password (JWT) login (self-hosted / Community Edition). Defaults to env var TB_USERNAME.

password

ThingsBoard password. Defaults to env var TB_PASSWORD.

Value

character vector of telemetry key names. May be of length 0.

Examples

## Not run: 
id <- tb_get_device_id("wasserportal-gw-6038")
tb_list_device_telemetry_keys(id)

## End(Not run)

Obtain a JWT Bearer Token from ThingsBoard (Username / Password Login)

Description

Calls POST /api/auth/login with a username/password pair and returns the JWT access token. This is the standard ThingsBoard REST API authentication and the only one available on self-hosted ThingsBoard Community Edition: unlike the account-level API key (a ThingsBoard Cloud convenience, generated under Account > Security > API keys), every edition – CE, PE and Cloud – accepts a username/password login.

Usage

tb_login(
  username = Sys.getenv("TB_USERNAME"),
  password = Sys.getenv("TB_PASSWORD"),
  host = tb_default_host()
)

Arguments

username

ThingsBoard user (usually the account e-mail). Defaults to env var TB_USERNAME.

password

ThingsBoard password. Defaults to env var TB_PASSWORD.

host

base URL of the ThingsBoard instance, without trailing slash. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud". Use e.g. "https://dashboards.inowas.org" for a self-hosted instance.

Details

The token is short-lived (ThingsBoard's default JWT expiration is 2.5 h), which is ample for the one-off device setup done by tb_setup_devices: the subsequent telemetry push uses the per-device access token, not this JWT, so no token refresh is implemented.

Transient failures (HTTP 408 / 429 / 500 / 502 / 503 / 504 and transport dropouts) are retried with exponential backoff, matching the predicate used for the telemetry POSTs so a flaky upstream does not abort the device-setup run on the first 5xx.

Value

the JWT access token as a character scalar, ready to be sent in an ⁠X-Authorization: Bearer <token>⁠ request header.

Credentials in error output

On a non-2xx response this helper surfaces an excerpt of the server's response body (via tb_error_body(), up to ~800 chars) in the R error message, and httr2::req_retry() prints retry messages to stderr. Stock ThingsBoard only echoes back the error description, not the request payload, so the password does not leak. If a self-hosted instance or reverse proxy is configured to echo request fields back in the error body, that excerpt would surface in R errors and – when captured with 2>&1 – in CI logs. Mask the relevant secrets in such environments.

Examples

## Not run: 
Sys.setenv(
  TB_HOST = "https://dashboards.inowas.org",
  TB_USERNAME = "[email protected]",
  TB_PASSWORD = "secret"
)
token <- tb_login()

## End(Not run)

Recommended Push Defaults per ThingsBoard Subscription Plan

Description

Wraps the per-device transport rate limits documented at https://thingsboard.io/docs/paas/eu/subscriptions/ into the parameters this package's push functions take. Pass the result into tb_push_station_telemetry() via mode, chunk_size and throttle_seconds so you stay below your plan's "Telemetry Transport messages/data points (Device)" thresholds.

Usage

tb_plan_defaults(plan = "free")

Arguments

plan

one of

  • "free" – proven Single-record mode (mode = "single", chunk_size = 1, throttle_seconds = 0.05).

  • "free-bulk" – bulk preset tuned to stay under Free's per-device 100 dp/s and 2,000 dp/min caps (chunk_size = 10, throttle_seconds = 1.0). Confirmed not to work on the public ThingsBoard Cloud Maker free tier as of 2026-05: the gateway returns an opaque empty-body HTTP 500 to the array form regardless of how small the chunk is. Kept as a reproducible baseline; on Free use "free".

  • "prototype", "pilot", "startup", "business" – the paid PaaS tiers. All use mode = "bulk", chunk_size = 30, throttle_seconds = 1.0 (~30 dp/s, well under the 2,000 dp/min cap that all paid tiers share).

  • "ce" – self-hosted Community Edition: mode = "bulk", chunk_size = 1000, throttle_seconds = 0.

Case-insensitive. Unknown values raise an error.

Details

Across all paid PaaS tiers the per-device sustained limits are identical (2 000 telemetry data points per minute, 15 000 per hour), the only thing that changes is how aggressive a burst the platform tolerates before it drops a request. Free additionally rejects the bulk array form on the device telemetry endpoint, so its default is mode = "single".

Self-hosted ThingsBoard CE has no per-tenant rate limit by default, hence the much larger chunk size and zero throttle.

Value

named list with mode, chunk_size and throttle_seconds, ready to be spread into tb_push_station_telemetry().

Examples

tb_plan_defaults("free")
tb_plan_defaults("free-bulk")
tb_plan_defaults("ce")

Push a Single "Latest" Telemetry Record (no Timestamp)

Description

Sends a flat ⁠{"key": value, ...}⁠ JSON object to the ThingsBoard telemetry endpoint, letting the server stamp it with the current time. This is the simplest possible telemetry POST and is useful both as a smoke test for the device-token auth path and as a fallback when the bulk-with-ts format is rejected (some ThingsBoard Cloud Maker tier configurations return an opaque HTTP 500 to the array-of-records form even though the same device accepts attributes and latest-style single records).

Usage

tb_push_latest_telemetry(
  values,
  device_token,
  host = tb_default_host()
)

Arguments

values

named list (or named numeric vector) of telemetry key/value pairs.

device_token

ThingsBoard device access token.

host

base URL of the ThingsBoard instance. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

Value

invisibly the number of keys that were sent.

Examples

## Not run: 
tb_push_latest_telemetry(
  values = list(`GW-Stand` = 35.6, `Wassertemperatur` = 11.2),
  device_token = Sys.getenv("TB_DEVICE_TOKEN")
)

## End(Not run)

Push Static Attributes of one Wasserportal Station to ThingsBoard

Description

Sends station metadata (coordinates, level reference, operator, ...) as client-side attributes to the ThingsBoard device attributes endpoint (⁠/api/v1/{token}/attributes⁠). Attributes are key/value pairs without a timestamp; ThingsBoard overwrites the previous value on every push.

Usage

tb_push_station_attributes(
  attributes,
  device_token,
  host = tb_default_host()
)

Arguments

attributes

named list (or a one-row data frame, which is converted to a list). All entries must be JSON-serialisable scalars.

device_token

ThingsBoard device access token.

host

base URL of the ThingsBoard instance. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

Value

invisibly the number of attributes that were sent.

Examples

## Not run: 
tb_push_station_attributes(
  attributes = list(
    name = "Pegel Mueggelheim",
    latitude = 52.4291,
    longitude = 13.6450,
    pegelnullpunkt_m_NHN = 32.18
  ),
  device_token = Sys.getenv("TB_DEVICE_TOKEN_5867000")
)

## End(Not run)

Push Time Series of one Wasserportal Station to ThingsBoard

Description

Sends long-format measurements of a single Wasserportal monitoring station to the device telemetry endpoint of a ThingsBoard instance (⁠/api/v1/{token}/telemetry⁠). Works against ThingsBoard Cloud (e.g. the free Maker tier on ⁠https://thingsboard.cloud⁠), self-hosted ThingsBoard Community Edition and ⁠https://demo.thingsboard.io⁠ since the device-token API is identical on all of them.

Usage

tb_push_station_telemetry(
  data,
  device_token,
  ts_col = "Datum",
  value_col = "Messwert",
  key_col = "Parameter",
  single_key = "value",
  host = tb_default_host(),
  chunk_size = 100L,
  mode = c("single", "bulk"),
  throttle_seconds = NULL,
  max_active = 10L,
  verbose = TRUE
)

Arguments

data

data frame for one station, in the long format produced by get_groundwater_data (columns Messstellennummer, Datum, Parameter, Einheit, Messwert) or get_daily_surfacewater_data (Messstellennummer, Datum, Tagesmittelwert, Parameter, Einheit).

device_token

ThingsBoard device access token (taken from the device detail view in the ThingsBoard UI).

ts_col

name of the timestamp column. Default "Datum".

value_col

name of the numeric value column. Default "Messwert" (set to "Tagesmittelwert" for surface water data).

key_col

name of the column whose values become telemetry keys. Default "Parameter". Set to NULL to push value_col itself under a single fixed key (see single_key).

single_key

telemetry key used when key_col is NULL. Default "value".

host

base URL of the ThingsBoard instance, without trailing slash. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud".

chunk_size

maximum number of timestamps per HTTP POST when mode = "bulk". Default 100. Ignored in single mode.

mode

one of "single" (default) or "bulk". The Maker free tier on ThingsBoard Cloud rejects the bulk array form with an opaque HTTP 500 even though the same device accepts the per-record ⁠{"ts": ms, "values": {...}}⁠ object; single mode therefore POSTs each record on its own. Use "bulk" against self-hosted CE for the much faster array-of-records form. See tb_plan_defaults for plan-aware presets that pick mode, chunk_size and throttle_seconds together.

throttle_seconds

inter-request sleep, in seconds, between consecutive HTTP POSTs. NULL (default) picks 0.05 for mode = "single" and 0.1 for mode = "bulk". Increase to stay safely below the per-second / per-minute transport rate limits of the target ThingsBoard plan; set to 0 to push as fast as the server permits (e.g. self-hosted CE).

max_active

number of concurrent HTTP POSTs in single mode (passed to httr2::req_perform_parallel()). Default 10, which stays below the ThingsBoard Cloud Free tier's 50 messages/second per-device transport rate limit. Ignored in bulk mode.

verbose

print one line per chunk in bulk mode and one line per parallel batch in single mode (default TRUE).

Details

Long-format input is pivoted on the fly so that every distinct value of key_col becomes one telemetry key inside ThingsBoard, sharing the same timestamp.

Value

invisibly the number of telemetry timestamps that were sent.

Examples

## Not run: 
stations <- wasserportal::get_stations()
gw <- wasserportal::get_groundwater_data(stations)
one_station <- dplyr::filter(
  gw$groundwater.level,
  .data$Messstellennummer == "149"
)
tb_push_station_telemetry(
  data = one_station,
  device_token = Sys.getenv("TB_DEVICE_TOKEN_149")
)

## End(Not run)

Create ThingsBoard Devices and Return their Access Tokens

Description

Convenience wrapper for the initial setup against a fresh ThingsBoard tenant. Authenticates with either a username/password login (JWT – works on every ThingsBoard edition, required for self-hosted Community Edition) or an account-level API key (ThingsBoard Cloud), then:

  1. Create one device per name (or fetch the device if it already exists),

  2. Read each device's access token via the credentials endpoint.

The returned named character vector can be fed directly into tb_push_station_telemetry as device_token.

Usage

tb_setup_devices(
  station_ids,
  api_key = Sys.getenv("TB_API_KEY"),
  host = tb_default_host(),
  name_prefix = "wasserportal-",
  device_type = "wasserportal",
  username = Sys.getenv("TB_USERNAME"),
  password = Sys.getenv("TB_PASSWORD")
)

Arguments

station_ids

character vector of Wasserportal Messstellennummer values. Each becomes a ThingsBoard device named paste0(name_prefix, station_id).

api_key

account-level API key (ThingsBoard Cloud only), generated under Account > Security > API keys > Generate. Sent in the ⁠X-Authorization: ApiKey <key>⁠ request header. Defaults to env var TB_API_KEY. Ignored when username and password are supplied.

host

base URL of the ThingsBoard instance, without trailing slash. Defaults to env var TB_HOST if set, otherwise "https://thingsboard.cloud". Use "https://eu.thingsboard.cloud" for the EU cloud or e.g. "https://dashboards.inowas.org" for a self-hosted instance.

name_prefix

prefix added in front of every station id when forming the ThingsBoard device name. Default "wasserportal-".

device_type

ThingsBoard device profile / type. Default "wasserportal". The profile is auto-created on first use.

username

ThingsBoard user for the username/password (JWT) login. Defaults to env var TB_USERNAME. When set together with password it takes precedence over api_key – this is the route to use for self-hosted Community Edition.

password

ThingsBoard password. Defaults to env var TB_PASSWORD.

Details

Set TB_USERNAME + TB_PASSWORD for the login route, or generate an API key in the ThingsBoard Cloud UI under Account > Security > API keys > Generate and set TB_API_KEY.

Value

named character vector. Names are the input station_ids, values are device access tokens.

Examples

## Not run: 
# Self-hosted ThingsBoard Community Edition (username/password login):
Sys.setenv(
  TB_HOST = "https://dashboards.inowas.org",
  TB_USERNAME = "[email protected]",
  TB_PASSWORD = "secret"
)
tokens <- tb_setup_devices(c("149", "5867000", "5803900"))

# ThingsBoard Cloud (account API key):
Sys.setenv(
  TB_HOST = "https://eu.thingsboard.cloud",
  TB_API_KEY = "<paste-your-API-key-here>"
)
tokens <- tb_setup_devices(c("149", "5867000", "5803900"))

## End(Not run)

Helper function: Base Url of Berlin Wassersportal

Description

Helper function: Base Url of Berlin Wassersportal

Usage

wasserportal_base_url()

Value

string with base url of Berlin Wasserportal


Wasserportal Master Data: download and Import in R List

Description

Wasserportal Master Data: download and Import in R List

Usage

wp_masters_data_to_list(
  overview_list_names,
  target_dir = tempdir(),
  file_prefix = "stations_",
  is_zipped = FALSE
)

Arguments

overview_list_names

names of elements in the list returned by get_stations(type = "list")

target_dir

target directory for downloading data (default: tempdir())

file_prefix

prefix given to file names

is_zipped

are the data to be downloaded zipped (default: FALSE)

Value

downloads csv master data from Wasserportal

Examples

## Not run: 
overview_list_names <- names(wasserportal::get_stations(type = "list"))
wp_masters_data_list <- wp_masters_data_to_list(overview_list_names)

## End(Not run)

Wasserportal Time Series Data: download and Import in R List

Description

Wasserportal Time Series Data: download and Import in R List

Usage

wp_timeseries_data_to_list(
  overview_list_names,
  target_dir = tempdir(),
  is_zipped = TRUE
)

Arguments

overview_list_names

names of elements in the list returned by get_stations(type = "list")

target_dir

target directory for downloading data (default: tempdir())

is_zipped

are the data to be downloaded zipped (default: TRUE)

Value

downloads (zipped) data from wasserportal

Examples

## Not run: 
overview_list_names <- names(wasserportal::get_stations(type = "list"))
wp_timeseries_data_list <- wp_timeseries_data_to_list(overview_list_names)

## End(Not run)