| Title: | Analyse Your R Code! |
|---|---|
| Description: | This package allows you to parse your R scripts and to calculate some staticstics on your code. |
| Authors: | Hauke Sonnenberg [aut, cre] (ORCID: <https://orcid.org/0000-0001-9134-2871>), Michael Rustler [ctb] (0000-0003-0647-7726), FAKIN [fnd], Kompetenzzentrum Wasser Berlin gGmbH (KWB) [cph] |
| Maintainer: | Hauke Sonnenberg <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.0 |
| Built: | 2026-05-28 10:11:54 UTC |
| Source: | https://github.com/KWB-R/kwb.code |
Analyse the Parse Tree of an R Script
analyse(x, path = "")analyse(x, path = "")
x |
parse tree as returned by |
path |
for internal use only (when this function is called recursively) |
list representing type information on the nodes in the parse tree
# Parse an R script file (here, a file from kwb.utils) x <- parse("https://raw.githubusercontent.com/KWB-R/kwb.utils/master/R/log.R") # Analyse the parse tree (This may take some time!) result <- kwb.code::analyse(x) # Show the structure of the result list (only 3 levels!) str(result, 3)# Parse an R script file (here, a file from kwb.utils) x <- parse("https://raw.githubusercontent.com/KWB-R/kwb.utils/master/R/log.R") # Analyse the parse tree (This may take some time!) result <- kwb.code::analyse(x) # Show the structure of the result list (only 3 levels!) str(result, 3)
Get Argument Names of a Function
arg_names(x)arg_names(x)
x |
function name or function |
vector of character
arg_names("sum") arg_names(mean)arg_names("sum") arg_names(mean)
The idea of this function is to collect objects of interest from the parse
tree, e.g. the names of functions that are called by a script. Therefore,
set the function matches so that it returns TRUE for the nodes
in the tree that are of interest.
extract_from_parse_tree( x, matches = matches_function, dbg = FALSE, path = integer(), parent = NULL, index = -1 )extract_from_parse_tree( x, matches = matches_function, dbg = FALSE, path = integer(), parent = NULL, index = -1 )
x |
parse tree as returned by |
matches |
function that is called for each node of the tree. Give a
function here that returns |
dbg |
if |
path |
for internal use |
parent |
for internal use |
index |
for internal use |
vector of character or NULL
Show String Constants Used in R Scripts
find_string_constants(root = "./R")find_string_constants(root = "./R")
root |
path from which to look recursively for R scripts |
Find weaknesses in R scripts
find_weaknesses_in_scripts( x = parse_scripts(root), root = NULL, min_duplicate_string_length = 6L, min_duplicate_frequency = 3L )find_weaknesses_in_scripts( x = parse_scripts(root), root = NULL, min_duplicate_string_length = 6L, min_duplicate_frequency = 3L )
x |
list of named parse trees as returned by
|
root |
path to folder containing R scripts |
min_duplicate_string_length |
minimum number of characters that a string constant must have to be considered as a duplicate |
min_duplicate_frequency |
minimum frequency of a string constant to be considered as a duplicate |
data frame with columns file, expression,
frequency, recommendation
Extract Sections of Same "Type" from Parse Tree
get_elements_by_type(x, result = NULL, dbg = TRUE)get_elements_by_type(x, result = NULL, dbg = TRUE)
x |
parse tree as returned by |
result |
optional. Result as returned by |
dbg |
if |
# Parse an R script file (here, a file from kwb.utils) x <- parse("https://raw.githubusercontent.com/KWB-R/kwb.utils/master/R/log.R") # For each "type" of code segment, extract all occurrences elements <- get_elements_by_type(x) # Show all for-loops elements$`language|call|for|4|` # Show all if-statements elements$`language|call|if|3|` # Show all if-else-statements elements$`language|call|if|4|`# Parse an R script file (here, a file from kwb.utils) x <- parse("https://raw.githubusercontent.com/KWB-R/kwb.utils/master/R/log.R") # For each "type" of code segment, extract all occurrences elements <- get_elements_by_type(x) # Show all for-loops elements$`language|call|for|4|` # Show all if-statements elements$`language|call|if|3|` # Show all if-else-statements elements$`language|call|if|4|`
Get information on function definitions in parsed R scripts
get_full_function_info(trees)get_full_function_info(trees)
trees |
list of R script parse trees as provided by
|
Which Function is Called How Often?
get_function_call_frequency(tree, simple = FALSE, dbg = TRUE)get_function_call_frequency(tree, simple = FALSE, dbg = TRUE)
tree |
parse tree as returned by |
simple |
if |
dbg |
if |
data frame with columns name (name of function), count
(number of times the function is called)
Get Names of Packages Used in R-Scripts
get_names_of_used_packages(root_dir, pattern = "[.][rR](md)?$")get_names_of_used_packages(root_dir, pattern = "[.][rR](md)?$")
root_dir |
directory in which to look recursively for R-scripts |
pattern |
regular expression matching the names of the files to be considered |
How Often Are the Functions of a Package Used?
get_package_function_usage(tree, package, simple = FALSE, by_script = FALSE)get_package_function_usage(tree, package, simple = FALSE, by_script = FALSE)
tree |
parse tree as returned by |
package |
name of the package (must be installed) |
simple |
if |
by_script |
if |
data frame with columns name (name of the function),
prefixed (number of function calls prefixed with <package>::
or <package>:::), non_prefixed (number of function calls
that are not prefixed with the package name) and total (total
number of function calls)
# Read all scripts that are provided in the kwb.fakin package tree <- kwb.code::parse_scripts(root = system.file(package = "kwb.fakin")) # Check which functions from kwb.utils are used and how often get_package_function_usage(tree, package = "kwb.utils") # Hm, this does not seem to be the whole truth...# Read all scripts that are provided in the kwb.fakin package tree <- kwb.code::parse_scripts(root = system.file(package = "kwb.fakin")) # Check which functions from kwb.utils are used and how often get_package_function_usage(tree, package = "kwb.utils") # Hm, this does not seem to be the whole truth...
Get Package Usage per Script
get_package_usage_per_script(root, packages, pattern = "\\.R$", ...)get_package_usage_per_script(root, packages, pattern = "\\.R$", ...)
root |
root directory with R scripts |
packages |
vector with package names to be checked |
pattern |
default: "\.R$" |
... |
additional arguments passed to |
tibble with information on used packages
Get Frequency of String Constant Usage in R Scripts
get_string_constants_in_scripts( root, scripts = dir(root, "\\.[Rr]$", recursive = TRUE), two_version_check = TRUE, FUN = NULL )get_string_constants_in_scripts( root, scripts = dir(root, "\\.[Rr]$", recursive = TRUE), two_version_check = TRUE, FUN = NULL )
root |
path to folder in which to look for R scripts |
scripts |
optional. Paths to R scripts in which to search for string
constants, relative to |
two_version_check |
if |
FUN |
optional. Function used to browse the code tree for string
constants. If |
data frame with columns file_id (file identifier),
string (string constant found in the file) and count (number
of occurrences of the string counted in the file). The file identifier can
be resolved to a full file name using the "file database" that is stored in
the attribute "file_db".
root <- system.file(package = "kwb.code") constants <- get_string_constants_in_scripts(root) # Get paths to files from "file database" stored in attribute "file_db" kwb.utils::getAttribute(constants, "file_db")root <- system.file(package = "kwb.code") constants <- get_string_constants_in_scripts(root) # Get paths to files from "file database" stored in attribute "file_db" kwb.utils::getAttribute(constants, "file_db")
Parse all given R scripts into a tree structure
parse_scripts( root, scripts = dir(root, "\\.R$", ignore.case = TRUE, recursive = TRUE), dbg = TRUE )parse_scripts( root, scripts = dir(root, "\\.R$", ignore.case = TRUE, recursive = TRUE), dbg = TRUE )
root |
root directory to which the relative paths given in
|
scripts |
relative file paths to R scripts. By default all files ending
with ".R" or ".r" below the |
dbg |
if |
## Not run: # Download some example code files from github... url.base <- "https://raw.githubusercontent.com/hsonne/blockrand2/master/R/" urls <- paste0(url.base, c("blockrand2_create.R", "blockrand2_main.R")) targetdir <- file.path(tempdir(), "blockrand2") targetdir <- kwb.utils::createDirectory(targetdir) for (url in urls) { download.file(url, file.path(targetdir, basename(url))) } # By default, all R scripts below the root are parse trees <- parse_scripts(root = targetdir) # All elements of trees are expressions sapply(trees, is.expression) # Analyse the scripts on the script level scriptInfo <- to_full_script_info(trees) scriptInfo # Analyse the scripts on the function level functionInfo <- get_full_function_info(trees) functionInfo ## End(Not run)## Not run: # Download some example code files from github... url.base <- "https://raw.githubusercontent.com/hsonne/blockrand2/master/R/" urls <- paste0(url.base, c("blockrand2_create.R", "blockrand2_main.R")) targetdir <- file.path(tempdir(), "blockrand2") targetdir <- kwb.utils::createDirectory(targetdir) for (url in urls) { download.file(url, file.path(targetdir, basename(url))) } # By default, all R scripts below the root are parse trees <- parse_scripts(root = targetdir) # All elements of trees are expressions sapply(trees, is.expression) # Analyse the scripts on the script level scriptInfo <- to_full_script_info(trees) scriptInfo # Analyse the scripts on the function level functionInfo <- get_full_function_info(trees) functionInfo ## End(Not run)
Get script statistics from a list of R script trees
to_full_script_info(trees)to_full_script_info(trees)
trees |
list of R script parse trees as provided by
|
Walk Along a Parse Tree
walk_tree( x, path = "", depth = 0L, max_depth = 20L, dbg = TRUE, config = list(), context = NULL )walk_tree( x, path = "", depth = 0L, max_depth = 20L, dbg = TRUE, config = list(), context = NULL )
x |
parse tree as returned by |
path |
for internal use only. Path to the element in the parse tree. |
depth |
for internal use only. Recursion depth. |
max_depth |
maximum recursion level. Default: 20L |
dbg |
whether or not to show debug messages |
config |
list defining modifications of nodes in the node tree. TODO: describe further |
context |
if not |
walk_tree(parse(text = "x <- 1:n"))walk_tree(parse(text = "x <- 1:n"))