--- title: "rolog: Prolog queries from R" author: "Matthias Gondan (Department of Psychology, Universität Innsbruck, Austria) and Jan Wielemaker (Vrije Universiteit Amsterdam/SWI-Prolog Solutions b.v.)" date: "2023-01-27" bibliography: bibliography.bibtex output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{rolog: Prolog queries from R} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include=FALSE} knitr::opts_chunk$set(collapse=TRUE, comment="#>") library(htmltools) library(DiagrammeR) library(DiagrammeRsvg) ``` Matthias Gondan\ Universität Innsbruck\ Department of Psychology\ Innrain 9\ A-6020 Innsbruck\ Matthias.Gondan-Rochon@uibk.ac.at # Abstract Prolog is a classical logic programming language with many applications in expert systems, computer linguistics and traditional, that is, symbolic artificial intelligence. The main strength of Prolog is its concise representation of facts and rules for the representation of knowledge and grammar, as well as its efficient built-in search engine for closed world domains. R is a statistical programming language for data analysis and statistical modeling which is widely used in academia and industry. Besides the core library, a lot of packages have been developed for all kinds of statistical problems, including statistics-based artificial intelligence tools such as neural networks for machine learning and deep learning. Whereas Prolog is weak in statistical computation, but strong in symbolic manipulation, the converse may be said for the R language. SWI-Prolog is a widely used Prolog system that offers a wide range of extensions for real world applications, and there already exist two Prolog "packs" to invoke R (`rserve-client`, `real`) from SWI-Prolog. Given the large user community of R, there may also be a need for a connection in the reverse direction that allows invoking Prolog queries in R computations. The R\ package `rolog` connects to the SWI-Prolog system, thus enabling deterministic and non-deterministic queries to the Prolog interpreter. Usage of `rolog` is illustrated by a few examples. ## Keywords Statistics; Logic Programming; Artificial Intelligence; R; Prolog # 1. rolog: Prolog queries from R The R [@R] programming language and environment is a widely used open source software for statistical data analysis. The basic R is a functional language with lots of support for storage and manipulation of different data types, and a strong emphasis on operations involving vectors and arrays. Moreover, a huge number of packages (e.g., CRAN, https://cran.r-project.org/) have been contributed that cover problems from areas as diverse as bioinformatics, machine learning, specialized statistical methods, web programming and connections to other programming languages. An interface to Prolog is lacking so far. Based on earlier work by Kowalski, the logic programming language Prolog was invented in the 1970ies by Colmerauer and Roussel [@Kowalski1988], mostly for the purpose of natural language processing. Since then, logic programming has become an important driving force in research on artificial intelligence, natural language processing, program analysis, knowledge representation and theorem proving [@Shoham1994;@Lally2011;@Carro2004;@Hsiang1987]. SWI-Prolog [@Wielemaker2012] is an open-source implementation of Prolog that mainly targets developers of applications, with many users in academia, research and industry. SWI-Prolog includes a large number of libraries for "the real world", for example, a web server, encryption, interfaces to C/C++ and other programming languages, as well as a development environment and debugger. In addition, pluggable extensions (so-called packs) are available for specific tasks to enhance its capabilities. Unlike R, Prolog is a declarative programming language consisting of facts and rules that define relations, for example, in a problem space [@Newell1972]. Prolog's major strength is its built-in query-driven search engine that efficiently deals with complex structured data, with the data not necessarily being numerical. In fact, Prolog only provides a basic collection of arithmetic calculations via a purely functional interface (`is/2`). More complex calculations such as matrix algebra, statistical models or machine learning need help from other systems, for example, from R. Angelopoulos et al. [-@Angelopoulos2013] summarize work at the intersection of symbolic knowledge representation and statistical inference, especially in the area of model fits [EM algorithms, MCMC, @Sato2001;@Angelopoulos2008] and stochastic logic programs [@Cussens2000;@Kimmig2011]. One of the major strengths of logic programming is handling constraints; and a number of systems for constraint satisfaction tools have been developed (constraint logic programming on booleans, finite domains, reals, and intervals) for that purpose [e.g., @Fruehwirth1998;@Triska2018]. Some constraint handlers exist in R (see the CRAN task view for optimization problems), but more of them would be available via a bridge between R and Prolog. Earlier approaches to connect Prolog and R have been published as SWI-Prolog packs [real, rserve_client, @Angelopoulos2013;@Rserve] and as a YAP module [YapR, @YapR]. Whereas `real` establishes a direct link to an embedded instance of R, `rserve-client` communicates with a local or remote R service [@Urbanek2021]. The former approach emphasizes speed, the latter might be preferred from a security perspective, especially in systems such as SWISH [@SWISH] that accept only a set of sandboxed commands for Prolog, but do not impose restrictions on R. A common feature of the two packages is that they provide an interface for R calls from Prolog, but not the other way round, that is, querying Prolog from R is not possible, so far. The present package fills this gap through Prolog queries in R scripts, for example, to perform efficient symbolic computations, searches in complex graphs, parsing natural language and definite clause grammars. In addition, two Prolog predicates are provided that enable Prolog to ring back to the R system for bidirectional communication. Similar to `real`, tight communication between the two systems is established by linking to a shared library that embeds the current SWI-Prolog runtime. The exchange of data is facilitated by the C++ interfaces of the two languages [@Edelbuettel2018;@Wielemaker2021]. A less tight connection might be established using the recently developed machine query interface [@Zinda2021] that allows socket-based communication between foreign languages and SWI-Prolog (and, in fact, the `MQI` documentation includes an example in which R is called). A bidirectional bridge between R and Prolog might overcome the limitations of both languages, thereby combining the extensive numerical and statistical power of the R system with Prolog's skills in the representation of knowledge and reasoning. In addition to the useful little tools shown in the examples below, `rolog` can therefore contribute to progress at the intersection of traditional artificial intelligence and contemporary statistical programming. The next section presents the interface of `rolog` in detail. Section\ 3 presents possible extensions of the package at both ends, in R and Prolog. Section\ 4 is a list of illustrative examples that offer useful extensions to the R system. Conclusions and further perspectives are summarized in Section\ 5. # 2. Basic syntax `rolog` has a rather minimalistic syntax, providing only the basic ingredients to establish communication with the SWI-Prolog runtime. Ways to extend the interface are described in Section\ 3. After installation with `install.packages("rolog")`, the package is loaded in the standard way. ```{r} library(rolog) ``` We can see a short message telling the user which SWI-Prolog was found. The package searches for SWI-Prolog based on the environment variable `SWI_HOME_DIR`, the registry (Windows only), an executable `swipl` in the `PATH`, and if everything fails, R package `rswipl` [@rswipl]. The message can be silenced by the usual option `quietly=TRUE` of the `library` command. ## R interface Most of the work is done using the three R\ functions `query`, `submit`, and `clear`. The R program in Listing\ 1 illustrates a query to Prolog's `member/2` using `rolog`'s syntax rules. ```{r} # member(1, [1, 2.0, a, "b", X, true]) query(call("member", 1L, list(1L, 2.0, quote(a), "b", expression(X), TRUE))) # returns an empty list, stating that member(1, [1 | _]) is satisfied submit() # returns a list with constraints, stating that the query is also satisfied # if the fifth element of the list, X, is 1 submit() # close the query clear() ``` Listing\ 1. : A query to Prolog's `member/2` predicate. `query`. The function `query(call, options)` is used to create a Prolog query (without invoking it yet). The first argument is a regular R call that is created using R's function `call(name, ...)`. This call represents the Prolog query that will be submitted in the later course. The creation of such predicates and Prolog terms is described below and can become quite contrived (see the examples in Section\ 4). The second argument, `options`, may be used for ad hoc modifications of the translation between R and Prolog, see the section below. The function returns `TRUE` on success. Note that `query` does not check if a Prolog predicate corresponding to `call` actually exists (see `submit()` below). Only a single query can be opened at a given time. If a new query _Q_ is created while another query _R_ is still open, a warning is shown and _R_ is closed. `submit`. Once a query has been created, it can be submitted using `submit()`. If the query fails, the return value is `FALSE`. If the query succeeds, a list of constraints is returned, with bindings for the variables that satisfy the query. Repeated calls to submit are possible, returning the different solutions of a query (until it eventually fails). The distinction between the different types of return values for success and failure (list vs. `FALSE`) is facilitated by the R function `isFALSE(x)`. `clear`. Closes the query. The name of the function was chosen to avoid name clashes with R's own built-in function `close`. The function returns an invisible `TRUE`, even if there is no open query. Three more functions `consult`, `once`, and `findall` are provided for convenience. `consult`. In most applications, a number of Prolog facts and rules will be loaded into the system. To facilitate this recurrent task, the Prolog directive `consult/1` has been mirrored into R, `consult(filename)`, with `filename` being a string or a vector of strings if multiple files are to be consulted. The function returns `TRUE` on success; in case of problems, it returns `FALSE` and an error message is shown. `once` and `findall`. The function `once(call, options)` is a convenience function that acts as a shortcut for `query(call, options)`, `submit()`, and `clear()`. Similarly, `findall(call, options)` abbreviates the commands `query(call, options)`, repetition of `submit()` until failure, and `clear()`, returning a list collecting the return values of the individual submissions. ## Creating Prolog terms in R Table\ 1 summarizes the rules for the translation from R objects to Prolog. Most rules work in both directions, but a few exceptions exist. Table\ 1 : Creating Prolog terms from R |R |Prolog |Note/Alternatives | |:-----------------------|:-------------------------|:------------------------| |`expression(X)` |Variable X |not necessarily uppercase| |`as.symbol(abc)` |Atom abc |`as.name`, `quote` | |`TRUE`, `FALSE`, `NULL` |Atoms true, false, null | | |`"abc"` |String \"abc\" | | |`3L` |Integer 3 | | |`3` |Float 3.0 | | |`call("term", 1L, 2L)` |term(1, 2) | | |`list(1L, 2L, 3L)` |List [1, 2, 3] | | |`list(a=1, b=2, c=3)` |List [a-1, b-2, c-3] | | |`c(1, 2, 3, Inf)` |##(1.0, 2.0, 3.0, 1.0Inf) |vectors of length > 1 | |`c(1L, 2L, 3L)` or `1:3`|\'%%\'(1, 2, 3) | | |`c("a", "b", "c")` |\$\$(\"a\", \"b\", \"c\") | | |`c(TRUE, FALSE, NA)` |!!(true, false, na) | | |`sin` |function(x) :- sin(x) |primitive function | |`function(x) sin(x)` |function(x) :- sin(x) |self-written function | |`matrix(1:4, nrow=2)` |\'%%%\'(\'%%\'(1, 3), ...)|see also ###, \$\$\$, !!!| In R, the basic elements such as integers, floating point numbers, character strings, and logicals are vectorized, and scalar entities are treated like vectors with one element. Conversely, Prolog does not natively support vectors or matrices. The problem is solved in the following way: * R vectors of length 0 are translated to Prolog's empty list. * R vectors of length 1 are translated to Prolog scalars. * R vectors of length $N > 1$ are translated to Prolog terms `##/N`, `%%/N`, `$$/N`, and `!!/N` for floating point numbers, integers, strings and logicals, respectively. * R matrices are translated to Prolog terms `###/R`, `%%%/R`, `$$$/R`, and `!!!/R` with the respective row vectors as arguments. In the reverse direction, Prolog terms like `##/N` are translated back to R vectors of length _N_, including the terms `##/0` and `##/1` that map to R vectors of length 0 and 1, respectively. Translation of a polymorphic Prolog term such as `##(a, 1.5)` to R will fail, since `rolog` expects the arguments to be numeric. If a Prolog object cannot be translated to R (e.g., a cyclic term), an error is raised. If an R object that lacks a suitable representation in Prolog (e.g., S4 class), a warning is printed and the result is unified with `na`. To summarize, the rules for translation are not fully symmetrical. A quick check for symmetry of the representation is obtained by a query to `=/2` or even `r_eval/2` (see also below, subsection Prolog interface): ```{r} Q <- call("=", expression(X), c(1, 2, NA, NaN, Inf)) once(Q, options=list(portray=TRUE)) Q <- call("r_eval", c(1, 2, NA, NaN, Inf), expression(X)) once(Q) ``` The optional argument `env` to query, once and findall allows to raise the query (and, as a consequence, r_eval/1,2 in a specific environment. ## Package options A few package-specific options have been defined to allow some fine-tuning of the rules for translation between R and Prolog. * *realvec* (string): Name of the Prolog term for vectors of floats (default is `##`) * *realmat*: Name of the Prolog term for matrices of floats (default is `###`) * *intvec*/*intmat*: same for vectors/matrices of integers (defaults are `%%`/`%%%`) * *boolvec*/*boolmat*: same for vectors/matrices of logicals (defaults are `!!`/`!!!`) * *charvec*/*charmat* (string): same for vectors/matrices of character strings (defaults are `$$`/`$$$`). The single dollar cannot be used because it is the list operator in R. * *scalar* (logical): if `TRUE` (default), R vectors of length\ 1 are translated to scalars in Prolog. If `FALSE` (rarely used), R\ vectors are always translated to `##/N`, or `%%/N`, `!!/N`, `$$/N`, even if they have only one element. * *portray* (logical): if `TRUE` (default in `query`), the result of `query`, `once` and `findall` includes an attribute with a text representation of the query in Prolog. * *preproc* (function with one argument): R hook that can be used to preprocess R terms before translation. The default is `rolog`'s own `preproc` function that maps R's `x <= y` to Prolog's `x =< y` and `!=` to `\=`. Preprocessing can be turned off by assigning the R function `dontCheck` to the preproc option. * *postproc* (function with one argument): R hook that can be used to postprocess R terms after a query. The default is `rolog`'s own `postproc` function that reverses the mapping from `preproc`. The command `rolog_options()` returns a list with all the options. The options can be globally modified with `options()` or in the optional argument of `query`, `once`, and `findall`. ```{r} options(rolog.intvec="iv") Q <- call("member", expression(X), list(c(1L, 2L), c(3.5, 4.5))) query(Q, options=list(realvec="rv")) submit() clear() ``` ## Prolog interface to R `rolog` offers some basic support to call R from Prolog, that is, connecting the two systems in the reverse direction. Two predicates can be used for this purpose, `r_eval(Call)` and `r_eval(Function, Result)`. The former just invokes R with the command `Call` (ignoring the result); the latter evaluates `Function` and unifies the result with `Result`. Note that proper quoting of R functions is needed at the Prolog end, especially with R functions that start with uppercase letters and/or contain a dot in their name (see Section\ 4). ## Exceptions Package `rolog` has limited support for exception handling. If Prolog raises an exception, the error string is forwarded to R using the `stop` function. The examples below illustrate this by querying an undefined Prolog predicate. ```{r} Q <- call("membr", expression(X), list(1, 2, 3)) query(Q) try(submit()) clear() ``` See Section\ 4 for another example with an error resulting from a malformed query to `r_eval/2`. # 3. Extending the package R is a functional language, whereas Prolog is declarative. Obviously, there cannot be a perfect one-to-one correspondence between the syntactic components of two programming languages that follow completely different paradigms. Whereas symbols, functions, numbers and character strings are easily mapped between R and Prolog, there are loose ends at both sides. The package is intentionally kept minimalistic, but can easily be extended by convenience functions at both ends, Prolog and R, to facilitate recurrent tasks and/or avoid cumbersome syntax. In particular, Prolog variables are translated from and to R *expressions* (not to be confused with R symbols), and R vectors of length greater than 1 are translated to the Prolog terms `#/N`, `%/N`, `!/N`, and `$$/N`, as mentioned above. These rules are, in principle, arbitrary and can be intercepted at several stages. * R functions that may be used to pre-process specific R elements before translation to Prolog (see, e.g., the R function `as.rolog`) * Prolog wrappers that manipulate the term before it is called and afterwards (see the example with dicts below) * R functions that post-process the result of a query The process is illustrated in Figure\ 1. ```{r, echo=FALSE, fig.width=6, fig.height=2} HTML(export_svg(grViz( 'digraph G { rankdir=LR Query Result subgraph cluster_0 { style=filled color=lightgrey node [style=filled,color=white] r2rolog -> forth -> rolog_pl } subgraph cluster_1 { style=filled color=lightgrey node [style=filled,color=white] rolog2r -> back [dir=back] back -> pl_rolog [dir=back] } Query -> r2rolog rolog_pl:e -> Prolog pl_rolog:e -> Prolog [dir=back] Result -> rolog2r [dir=back] Query [shape=Mdiamond;width=0.7;height=0.7] r2rolog [shape=rect,label="preproc"] forth [label="(rolog)"] rolog_pl [shape=rect,label="preproc/2"] Prolog [shape=Mcircle] pl_rolog [shape=rect,label="postproc/2"] rolog2r [shape=rect,label="postproc"] back [label="(rolog)"] Result [shape=Msquare] }'))) ``` Figure\ 1 : Workflow in rolog ## Preprocessing in R `rolog` uses a default preprocessing function `preproc(query)` to map the R operators `<=` and `!=` to their Prolog counterparts `= np(NP, C), blank, vp(VP, C). np(NP, C) --> pn(NP, C). np(np(Det, N), C) --> det(Det, C), blank, n(N, C). np(np(Det, N, PP), C) --> det(Det, C), blank, n(N, C), blank, pp(PP). vp(vp(V, NP), C) --> v(V, C), blank, np(NP, _). vp(vp(V, NP, PP), C) --> v(V, C), blank, np(NP, _), blank, pp(PP). pp(pp(P, NP)) --> p(P), blank, np(NP, _). det(det(a), sg) --> `a`. det(det(the), _) --> `the`. pn(pn(john), sg) --> `john`. n(n(man), sg) --> `man`. n(n(men), pl) --> `men`. n(n(telescope), sg) --> `telescope`. v(v(sees), sg) --> `sees`. v(v(see), pl) --> `see`. p(p(with)) --> `with`. % Translate R string to code points and invoke phrase/2 sentence(Tree, Sentence) :- string_codes(Sentence, Codes), phrase(s(Tree), Codes). ``` Listing\ 3 : Simple grammar and lexicon. `sentence/2` preprocesses the R call. As in the first example, we first consult a little Prolog program with a minimalistic grammar and lexicon (Listing\ 3, see also `pl/telescope.pl`), and then raise a query asking for the syntactic structure of "john sees a man with a telescope". Closer inspection of the two results reveals the two possible meanings, "john sees a man *who carries* a telescope" versus "john sees a man *through* a telescope". Further Prolog examples of natural language processing are found in \citet{Blackburn2005}, including the resolution of anaphoric references and the extraction of semantic meaning. ```{r} consult(system.file(file.path("pl", "telescope.pl"), package="rolog")) Q <- quote(sentence(.Tree, "john sees a man with a telescope")) unlist(findall(Q, options=list(preproc=as.rolog))) ``` ## Installation of add-ons for Prolog In description of the previous example, we noted in passing that `rolog` can access the built-in libraries of SWI-Prolog (e.g., by calls to `use_module/1`). It is also possible to extend the installation by add-ons, including add-ons that require compilation, if the build tools (essentially, RTools under Windows, and xcode under macOS) are properly configured. This is illustrated below by the demo add-on `environ` [@Environ] that collects the current environment variables. ```r once(call("pack_install", quote(environ), list(quote(interactive(FALSE))))) once(quote(use_module(library(environ)))) once(call("environ", expression(X))) ``` The query then unifies *X* with a list with `Key=Value` terms. The purpose of this example is obviously not to mimic the built-in function `Sys.getenv()` from R, but to illustrate the installation and usage of Prolog extensions from within R. In most situations, the user would install the pack from within Prolog with `pack_install(environ).`. ## Term manipulation Prolog is homoiconic, that is, code is data. In this example, we make use of Prolog's ability to match expressions against given patterns and modify these expressions according to a few predefined "buggy rules" [@Brown1978], inspired by recurrent mistakes in the statistics exams of our students. Consider the $t$-statistic for comparing an observed group average to a population mean: $$ T = \frac{\overline{X} - \mu}{s / \sqrt{N}} $$ Some mistakes may occur in this calculation, for example, omission of the implicit parentheses around the numerator and the denominator when typing the numbers into a calculator, resulting in $\overline{X} - \frac{\mu}{s} \div \sqrt{N}$, or forgetting the square root around $N$, or both. Prolog code for the two buggy rules is given in Listing\ 4. ```prolog % Correct steps and mistakes expert(tratio(X, Mu, S, N), frac(X - Mu, S / sqrt(N))). buggy(frac(X - Mu, S / SQRTN), X - frac(Mu, S) / SQRTN). buggy(sqrt(N), N). % Apply expert and buggy rules, or enter expressions step(X, Y) :- expert(X, Y) ; buggy(X, Y). step(X, Y) :- compound(X), mapargs(search, X, Y), dif(X, Y). % Search through problem space search(X, X). search(X, Z) :- step(X, Y), search(Y, Z). ``` Listing 4 : Manipulating terms in Prolog The little e-learning system shown in Listing\ 4 produces six response alternatives. The fourth and the sixth result are combinations of the same two buggy rules (parenthesis, then square root, and the other way round). Some additional filters would be needed to eliminate trivial and redundant solutions \citep[see, e.g., the chapter on generate-and-test in][]{Sterling1994}. ```{r} consult(system.file(file.path("pl", "buggy.pl"), package="rolog")) Q <- quote(search(tratio(x, mu, s, n), .S)) unlist(findall(Q, options=list(preproc=as.rolog))) ``` An important feature of such a term manipulation is that the evaluation of the term can be postponed; for example, there is no need to instantiate the variables *x*, *mu*, *s*, and *n* with given values before raising a query. This is especially helpful for variables that may represent larger sets of data in later steps. It should be mentioned that R is homoiconic, too, and the Prolog code above can, in principle, be rewritten in R using non-standard evaluation techniques [@Wickham2019]. Prolog's inbuilt pattern matching algorithm simplifies things a lot, though. ## Rendering mathematical expressions The R extension of the markdown language [@Xie2020] enables reproducible statistical reports with nice typesetting in HTML, Microsoft Word, and Latex. However, so far, R expressions such as `pbinom(k, N, p)` are typeset as-is; prettier mathematical expressions such as $P_\mathrm{Bi}(X \le k; N, p)$ require Latex commands like `P_\mathrm{Bi}\left(X \le k; N, p\right)`, which are cumbersome to type in and hard to read even if the expressions are simple. Since recently, manual pages include support for mathematical expressions [@Sarkar2022], which already is a big improvement. Below Prolog's grammar rules are used for an _automatic_ translation of R calls to MathML. The result can then be used for calculations or it can be rendered on a web page. A limited set of rules for translation from R to MathML is found in `pl/mathml.pl` of package `rolog`. A more comprehensive translator is provided by the R package `mathml` [@mathml]. The relevant code snippets are shown in the listings below, along with their output. ```{r} library(rolog) consult(system.file(file.path("pl", "mathml.pl"), package="rolog")) # R interface to Prolog predicate r2mathml/2 mathml <- function(term) { t <- once(call("r2mathml", term, expression(X))) cat(paste(t$X, collapse="")) } ``` Listing 4 : Generate MathML from R expressions The first example is easy. At the Prolog end, there is a handler for `pbinom/3` that translates the term into a pretty MathML syntax like P_bi(X <= k; N, pi). ```{r, results="asis"} term <- quote(pbinom(k, N, p)) # Pretty print mathml(term) # Do some calculations with the same term k <- 10 N <- 22 p <- 0.4 eval(term) ``` The next example is interesting because Prolog needs to find out the name of the integration variable for `sin`. For that purpose, rolog provides a predicate `r_eval/2` that calls R from Prolog (i.e., the reverse direction, see also next example). Here, the predicate is used for the R\ function `formalArgs(args(sin))`, which returns the name of the function argument of `sin`, that is, `x`. ```{r, results="asis"} term <- quote(integrate(sin, 0L, 2L*pi)) mathml(term) eval(term) ``` Note that the Prolog end, the handler for `integrate/3` is rather rigid; it accepts only these three arguments in that particular order, and without names, that is, `integrate(sin, lower=0L, upper=2L * pi)` would not print the desired result. The extra R function `canonical()` applies `match.call()` to non-primitive R calls, basically cleaning up the arguments and bringing them into the correct order. Moreover, an extra handler maps the extractor function `$(Fn, "value")` to `Fn`. ```{r, results='asis'} canonical <- function(term) { if(is.call(term)) { f <- match.fun(term[[1]]) if(!is.primitive(f)) term <- match.call(f, term) # Recurse into arguments term[-1] <- lapply(term[-1], canonical) } return(term) } g <- function(u) sin(u) # Mixture of (partially) named and positional arguments in unusual order term <- quote(2L * integrate(low=-Inf, up=Inf, g)$value) mathml(canonical(term)) # It is a bit of a mystery that R knows the result of this integral. eval(term) ``` Note that both `sin` nor `g` in the above terms are R symbols, not R functions. In order to render something like `call("integrate", low=-Inf, up=Inf, g)`, or `call("integrate", low=-Inf, up=Inf, sin)`, with `g` and `sin` referring to the respective functions, one would need to determine its name, which is not possible in general. ```{r} print(g) ``` ## Calling R from Prolog The basic workflow of the bridge from R to Prolog is to (A)\ translate an R\ expression into a Prolog term (i.e., a predicate), (B)\ query the predicate, and then, (C)\ translate the result (i.e., the bindings of the variables) back to R (see also Figure\ 1). The reverse direction is straightforward, we start by translating a Prolog term to an R\ expression (i.e. Step\ C), evaluate the R\ expression, and then translate the result back to a Prolog\ term (Step\ A). Package `rolog` provides two predicates for that purpose, `r_eval(Expr)` and `r_eval(Expr, Res)`. The former is used to invoke an R\ expression `Expr` for its side effects (e.g., initializing a random number generator); it does not return a result. The latter is used to evaluate the R\ expression and return the result `Res`. The code snippet in Listing\ 6 (`r_eval.pl`) illustrates this behavior. ```prolog r_seed(Seed) :- r_eval('set.seed'(Seed)). r_norm(N, L) :- r_eval(rnorm(N), L). ``` Listing\ 6 : Calling R from Prolog using `r_eval/1` and `r_eval/2`. The R\ call `set.seed` is quoted because the dot is an operator in Prolog. ```{r} consult(system.file(file.path("pl", "r_eval.pl"), package="rolog")) invisible(once(call("r_seed", 123L))) once(call("r_norm", 3L, expression(X))) ``` The example in Listing\ 6 is a bit trivial, basically illustrating the syntax and the workflow. More serious applications of are shown in the next two sections where `r_eval/2` is used to evaluate monotonically behaving R\ functions and to obtain the names of function arguments in R. As show below, the default environment of `rolog`'s `r_eval/2` is `.GlobalEnv`, this can be changed in an optional argument to `once()`, `findall()`, and `query()`. ```{r} # Set variable in R, read in Prolog env <- new.env() with(env, a <- 1) once(call("r_eval", quote(a), expression(X)), env=env) # Set R variable in Prolog, read in R invisible(once(call("r_eval", call("<-", quote(b), 2)))) cat("b =", b) ``` If the R call raises an exception, an error is propagated to Prolog and finally to the `rolog` package: ```{r} #try(once(quote(r_eval(rnorm(-1))))) # return "-1" random normals ``` ## Interval arithmetic Let $\langle\ell, u\rangle$ denote a number between $\ell$ and $u$, $\ell\le u$. It is easily verified that the result of the difference $\langle\ell_1, u_1\rangle - \langle\ell_2, u_2\rangle$ is somewhere in the interval $\langle \ell_1 - u_2, u_1 - \ell_2\rangle$, and a number of rules exist for basic arithmetic operations and (piecewise) monotonically behaving functions [@Hickey2001]. For ratios, denominators with mixed sign yield two possible intervals, for example, $\langle 1, 2\rangle / \langle -3, 3\rangle = \langle -\infty, 3\rangle \cup \langle 3, \infty\rangle$, as shown in Figure 4 in Hickey et al.'s article. The number of possible candidates increases if more complicated functions are involved, as unions of intervals themselves appear as arguments (e.g., if $I_1 \cup I_2$ is added to $I_3 \cup I_4$, the result is $I_1 + I_3 \cup I_1 + I_4 \cup I_2 + I_3 \cup I_2 + I_4$). As a consequence, calculations in interval arithmetic are non-deterministic in nature, and the number of possible results is not foreseeable and cannot, in general, be vectorized as is often done in R. Use cases for interval arithmetic are the limitations of floating-point representations in computer hardware, but intervals can also be used to represent the result of measurements with limited precision, or truncated intermediate results of students doing hand calculations. A few rules for basic interval arithmetic are found in `pl/interval.pl`; a few examples are shown below. Again, Prolog rings back to R via `r_eval/2` to determine the result of `dbinom(X, Size, Prob, Log)`. ```{r} #consult(system.file(file.path("pl", "interval.pl"), package="rolog")) #Q <- quote(int(`...`(1, 2) / `...`(-3, 3), .Res)) #unlist(findall(Q, options=list(preproc=as.rolog))) #D <- quote(`...`(5.7, 5.8)) #mu <- 4 #s <- quote(`...`(3.8, 3.9)) #N <- 24L #tratio <- call("/", call("-", D, mu), call("/", s, call("sqrt", N))) #once(call("int", tratio, expression(Res))) # Binomial density #prob = quote(`...`(0.2, 0.3)) #once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res))) ``` The slightly cumbersome syntax for entering an interval $\langle \ell, u\rangle$ is due to the fact that the ellipsis is a reserved symbol in R and cannot be used as an infix operator. A powerful and comprehensive system for constraint logic programming over intervals is available as a Prolog pack [@Workman2021] and can easily be connected to R using, for example, the present package. # 5. Conclusions R has become the primary language for statistical programming and data science, but is currently lacking support for traditional, symbolic artificial intelligence. There are already two add-ons for SWI-Prolog that allow to run R calculations from Prolog [@Angelopoulos2013;@Rserve], but a connection in the other direction was missing, so far. `rolog` bridges this gap by providing an interface to a SWI-Prolog distribution in an R package. The communication between the two systems is mainly in the form of queries from R to Prolog, but two predicates allow Prolog to ring back and evaluate terms in R. The design of the package is minimalistic, providing three main functions `query()`, `submit()`, and `clear()`, and a very limited set of convenience tools (`consult()`, `once()`, and `findall()`) to facilitate recurrent everyday actions. As both systems are homoiconic in nature, it was easy to establish a one-to-one correspondence between many of the elements of the two languages. Most exceptions (e.g., lack of R support for empty symbols) can be avoided and/or circumvented by wrapper functions at both ends. Simple ways to extend the package have been described in Section\ 2; such extensions could, for example, include R objects and structures like those returned by `lm()`, or S4 classes. In many use cases, this may be realized by transforming the R object to a list with named elements, and rebuild the object on the Prolog end on an as-needed basis. After a query, the process is reversed. If speed is an issue, more of these steps can, in principle, be moved into the package and implemented in `Rcpp`. `rolog`, thus, opens up a wide of applications in logic programming for statisticians and researchers at the intersection of symbolic and connectionist artificial intelligence, where concise knowledge representation is combined with statistical power. Moreover, `rolog` provides starting points for useful small-scale solutions for everyday issues in data science (term transformations, pretty mathematical output, interval arithmetic, see Section\ 3). At its present stage, a major limitation of `rolog` is its relatively slow speed. For example, translation of R lists or vectors to the respective elements of the Prolog language (also lists, `#/N`) is done element-wise, in both directions. The translation is optimized by using `Rcpp` [@Edelbuettel2018], but there remains an upper bound in the efficiency, because Prolog does not support vectors or matrices. Since Prolog's primary purpose is not vector or matrix calculation, this limitation may not show up in real-world applications. Another issue, maybe a bit annoying, is the rather cumbersome syntax of the interface, with the need for quoted calls and R expressions for representing Prolog variables. `rolog` was deliberately chosen to be minimalistic and, so far, only depends on base R. A more concise representation might be obtained by tools from the "Tidyverse" ecosystem, as described in Chapter\ 19 of Advanced\ R [@Wickham2019]. Finally, at this stage, `rolog` is unable to deal with cyclic terms (e.g., `once(call("=", expression(A), call("f", expression(A))))`, i.e., `A = f(A)` raises an error message). `rolog` is available for R Version 4.2 and later, and can easily be installed using the usual `install.packages("rolog")`. The source code of the package is found at https://github.com/mgondan/rolog/, including installation instructions for Unix, Windows and macOS. # Acknowledgement Development of the package profited substantially from the Prolog packs `rserve_client` [@Rserve] and `real` [@Angelopoulos2013]. # Note The results in this paper were obtained using R\ `r paste(R.Version()[7:8], collapse = ".")` with the `rolog`\ `r packageVersion("rolog")` package. R\ itself and all packages used are available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/. # References