Environment basics

Experiment: detach base package, global environment

tryCatch(detach("package:base"), error=identity)
## <simpleError in detach("package:base"): detaching "package:base" is not allowed>
tryCatch(detach(".GlobalEnv"), error=identity)
## <simpleError in detach(".GlobalEnv"): invalid 'pos' argument>

Does not work - note different error messages (generated by .Internal call of pos).

ls.str: did not know this was a thing

$ [[ vs get: get("ls") works at prompt, along the search path; operators require name of environment.

identical: note that generic function all.equal has a method for environments:

methods(class=`environment`)
## [1] all.equal  as.list    coerce     initialize
## see '?methods' for accessing help and source code

Works as usually: use identical for logical conditions

identical(globalenv(), environment())
## [1] TRUE
all.equal(globalenv(), environment())
## [1] TRUE

Apparently, there is not print method for environments:

class(environment())
## [1] "environment"
unclass(environment())
## <environment: R_GlobalEnv>
print(environment())
## <environment: R_GlobalEnv>

Exercises

  1. List three ways in which an environment differs from a list.

    Unsorted, unique names, has parent (except emptyenv()), (reference semantics)

  2. If you don’t supply an explicit environment, where do ls() and rm() look? Where does <- make bindings?

    In the current environment (i.e. global environment at the command line, execution environment in function)

  3. Using parent.env() and a loop (or a recursive function), verify that the ancestors of globalenv() include baseenv() and emptyenv(). Use the same basic idea to implement your own version of search().

Recursive function:

chainofbeing = function(e = parent.frame())
{
   if (identical(e, emptyenv())) {
       "Empty environment (cause of all being)"
   } else {
        paste(environmentName(e), chainofbeing( parent.env(e) ), sep = " -> ")
   }
}

By default, start at calling frame:

chainofbeing()
## [1] "R_GlobalEnv -> package:stats -> package:graphics -> package:grDevices -> package:utils -> package:datasets -> package:methods -> Autoloads -> base -> Empty environment (cause of all being)"
chainofbeing(new.env())
## [1] " -> R_GlobalEnv -> package:stats -> package:graphics -> package:grDevices -> package:utils -> package:datasets -> package:methods -> Autoloads -> base -> Empty environment (cause of all being)"
chainofbeing(new.env(parent=new.env()))
## [1] " ->  -> R_GlobalEnv -> package:stats -> package:graphics -> package:grDevices -> package:utils -> package:datasets -> package:methods -> Autoloads -> base -> Empty environment (cause of all being)"
chainofbeing(new.env(parent=baseenv()))
## [1] " -> base -> Empty environment (cause of all being)"

Note that new.env generates an unnamed environment; this is probably intentional? Could I mess up evaluation if I get to name environments?

Recursing over environments

Recursion, always fun

Exercises

  1. Modify where() to find all environments that contain a binding for name.

This is a bit tricky - what do I return, a list of environments?

where_all <- function(name, env = parent.frame()) {
  if (identical(env, emptyenv())) {
    # Base case
    NULL
  } else if (exists(name, envir = env, inherits = FALSE)) {
    # Success case
    c(env, where_all(name, parent.env(env)))
  } else {
    where_all(name, parent.env(env))
  }
}

Works for one environment:

where_all("ls")
## [[1]]
## <environment: base>
where_all("smurf")
## NULL

Create another ls:

ee = new.env()
ee$ls = 1
where_all("ls", env=ee)
## [[1]]
## <environment: 0x000000000aa95a58>
## 
## [[2]]
## <environment: base>
  1. Write your own version of get() using a function written in the style of where().
get <- function(name, env = parent.frame()) {
  if (identical(env, emptyenv())) {
    # Base case
    stop("Can't find ", name, call. = FALSE)
    
  } else if (exists(name, envir = env, inherits = FALSE)) {
    # Success case
    env[[name]]
    
  } else {
    # Recursive case
    get(name, parent.env(env))
    
  }
}

A bit of self-reference:

get("get")
## function(name, env = parent.frame()) {
##   if (identical(env, emptyenv())) {
##     # Base case
##     stop("Can't find ", name, call. = FALSE)
##     
##   } else if (exists(name, envir = env, inherits = FALSE)) {
##     # Success case
##     env[[name]]
##     
##   } else {
##     # Recursive case
##     get(name, parent.env(env))
##     
##   }
## }
get("get", env=parent.env(globalenv()))
## function (x, pos = -1L, envir = as.environment(pos), mode = "any", 
##     inherits = TRUE) 
## .Internal(get(x, envir, mode, inherits))
## <bytecode: 0x0000000007157a78>
## <environment: namespace:base>
  1. Write a function called fget() that finds only function objects. It should have two arguments, name and env, and should obey the regular scoping rules for functions: if there’s an object with a matching name that’s not a function, look in the parent. For an added challenge, also add an inherits argument which controls whether the function recurses up the parents or only looks in one environment.
fget <- function(name, env = parent.frame()) {
  if (identical(env, emptyenv())) {
    # Base case
    stop("Can't find function ", name, call. = FALSE)
# This is the hard way    
#  } else if (exists(name, envir = env, inherits = FALSE) & mode(get(name, env))=="function") {
# This is the easy way
   } else if (exists(name, envir = env, inherits = FALSE, mode = "function") ) {
      env[[name]]
  } else {
    # Recursive case
    fget(name, parent.env(env))
  }
}

Testing:

get("pi")
## [1] 3.141593
tryCatch(fget("pi"), error=identity)
## <simpleError: Can't find function pi>
fget("get")
## function(name, env = parent.frame()) {
##   if (identical(env, emptyenv())) {
##     # Base case
##     stop("Can't find ", name, call. = FALSE)
##     
##   } else if (exists(name, envir = env, inherits = FALSE)) {
##     # Success case
##     env[[name]]
##     
##   } else {
##     # Recursive case
##     get(name, parent.env(env))
##     
##   }
## }
fget("is.numeric") ## Note: no environment!!
## function (x)  .Primitive("is.numeric")
  1. Write your own version of exists(inherits = FALSE) (Hint: use ls().) Write a recursive version that behaves like exists(inherits = TRUE).

Werl, no inheritance is easy:

exists1 = function(name, env = parent.frame()) 
{
    name %in% ls(env)
}
exists1("ls")
## [1] FALSE
exists1("exists1")
## [1] TRUE

With inheritance:

exists2 <- function(name, env = parent.frame()) {
  if (identical(env, emptyenv())) {
    FALSE
  } else if (name %in% ls(env) ) {
    TRUE
  } else {
    # Recursive case
    exists2(name, parent.env(env))
  }
}
exists2("ls")
## [1] TRUE
exists2("exists1")
## [1] TRUE
exists2("pi")
## [1] TRUE
exists2("smurf")
## [1] FALSE

Function environments

“Functions dont have names”: are functions global objects that are stored somewhere centrally? If so, how can we do garbage collection…. as stated: all objects without a name binding are deleted?

This seems to support this: f has the same enclosing environment, but clearly a different binding environment. Iow, functions are by reference, not by value; and where does this leave the original function definition?!

environment(ls)
## <environment: namespace:base>
f = ls 
environment(f)
## <environment: namespace:base>

In the diagram, why is namespace:base inserted where it is, and referring to the global environment as parent? Is that what happens when a package is loaded? So we get a kind of tree structure: which each library(), the package environment is inserted just after the global environment, but the namespace / imports environments are set up below the global environment, pointing upwards? Why insert the namespace:base here?

This is always fun: parent.env != parent.frame

Execution environment has two parents?! This is not the definition of an environment - is this metaphorical or what?

Exercises

  1. List the four environments associated with a function. What does each one do? Why is the distinction between enclosing and binding environments particularly important?
  • Enclosing: environment at function definition, the enviroment for the function; access via environment(f)
  • Binding: an environment where a symbol is associated with the function; access via where(as.character(bquote(f)))
  • Execution: an environment created fresh when the function is run for local variables; access via environment()in function
  • Calling: the environment from which the function call is run; access via parent.frame from function
  1. Draw a diagram that shows the enclosing environments of this function:
    f1 <- function(x1) {
      f2 <- function(x2) {
        f3 <- function(x3) {
          x1 + x2 + x3
        }
        f3(3)
      }
      f2(2)
    }
    f1(1)
  1. Expand your previous diagram to show function bindings.

  2. Expand it again to show the execution and calling environments.

  3. Write an enhanced version of str() that provides more information about functions. Show where the function was found and what environment it was defined in.

Tricky. Let’s assume we do not want to recurse all the way down to deal properly with e.g. a list of functions.

str2 = function(x, ...)
{
  ## The default
  str(x, ...)
  ## If this is a function...
  if (mode(x)=="function") {
     found_in = environmentName(where(deparse(substitute(x))))
     def_in   = environmentName(environment(x))
     cat("\tFound in:", found_in, "\n\tDefined in:", def_in, "\n")
  }     
  invisible()
}
## As usual
str2(pi)
##  num 3.14
str2("pi")
##  chr "pi"
str2(list(a=1, b=f, c=LETTERS))
## List of 3
##  $ a: num 1
##  $ b:function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE, 
##     pattern, sorted = TRUE)  
##  $ c: chr [1:26] "A" "B" "C" "D" ...
## New 
str2(ls)
## function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE, 
##     pattern, sorted = TRUE)  
##  Found in: base 
##  Defined in: base
str2(f)
## function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE, 
##     pattern, sorted = TRUE)  
##  Found in: R_GlobalEnv 
##  Defined in: base

Binding names to values

Delayed binding: sounds like a syntactic binding (for parsing purposes)?

Exercises

  1. What does this function do? How does it differ from <<- and why might you prefer it?
    rebind <- function(name, value, env = parent.frame()) {
      if (identical(env, emptyenv())) {
        stop("Can't find ", name, call. = FALSE)
      } else if (exists(name, envir = env, inherits = FALSE)) {
        assign(name, value, envir = env)
      } else {
        rebind(name, value, parent.env(env))
      }
    }
    tryCatch(rebind("a", 10), error=identity)
## <simpleError: Can't find a>
    a <- 5
    rebind("a", 10)
    a
## [1] 10

This a replacement for <<- which bombs if the target name is not found in the enclosing environment; it does protect from unintentional side effects when e.g. misspelling the name of the target.

  1. Create a version of assign() that will only bind new names, never re-bind old names. Some programming languages only do this, and are known as single assignment languages.
new_assign = function(name, value, env = parent.frame())
{
    if (name %in% ls(env)) stop("Can't rebind ", name)
    assign(name, value, envir = env)
}
new_assign("new", 3)
tryCatch(new_assign("f", 3), error=identity)
## <simpleError in new_assign("f", 3): Can't rebind f>
  1. Write an assignment function that can do active, delayed, and locked bindings. What might you call it? What arguments should it take? Can you guess which sort of assignment it should do based on the input?

Note that despite what ?assign says, it really binds by default in the calling environment of assign.

Ok, this is a prototpye that DOES NOT WORK:

bind = function(name, value, envir = parent.frame(), type)
{
  if (missing(type)) {
    type =  if (is.function(value)) "active"
    else if (is.expression(value)) "delayed"
    else "normal"
  }
  type = match.arg(type, c("normal", "active", "delayed", "locked"))
  
  if (type %in% c("normal", "locked")) {
    assign(name, value, envir = envir)
    if (type == "locked") lockBinding(name, env = envir)
  } else if (type == "active") {
    makeActiveBinding(name, value, env = envir)
  } else if (type == "delayed") {
    delayedAssign(name, value, eval.env = envir, assign.env = envir)
  } else stop("How the fuck did you end up here?!")
  
  invisible()
}

Explicit environments