-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Refactor internal workflows #52
base: main
Are you sure you want to change the base?
Conversation
plan_rev_dep_dev_check <- function(path, revdep, repos) { | ||
origin <- pkg_origin_local(path = path) | ||
|
||
tasks <- list( | ||
make_unique_task(seed = "dev", check_task( | ||
origin = pkg_origin_repo(package = revdep, repos = repos), | ||
env = DEFAULT_R_CMD_CHECK_ENVVARS, | ||
args = DEFAULT_R_CMD_CHECK_ARGS, | ||
build_args = DEFAULT_R_CMD_BUILD_ARGS | ||
)), | ||
install_task(origin = origin) | ||
) | ||
|
||
planned(sequence_graph( | ||
name = hashes(tasks), | ||
task = tasks, | ||
task_type = lapply(tasks, function(task) class(task)[[1]]), | ||
package = c(revdep, origin$package) | ||
)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Planning a reverse-dependency check requires just two declarative nodes: running the R CMD check
and installing the appropriate version of the dependency (CRAN or local dev version).
When constructing a full set of rev dep checks, we create pairs (release + dev) of these types of sub-graphs and then merge them into a single task graph.
note: I'm trying to simplify this further, for now just focus on the tasks <- list(...)
part
install_task <- function( | ||
origin, | ||
type = getOption("pkgType"), | ||
INSTALL_opts = NULL, | ||
lib = lib_loc_default(), | ||
... | ||
) { | ||
task <- task(origin = origin, ...) | ||
task$type <- type | ||
task$INSTALL_opts <- INSTALL_opts | ||
task$lib <- lib | ||
class(task) <- c("install_task", class(task)) | ||
task | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An install_task
already contains information about the library it is intended to install into. This could be a file path when one is already known, but is more likely a method of selecting a lib path such as lib_loc_default()
lib_loc_default
is simply a structure(list(), class = c("lib_loc_default", "lib_loc"))
used for dispatch when deriving libraries.
# for each check task in the plan, build a dependency tree and merge it | ||
# into the existing check task subtree | ||
plan_neighborhoods <- lapply(plan_neighborhoods, function(nh) { | ||
subtree <- igraph::induced_subgraph(plan, nh) | ||
deps <- dep_tree(nh[[1]]$task) | ||
igraph::reverse_edges(deps) | ||
|
||
subtree <- graph_project( | ||
x = deps, | ||
onto = subtree, | ||
where = c("name" = "package") | ||
) | ||
|
||
# set missing dependencies to be installed from repo | ||
missing_task <- is.na(igraph::V(subtree)$task) | ||
igraph::V(subtree)$task[missing_task] <- lapply( | ||
igraph::V(subtree)$package[missing_task], | ||
function(package) { | ||
origin <- try_pkg_origin_repo(package = package, repos = repos) | ||
install_task(origin = origin) | ||
} | ||
) | ||
|
||
# re-hash tasks as vertex names (populate missing vertex names) | ||
igraph::V(subtree)$name <- vcapply(igraph::V(subtree)$task, hash) | ||
igraph::V(subtree)$task_type <- vcapply( | ||
igraph::V(subtree)$task, | ||
function(task) class(task)[[1]] | ||
) | ||
|
||
subtree | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a a big component of the new strategy. When we "plan" a set of checks, we don't really care about the details of where packages come from or how dependencies are managed. This function completes the graph with all the task dependencies that need to be performed before the plan can be executed.
To accomplish this, we look at each check
task, derive its dependency tree and then project that dependency tree back onto the existing planned tasks.
Let's imagine a situation where we are planning to run a rev dep check for pkgA
with a dev version our package, pkgB
. In this case, we plan to run R CMD check
where our dependency is installed using the local dev source code. When we derive the dependency tree, we want to make sure that this dev dependency continues to use the existing task that has planned its installation.
graph_project
projects a set of vertices and edges onto another graph based on one or more attributes. If a vertex already exists, that vertex is retained and all the edges from both graphs are merged.
flowchart TB
rdcA["revdep check {pkgA}"]
instB["install dep {pkgB}<br>from source"]
pkgA["{pkgA}"]
pkgB["{pkgB}"]
pkgX["{pkgX}"]
pkgY["{pkgY}"]
pkgZ["{pkgZ}"]
subgraph plan
direction TB
rdcA --> instB
end
subgraph deps["{pkgA} dependencies"]
direction TB
pkgA --> pkgB
pkgA --> pkgX
pkgB --> pkgY
pkgB --> pkgZ
end
becomes
flowchart TB
rdcA["revdep check {pkgA}"]
instB["install dep {pkgB}<br>from source"]
instX["install dep {pkgX}<br> from CRAN"]
instY["install dep {pkgY}<br> from CRAN"]
instZ["install dep {pkgZ}<br> from CRAN"]
subgraph task graph
direction TB
rdcA --> instB
rdcA --> instX
instB --> instY
instB --> instZ
end
} | ||
|
||
#' @export | ||
plot.task_graph <- function(x, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some visualization tools to help with debugging. I have aspirations to use the viewer pane to have an actively updating graph as the execution is happening, but nowhere near that feature yet.
This is a pretty big redesign of the internals. The design objectives were to:
data.frame
) and "design" (task_graph
) so that they both leverage a graph structure.task_get_install_lib()
), the lib path method is declared when the install task node is createdUse Case
The code that has been refactored is really only for a central, primary use case so far:
Naming conventions changes:
check_design
=>checker
checks_df
,tasks_df
=>task_graph
(plan
, but want to consolidate these still)package_spec
=>pkg_origin
task_spec
=>task
Non-goals:
I'll highlight important bits of code in comments