| Title: | Branch-Level Inference Framework for Recognizing Optimal Shifts in Traits |
|---|---|
| Description: | Methods for detecting and visualizing cladogenic shifts in multivariate trait data on phylogenies. Implements penalized-likelihood multivariate generalized least squares models, enabling analyses of high-dimensional trait datasets and large trees via searchOptimalConfiguration(). Includes a greedy step-wise shift-search algorithm following approaches developed in Smith et al. (2023) <doi:10.1111/nph.19099> and Berv et al. (2024) <doi:10.1126/sciadv.adp0114>. Methods build on multivariate GLS approaches described in Clavel et al. (2019) <doi:10.1093/sysbio/syy045> and implemented in the mvgls() function from the 'mvMORPH' package. Documentation and vignettes are available at <https://jakeberv.com/bifrost/>, including worked examples for the jaw-shape dataset. |
| Authors: | Jacob S. Berv [aut, cre, cph, fnd] (ORCID: <https://orcid.org/0000-0002-5962-0621>), Nathan Fox [aut] (ORCID: <https://orcid.org/0000-0002-2816-9751>), Matt J. Thorstensen [aut] (ORCID: <https://orcid.org/0000-0002-7870-3369>), Henry Lloyd-Laney [aut] (ORCID: <https://orcid.org/0000-0003-4650-8937>), Emily M. Troyer [aut] (ORCID: <https://orcid.org/0000-0001-7478-2306>), Rafael A. Rivero-Vega [aut] (ORCID: <https://orcid.org/0000-0001-5937-6377>), Stephen A. Smith [aut, fnd] (ORCID: <https://orcid.org/0000-0003-2035-9531>), Matt Friedman [aut, fnd] (ORCID: <https://orcid.org/0000-0002-0114-7384>), David F. Fouhey [aut, fnd] (ORCID: <https://orcid.org/0000-0001-5028-5161>), Brian C. Weeks [aut, fnd] (ORCID: <https://orcid.org/0000-0003-2967-2970>) |
| Maintainer: | Jacob S. Berv <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.1.4 |
| Built: | 2026-06-04 21:50:44 UTC |
| Source: | https://github.com/jakeberv/bifrost |
Creates a named color mapping for a set of numeric parameters (e.g., evolutionary rates) using the viridis color palette. Parameters are first sorted in ascending order and normalized to the range [0, 1], then mapped to evenly spaced viridis colors for intuitive visualization.
generateViridisColorScale(params)generateViridisColorScale(params)
params |
A named numeric vector of parameter values (e.g., rates). The names will be preserved and used to label the resulting color mapping. |
This function is useful for plotting results where parameters should be visually distinguished based on their magnitude (e.g., rate shifts across a phylogeny). By using the perceptually uniform viridis palette, it avoids misleading color interpretations common with rainbow scales.
A named list with two elements:
NamedColorsA named character vector of hex color codes, with names corresponding to the input parameter names, ordered by increasing parameter value.
ParamColorMappingA named numeric vector of the sorted parameter values,
maintaining the same order and names as NamedColors.
viridis::viridis() for details on
the color palette.
if (requireNamespace("viridis", quietly = TRUE)) { library(viridis) set.seed(1) rates <- c(A = 0.1, B = 0.5, C = 0.9) color_scale <- generateViridisColorScale(rates) # View the color assignments color_scale$NamedColors # Plot with colors barplot(color_scale$ParamColorMapping, col = color_scale$NamedColors, main = "Rates with Viridis Colors") }if (requireNamespace("viridis", quietly = TRUE)) { library(viridis) set.seed(1) rates <- c(A = 0.1, B = 0.5, C = 0.9) color_scale <- generateViridisColorScale(rates) # View the color assignments color_scale$NamedColors # Plot with colors barplot(color_scale$ParamColorMapping, col = color_scale$NamedColors, main = "Rates with Viridis Colors") }
Create a two-layer base R plot that visualizes information criterion (IC) scores across a sequence of sub-model evaluations, highlighting which steps were accepted vs rejected. Optionally, a secondary y-axis overlays the rate of improvement (first difference of IC scores) as a line with markers.
plot_ic_acceptance_matrix( matrix_data, plot_title = "IC Acceptance Matrix Scatter Plot", plot_rate_of_improvement = TRUE, rate_limits = c(-400, 150), baseline_ic = NULL )plot_ic_acceptance_matrix( matrix_data, plot_title = "IC Acceptance Matrix Scatter Plot", plot_rate_of_improvement = TRUE, rate_limits = c(-400, 150), baseline_ic = NULL )
matrix_data |
A two-column |
plot_title |
|
plot_rate_of_improvement |
|
rate_limits |
|
baseline_ic |
Optional |
The function expects a two-column object where:
Column 1 contains the IC score at each step (numeric; lower is better).
Column 2 contains an indicator for acceptance (0 = rejected, 1 = accepted).
The first IC value is treated as the baseline and is plotted as a larger
black point with a numeric label. If baseline_ic is supplied, it is used as
the baseline IC score (step 1) in place of matrix_data[1, 1] for both the
baseline annotation and the rate-of-improvement series (diff(IC)). This is
useful because matrix_data begins with the first evaluated shift model (rather
than the true no-shift baseline). To achieve this behavior, pass the true baseline via
baseline_ic to avoid labeling the first evaluated model as the baseline.
Accepted steps are drawn as blue filled points connected by a thin line; rejected
steps are drawn as small red crosses. When plot_rate_of_improvement = TRUE,
the function overlays a secondary y-axis on the right that shows diff(IC) values
(the per-step change in IC; more negative implies improvement).
The function uses only base graphics. It sets plot margins and mgp via
par(), and (when overlaying) uses par(new = TRUE) to layer the IC plot over the
rate-of-improvement axes. Initial user par is reset on exit.
Axes and scaling. Tick marks for the primary (IC) x/y axes are computed with
pretty() to give clean bounds. The secondary axis for the rate of improvement
uses rate_limits (default c(-400, 150)); adjust via the argument if your
expected diff(IC) range differs substantially.
Invisibly returns NULL. Called for its plotting side effects.
par, plot, axis,
lines, points, legend,
mtext, title
ic <- c(-1000, -1012, -1008, -1025, -1020, -1030) accepted <- c(1, 0, 1, 0, 1) # steps 2..6 relative to baseline mat <- cbind(ic, c(1, accepted)) # mark baseline as accepted for plotting plot_ic_acceptance_matrix(mat, plot_title = "IC Path") # Avoid non-ASCII glyphs in titles on CRAN/CI: plot_ic_acceptance_matrix(mat, plot_rate_of_improvement = TRUE) # Override baseline IC: plot_ic_acceptance_matrix(mat, baseline_ic = -995)ic <- c(-1000, -1012, -1008, -1025, -1020, -1030) accepted <- c(1, 0, 1, 0, 1) # steps 2..6 relative to baseline mat <- cbind(ic, c(1, accepted)) # mark baseline as accepted for plotting plot_ic_acceptance_matrix(mat, plot_title = "IC Path") # Avoid non-ASCII glyphs in titles on CRAN/CI: plot_ic_acceptance_matrix(mat, plot_rate_of_improvement = TRUE) # Override baseline IC: plot_ic_acceptance_matrix(mat, baseline_ic = -995)
rateMap ObjectsRender a computed "rateMap" object. The usual workflow is
x <- rateMap(...) followed by plot(x, ...).
## S3 method for class 'rateMap' plot( x, value = "value", palette = NULL, reverse_palette = NULL, color_mode = NULL, n_categories = NULL, category_bin_method = NULL, category_breaks = NULL, category_labels = NULL, ncolors = NULL, legend_title = NULL, legend = NULL, fsize = NULL, tip_fsize = NULL, legend_fsize = NULL, ftype = NULL, show_tip_labels = TRUE, outline = FALSE, lwd = 3, type = c("phylogram", "fan", "arc"), mar = rep(0.3, 4), direction = "rightwards", offset = NULL, xlim = NULL, ylim = NULL, hold = TRUE, underscore = FALSE, arc_height = 2, legend_digits = NULL, ... )## S3 method for class 'rateMap' plot( x, value = "value", palette = NULL, reverse_palette = NULL, color_mode = NULL, n_categories = NULL, category_bin_method = NULL, category_breaks = NULL, category_labels = NULL, ncolors = NULL, legend_title = NULL, legend = NULL, fsize = NULL, tip_fsize = NULL, legend_fsize = NULL, ftype = NULL, show_tip_labels = TRUE, outline = FALSE, lwd = 3, type = c("phylogram", "fan", "arc"), mar = rep(0.3, 4), direction = "rightwards", offset = NULL, xlim = NULL, ylim = NULL, hold = TRUE, underscore = FALSE, arc_height = 2, legend_digits = NULL, ... )
x |
An object of class |
value |
Character column in the plotted summary table ( |
palette |
Optional palette override used for this plot. This can be an
|
reverse_palette |
Optional logical override for palette reversal used for this plot. |
color_mode |
Optional color-mode override. Use |
n_categories |
Optional category-count override for
|
category_bin_method |
Optional category-binning override for
|
category_breaks |
Optional category-break override for
|
category_labels |
Optional category-label override for
|
ncolors |
Optional number of colors to use when recoloring this plot
with |
legend_title |
Optional legend title override for this plot. |
legend |
Legend length. If |
fsize |
Numeric font-size vector. The first element is passed as the tip
label |
tip_fsize |
Optional override for the first |
legend_fsize |
Optional override for the legend font size. |
ftype |
Font type. The first element is passed as |
show_tip_labels |
Logical; if |
outline |
Logical; if |
lwd |
Branch and legend line widths. The first element is passed to
|
type |
Plot type: |
mar |
Plot margins passed to |
direction |
Plotting direction for |
offset |
Tip-label offset passed to |
xlim, ylim
|
Optional plot limits passed to |
hold |
Logical controlling |
underscore |
Logical; if |
arc_height |
Arc height passed through to |
legend_digits |
Optional number of digits for legend endpoint labels. If omitted, small-magnitude values use enough digits to avoid zero-valued legend endpoints. |
... |
Additional arguments are rejected. Include all display choices as
named |
The plot method uses phytools::plotSimmap() and draws either a continuous
color-bar legend or a segmented rate-category color bar.
The plotting controls intentionally mirror the phytools density-map
plotting style for phylogram, fan, and arc layouts.
In branch-summary category mode, near-zero or high-outlier rate diagnostics
are drawn as special categories when rate-valued columns are plotted; these
special categories do not consume positions in the ordered palette used for
regular rate bins. When special categories are present, the category legend
spans the full plotted value range and marks the diagnostic cutoff separating
special and regular bins. When plotting non-rate columns such as "sd",
diagnostic columns are preserved only as metadata with rate_flag_source
provenance; special rate categories are not drawn for those non-rate values.
The selected value column must contain finite values for every plotted
interval; uncertainty columns such as "sd" may be all NA for single-fit
objects and are rejected with a clear error.
The plot() method handles "phylogram", "fan", and "arc" layouts.
legend controls the length of the color bar; set legend = FALSE to
suppress it. legend_digits controls numeric endpoint labels and defaults
to enough precision to avoid rounding small values to zero. Single fitted
objects should be converted explicitly with rateMap() before plotting, for
example plot(rateMap(search_a), ...).
Relationship to phytools plotting arguments. plot.rateMap() keeps the
tree-layout argument names close to phytools: type, fsize, ftype,
lwd, mar, direction, offset, xlim, ylim, underscore, and
arc_height are forwarded to phytools::plotSimmap() or
phytools::plotTree() as described above. rateMap fixes
phytools::plotSimmap() options such as colors, pts, node.numbers,
add, and hold internally, because those are determined by the computed
"rateMap" object and by the optional outline pass. palette,
color_mode, n_categories, category_breaks, category_labels,
legend_title, legend_digits, and the rate-flag display behavior are
rateMap controls, not phytools arguments.
Invisibly returns the plotted "rateMap" object.
Revell, L. J. (2013). Two new graphical methods for mapping trait evolution on phylogenies. Methods in Ecology and Evolution, 4, 754-759.
Revell, L. J. (2024). phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12, e16505. doi:10.7717/peerj.16505
rateMap(), phytools::densityMap(), phytools::plotSimmap(),
phytools::plotTree()
## Not run: rm_obj <- rateMap(fits, progress = FALSE) plot( rm_obj, type = "arc", show_tip_labels = FALSE, legend_fsize = 0.8 ) # If rm_obj was built with uncertainty = TRUE, plot uncertainty directly: plot(rm_obj, value = "sd", palette = "Inferno") # Use a continuous ramp instead of the default ordered rate categories: plot(rm_obj, color_mode = "continuous") # Or keep category colors but change the binning: plot(rm_obj, n_categories = 5, category_bin_method = "equal") plot(rm_obj, category_breaks = c(-4, -2, 0, 2)) ## End(Not run)## Not run: rm_obj <- rateMap(fits, progress = FALSE) plot( rm_obj, type = "arc", show_tip_labels = FALSE, legend_fsize = 0.8 ) # If rm_obj was built with uncertainty = TRUE, plot uncertainty directly: plot(rm_obj, value = "sd", palette = "Inferno") # Use a continuous ramp instead of the default ordered rate categories: plot(rm_obj, color_mode = "continuous") # Or keep category colors but change the binning: plot(rm_obj, n_categories = 5, category_bin_method = "equal") plot(rm_obj, category_breaks = c(-4, -2, 0, 2)) ## End(Not run)
Prints a compact summary of a completed Bifrost search, including the baseline and optimal information criterion (IC) values, the inferred shift node set, key search settings, and (when present) optional diagnostics such as IC-history and IC-weight support.
## S3 method for class 'bifrost_search' print(x, ...)## S3 method for class 'bifrost_search' print(x, ...)
x |
A |
... |
Unused (S3 compatibility). |
Invisibly returns x. Called for its printing side effects.
rateMap ObjectPrint a concise summary of a "rateMap" object, including the number of fits,
summary mode, target/check mode, weighting mode, uncertainty status, color
mode, plotted value, value range, rate-flag diagnostics when active, and
uncertainty ranges when available.
## S3 method for class 'rateMap' print(x, ...)## S3 method for class 'rateMap' print(x, ...)
x |
An object of class |
... |
Ignored. |
Invisibly returns x.
Summarize fitted regime-specific rates across a list of stochastic-map-aware
model fits or completed bifrost_search results. By default, rateMap()
follows bifrost's branch-level framing: the first retained tree is used as
the plotting scaffold, all retained trees must match in topology and branch
lengths, each branch receives one summarized log-rate, and runs are averaged
with equal weight.
rateMap( fits, weights = c("equal", "ic"), uncertainty = FALSE, summary = c("branch", "interval"), log = TRUE, value_summary = c("mean", "median"), target_tree = NULL, workers = NULL, progress = TRUE, control = rateMapControl() )rateMap( fits, weights = c("equal", "ic"), uncertainty = FALSE, summary = c("branch", "interval"), log = TRUE, value_summary = c("mean", "median"), target_tree = NULL, workers = NULL, progress = TRUE, control = rateMapControl() )
fits |
A completed run, fitted model object, or non-empty list of these
objects. Supported shapes are |
weights |
Fit-level weighting mode. |
uncertainty |
Logical; if |
summary |
Character; |
log |
Logical; if |
value_summary |
Character; central estimate stored in the summary table
column |
target_tree |
Optional explicit target tree used as the summary and
plotting scaffold. This may be any target or summary tree with the same
topology and tip labels as the inputs. It does not need to contain
stochastic maps because |
workers |
Optional number of |
progress |
Logical; if |
control |
A |
Everyday workflow. rateMap() is a compute function: call it to build a
reusable "rateMap" object, then call plot(x, ...) to choose what to
display. Common compute controls are fits, weights, uncertainty,
summary, log, and value_summary. Common display controls belong in
plot() or rateMapView(), including value, type, palette,
color_mode, n_categories, show_tip_labels, and legend sizing. Controls
such as target, check, res, tree_fun, param_fun, and the future_*
arguments live in rateMapControl() because they are mainly advanced tools
for same-topology tree samples, interval summaries, custom fit objects, and
larger run sets.
Algorithmic provenance. rateMap() is inspired by
phytools::densityMap(), which summarizes a set of stochastic maps by
slicing mapped branches on a shared depth grid and coloring a SIMMAP-style
tree by an averaged branchwise quantity. Here the averaged quantity is not a
posterior probability of a mapped state; instead, each mapped state is
translated through the corresponding fitted regime-rate parameter from each
run. The plotting interface likewise follows the phytools density-map
family by returning a colored SIMMAP tree and drawing it with
phytools::plotSimmap() plus either a segmented rate-category color bar or
a continuous color-bar legend.
Although rateMap() is designed for summarizing multiple fitted maps, it also
accepts a single completed bifrost search or supported fit. Single-fit input
is a convenience for inspecting one fitted model before scaling up to
multi-run summaries.
rateMap() can also summarize same-topology posterior or sensitivity samples
where branch lengths differ. In that case, usually supply an explicit
target_tree and use control = list(check = "topology"). The target can
be any tree with the same topology and tip labels as the inputs; an MCC tree
is only one possible choice. The target tree supplies the plotted topology
and branch lengths, while rates are matched from each input tree by
descendant-tip clade keys rather than by edge order.
Summary modes. With the default summary = "branch" and log = TRUE,
each edge receives one length-weighted average log-rate from each run before
the across-run summary is computed. This is the natural default for bifrost
searches because shifts are placed at nodes, so a bifrost branch is not
expected to change regimes internally. With summary = "interval", each
target-tree branch is subdivided by the global depth grid controlled by
control = list(res = ...). If source branch lengths differ from the target
branch length, source stochastic-map segments are projected onto target
intervals by relative position along the matched branch. Interval mode is
useful for general stochastic maps that can genuinely change state along a
branch.
Log-rate averaging. When log = TRUE, rate parameters are transformed
before branch-level, interval-level, and across-fit summaries are computed.
With weights w_i, the default plotted mean is therefore
sum(w_i * log(rate_i)), which is the log of a weighted geometric mean. It
is not log(sum(w_i * rate_i)), the log of a weighted arithmetic mean. Use
log = FALSE when downstream interpretation requires arithmetic summaries on
the original rate scale. Raw fitted rate parameters must be strictly positive
in either mode. Negative plotted values are therefore valid in log-rate maps
whenever the positive raw fitted rates are less than one.
Tree checks and targets. In rateMapControl(), check = TRUE is
equivalent to check = "full" and requires topology and branch lengths to
match the target tree. Use check = "topology" when all inputs have the same
topology and tip labels but may have different branch lengths. check = FALSE
or check = "none" skips the upfront ape::all.equal.phylo() check, but
every target branch must still be recoverable in every input tree by
descendant-tip set. This function is not a mixed-topology posterior summarizer;
clades absent from an input tree are treated as an error rather than being
marginalized over topology.
When target_tree is supplied, it is used directly as the plotting scaffold
and need not contain SIMMAP maps. When target_tree = NULL,
rateMapControl(target = "first") uses the first retained input tree, and
rateMapControl(target = "mcc") chooses the retained input tree with the
highest sum of log clade credibilities. For truly same-topology inputs, the
MCC score is usually tied, so "mcc" commonly resolves to the first retained
tree. MCC target selection does not make rateMap() a mixed-topology
summarizer: every target branch must still be present in every retained input.
If you already have a preferred consensus, chronogram, MCC, maximum-likelihood,
or otherwise curated target tree, pass it with target_tree.
Fit weights. weights = "equal" assigns the same weight to each retained
fit. weights = "ic" computes standard information-criterion weights from
each retained fit's optimal_ic, requiring all retained fits to share the
same non-missing IC_used. A numeric weights vector can be supplied for
custom weighting. Weights are subset to retained fits after
rateMapControl(na_action = "omit") and are normalized to sum to one. IC
weights are descriptive fit-level weights for comparable retained searches;
they do not choose a formal threshold or make incomparable searches
comparable.
Uncertainty summaries. The returned intervals data frame is the plotted
summary table. With summary = "branch", it has one row per branch. With
summary = "interval", it has one row per plotted depth-grid interval. When
uncertainty = TRUE, this table includes across-fit summaries for each
plotted row: weighted mean, weighted median, weighted standard deviation,
quantiles, highest-density interval bounds, quantile and highest-density
interval widths, coefficient of variation, and the number of finite run-level
values. The run-level values are also returned in run_values as one matrix
per edge.
These are weighted empirical summaries of run-level values, not posterior
uncertainty or model-internal uncertainty unless the retained inputs
themselves have that interpretation. For posterior tree samples, use
weights = "equal" if each retained run represents one posterior draw.
value_summary controls whether the plotted value column uses the weighted
mean or weighted median. These summaries are computed on the log-rate scale
when log = TRUE. Highest-density intervals use the same shortest empirical
interval calculation for both equal and unequal fit weights. For
weights = "equal", sd is the ordinary sample standard deviation of
retained run-level values. For unequal weights, sd is the square root of the
normalized weighted variance, sum(w * (x - mu)^2), with weights normalized
to sum to one.
Display views. Returned objects include a default category-style display
mapping so they can be plotted immediately. Palette, category-bin, legend
title, and alternative-value choices are intentionally controlled by
plot(x, ...) or rateMapView(), not by rateMap(). For a single bifrost
search with log = FALSE, category display is the closest formal analogue to
the illustrative generateViridisColorScale() plot. For multi-run summaries,
displayed categories are bins for summarized branch values; they should not be
read as newly inferred bifrost regimes. When summary = "branch" and
color_mode = "category", the rate_categories table also reports
branch-level summaries for the plotted values assigned to each bin. These
summaries are recomputed by rateMapView() or plot() whenever the plotted
value or category breaks change. They are not computed for summary = "interval" because interval rows depend on the plotting grid rather than on
discrete biological branch units.
Rate diagnostics. Branch-summary maps compute rate diagnostics through
rateMapControl(rate_flags = rateMapRateFlags()). The default
rateMapRateFlags() setting records the fitted-rate range and fold range but
applies no special near-zero or high-outlier rule. Supplying zero_floor, equivalently
rateMapRateFlags(method = "floor", zero_floor = ...), flags finite rates at
or below an explicit manual floor. Use rateMapRateFlags(method = "tail_cluster") to apply a deterministic, Otsu-style guarded two-class split
on sorted log rates when the near-zero values form a broader separated
lower-tail cluster rather than one extreme adjacent gap. The split is kept
only when gap, fold-reduction, tail-fraction, and minimum-count guardrails are
satisfied. These flags never remove branches and never alter
intervals$value; they add rate_flag metadata, rate_flag_source
provenance, object-level rate_diagnostics, and, in category display mode,
special display categories such as "near-zero". Special categories are
colored outside the ordered palette, so the remaining regular categories
still use the full low-to-high palette range. In category legends, diagnostic
cutoffs mark where special categories end and regular bins begin. Set
rate_flags = NULL to turn off rate-flag metadata and diagnostics entirely.
An object of class "rateMap" with components:
treeA SIMMAP-style tree whose mapped segments encode color-bin indices in continuous mode, or named rate categories in category mode.
colsThe resolved color palette. In continuous mode this has
length ncolors; in category mode it has one color per exact value or
category bin.
limsNumeric length-2 vector giving the plotted value range.
breaksNumeric vector of palette bin boundaries in continuous mode, or category boundaries/values in category mode.
valuesList of plotted-row central values by edge before color binning.
intervalsPlotted summary table. With summary = "branch", this
has one row per branch. With summary = "interval", this has one row per
plotted depth-grid interval. When branch-level rate diagnostics are
enabled, this table also includes rate_for_flagging, rate_flag,
rate_flag_source, is_near_zero, and is_high_outlier.
rate_categoriesData frame describing discrete rate categories
when color_mode = "category"; otherwise NULL. With summary = "branch", this table also includes bin-level summaries of the plotted
branch values, including n_branches, value_mean, value_median,
value_min, value_max, value_sd, and total_branch_length.
run_valuesWhen uncertainty = TRUE, list of numeric matrices
containing run-level values for each edge. Matrix rows match the plotted
rows for that edge and columns are retained fits. Otherwise NULL.
clade_keyCharacter descendant-tip key for each target-tree edge.
edge_matchesInteger matrix mapping target-tree edge rows to matched source-tree edge rows for each retained fit.
summaryThe summary mode used, "interval" or "branch".
uncertaintyLogical indicating whether uncertainty summaries were computed.
value_summaryCentral estimate used for the plotted summary table
column intervals$value.
quantile_probsQuantile probabilities used for uncertainty summaries.
highest_density_interval_probHighest-density interval mass used for uncertainty summaries.
rate_diagnosticsList summarizing rate-flag settings, counts, detected tail cutoffs, and fold-rate ranges with and without flagged branches.
rate_flagsThe normalized "rateMap_rate_flags" control object
used for rate diagnostics.
rate_flag_sourceCharacter name of the rate-valued column used to
compute or preserve rate_flag metadata, or NA when diagnostics are
disabled.
plot_valueCurrent interval column mapped to branch colors.
targetTarget-tree selection mode used.
checkTree compatibility check mode used.
weightsNormalized fit weights used for aggregation.
weight_modeWeighting mode used: "equal", "ic", or
"custom".
weight_tableData frame linking retained input indices, weights, and IC values when available.
paletteOriginal palette specification.
reverse_paletteLogical indicating whether the palette was reversed.
ncolorsStored continuous-ramp resolution used when recoloring
with color_mode = "continuous" and no explicit ncolors.
color_modeColoring mode used for the current tree.
n_categoriesCategory count target used when
color_mode = "category".
category_breaksCategory breaks or exact category values used
when color_mode = "category".
category_labelsCategory labels used when
color_mode = "category".
category_bin_methodAutomatic category-binning method used when
color_mode = "category" and category_breaks = NULL.
titleLegend title used for plotting.
n_fitsNumber of fits used after validation or omission.
omittedInteger indices of omitted fits when na_action = "omit".
Paradis, E., and Schliep, K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics, 35, 526-528. doi:10.1093/bioinformatics/bty633
Clavel, J., Aristide, L., and Morlon, H. (2019). A penalized likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Systematic Biology, 68, 93-116. doi:10.1093/sysbio/syy045
Revell, L. J. (2013). Two new graphical methods for mapping trait evolution on phylogenies. Methods in Ecology and Evolution, 4, 754-759.
Revell, L. J. (2024). phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12, e16505. doi:10.7717/peerj.16505
plot.rateMap(), rateMapView(), rateMapControl(),
phytools::densityMap(), phytools::plotSimmap()
## Not run: # A list of completed bifrost searches can be summarized directly: rm_obj <- rateMap(list(search_a, search_b, search_c)) plot(rm_obj, type = "arc", show_tip_labels = FALSE) # A single completed bifrost search can be inspected with the same API: one_search_map <- rateMap(search_a) plot(one_search_map, type = "arc", show_tip_labels = FALSE) # Scratch-style lists remain supported: scratch_fit <- list( variables = list(tree = mapped_tree), param = c("0" = 0.12, "1" = 0.45) ) rateMap(list(scratch_fit)) # Use bifrost model-level IC weights across comparable sensitivity runs: rm_ic <- rateMap(list(search_a, search_b), weights = "ic") # Branch-level, discrete-category maps are the bifrost default: branch_rates <- rateMap(list(search_a, search_b)) branch_rates$rate_categories # To mimic the old jaw-shape preview style for a single search, use raw # fitted rates and category colors: jaw_view <- rateMapView(rateMap(search_a, log = FALSE), palette = viridis::viridis) # Category bins use pretty breaks by default. Use equal-width bins or custom # boundaries when those are easier to compare across figures: equal_bin_rates <- rateMapView( branch_rates, n_categories = 5, category_bin_method = "equal" ) custom_bin_rates <- rateMapView( branch_rates, category_breaks = c(-4, -2, 0, 2), category_labels = c("slow", "middle", "fast") ) # Continuous interval maps remain available for stochastic maps with # along-branch changes: interval_rates <- rateMap( list(search_a, search_b), summary = "interval", control = list(res = 200) ) plot(interval_rates, color_mode = "continuous") # Custom fit weights are normalized internally: custom_weighted <- rateMap( list(search_a, search_b, search_c), weights = c(2, 1, 1), summary = "branch" ) # Posterior trees with the same topology but different branch lengths can be # summarized on any explicit target or summary tree: posterior_target_rates <- rateMap( posterior_fit_list, target_tree = summary_tree, summary = "branch", weights = "equal", uncertainty = TRUE, control = list(check = "topology") ) plot(posterior_target_rates, value = "sd", type = "arc") # Or choose the retained input tree with the highest summed log clade # credibility as the plotting scaffold: posterior_mcc_rates <- rateMap( posterior_fit_list, summary = "branch", control = list(check = "topology", target = "mcc") ) # Large sensitivity sets can be explicitly subsampled before plotting. # By default, rates are mapped on the log scale: set.seed(1) idx <- sample(seq_along(fit_list), size = 1000) rm_sub <- rateMap( fit_list[idx], workers = 8, control = list(res = 100, future_strategy = "multisession") ) # Use log = FALSE only when a raw-rate scale is preferred: rm_raw <- rateMap(fit_list[idx], log = FALSE) plot( rm_sub, type = "arc", show_tip_labels = FALSE, lwd = 1, palette = c("lightblue", "blue", "pink", "red") ) ## End(Not run)## Not run: # A list of completed bifrost searches can be summarized directly: rm_obj <- rateMap(list(search_a, search_b, search_c)) plot(rm_obj, type = "arc", show_tip_labels = FALSE) # A single completed bifrost search can be inspected with the same API: one_search_map <- rateMap(search_a) plot(one_search_map, type = "arc", show_tip_labels = FALSE) # Scratch-style lists remain supported: scratch_fit <- list( variables = list(tree = mapped_tree), param = c("0" = 0.12, "1" = 0.45) ) rateMap(list(scratch_fit)) # Use bifrost model-level IC weights across comparable sensitivity runs: rm_ic <- rateMap(list(search_a, search_b), weights = "ic") # Branch-level, discrete-category maps are the bifrost default: branch_rates <- rateMap(list(search_a, search_b)) branch_rates$rate_categories # To mimic the old jaw-shape preview style for a single search, use raw # fitted rates and category colors: jaw_view <- rateMapView(rateMap(search_a, log = FALSE), palette = viridis::viridis) # Category bins use pretty breaks by default. Use equal-width bins or custom # boundaries when those are easier to compare across figures: equal_bin_rates <- rateMapView( branch_rates, n_categories = 5, category_bin_method = "equal" ) custom_bin_rates <- rateMapView( branch_rates, category_breaks = c(-4, -2, 0, 2), category_labels = c("slow", "middle", "fast") ) # Continuous interval maps remain available for stochastic maps with # along-branch changes: interval_rates <- rateMap( list(search_a, search_b), summary = "interval", control = list(res = 200) ) plot(interval_rates, color_mode = "continuous") # Custom fit weights are normalized internally: custom_weighted <- rateMap( list(search_a, search_b, search_c), weights = c(2, 1, 1), summary = "branch" ) # Posterior trees with the same topology but different branch lengths can be # summarized on any explicit target or summary tree: posterior_target_rates <- rateMap( posterior_fit_list, target_tree = summary_tree, summary = "branch", weights = "equal", uncertainty = TRUE, control = list(check = "topology") ) plot(posterior_target_rates, value = "sd", type = "arc") # Or choose the retained input tree with the highest summed log clade # credibility as the plotting scaffold: posterior_mcc_rates <- rateMap( posterior_fit_list, summary = "branch", control = list(check = "topology", target = "mcc") ) # Large sensitivity sets can be explicitly subsampled before plotting. # By default, rates are mapped on the log scale: set.seed(1) idx <- sample(seq_along(fit_list), size = 1000) rm_sub <- rateMap( fit_list[idx], workers = 8, control = list(res = 100, future_strategy = "multisession") ) # Use log = FALSE only when a raw-rate scale is preferred: rm_raw <- rateMap(fit_list[idx], log = FALSE) plot( rm_sub, type = "arc", show_tip_labels = FALSE, lwd = 1, palette = c("lightblue", "blue", "pink", "red") ) ## End(Not run)
rateMap() OptionsBuild a control object for less common rateMap() settings. These options are
useful for interval maps, same-topology tree samples, custom fit object
shapes, invalid-fit handling, and future-based parallel scheduling.
rateMapControl( res = 100, check = TRUE, target = c("first", "mcc"), tree_fun = NULL, param_fun = NULL, na_action = c("error", "omit"), rate_flags = rateMapRateFlags(), quantile_probs = c(0.025, 0.975), highest_density_interval_prob = 0.95, future_strategy = c("multisession", "multicore"), future_seed = FALSE, future_scheduling = 1, future_chunk_size = NULL )rateMapControl( res = 100, check = TRUE, target = c("first", "mcc"), tree_fun = NULL, param_fun = NULL, na_action = c("error", "omit"), rate_flags = rateMapRateFlags(), quantile_probs = c(0.025, 0.975), highest_density_interval_prob = 0.95, future_strategy = c("multisession", "multicore"), future_seed = FALSE, future_scheduling = 1, future_chunk_size = NULL )
res |
Integer resolution of the global depth grid used to subdivide
target-tree branches when |
check |
Logical or character check mode. |
target |
Character target-tree selection when |
tree_fun |
Optional function used to extract a mapped tree from each
element of |
param_fun |
Optional function used to extract a named numeric vector of
state-specific fitted rates from each element of |
na_action |
What to do when a run has invalid parameters. |
rate_flags |
A |
quantile_probs |
Numeric length-2 vector of quantile probabilities to
report when |
highest_density_interval_prob |
Numeric scalar giving the
highest-density interval mass to report when |
future_strategy |
Future backend used only when |
future_seed |
Seed control passed to |
future_scheduling |
Scheduling control passed to
|
future_chunk_size |
Chunk size passed to
|
A "rateMap_control" object for the control argument of
rateMap().
## Not run: # Same-topology tree samples with differing branch lengths: ctrl <- rateMapControl(check = "topology") posterior_rates <- rateMap( posterior_fit_list, target_tree = summary_tree, uncertainty = TRUE, control = ctrl ) # Interval maps with a finer depth grid: interval_rates <- rateMap( fits, summary = "interval", control = rateMapControl(res = 200) ) ## End(Not run)## Not run: # Same-topology tree samples with differing branch lengths: ctrl <- rateMapControl(check = "topology") posterior_rates <- rateMap( posterior_fit_list, target_tree = summary_tree, uncertainty = TRUE, control = ctrl ) # Interval maps with a finer depth grid: interval_rates <- rateMap( fits, summary = "interval", control = rateMapControl(res = 200) ) ## End(Not run)
rateMap()
Build a rate-flagging control object for rateMapControl(). These options
identify branch-rate summaries that are effectively zero, or optionally in a
separated high-rate tail, without deleting or changing the fitted rates. Flags
are reported in the returned intervals table and summarized in
rate_diagnostics.
rateMapRateFlags( near_zero = NULL, high_outlier = FALSE, method = NULL, zero_floor = NULL, cluster_min_log_gap = log(10), cluster_min_fold_reduction = 10, cluster_max_tail_fraction = 0.4, cluster_min_flagged = 3, cluster_min_regular = 10, zero_label = "near-zero", high_label = "high-outlier", zero_color = "grey70", high_color = "black" )rateMapRateFlags( near_zero = NULL, high_outlier = FALSE, method = NULL, zero_floor = NULL, cluster_min_log_gap = log(10), cluster_min_fold_reduction = 10, cluster_max_tail_fraction = 0.4, cluster_min_flagged = 3, cluster_min_regular = 10, zero_label = "near-zero", high_label = "high-outlier", zero_color = "grey70", high_color = "black" )
near_zero |
Logical or |
high_outlier |
Logical; if |
method |
Optional detection method. |
zero_floor |
Optional non-negative rate floor. When supplied with
|
cluster_min_log_gap |
Minimum log-rate gap required between a cluster-defined tail and the remaining regular rates. |
cluster_min_fold_reduction |
Minimum fold-range reduction required after separating a cluster-defined tail from the regular rates. |
cluster_max_tail_fraction |
Maximum fraction of positive finite rates that can be assigned to a cluster-defined tail. |
cluster_min_flagged |
Minimum number of branches required for a cluster-defined tail. |
cluster_min_regular |
Minimum number of branches that must remain in the regular rate set after separating a cluster-defined tail. |
zero_label, high_label
|
Labels used for flagged display categories. |
zero_color, high_color
|
Colors used for flagged display categories in category mode. |
The "tail_cluster" method is an Otsu-style diagnostic adapted to sorted
branch log rates. It evaluates two-class splits and keeps a tail only when
guardrails for log-rate gap size, fold-range reduction, tail fraction, and
minimum counts are all satisfied. It is display metadata, not data deletion,
model correction, or formal threshold selection.
A normalized "rateMap_rate_flags" object for
rateMapControl(rate_flags = ).
Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62-66. doi:10.1109/TSMC.1979.4310076
## Not run: ctrl <- rateMapControl(rate_flags = rateMapRateFlags(zero_floor = 1e-8)) rm_obj <- rateMap(fits, control = ctrl) rm_obj$rate_diagnostics cluster_ctrl <- rateMapControl( rate_flags = rateMapRateFlags(method = "tail_cluster") ) rm_obj <- rateMap(fits, control = cluster_ctrl) rm_obj$rate_diagnostics ## End(Not run)## Not run: ctrl <- rateMapControl(rate_flags = rateMapRateFlags(zero_floor = 1e-8)) rm_obj <- rateMap(fits, control = ctrl) rm_obj$rate_diagnostics cluster_ctrl <- rateMapControl( rate_flags = rateMapRateFlags(method = "tail_cluster") ) rm_obj <- rateMap(fits, control = cluster_ctrl) rm_obj$rate_diagnostics ## End(Not run)
rateMap ObjectRecompute the display mapping for a computed "rateMap" object without
drawing it. This is optional in the usual rateMap() then plot() workflow;
use it when you want to save or inspect category tables, color bins, or an
uncertainty-valued map object and then reuse the same display mapping in one
or more plots. The numeric branch or interval summaries computed by
rateMap() are not recomputed.
rateMapView( x, value = "value", palette = NULL, reverse_palette = NULL, color_mode = NULL, n_categories = NULL, category_bin_method = NULL, category_breaks = NULL, category_labels = NULL, ncolors = NULL, legend_title = NULL )rateMapView( x, value = "value", palette = NULL, reverse_palette = NULL, color_mode = NULL, n_categories = NULL, category_bin_method = NULL, category_breaks = NULL, category_labels = NULL, ncolors = NULL, legend_title = NULL )
x |
An object of class |
value |
Name of the numeric column in |
palette |
Optional palette override. This can be an |
reverse_palette |
Optional logical override for palette reversal. |
color_mode |
Optional color-mode override. Use |
n_categories |
Optional target category count for
|
category_bin_method |
Optional automatic category-binning method:
|
category_breaks |
Optional strictly increasing category boundaries. |
category_labels |
Optional labels for displayed categories. |
ncolors |
Optional number of colors for continuous ramps. If omitted,
the stored |
legend_title |
Optional legend title stored on the returned object. |
A "rateMap" object with updated tree maps, color palette,
intervals$value, rate_categories, and legend title. For branch-summary
category views, rate_categories includes bin-level summaries of the
plotted branch values and is recomputed whenever the view changes. For
branch-summary maps with active rate diagnostics, diagnostic flags are
recomputed for rate-valued views ("value", "mean", or "median") and
preserved as metadata for non-rate views such as "sd". When preserved for
a non-rate view, rate_flag_source identifies the
rate-valued column that the flags classify; the flags do not classify the
displayed uncertainty value. The selected value column must contain
finite values for every plotted interval; uncertainty columns such as
"sd" may be all NA for single-fit objects and are rejected with a
clear error.
## Not run: rm_obj <- rateMap(fits, uncertainty = TRUE) display_obj <- rateMapView( rm_obj, palette = "Viridis", n_categories = 5, legend_title = "Mean log fitted rate" ) plot(display_obj, type = "arc", show_tip_labels = FALSE) sd_obj <- rateMapView( rm_obj, value = "sd", palette = "Inferno", color_mode = "continuous" ) ## End(Not run)## Not run: rm_obj <- rateMap(fits, uncertainty = TRUE) display_obj <- rateMapView( rm_obj, palette = "Viridis", n_categories = 5, legend_title = "Mean log fitted rate" ) plot(display_obj, type = "arc", show_tip_labels = FALSE) sd_obj <- rateMapView( rm_obj, value = "sd", palette = "Inferno", color_mode = "continuous" ) ## End(Not run)
Greedy, stepwise search for evolutionary regime shifts on a phylogeny
using multivariate mvgls fits from mvMORPH. The routine:
builds one-shift candidate trees for all internal nodes meeting a tip-size threshold
(via generatePaintedTrees),
fits each candidate in parallel and ranks them by improvement in the chosen
information criterion (IC; GIC or BIC),
iteratively adds shifts that pass a user-defined acceptance threshold,
optionally revisits accepted shifts to prune overfitting using a small IC tolerance window,
optionally computes per-shift IC weights by refitting the model with each shift removed.
Models are fitted directly in multivariate trait space (no PCA), assuming a multi-rate
Brownian Motion with proportional VCV scaling across regimes. Extra arguments in ...
are forwarded to mvgls. In practice, method and
error are often the most important of these: the package vignettes use
method = "H&L" for intercept-only, high-dimensional response matrices and
method = "LL" for formula-based searches with predictors, while
error = TRUE asks mvgls() to estimate a nuisance measurement-error
(intraspecific-variance) term from the data.
searchOptimalConfiguration( baseline_tree, trait_data, formula = "trait_data ~ 1", min_descendant_tips, num_cores = 2, ic_uncertainty_threshold = 1, shift_acceptance_threshold = 1, uncertaintyweights = FALSE, uncertaintyweights_par = FALSE, plot = FALSE, IC = "GIC", store_model_fit_history = TRUE, verbose = FALSE, ... )searchOptimalConfiguration( baseline_tree, trait_data, formula = "trait_data ~ 1", min_descendant_tips, num_cores = 2, ic_uncertainty_threshold = 1, shift_acceptance_threshold = 1, uncertaintyweights = FALSE, uncertaintyweights_par = FALSE, plot = FALSE, IC = "GIC", store_model_fit_history = TRUE, verbose = FALSE, ... )
baseline_tree |
A rooted |
trait_data |
A |
formula |
Character string or formula object passed to |
min_descendant_tips |
Integer ( |
num_cores |
Integer. Number of workers for candidate scoring. Uses plain
serial evaluation when |
ic_uncertainty_threshold |
Numeric ( |
shift_acceptance_threshold |
Numeric ( |
uncertaintyweights |
Logical. If |
uncertaintyweights_par |
Logical. As above, but compute per-shift IC weights in parallel
using future.apply. Exactly one of |
plot |
Logical. If |
IC |
Character. Which information criterion to use, one of |
store_model_fit_history |
Logical. If |
verbose |
Logical. If |
... |
Additional arguments passed to |
Input requirements.
Tree: baseline_tree should be a rooted phylo tree
with branch lengths interpreted in units of time. An ultrametric tree is not required.
The starting tree does not need to already be painted; searchOptimalConfiguration()
paints a single baseline regime internally before building shifted candidates.
Trait data alignment: rownames(trait_data) must match
baseline_tree$tip.label in both names and order; any tips without data should be
pruned beforehand.
Data type: trait_data is typically a numeric matrix of continuous traits;
numeric response-only data.frames are also supported for intercept-only
searches, and named mixed-type data.frames are supported for richer
formulas. High-dimensional settings (p n) are supported via
penalized-likelihood mvgls() fits.
Search outline.
Baseline: Fit mvgls on the baseline tree (single regime) to obtain the baseline IC.
Candidates: Build one-shift trees for eligible internal nodes
(generatePaintedTrees); fit each with
fitMvglsAndExtractGIC.formula or fitMvglsAndExtractBIC.formula
(internal helpers; not exported) and rank by IC.
Greedy add: Add the top candidate, refit, and accept if
IC shift_acceptance_threshold; continue down the ranked list.
Optional IC weights: If uncertaintyweights (or uncertaintyweights_par)
is TRUE, compute an IC weight for each accepted shift by refitting the final model with that
shift removed and comparing the two ICs via aicw.
Parallelization. When num_cores = 1, candidates are scored serially. For
larger values, candidate sub-model fits are distributed with future +
future.apply. On Unix outside RStudio, multicore is used; otherwise
multisession is used. A sequential plan is restored afterward.
Plotting. If plot = TRUE, trees are rendered with
plotSimmap(); shift IDs are labeled with nodelabels().
Regime VCVs. The returned $VCVs are extracted from the fitted multi-regime model via
extractRegimeVCVs and reflect regime-specific covariance
estimates (when mvgls is fitted under a PL/ML method).
For high-dimensional trait datasets (p n), penalized-likelihood settings in
mvgls() are often required for stable estimation. The package vignettes
distinguish two common workflows. For intercept-only searches on high-dimensional
response matrices (for example, GPA-aligned landmark data), the jaw-shape vignette
uses method = "H&L" with the default "RidgeArch" penalty; in
mvMORPH, this is a fast approximation to penalized LOOCV and is only available
for intercept-only models. For formula-based searches with predictors, the avian
skeleton vignette uses method = "LL" instead. When IC = "BIC",
method = "LL" should be used. Across empirical workflows, error = TRUE
is often a sensible default because it asks mvgls() to estimate a nuisance
measurement-error (intraspecific-variance) term from the data. Users should consult
the mvMORPH documentation for details on available methods and penalties and
tune these choices to the structure of their data.
A named list with (at minimum):
user_input: captured call (as a list) for reproducibility.
tree_no_uncertainty_transformed: SIMMAP tree from the optimal (no-uncertainty) model
on the transformed scale used internally by mvgls.
tree_no_uncertainty_untransformed: same topology with original edge lengths restored.
model_no_uncertainty: the final mvgls model object.
shift_nodes_no_uncertainty: integer vector of accepted shift nodes.
optimal_ic: final IC value; baseline_ic: baseline IC.
IC_used: "GIC" or "BIC"; num_candidates: count of candidate one-shift models evaluated.
model_fit_history: if store_model_fit_history = TRUE, a list of per-iteration fits
(loaded from temporary files written during the run) and an ic_acceptance_matrix
(IC value and acceptance flag per step).
VCVs: named list of regime-specific VCV matrices extracted from the final model
(penalized-likelihood estimates if PL was used).
Additional components appear conditionally:
ic_weights: a data.frame of per-shift IC weights and evidence ratios when
uncertaintyweights or uncertaintyweights_par is TRUE.
warnings: character vector of warnings/errors encountered during fitting (if any).
The search is greedy and may converge to a local optimum. Use a stricter
shift_acceptance_threshold to reduce overfitting, and re-run the search
with different min_descendant_tips and IC choices ("GIC" vs "BIC")
to assess stability of the inferred shifts. For a given run, the optional IC-weight
calculations (uncertaintyweights or uncertaintyweights_par) can be used
to quantify support for individual shifts. It is often helpful to repeat the analysis
under slightly different settings (e.g., thresholds or candidate-size constraints) and
compare the resulting sets of inferred shifts.
Internally, this routine coordinates multiple unexported helper functions:
generatePaintedTrees, fitMvglsAndExtractGIC.formula,
fitMvglsAndExtractBIC.formula, addShiftToModel,
removeShiftFromTree, and extractRegimeVCVs. Through these,
it may also invoke lower-level utilities such as paintSubTree_mod
and paintSubTree_removeShift. These helpers are internal
implementation details and are not part of the public API.
mvgls, GIC, BIC,
plot_ic_acceptance_matrix for visualizing IC trajectories and shift
acceptance decisions, and generateViridisColorScale for mapping
regime-specific rates or parameters to a viridis color scale when plotting trees;
packages: mvMORPH, future, future.apply, phytools, ape.
library(ape) library(phytools) library(mvMORPH) set.seed(1) # Simulate a tree tr <- pbtree(n = 50, scale = 1) # Define two regimes: "0" (baseline) and "1" (high-rate) on a subset of tips states <- setNames(rep("0", Ntip(tr)), tr$tip.label) high_clade_tips <- tr$tip.label[1:20] states[high_clade_tips] <- "1" # Make a SIMMAP tree for the BMM simulation simmap <- phytools::make.simmap(tr, states, model = "ER", nsim = 1) # Simulate traits under a BMM model with ~10x higher rate in regime "1" sigma <- list( "0" = diag(0.1, 2), "1" = diag(1.0, 2) ) theta <- c(0, 0) sim <- mvMORPH::mvSIM( tree = simmap, nsim = 1, model = "BMM", param = list( ntraits = 2, sigma = sigma, theta = theta ) ) # mvSIM returns either a matrix or a list of matrices depending on mvMORPH version X <- if (is.list(sim)) sim[[1]] else sim rownames(X) <- simmap$tip.label # Run the search on the unpainted tree (single baseline regime) res <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = X, formula = "trait_data ~ 1", min_descendant_tips = 10, num_cores = 1, # keep it simple / CRAN-safe shift_acceptance_threshold = 20, # conservative GIC threshold IC = "GIC", plot = FALSE, store_model_fit_history = FALSE, verbose = FALSE ) res$shift_nodes_no_uncertainty res$optimal_ic - res$baseline_ic str(res$VCVs) ## Not run: # Intercept-only empirical-style search: # high-dimensional response matrix with H&L + measurement error res_hl <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = X, formula = "trait_data ~ 1", min_descendant_tips = 10, num_cores = 2, shift_acceptance_threshold = 20, uncertaintyweights_par = TRUE, IC = "GIC", plot = FALSE, method = "H&L", error = TRUE, store_model_fit_history = TRUE, verbose = TRUE ) # Formula-based search with a predictor: # use LL when the model includes predictors dat <- data.frame( trait1 = X[, 1], trait2 = X[, 2], predictor = rnorm(nrow(X)) ) rownames(dat) <- simmap$tip.label res_ll <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = dat, formula = "trait_data[, 1:2] ~ trait_data[, 3]", min_descendant_tips = 10, num_cores = 2, shift_acceptance_threshold = 20, IC = "GIC", plot = FALSE, method = "LL", error = TRUE, store_model_fit_history = TRUE, verbose = TRUE ) ## End(Not run)library(ape) library(phytools) library(mvMORPH) set.seed(1) # Simulate a tree tr <- pbtree(n = 50, scale = 1) # Define two regimes: "0" (baseline) and "1" (high-rate) on a subset of tips states <- setNames(rep("0", Ntip(tr)), tr$tip.label) high_clade_tips <- tr$tip.label[1:20] states[high_clade_tips] <- "1" # Make a SIMMAP tree for the BMM simulation simmap <- phytools::make.simmap(tr, states, model = "ER", nsim = 1) # Simulate traits under a BMM model with ~10x higher rate in regime "1" sigma <- list( "0" = diag(0.1, 2), "1" = diag(1.0, 2) ) theta <- c(0, 0) sim <- mvMORPH::mvSIM( tree = simmap, nsim = 1, model = "BMM", param = list( ntraits = 2, sigma = sigma, theta = theta ) ) # mvSIM returns either a matrix or a list of matrices depending on mvMORPH version X <- if (is.list(sim)) sim[[1]] else sim rownames(X) <- simmap$tip.label # Run the search on the unpainted tree (single baseline regime) res <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = X, formula = "trait_data ~ 1", min_descendant_tips = 10, num_cores = 1, # keep it simple / CRAN-safe shift_acceptance_threshold = 20, # conservative GIC threshold IC = "GIC", plot = FALSE, store_model_fit_history = FALSE, verbose = FALSE ) res$shift_nodes_no_uncertainty res$optimal_ic - res$baseline_ic str(res$VCVs) ## Not run: # Intercept-only empirical-style search: # high-dimensional response matrix with H&L + measurement error res_hl <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = X, formula = "trait_data ~ 1", min_descendant_tips = 10, num_cores = 2, shift_acceptance_threshold = 20, uncertaintyweights_par = TRUE, IC = "GIC", plot = FALSE, method = "H&L", error = TRUE, store_model_fit_history = TRUE, verbose = TRUE ) # Formula-based search with a predictor: # use LL when the model includes predictors dat <- data.frame( trait1 = X[, 1], trait2 = X[, 2], predictor = rnorm(nrow(X)) ) rownames(dat) <- simmap$tip.label res_ll <- searchOptimalConfiguration( baseline_tree = as.phylo(simmap), trait_data = dat, formula = "trait_data[, 1:2] ~ trait_data[, 3]", min_descendant_tips = 10, num_cores = 2, shift_acceptance_threshold = 20, IC = "GIC", plot = FALSE, method = "LL", error = TRUE, store_model_fit_history = TRUE, verbose = TRUE ) ## End(Not run)