% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/data_frame_methods.R, R/ggaverage.R,
%   R/ggeffect.R, R/ggemmeans.R, R/ggpredict.R
\name{as.data.frame.ggeffects}
\alias{as.data.frame.ggeffects}
\alias{ggaverage}
\alias{ggeffect}
\alias{ggemmeans}
\alias{ggpredict}
\title{Adjusted predictions from regression models}
\usage{
\method{as.data.frame}{ggeffects}(
  x,
  row.names = NULL,
  optional = FALSE,
  ...,
  stringsAsFactors = FALSE,
  terms_to_colnames = FALSE
)

ggaverage(
  model,
  terms,
  ci_level = 0.95,
  type = "fixed",
  typical = "mean",
  condition = NULL,
  back_transform = TRUE,
  vcov_fun = NULL,
  vcov_type = NULL,
  vcov_args = NULL,
  weights = NULL,
  verbose = TRUE,
  ...
)

ggeffect(model, terms, ci_level = 0.95, verbose = TRUE, ci.lvl = ci_level, ...)

ggemmeans(
  model,
  terms,
  ci_level = 0.95,
  type = "fixed",
  typical = "mean",
  condition = NULL,
  back_transform = TRUE,
  vcov_fun = NULL,
  vcov_type = NULL,
  vcov_args = NULL,
  interval = "confidence",
  verbose = TRUE,
  ci.lvl = ci_level,
  back.transform = back_transform,
  ...
)

ggpredict(
  model,
  terms,
  ci_level = 0.95,
  type = "fixed",
  typical = "mean",
  condition = NULL,
  back_transform = TRUE,
  ppd = FALSE,
  vcov_fun = NULL,
  vcov_type = NULL,
  vcov_args = NULL,
  interval,
  verbose = TRUE,
  ci.lvl = ci_level,
  back.transform = back_transform,
  vcov.fun = vcov_fun,
  vcov.type = vcov_type,
  vcov.args = vcov_args,
  ...
)
}
\arguments{
\item{x}{An object of class \code{ggeffects}, as returned by \code{predict_response()},
\code{ggpredict()}, \code{ggeffect()}, \code{ggaverage()} or \code{ggemmeans()}.}

\item{row.names}{\code{NULL} or a character vector giving the row
    names for the data frame.  Missing values are not allowed.}

\item{optional}{logical. If \code{TRUE}, setting row names and
    converting column names (to syntactic names: see
    \code{\link[base]{make.names}}) is optional.  Note that all of \R's
    \pkg{base} package \code{as.data.frame()} methods use
    \code{optional} only for column names treatment, basically with the
    meaning of \code{\link[base]{data.frame}(*, check.names = !optional)}.
    See also the \code{make.names} argument of the \code{matrix} method.}

\item{...}{Arguments are passed down to \code{ggpredict()} (further down to \code{predict()})
or \code{ggemmeans()} (and thereby to \code{emmeans::emmeans()}), If \code{type = "simulate"},
\code{...} may also be used to set the number of simulation, e.g. \code{nsim = 500}.
When calling \code{ggeffect()}, further arguments passed down to \code{effects::Effect()}.}

\item{stringsAsFactors}{logical: should the character vector be converted
    to a factor?}

\item{terms_to_colnames}{Logical, if \code{TRUE}, standardized column names (like
\code{"x"}, \code{"group"} or \code{"facet"}) are replaced by the variable names of the focal
predictors specified in \code{terms}.}

\item{model}{A model object, or a list of model objects.}

\item{terms}{Names of those terms from \code{model}, for which predictions should
be displayed (so called \emph{focal terms}). Can be:
\itemize{
\item A character vector, specifying the names of the focal terms. This is the
preferred and probably most flexible way to specify focal terms, e.g.
\code{terms = "x [40:60]"}, to calculate predictions for the values 40 to 60.
\item A list, where each element is a named vector, specifying the focal terms
and their values. This is the "classical" R way to specify focal terms,
e.g. \code{list(x = 40:60)}.
\item A formula, e.g. \code{terms = ~ x + z}, which is internally converted to a
character vector. This is probably the least flexible way, as you cannot
specify representative values for the focal terms.
\item A data frame representing a "data grid" or "reference grid". Predictions
are then made for all combinations of the variables in the data frame.
}

\code{terms} at least requires one variable name. The maximum length is four terms,
where the second to fourth term indicate the groups, i.e. predictions of the first
term are grouped at meaningful values or levels of the remaining terms (see
\code{\link[=values_at]{values_at()}}). It is also possible to define specific values for focal
terms, at which adjusted predictions should be calculated (see details below).
All remaining covariates that are not specified in \code{terms} are "marginalized",
see the \code{margin} argument in \code{?predict_response}. See also argument \code{condition}
to fix non-focal terms to specific values, and argument \code{typical} for
\code{ggpredict()} or \code{ggemmeans()}.}

\item{ci_level}{Numeric, the level of the confidence intervals. Use
\code{ci_level = NA} if confidence intervals should not be calculated
(for instance, due to computation time). Typically, confidence intervals are
based on the returned standard errors for the predictions, assuming a t- or
normal distribution (based on the model and the available degrees of freedom,
i.e. roughly \verb{+/- 1.96 * SE}). See introduction of
\href{https://strengejacke.github.io/ggeffects/articles/ggeffects.html}{this vignette}
for more details.}

\item{type}{Character, indicating whether predictions should be conditioned
on specific model components or not. Consequently, most options only apply
for survival models, mixed effects models and/or models with zero-inflation
(and their Bayesian counter-parts); only exception is \code{type = "simulate"},
which is available for some other model classes as well (which respond to
\code{simulate()}).

\strong{Note 1:} For \code{brmsfit}-models with zero-inflation component, there is no
\code{type = "zero_inflated"} nor \code{type = "zi_random"}; predicted values for these
models \emph{always} condition on the zero-inflation part of the model. The same
is true for \code{MixMod}-models from \strong{GLMMadaptive} with zero-inflation
component (see 'Details').

\strong{Note 2:} If \code{margin = "empirical"}, or when calling \code{ggaverage()} respectively,
(i.e. counterfactual predictions), the \code{type} argument is handled differently.
It is set to \code{"response"} by default, but usually accepts all possible options
from the \code{type}-argument of the model's respective \code{predict()} method. E.g.,
passing a \code{glm} object would allow the options \code{"response"}, \code{"link"}, and
\code{"terms"}. For models with zero-inflation component, the below mentioned
options \code{"fixed"}, \code{"zero_inflated"} and \code{"zi_prob"} can also be used and will
be "translated" into the corresponding \code{type} option of the model's respective
\code{predict()}-method.
\itemize{
\item \code{"fixed"} (or \code{"fe"} or \code{"count"})

Predicted values are conditioned on the fixed effects or conditional
model only (for mixed models: predicted values are on the population-level
and \emph{confidence intervals} are returned, i.e. \code{re.form = NA} when calling
\code{predict()}). For instance, for models fitted with \code{zeroinfl} from \strong{pscl},
this would return the predicted mean from the count component (without
zero-inflation). For models with zero-inflation component, this type calls
\code{predict(..., type = "link")} (however, predicted values are
back-transformed to the response scale, i.e. the conditional mean of the
response).
\item \code{"random"} (or \code{"re"})

This only applies to mixed models, and \code{type = "random"} does not condition
on the zero-inflation component of the model. \code{type = "random"} still
returns population-level predictions, however, conditioned on random effects
and considering individual level predictions, i.e. \code{re.form = NULL} when
calling \code{predict()}. This may affect the returned predicted values, depending
on whether \code{REML = TRUE} or \code{REML = FALSE} was used for model fitting.
Furthermore, unlike \code{type = "fixed"}, intervals also consider the uncertainty
in the variance parameters (the mean random effect variance, see \emph{Johnson
et al. 2014} for details) and hence can be considered as \emph{prediction intervals}.
For models with zero-inflation component, this type calls
\code{predict(..., type = "link")} (however, predicted values are back-transformed
to the response scale).

To get predicted values for each level of the random effects groups, add the
name of the related random effect term to the \code{terms}-argument
(for more details, see
\href{https://strengejacke.github.io/ggeffects/articles/introduction_effectsatvalues.html}{this vignette}).
\item \code{"zero_inflated"} (or \code{"fe.zi"} or \code{"zi"})

Predicted values are conditioned on the fixed effects and the zero-inflation
component. For instance, for models fitted with \code{zeroinfl} from \strong{pscl},
this would return the predicted (or expected) response (\code{mu*(1-p)}),
and for \strong{glmmTMB}, this would return the expected response \code{mu*(1-p)}
\emph{without} conditioning on random effects (i.e. random effect variances
are not taken into account for the confidence intervals). For models with
zero-inflation component, this type calls \code{predict(..., type = "response")}.
See 'Details'.
\item \code{"zi_random"} (or \code{"re.zi"} or \code{"zero_inflated_random"})

Predicted values are conditioned on the zero-inflation component and
take the random effects uncertainty into account. For models fitted with
\code{glmmTMB()}, \code{hurdle()} or \code{zeroinfl()}, this would return the
expected value \code{mu*(1-p)}. For \strong{glmmTMB}, prediction intervals
also consider the uncertainty in the random effects variances. This
type calls \code{predict(..., type = "response")}. See 'Details'.
\item \code{"zi_prob"} (or \code{"zi.prob"})

Predicted zero-inflation probability. For \strong{glmmTMB} models with
zero-inflation component, this type calls \code{predict(..., type = "zlink")};
models from \strong{pscl} call \code{predict(..., type = "zero")} and for
\strong{GLMMadaptive}, \code{predict(..., type = "zero_part")} is called.
\item \code{"simulate"} (or \code{"sim"})

Predicted values and confidence resp. prediction intervals are
based on simulations, i.e. calls to \code{simulate()}. This type
of prediction takes all model uncertainty into account, including
random effects variances. Currently supported models are objects of
class \code{lm}, \code{glm}, \code{glmmTMB}, \code{wbm}, \code{MixMod} and \code{merMod}.
See \code{...} for details on number of simulations.
\item \code{"survival"} and \code{"cumulative_hazard"} (or \code{"surv"} and \code{"cumhaz"})

Applies only to \code{coxph}-objects from the \strong{survial}-package and
calculates the survival probability or the cumulative hazard of an event.
}

When \code{margin = "empirical"} (or when calling \code{ggaverage()}), the \code{type}
argument accepts all values from the \code{type}-argument of the model's respective
\code{predict()}-method.}

\item{typical}{Character vector, naming the function to be applied to the
covariates (non-focal terms) over which the effect is "averaged". The
default is \code{"mean"}. Can be \code{"mean"}, "\code{weighted.mean}", \code{"median"}, \code{"mode"}
or \code{"zero"}, which call the corresponding R functions (except \code{"mode"},
which calls an internal function to compute the most common value); \code{"zero"}
simply returns 0. By default, if the covariate is a factor, only \code{"mode"} is
applicable; for all other values (including the default, \code{"mean"}) the
reference level is returned. For character vectors, only the mode is returned.
You can use a named vector to apply different functions to integer, numeric and
categorical covariates, e.g. \code{typical = c(numeric = "median", factor = "mode")}.
If \code{typical} is \code{"weighted.mean"}, weights from the model are used. If no
weights are available, the function falls back to \code{"mean"}. \strong{Note} that this
argument is ignored for \code{predict_response()}, because the \code{margin} argument
takes care of this.}

\item{condition}{Named character vector, which indicates covariates that
should be held constant at specific values. Unlike \code{typical}, which
applies a function to the covariates to determine the value that is used
to hold these covariates constant, \code{condition} can be used to define
exact values, for instance \code{condition = c(covariate1 = 20, covariate2 = 5)}.
See 'Examples'.}

\item{back_transform}{Logical, if \code{TRUE} (the default), predicted values
for log- or log-log transformed responses will be back-transformed to
original response-scale.}

\item{vcov_fun}{Variance-covariance matrix used to compute uncertainty
estimates (e.g., for confidence intervals based on robust standard errors).
This argument accepts a covariance matrix, a function which returns a
covariance matrix, or a string which identifies the function to be used to
compute the covariance matrix.
\itemize{
\item A (variance-covariance) matrix
\item A function which returns a covariance matrix (e.g., \code{stats::vcov()})
\item A string which indicates the estimation type for the heteroscedasticity-consistent
variance-covariance matrix, e.g. \code{vcov_fun = "HC0"}. Possible values are
\code{"HC0"}, \code{"HC1"}, \code{"HC2"}, \code{"HC3"}, \code{"HC4"}, \code{"HC4m"}, and \code{"HC5"}, which
will then call the \code{vcovHC()}-function from the \strong{sandwich} package, using
the specified type. Further possible values are \code{"CR0"}, \code{"CR1"}, \code{"CR1p"},
\code{"CR1S"}, \code{"CR2"}, and \code{"CR3"}, which will call the \code{vcovCR()}-function from
the \strong{clubSandwich} package.
\item A string which indicates the name of the \verb{vcov*()}-function from the
\strong{sandwich} or \strong{clubSandwich} packages, e.g. \code{vcov_fun = "vcovCL"},
which is used to compute (cluster) robust standard errors for predictions.
}

If \code{NULL}, standard errors (and confidence intervals) for predictions are
based on the standard errors as returned by the \code{predict()}-function.
\strong{Note} that probably not all model objects that work with \code{predict_response()}
are also supported by the \strong{sandwich} or \strong{clubSandwich} packages.

See details in \href{https://strengejacke.github.io/ggeffects/articles/practical_robustestimation.html}{this vignette}.}

\item{vcov_type}{Character vector, specifying the estimation type for the
robust covariance matrix estimation (see \code{?sandwich::vcovHC}
or \code{?clubSandwich::vcovCR} for details). Only used when \code{vcov_fun} is a
character string indicating one of the functions from those packages.
When \code{vcov_fun} is a function, a possible \code{type} argument \emph{must} be provided
via the \code{vcov_args} argument.}

\item{vcov_args}{List of named vectors, used as additional arguments that
are passed down to \code{vcov_fun}.}

\item{weights}{Character vector, naming the weigthing variable in the data,
or a vector of weights (of same length as the number of observations in the
data). Only applies to \code{margin = "empirical"}.}

\item{verbose}{Toggle messages or warnings.}

\item{ci.lvl, vcov.fun, vcov.type, vcov.args, back.transform}{Deprecated arguments.
Please use \code{ci_level}, \code{vcov_fun}, \code{vcov_type}, \code{vcov_args} and \code{back_transform}
instead.}

\item{interval}{Type of interval calculation, can either be \code{"confidence"}
(default) or \code{"prediction"}. May be abbreviated. Unlike \emph{confidence intervals},
\emph{prediction intervals} include the residual variance (sigma^2) to account for
the uncertainty of predicted values. For mixed models, \code{interval = "prediction"}
is the default for \code{type = "random"}. When \code{type = "fixed"}, the default is
\code{interval = "confidence"}. Note that prediction intervals are not available
for all models, but only for models that work with \code{\link[insight:get_sigma]{insight::get_sigma()}}.
For Bayesian models, when \code{interval = "confidence"}, predictions are based on
posterior draws of the linear predictor \code{\link[rstantools:posterior_epred]{rstantools::posterior_epred()}}.
If \code{interval = "prediction"}, \code{\link[rstantools:posterior_predict]{rstantools::posterior_predict()}} is called.}

\item{ppd}{Logical, if \code{TRUE}, predictions for Stan-models are based on the
posterior predictive distribution \code{\link[rstantools:posterior_predict]{rstantools::posterior_predict()}}. If
\code{FALSE} (the default), predictions are based on posterior draws of the linear
predictor \code{\link[rstantools:posterior_epred]{rstantools::posterior_epred()}}. This is roughly comparable to
the distinction between \emph{confidence} and \emph{prediction} intervals. \code{ppd = TRUE}
incorporates the residual variance and hence returned intervals are similar to
prediction intervals. Consequently, if \code{interval = "prediction"}, \code{ppd} is
automatically set to \code{TRUE}. The \code{ppd} argument will be deprecated in a
future version. Please use \code{interval = "prediction"} instead.}
}
\value{
A data frame (with \code{ggeffects} class attribute) with consistent data columns:
\itemize{
\item \code{"x"}: the values of the first term in \code{terms}, used as x-position in plots.
\item \code{"predicted"}: the predicted values of the response, used as y-position in plots.
\item \code{"std.error"}: the standard error of the predictions. \emph{Note that the standard
errors are always on the link-scale, and not back-transformed for non-Gaussian
models!}
\item \code{"conf.low"}: the lower bound of the confidence interval for the predicted values.
\item \code{"conf.high"}: the upper bound of the confidence interval for the predicted values.
\item \code{"group"}: the grouping level from the second term in \code{terms}, used as
grouping-aesthetics in plots.
\item \code{"facet"}: the grouping level from the third term in \code{terms}, used to indicate
facets in plots.

The estimated marginal means (or predicted values) are always on the
response scale!

For proportional odds logistic regression (see \code{?MASS::polr})
resp. cumulative link models (e.g., see \code{?ordinal::clm}),
an additional column \code{"response.level"} is returned, which indicates
the grouping of predictions based on the level of the model's response.

Note that for convenience reasons, the columns for the intervals
are always named \code{"conf.low"} and \code{"conf.high"}, even though
for Bayesian models credible or highest posterior density intervals
are returned.

There is an \code{\link[=as.data.frame]{as.data.frame()}} method for objects of class \code{ggeffects},
which has an \code{terms_to_colnames} argument, to use the term names as column
names instead of the standardized names \code{"x"} etc.
}
}
\description{
After fitting a model, it is useful generate model-based estimates (expected
values, or \emph{adjusted predictions}) of the response variable for different
combinations of predictor values. Such estimates can be used to make
inferences about relationships between variables.

The \strong{ggeffects} package computes marginal means and adjusted predicted
values for the response, at the margin of specific values or levels from
certain model terms. The package is built around three core functions:
\code{predict_response()} (understanding results), \code{test_predictions()} (testing
results for statistically significant differences) and \code{plot()} (communicate
results).

By default, adjusted predictions or marginal means are by returned on the
\emph{response} scale, which is the easiest and most intuitive scale to interpret
the results. There are other options for specific models as well, e.g. with
zero-inflation component (see documentation of the \code{type}-argument). The
result is returned as consistent data frame, which is nicely printed by
default. \code{plot()} can be used to easily create figures.

The main function to calculate marginal means and adjusted predictions is
\code{predict_response()}. In previous versions of \strong{ggeffects}, the functions
\code{ggpredict()}, \code{ggemmeans()}, \code{ggeffect()} and \code{ggaverage()} were used to
calculate marginal means and adjusted predictions. These functions are still
available, but \code{predict_response()} as a "wrapper" around these functions is
the preferred way to do this now.
}
\details{
Please see \code{?predict_response} for details and examples.
}
