lcca.datasim {lcca} | R Documentation |
The generic method lcca.datasim
simulates a random
dataset of a given size from a latent-class causal model
under user-supplied parameters. It may be used in simulations to
evaluate the properties of inferential procedures over repeated samples.
lcca.datasim(obj, ...) ## Default S3 method: lcca.datasim(obj, outcome.distribution="NORMAL", iseeds = NULL, nlevs, ncases, x.alpha, x.beta, case.names = NULL, item.names = NULL, ...) ## S3 method for class 'lcca' lcca.datasim(obj, outcome.distribution="NORMAL", iseeds = NULL, nlevs = obj$nlevs, ncases = obj$ncases, x.alpha = obj$x.alpha, x.beta = obj$x.beta, case.names = obj$case.names, item.names = obj$item.names, ...)
obj |
object used to select a method. Either an object of
class |
outcome.distribution |
The distribution of outcome
variable, could be |
iseeds |
two integers to initialize the random number generator; see DETAILS. |
nlevs |
integer vector of length |
ncases |
number of cases to simulate (i.e., the sample size). |
x.alpha |
matrix of predictors (including a constant term, if present) for the logistic treatment model. |
x.beta |
matrix of predictors (including a constant term, if present) for the linear outcome model. |
case.names |
optional names to assign to the rows of
the resulting data matrix. Should be a character vector of length
|
item.names |
optional names to assign to the columns of the
resulting data matrix. Should be a character vector of length
|
... |
additional arguments to be passed to the methods. |
This generic method may be called in two ways. One way is to supply
the parameters of a latent-class causal model as the first
argument. The parameters are arranged as a list with four named
named components: rho
, alpha
, beta
and
sigma2
. The component rho
should be an array of
dimension c(nitems,maxlevs,nclass)
, where nitems
is the
number of items on the left-hand side of formula.treatment
,
maxlevs
is the maximum number of levels (distinct response
categories) among the items, and nclass
is the number of
treatment classes. The element starting.values$rho[j,k,c,g]
is
the probability that an individual in class c
supplies a
response of k
to item j
. The component alpha
should be a matrix of dimension c(ncovs.alpha,nclass)
, where
ncovs.alpha
is the number of predictors in the logistic
treatment model (including a constant term for the intercept, if
present). The elements of starting.values$alpha[,c]
are the
coefficients determining the log-odds of membership in class c
,
versus the reference class. If c
is the reference class, then
all elements of starting.values$alpha[,c]
must be zero. The
component beta
should be a matrix of dimension
c(ncovs.beta,nclass)
, where
ncovs.beta
is the number of predictors in the outcome
model (including a constant term for the intercept, if
present). The elements of starting.values$beta[,c]
are the
coefficients for predicting the potential outcomes for class c
.
The component sigma2
should be a numeric vector of length
nclass
containing residual variances for the potential
outcomes.
The second way to call this method is to supply as the first argument
an object of class "lcca"
, which is the result of a call to
lcca
. In this case, a dataset will be generated
with the same dimensions as the data in the original lcca
call, with the same number of classes as in the original model, the
same covariates, and parameters equal to the final estimates from
that model.
This function uses its own internal random number generator
which is seeded by two integers, for example, seeds=c(123,456)
,
which allows results to be reproduced in the future. If
seeds=NULL
then the function will seed itself with two random
integers from R. Therefore, results can also be made reproducible by
calling set.seed
beforehand and taking
seeds=NULL
.
a list with two components:
u |
a matrix of integers of dimension
|
y.obs |
a numeric vector of length |
Joe Schafer
Send questions to mchelpdesk@psu.edu
# Set up rho-parameters for a two-class model with 4 binary # items and strong measurement. Members of the first class # have a high probability of endorsing (providing a response # of 1 to) items 1 and 2. Members of the second class have # a high probability of endorsing items 3 and 4. rho <- array(NA, c(4,2,2) ) rho[,1,1] <- c(.9,.9,.1,.1) rho[,2,1] <- 1 - rho[,1,1] rho[,1,2] <- c(.1,.1,.9,.9) rho[,2,2] <- 1 - rho[,1,2] # create matrix of predictors for the treatment model # consisting of a constant # and two normally distributed covariates N <- 1000 set.seed(102) X1 <- rnorm(N) X2 <- rnorm(N) x.alpha <- cbind(1, X1, X2) # use the same predictors in the outcome model x.beta <- x.alpha # Set up the logistic coefficents, with class 1 as the # reference class alpha <- matrix(NA, 3, 2) alpha[,1] <- 0 # reference class alpha[,2] <- c(0,1,1) # Set up the linear coefficents for the outcomes model # Note: average treatment effect in this population is 1 beta <- matrix(NA, 3, 2) beta[,1] <- c(1,2,3) beta[,2] <- c(2,2,3) # set up residual variances sigma2 <- c(1,1.5) # generate a sample of N=1000 observations, and # fit the two-class model to the simulated data param <- list(rho=rho, alpha=alpha, beta=beta, sigma2=sigma2) tmp <- lcca.datasim( param, outcome.distribution="NORMAL", nlevs=c(2,2,2,2,2,2), ncases=N, x.alpha=x.alpha, x.beta=x.beta) U <- tmp$u Y <- tmp$y.obs fit <- lcca( U ~ X1 + X2, Y ~ X1 + X2 ) summary(fit)