lca.datasim {lcca} | R Documentation |
The generic method lca.datasim
simulates a random
dataset of a given size from a latent-class model under user-supplied
parameters. It may be used in simulations to
evaluate the properties of inferential procedures over repeated samples.
lca.datasim(obj, ...) ## Default S3 method: lca.datasim(obj, iseeds = NULL, nlevs, ncases, groups = NULL, case.names = NULL, item.names = NULL, ...) ## S3 method for class 'lca' lca.datasim(obj, iseeds = NULL, nlevs = obj$nlevs, ncases = obj$ncases, groups = obj$ngroups, case.names = obj$case.names, item.names = obj$item.names, ...)
obj |
object used to select a method. Either an object of
class |
iseeds |
two integers to initialize the random number generator; see DETAILS. |
nlevs |
integer vector of length |
ncases |
number of cases to simulate (i.e., the sample size). |
groups |
optional grouping variable; see DETAILS. |
case.names |
optional names to assign to the rows of
the resulting data matrix. Should be a character vector of length
|
item.names |
optional names to assign to the columns of the
resulting data matrix. Should be a character vector of length
|
... |
additional arguments to be passed to the methods. |
This generic method may be called in two ways. One way is to supply
the parameters of a latent-class model as the first argument. The
parameters are arranged as a list with two named components,
rho
and gamma
, containing item-response probabilities
and class prevalences, respectively. The component rho
should
be an array of dimension c(nitems,maxlevs,nclass,ngroups)
,
where nitems
is the number of response variables,
maxlevs
is the maximum number of levels (distinct response
categories) among the response variables, nclass
is the number
of latent classes, and ngroups
is the number of groups (equal
to 1
if groups
is not supplied). The element
rho[j,k,c,g]
is the probability that an individual in group
g
and class c
supplies a response of k
to item
j
. The component gamma
should be a matrix with
nclass
rows and ngroups
columns, with gamma[c,g]
containing the prevalence of class c
within group g
. If
ngroups=1
, the last dimension of the parameter arrays may be
dropped, so that rho
may have dimension
c(nitems,maxlevs,nclass)
and gamma
may be a vector of
length nclass
.
The second way to call this method is to supply as the first argument
an object of class "lca"
, which is the result of a call to
lca
. In this case, a data matrix will be generated
with the same dimensions as the response data (the matrix on
the left-hand side of the model formula) in the original lca
call, with the same number of classes as in the original model,
and parameters equal to the final estimates from that model.
This function uses its own internal random number generator
which is seeded by two integers, for example, seeds=c(123,456)
,
which allows results to be reproduced in the future. If
seeds=NULL
then the function will seed itself with two random
integers from R. Therefore, results can also be made reproducible by
calling set.seed
beforehand and taking
seeds=NULL
.
The groups
variable, if present,
should be integers coded as 1,2,...,ngroups
, where
ngroups
is the number of distinct groups. The groups
variable may also be a factor. If groups=NULL
,
then ngroups
is taken to be 1
.
a matrix of integers of dimension c(ncases,nitems)
containing
a random sample of responses from the specified latent-class
model.
Joe Schafer
Send questions to mchelpdesk@psu.edu
# Set up parameters for a two-class model with four binary # items and strong measurement. Members of the first class, which # comprises 40% of the population, have a high probability of # endorsing (providing a response of 1 to) items 1 and 2. # Members of the second class, which comprises 60% of the # population, have a high probability of endorsing items 3 and 4. rho <- array(NA, c(4,2,2) ) rho[,1,1] <- c(.9,.9,.1,.1) rho[,2,1] <- 1 - rho[,1,1] rho[,1,2] <- c(.1,.1,.9,.9) rho[,2,2] <- 1 - rho[,1,2] param <- list( rho=rho, gamma=c(.4,.6) ) # generate a sample of N=1000 observations, and # fit the two-class model to the simulated data set.seed(124) Y <- lca.datasim( param, nlevs=c(2,2,2,2), ncases=1000) fit <- lca( Y~1 ) summary(fit)