lcacov.datasim {lcca}R Documentation

Simulate random data from a latent-class model with covariates

Description

The generic method lcacov.datasim simulates a random dataset of a given size from a latent-class model with covariates under user-supplied parameters. It may be used in simulations to evaluate the properties of inferential procedures over repeated samples.

Usage



lcacov.datasim(obj, ...)


## Default S3 method:
lcacov.datasim(obj, iseeds = NULL, nlevs, ncases,
   x, groups = NULL, case.names = NULL, item.names = NULL, ...)


## S3 method for class 'lcacov'
lcacov.datasim(obj, iseeds = NULL, 
   nlevs = obj$nlevs, ncases = obj$ncases, x = obj$x,
   groups = obj$groups, case.names = obj$case.names,
   item.names = obj$item.names, ...)

Arguments

obj

object used to select a method. Either an object of class "lcacov" produced by the function lcacov, or a list containing parameters from a latent-class model with covariates; see DETAILS.

iseeds

two integers to initialize the random number generator; see DETAILS.

nlevs

integer vector of length nitems, where nitems is the number of response variables, indicating the number of levels or response categories for each variable.

ncases

number of cases to simulate (i.e., the sample size).

x

matrix of predictors (including a constant term, if present) for the logistic model.

groups

optional grouping variable; see DETAILS.

case.names

optional names to assign to the rows of the resulting data matrix. Should be a character vector of length ncases.

item.names

optional names to assign to the columns of the resulting data matrix. Should be a character vector of length nitems.

...

additional arguments to be passed to the methods.

Details

This generic method may be called in two ways. One way is to supply the parameters of a latent-class model with covariates as the first argument. The parameters are arranged as a list with two named components, rho and alpha, containing item-response probabilities and logistic coefficients, respectively. The component rho should be an array of dimension c(nitems,maxlevs,nclass,ngroups), where nitems is the number of response variables, maxlevs is the maximum number of levels (distinct response categories) among the response variables, nclass is the number of latent classes, and ngroups is the number of groups (equal to 1 if groups is not supplied). The element rho[j,k,c,g] is the probability that an individual in group g and class c supplies a response of k to item j. The component alpha should be an array of dimension c(ncovs,nclass,ngroups), where ncovs is the number of predictors in the logistic model (including a constant term for the intercept, if present). The elements of alpha[,c,g] are the coefficients determining the log-odds of membership in class c, versus the reference class, for individuals in group g. If c is the reference class, then all elements of starting.values$alpha[,c,g] must be zero. If ngroups=1, the last dimension of the parameter arrays may be dropped, so that rho may have dimension c(nitems,maxlevs,nclass) and alpha may have dimension c(ncovs,nclass).

The second way to call this method is to supply as the first argument an object of class "lcacov", which is the result of a call to lcacov. In this case, a data matrix will be generated with the same dimensions as the response data (the matrix on the left-hand side of the model formula) in the original lcacov call, with the same number of classes as in the original model, the same covariates, and parameters equal to the final estimates from that model.

This function uses its own internal random number generator which is seeded by two integers, for example, seeds=c(123,456), which allows results to be reproduced in the future. If seeds=NULL then the function will seed itself with two random integers from R. Therefore, results can also be made reproducible by calling set.seed beforehand and taking seeds=NULL.

The groups variable, if present, should be integers coded as 1,2,...,ngroups, where ngroups is the number of distinct groups. The groups variable may also be a factor. If groups=NULL, then ngroups is taken to be 1.

Value

a matrix of integers of dimension c(ncases,nitems) containing a random sample of responses from the specified latent-class model with covariates.

Author(s)

Joe Schafer

Send questions to mchelpdesk@psu.edu

See Also

lcacov

Examples

# Set up rho-parameters for a two-class model with 4 binary
# items and strong measurement.  Members of the first class
# have a high probability of endorsing (providing a response
# of 1 to) items 1 and 2. Members of the second class have
# a high probability of endorsing items 3 and 4.
rho <- array(NA, c(4,2,2) )
rho[,1,1] <- c(.9,.9,.1,.1)
rho[,2,1] <- 1 - rho[,1,1]
rho[,1,2] <- c(.1,.1,.9,.9)
rho[,2,2] <- 1 - rho[,1,2]

# create matrix of predictors consisting of a constant
# and two normally distributed covariates
N <- 1000
set.seed(102)
X1 <- rnorm(N)
X2 <- rnorm(N)
x <- cbind(1, X1, X2)

# Set up the logistic coefficents, with class 1 as the
# reference class
alpha <- matrix(NA, 3, 2)
alpha[,1] <- 0  # reference class
alpha[,2] <- c(0,1,1)

# generate a sample of N=1000 observations, and
# fit the two-class model to the simulated data
param <- list(rho=rho, alpha=alpha)
Y <- lcacov.datasim( param, nlevs=c(2,2,2,2,2,2), ncases=N, x=x)
fit <- lcacov( Y ~ X1 + X2 )
summary(fit)

[Package lcca version 2.0.0 Index]