mdMARsimulate generates missing values under a MAR logistic mechanism.
This function introduces missing values in a data matrix using a Missing At Random (MAR) mechanism. The probability that an entry is missing is modeled through a logistic regression whose covariates are observed columns of X.
The logistic link is evaluated using the Statistics and Machine Learning Toolbox probability distribution object created by pdLogistic = makedist('Logistic','mu',0,'sigma',1);
Therefore, if $\eta_{ij} = \alpha_j + x_i' \beta_j$, the missingness probability is computed as $Pr(M_{ij}=1 | x_i)$ = cdf(pdLogistic,eta_ij), which is equal to
\[ 1/(1+exp(-\eta_{ij})). \] If obsCols=1 and missCols=2:p, the mechanism is \[ Pr(M_{ij}=1 | X_{i1}) = cdf(pdLogistic, \alpha_j + \beta_j*X_{i1}), \qquad j = 2, ..., p, \] where $M_{ij}=1$ denotes a missing entry. The intercept alpha_j is chosen so that the expected missingness proportion in the corresponding missing column is equal to missRate.
Different logistic slopes for different missing columns.Ymar
=mdMARsimulate(Y,
Name, Value)
Little, R.J.A. and Rubin, D.B. (2019), "Statistical Analysis with Missing Data", 3rd edition, Wiley.
Rubin, D.B. (1976), Inference and missing data, "Biometrika", Vol. 63, pp. 581-592.
mdpattern
|
mdPartialMD
|
mdEM
|
mdTEM
|
mdLittleTest