Title: | Sample Size Calculation in Hierarchical 2x2 Factorial Trials |
---|---|
Description: | Implements the sample size methods for hierarchical 2x2 factorial trials under two choices of effect estimands and a series of hypothesis tests proposed in "Sample size calculation in hierarchical 2x2 factorial trials with unequal cluster sizes" (under review), and provides the table and plot generators for the sample size estimations. |
Authors: | Zizhong Tian [aut, cre], Denise Esserman [aut], Guangyu Tong [aut], Fan Li [aut] |
Maintainer: | Zizhong Tian <[email protected]> |
License: | LGPL (>= 2.1) |
Version: | 2.0.0 |
Built: | 2025-02-04 02:53:15 UTC |
Source: | https://github.com/billytian/h2x2factorial |
The function calc.H2x2Factorial
estimates the required number of clusters or the achieved power level under different types of
hypothesis tests of either the controlled (main) effect (by default) or the natural (marginal) effect of the two treatments in a hierarchical 2x2 factorial trial
with unequal cluster sizes and a continuous outcome. Two types of treatment effect estimands, five types of hypothesis tests as well as their corresponding
finite-sample considerations could be chosen for the predictions. Users may input an optional cluster number through the n.input
argument. When this
number is provided, the function will calculate the power under a chosen hypothesis test as well as a finite-sample correction if specified, and the function
will ignore the potential input for the power parameter; When the number of clusters is not provided, the function will calculate the required number of
clusters based on a given power threshold, which is set to 0.8 by default.
calc.H2x2Factorial(power=0.8, n_input=NULL, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, m_bar=50, CV=0, rho=0, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
calc.H2x2Factorial(power=0.8, n_input=NULL, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, m_bar=50, CV=0, rho=0, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
power |
a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is |
n_input |
a number of cluster provided by the user to estimate the power that can be achieved. Default is |
alpha |
a numeric value between 0 and 1 as the type I error rate. Default is |
pi_x |
a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is |
pi_z |
a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is |
delta_x |
a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is |
delta_z |
a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is |
delta_xz |
a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is |
sigma2_y |
a positive numeric value for the total variance of the continuous outcome. Default is |
m_bar |
a numeric value larger than 2 for the mean cluster size. Default is |
CV |
a positive numeric value as the coefficient of variation of the cluster sizes. Default is |
rho |
a numeric value between 0 and 1 as the intraclass correlation coefficient characterizing the between-cluster variability. Default is |
estimand |
a character argument indicating the type of treatment effect estimand. Supported values include |
test |
a character argument indicating the type of hypothesis test of interest. Supported values include
|
correction |
a logical argument indicating whether a finite sample correction should be used. Default is |
max_n |
an optional setting of a maximum number of clusters, which is only functional under |
seed_mix |
an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under |
size_mix |
a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under |
verbose |
a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is |
Given the input parameters, our method will firstly compute the variances of the effects of interest based on Generalized Least Square estimators and large-sample approximations.
Then, the variances are used to build up either the classic sample size formulas (for the separate tests for controlled or natural treatment effects and the interaction test) or
the power formulas (for the simultaneous tests and the corrected tests), which help to deliver both the sample size and power calculations.
Without finite-sample considerations, the separate tests of the two controlled effects and the two natural effects as well as the interaction test use the two-sided Wald z-test,
the joint test use the Chi-square test, and the intersection-union (I-U) test use also a two-sided z-based test.
With correction=T
, finite-sample corrections are customized for the three types of tests involving either the controlled effect or the natural effect of the cluster-level
treatment: For the tests for the controlled effect and the natural effect of the cluster-level treatment, a two-sided t-test is used;
For the joint test of the two controlled effects, a F-test is used as a naive correction, which might lead to slight overpower;
For the joint test of the two natural effects, a simulation-based mixed F-chi-square test is used;
For the I-U test of the two controlled effects, a two-sided t-based test is used as a naive correction, which might lead to slight overpower.
For the I-U test of the two natural effects, a two-sided mixed t- and z-based test is used.
For the finite-sample corrected joint test of the two natural effects, since there does not exist the required parametric distribution, we offer a simulation-based method to
generate the null and alternative distributions, and we use the simulated distributions to compute the power and required sample size.
A seed should be set via seed_mix
for this random process to promote reproducibility, and this is only needed under the natural effect joint test with finite-sample correction.
The two types of estimand
, the five types of test
, and the developments of correction
are defined in Tian et al. (under review).
calc.H2x2Factorial
returns an integer representing the required number of clusters or a decimal representing the power that can be achieved by the provided
sample size, with some useful and suppressible messages elaborating vital parameter choices and results (the power will be displayed in 4 decimal places; the messages can be suppressed via verbose=FALSE
).
#Predict the actual power of a natural effect joint test when the number of clusters is 10 joint.power <- calc.H2x2Factorial(n_input=10, delta_x=0.2, delta_z=0.1, rho=0.1, CV=0.38, estimand="natural", test="joint", correction=TRUE, seed_mix=123456, verbose=FALSE) print(joint.power)
#Predict the actual power of a natural effect joint test when the number of clusters is 10 joint.power <- calc.H2x2Factorial(n_input=10, delta_x=0.2, delta_z=0.1, rho=0.1, CV=0.38, estimand="natural", test="joint", correction=TRUE, seed_mix=123456, verbose=FALSE) print(joint.power)
The function graph.H2x2Factorial
plots the sample size estimations or combinations of mean cluster sizes and cluster numbers
under variable CV for a chosen test. Based on the desired test and power, the function produces a plot with mean cluster size on the x-axis and number of clusters on
the y-axis, with multiple lines representing the dynamic sample size constraints if a vector of CV is specified. The limits of the y-axis
will be automatically adjusted based on the extreme values calculated. A color-blind-friendly palette is set by default but it can be updated by users.
graph.H2x2Factorial(m_lower=10, m_upper=100, m_step=2, CV=c(0,0.3,0.6,0.9), palette=c("#0F2080","#85C0F9","#DDCC77","#F5793A","#A95AA1"), line_width=rep(3,5), line_type=seq(1,5,1), title=NULL, power=0.8, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, rho=0, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
graph.H2x2Factorial(m_lower=10, m_upper=100, m_step=2, CV=c(0,0.3,0.6,0.9), palette=c("#0F2080","#85C0F9","#DDCC77","#F5793A","#A95AA1"), line_width=rep(3,5), line_type=seq(1,5,1), title=NULL, power=0.8, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, rho=0, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
m_lower |
a numeric value larger than 2 for the lower bound of the mean cluster sizes on the horizontal axis. Default is |
m_upper |
a numeric value larger than |
m_step |
a positive numeric value for the step size on the horizontal axis for plotting the sample size combinations. Default is |
CV |
a vector of positive numeric values for a series of coefficients of variation of the cluster sizes. The length of CV vector equals the number
of lines presented in the plot, so the CV vector with a length less or equal to 5 is suggested for making a clear-looking graph. Besides, a reasonable magnitude of CV is highly recommended to produce effective plots.
Default is |
palette |
a vector of character values to specify the color choices corresponding to the lines in the plot.
Default is |
line_width |
a vector of numeric values to specify the widths of the lines in the plot. Default is |
line_type |
a vector of numeric values to specify the line types of the lines in the plot. Default is |
title |
a user-defined title or caption for the plot. Default is |
power |
a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is |
alpha |
a numeric value between 0 and 1 as the type I error rate. Default is |
pi_x |
a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is |
pi_z |
a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is |
delta_x |
a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is |
delta_z |
a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is |
delta_xz |
a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is |
sigma2_y |
a positive numeric value for the total variance of the continuous outcome. Default is |
rho |
a numeric value between 0 and 1 as the intraclass correlation coefficient characterizing the between-cluster variability. Default is |
estimand |
a character argument indicating the type of treatment effect estimand. Supported values include |
test |
a character argument indicating the type of hypothesis test of interest. Supported values include
|
correction |
a logical argument indicating whether a finite sample correction should be used. Default is |
max_n |
an optional setting of a maximum number of clusters, which is only functional under |
seed_mix |
an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under |
size_mix |
a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under |
verbose |
a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is |
graph.H2x2Factorial
returns a plot comparing the sample size requirements under different CV, with some suppressible messages.
#Make a plot under the test for marginal cluster-level treatment effect graph.H2x2Factorial(power=0.9, estimand="controlled", test="cluster", rho=0.1, verbose=FALSE)
#Make a plot under the test for marginal cluster-level treatment effect graph.H2x2Factorial(power=0.9, estimand="controlled", test="cluster", rho=0.1, verbose=FALSE)
The function table.H2x2Factorial
outputs a data frame that summarizes the required number of clusters and the predicted
power based on a constellation of design parameters. This function is useful when the user wants a series of table-format predictions
based on varying design parameters including mean cluster size (m_bar), intraclass correlation coefficient (rho), and coefficient of variation of the cluster sizes (CV).
table.H2x2Factorial(power=0.8, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x, delta_z, delta_xz, sigma2_y=1, m_bar, CV, rho, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
table.H2x2Factorial(power=0.8, alpha=0.05, pi_x=0.5, pi_z=0.5, delta_x, delta_z, delta_xz, sigma2_y=1, m_bar, CV, rho, estimand="controlled", test="cluster", correction=FALSE, max_n=1e8, seed_mix=NULL, size_mix=1e4, verbose=TRUE)
power |
a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is |
alpha |
a numeric value between 0 and 1 as the type I error rate. Default is |
pi_x |
a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is |
pi_z |
a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is |
delta_x |
a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is |
delta_z |
a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is |
delta_xz |
a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is |
sigma2_y |
a positive numeric value for the total variance of the continuous outcome. Default is |
m_bar |
a vector of numeric values larger than 2 for a series of mean cluster sizes. |
CV |
a vector of positive numeric values for a series of coefficients of variation of the cluster sizes. |
rho |
a vector of numeric values between 0 and 1 for a series of intraclass correlation coefficients. |
estimand |
a character argument indicating the type of treatment effect estimand. Supported values include |
test |
a character argument indicating the type of hypothesis test of interest. Supported values include
|
correction |
a logical argument indicating whether a finite sample correction should be used. Default is |
max_n |
an optional setting of a maximum number of clusters, which is only functional under |
seed_mix |
an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under |
size_mix |
a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under |
verbose |
a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is |
If the user further requires a vector of power
or other parameters like pi_x
, which invokes the need for multiple tables,
an external loop could be easily written using this function to produce multiple data frames.
table.H2x2Factorial
returns a data frame with inputs of m_bar
, rho
, and CV
varied in a factorial setting, the predicted number of clusters n
under the power requirement,
and the actual power predicted.power
the estimated sample size can help to achieve, with some suppressible messages.
#Make a result table by providing three mean cluster sizes, three CV, and three ICC table.cluster <- table.H2x2Factorial(delta_x=0.2, delta_z=0.1, m_bar=c(10,50,100), CV=c(0, 0.3, 0.5), rho=c(0.01, 0.1), estimand="controlled", test="cluster", verbose=FALSE) table.cluster
#Make a result table by providing three mean cluster sizes, three CV, and three ICC table.cluster <- table.H2x2Factorial(delta_x=0.2, delta_z=0.1, m_bar=c(10,50,100), CV=c(0, 0.3, 0.5), rho=c(0.01, 0.1), estimand="controlled", test="cluster", verbose=FALSE) table.cluster