Package 'H2x2Factorial' reference manual

Title:	Sample Size Calculation in Hierarchical 2x2 Factorial Trials
Description:	Implements the sample size methods for hierarchical 2x2 factorial trials under two choices of effect estimands and a series of hypothesis tests proposed in "Sample size calculation in hierarchical 2x2 factorial trials with unequal cluster sizes" (under review), and provides the table and plot generators for the sample size estimations.
Authors:	Zizhong Tian [aut, cre], Denise Esserman [aut], Guangyu Tong [aut], Fan Li [aut]
Maintainer:	Zizhong Tian <[email protected]>
License:	LGPL (>= 2.1)
Version:	2.0.0
Built:	2025-03-06 03:02:03 UTC
Source:	https://github.com/billytian/h2x2factorial

H2x2Factorial Sample Size and Power Calculation

Description

The function calc.H2x2Factorial estimates the required number of clusters or the achieved power level under different types of hypothesis tests of either the controlled (main) effect (by default) or the natural (marginal) effect of the two treatments in a hierarchical 2x2 factorial trial with unequal cluster sizes and a continuous outcome. Two types of treatment effect estimands, five types of hypothesis tests as well as their corresponding finite-sample considerations could be chosen for the predictions. Users may input an optional cluster number through the n.input argument. When this number is provided, the function will calculate the power under a chosen hypothesis test as well as a finite-sample correction if specified, and the function will ignore the potential input for the power parameter; When the number of clusters is not provided, the function will calculate the required number of clusters based on a given power threshold, which is set to 0.8 by default.

Usage

calc.H2x2Factorial(power=0.8, n_input=NULL, alpha=0.05,
                   pi_x=0.5, pi_z=0.5,
                   delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1,
                   m_bar=50, CV=0, rho=0,
                   estimand="controlled", test="cluster", correction=FALSE,
                   max_n=1e8, seed_mix=NULL, size_mix=1e4,
                   verbose=TRUE)
calc.H2x2Factorial(power=0.8, n_input=NULL, alpha=0.05,
                   pi_x=0.5, pi_z=0.5,
                   delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1,
                   m_bar=50, CV=0, rho=0,
                   estimand="controlled", test="cluster", correction=FALSE,
                   max_n=1e8, seed_mix=NULL, size_mix=1e4,
                   verbose=TRUE)

Arguments

`power`	a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is `0.8`.
`n_input`	a number of cluster provided by the user to estimate the power that can be achieved. Default is `NULL`.
`alpha`	a numeric value between 0 and 1 as the type I error rate. Default is `0.05`.
`pi_x`	a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is `0.5`, representing a balanced allocation.
`pi_z`	a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is `0.5`, representing a balanced allocation.
`delta_x`	a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is `0.25`, which is the hypothetical value for the example in the referenced paper.
`delta_z`	a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is `0.33`, which is the hypothetical value for the example in the referenced paper.
`delta_xz`	a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is `0.3`, which is the hypothetical value for the example in the referenced paper.
`sigma2_y`	a positive numeric value for the total variance of the continuous outcome. Default is `1`.
`m_bar`	a numeric value larger than 2 for the mean cluster size. Default is `50`.
`CV`	a positive numeric value as the coefficient of variation of the cluster sizes. Default is `0`, representing equal cluster sizes.
`rho`	a numeric value between 0 and 1 as the intraclass correlation coefficient characterizing the between-cluster variability. Default is `0`.
`estimand`	a character argument indicating the type of treatment effect estimand. Supported values include `"controlled"` (controlled or main effect estimand) and `"natural"` (natural or marginal effect estimand). Default is `"controlled"`.
`test`	a character argument indicating the type of hypothesis test of interest. Supported values include `"cluster"` (test for marginal cluster-level treatment effect), `"individual"` (test for marginal individual-level treatment effect), `"interaction"` (interaction test for the two treatments), `"joint"` (joint test for the two marginal treatment effects), `"I-U"` (intersection-union test for the two marginal effects). Default is `"cluster"`.
`correction`	a logical argument indicating whether a finite sample correction should be used. Default is `FALSE`.
`max_n`	an optional setting of a maximum number of clusters, which is only functional under `test="cluster"`, `"joint"`, or `"I-U"`. Default is `1e8`.
`seed_mix`	an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under `test="joint"`. Default is `NULL`.
`size_mix`	a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under `test="joint"`. Default is `1e4`.
`verbose`	a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is `TRUE`.

Details

Given the input parameters, our method will firstly compute the variances of the effects of interest based on Generalized Least Square estimators and large-sample approximations. Then, the variances are used to build up either the classic sample size formulas (for the separate tests for controlled or natural treatment effects and the interaction test) or the power formulas (for the simultaneous tests and the corrected tests), which help to deliver both the sample size and power calculations. Without finite-sample considerations, the separate tests of the two controlled effects and the two natural effects as well as the interaction test use the two-sided Wald z-test, the joint test use the Chi-square test, and the intersection-union (I-U) test use also a two-sided z-based test. With correction=T, finite-sample corrections are customized for the three types of tests involving either the controlled effect or the natural effect of the cluster-level treatment: For the tests for the controlled effect and the natural effect of the cluster-level treatment, a two-sided t-test is used; For the joint test of the two controlled effects, a F-test is used as a naive correction, which might lead to slight overpower; For the joint test of the two natural effects, a simulation-based mixed F-chi-square test is used; For the I-U test of the two controlled effects, a two-sided t-based test is used as a naive correction, which might lead to slight overpower. For the I-U test of the two natural effects, a two-sided mixed t- and z-based test is used. For the finite-sample corrected joint test of the two natural effects, since there does not exist the required parametric distribution, we offer a simulation-based method to generate the null and alternative distributions, and we use the simulated distributions to compute the power and required sample size. A seed should be set via seed_mix for this random process to promote reproducibility, and this is only needed under the natural effect joint test with finite-sample correction. The two types of estimand, the five types of test, and the developments of correction are defined in Tian et al. (under review).

Value

calc.H2x2Factorial returns an integer representing the required number of clusters or a decimal representing the power that can be achieved by the provided sample size, with some useful and suppressible messages elaborating vital parameter choices and results (the power will be displayed in 4 decimal places; the messages can be suppressed via verbose=FALSE).

Examples

#Predict the actual power of a natural effect joint test when the number of clusters is 10
joint.power <- calc.H2x2Factorial(n_input=10,
                                  delta_x=0.2, delta_z=0.1,
                                  rho=0.1, CV=0.38,
                                  estimand="natural",
                                  test="joint",
                                  correction=TRUE, seed_mix=123456, verbose=FALSE)
print(joint.power)

#Predict the actual power of a natural effect joint test when the number of clusters is 10
joint.power <- calc.H2x2Factorial(n_input=10,
                                  delta_x=0.2, delta_z=0.1,
                                  rho=0.1, CV=0.38,
                                  estimand="natural",
                                  test="joint",
                                  correction=TRUE, seed_mix=123456, verbose=FALSE)
print(joint.power)

H2x2Factorial Plot

Description

The function graph.H2x2Factorial plots the sample size estimations or combinations of mean cluster sizes and cluster numbers under variable CV for a chosen test. Based on the desired test and power, the function produces a plot with mean cluster size on the x-axis and number of clusters on the y-axis, with multiple lines representing the dynamic sample size constraints if a vector of CV is specified. The limits of the y-axis will be automatically adjusted based on the extreme values calculated. A color-blind-friendly palette is set by default but it can be updated by users.

Usage

graph.H2x2Factorial(m_lower=10, m_upper=100, m_step=2,
                    CV=c(0,0.3,0.6,0.9),
                    palette=c("#0F2080","#85C0F9","#DDCC77","#F5793A","#A95AA1"),
                    line_width=rep(3,5), line_type=seq(1,5,1), title=NULL,
                    power=0.8, alpha=0.05,
                    pi_x=0.5, pi_z=0.5,
                    delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, rho=0,
                    estimand="controlled", test="cluster", correction=FALSE,
                    max_n=1e8, seed_mix=NULL, size_mix=1e4,
                    verbose=TRUE)
graph.H2x2Factorial(m_lower=10, m_upper=100, m_step=2,
                    CV=c(0,0.3,0.6,0.9),
                    palette=c("#0F2080","#85C0F9","#DDCC77","#F5793A","#A95AA1"),
                    line_width=rep(3,5), line_type=seq(1,5,1), title=NULL,
                    power=0.8, alpha=0.05,
                    pi_x=0.5, pi_z=0.5,
                    delta_x=0.25, delta_z=0.33, delta_xz=0.3, sigma2_y=1, rho=0,
                    estimand="controlled", test="cluster", correction=FALSE,
                    max_n=1e8, seed_mix=NULL, size_mix=1e4,
                    verbose=TRUE)

Arguments

`m_lower`	a numeric value larger than 2 for the lower bound of the mean cluster sizes on the horizontal axis. Default is `10`.
`m_upper`	a numeric value larger than `m_lower` for the upper bound of the mean cluster sizes on the horizontal axis. Default is `100`.
`m_step`	a positive numeric value for the step size on the horizontal axis for plotting the sample size combinations. Default is `2`.
`CV`	a vector of positive numeric values for a series of coefficients of variation of the cluster sizes. The length of CV vector equals the number of lines presented in the plot, so the CV vector with a length less or equal to 5 is suggested for making a clear-looking graph. Besides, a reasonable magnitude of CV is highly recommended to produce effective plots. Default is `c(0, 0.3, 0.6, 0.9)`.
`palette`	a vector of character values to specify the color choices corresponding to the lines in the plot. Default is `c("#0F2080", "#85C0F9", "#DDCC77", "#F5793A", "#A95AA1")`. The order should be matched with the specification of CV and the number of elements should be no less than that for CV vector.
`line_width`	a vector of numeric values to specify the widths of the lines in the plot. Default is `rep(3, 5)`. The order should be matched with the specification of CV and the number of elements should be no less than that for CV vector.
`line_type`	a vector of numeric values to specify the line types of the lines in the plot. Default is `seq(1, 5, 1)`. The order should be matched with the specification of CV and the number of elements should be no less than that for CV vector.
`title`	a user-defined title or caption for the plot. Default is `NULL`. By default, a formal test name will be automatically given.
`power`	a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is `0.8`.
`alpha`	a numeric value between 0 and 1 as the type I error rate. Default is `0.05`.
`pi_x`	a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is `0.5`, representing a balanced allocation.
`pi_z`	a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is `0.5`, representing a balanced allocation.
`delta_x`	a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is `0.25`, which is the hypothetical value for the example in the referenced paper.
`delta_z`	a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is `0.33`, which is the hypothetical value for the example in the referenced paper.
`delta_xz`	a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is `0.3`, which is the hypothetical value for the example in the referenced paper.
`sigma2_y`	a positive numeric value for the total variance of the continuous outcome. Default is `1`.
`rho`	a numeric value between 0 and 1 as the intraclass correlation coefficient characterizing the between-cluster variability. Default is `0`.
`estimand`	a character argument indicating the type of treatment effect estimand. Supported values include `"controlled"` (controlled or main effect estimand) and `"natural"` (natural or marginal effect estimand). Default is `"controlled"`.
`test`	a character argument indicating the type of hypothesis test of interest. Supported values include `"cluster"` (test for marginal cluster-level treatment effect), `"individual"` (test for marginal individual-level treatment effect), `"interaction"` (interaction test for the two treatments), `"joint"` (joint test for the two marginal treatment effects), `"I-U"` (intersection-union test for the two marginal effects). Default is `"cluster"`.
`correction`	a logical argument indicating whether a finite sample correction should be used. Default is `FALSE`.
`max_n`	an optional setting of a maximum number of clusters, which is only functional under `test="cluster"`, `"joint"`, or `"I-U"`. Default is `1e8`.
`seed_mix`	an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under `test="joint"`. Default is `NULL`.
`size_mix`	a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under `test="joint"`. Default is `1e4`.
`verbose`	a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is `TRUE`.

Value

graph.H2x2Factorial returns a plot comparing the sample size requirements under different CV, with some suppressible messages.

Examples

#Make a plot under the test for marginal cluster-level treatment effect
graph.H2x2Factorial(power=0.9, estimand="controlled", test="cluster", rho=0.1, verbose=FALSE)

#Make a plot under the test for marginal cluster-level treatment effect
graph.H2x2Factorial(power=0.9, estimand="controlled", test="cluster", rho=0.1, verbose=FALSE)

H2x2Factorial Table

Description

The function table.H2x2Factorial outputs a data frame that summarizes the required number of clusters and the predicted power based on a constellation of design parameters. This function is useful when the user wants a series of table-format predictions based on varying design parameters including mean cluster size (m_bar), intraclass correlation coefficient (rho), and coefficient of variation of the cluster sizes (CV).

Usage

table.H2x2Factorial(power=0.8, alpha=0.05,
                    pi_x=0.5, pi_z=0.5,
                    delta_x, delta_z, delta_xz, sigma2_y=1,
                    m_bar, CV, rho,
                    estimand="controlled", test="cluster", correction=FALSE,
                    max_n=1e8, seed_mix=NULL, size_mix=1e4,
                    verbose=TRUE)
table.H2x2Factorial(power=0.8, alpha=0.05,
                    pi_x=0.5, pi_z=0.5,
                    delta_x, delta_z, delta_xz, sigma2_y=1,
                    m_bar, CV, rho,
                    estimand="controlled", test="cluster", correction=FALSE,
                    max_n=1e8, seed_mix=NULL, size_mix=1e4,
                    verbose=TRUE)

Arguments

`power`	a numeric value between 0 and 1 as the desired power level for sample size estimation. Default is `0.8`.
`alpha`	a numeric value between 0 and 1 as the type I error rate. Default is `0.05`.
`pi_x`	a numeric value between 0 and 1 as the proportion of clusters randomized to the cluster-level treatment. Default is `0.5`, representing a balanced allocation.
`pi_z`	a numeric value between 0 and 1 as the proportion of individuals randomized to the individual-level treatment within each cluster. Default is `0.5`, representing a balanced allocation.
`delta_x`	a nonzero numeric value for the (unstandardized) effect size of the marginal cluster-level treatment effect. Default is `0.25`, which is the hypothetical value for the example in the referenced paper.
`delta_z`	a nonzero numeric value for the (unstandardized) effect size of the marginal individual-level treatment effect. Default is `0.33`, which is the hypothetical value for the example in the referenced paper.
`delta_xz`	a nonzero numeric value for the (unstandardized) effect size of the interaction effect of the two treatments. Default is `0.3`, which is the hypothetical value for the example in the referenced paper.
`sigma2_y`	a positive numeric value for the total variance of the continuous outcome. Default is `1`.
`m_bar`	a vector of numeric values larger than 2 for a series of mean cluster sizes.
`CV`	a vector of positive numeric values for a series of coefficients of variation of the cluster sizes.
`rho`	a vector of numeric values between 0 and 1 for a series of intraclass correlation coefficients.
`estimand`	a character argument indicating the type of treatment effect estimand. Supported values include `"controlled"` (controlled or main effect estimand) and `"natural"` (natural or marginal effect estimand). Default is `"controlled"`.
`test`	a character argument indicating the type of hypothesis test of interest. Supported values include `"cluster"` (test for marginal cluster-level treatment effect), `"individual"` (test for marginal individual-level treatment effect), `"interaction"` (interaction test for the two treatments), `"joint"` (joint test for the two marginal treatment effects), `"I-U"` (intersection-union test for the two marginal effects). Default is `"cluster"`.
`correction`	a logical argument indicating whether a finite sample correction should be used. Default is `FALSE`.
`max_n`	an optional setting of a maximum number of clusters, which is only functional under `test="cluster"`, `"joint"`, or `"I-U"`. Default is `1e8`.
`seed_mix`	an optional setting of a seed for conducting the simulation-based testing under a mixed distribution, which is only functional under `test="joint"`. Default is `NULL`.
`size_mix`	a pre-specified size for the mixed distribution in the simulation-based procedure, which is only needed under `test="joint"`. Default is `1e4`.
`verbose`	a logical argument indicating whether the parameter reiterations and supplementary messages should be presented or suppressed. Default is `TRUE`.

Details

If the user further requires a vector of power or other parameters like pi_x, which invokes the need for multiple tables, an external loop could be easily written using this function to produce multiple data frames.

Value

table.H2x2Factorial returns a data frame with inputs of m_bar, rho, and CV varied in a factorial setting, the predicted number of clusters n under the power requirement, and the actual power predicted.power the estimated sample size can help to achieve, with some suppressible messages.

Examples

#Make a result table by providing three mean cluster sizes, three CV, and three ICC
table.cluster <- table.H2x2Factorial(delta_x=0.2, delta_z=0.1,
                                     m_bar=c(10,50,100), CV=c(0, 0.3, 0.5), rho=c(0.01, 0.1),
                                     estimand="controlled", test="cluster", verbose=FALSE)
table.cluster

#Make a result table by providing three mean cluster sizes, three CV, and three ICC
table.cluster <- table.H2x2Factorial(delta_x=0.2, delta_z=0.1,
                                     m_bar=c(10,50,100), CV=c(0, 0.3, 0.5), rho=c(0.01, 0.1),
                                     estimand="controlled", test="cluster", verbose=FALSE)
table.cluster

Package 'H2x2Factorial'

Help Index

H2x2Factorial Sample Size and Power Calculation

Description

Usage

Arguments

Details

Value

Examples

H2x2Factorial Plot

Description

Usage

Arguments

Value

Examples

H2x2Factorial Table

Description

Usage

Arguments

Details

Value

Examples