Rfuncs Project
Bayesians are always shrinking things, but they are growing.
- Guernsey McPearson, alter ego of Stephen Senn
BetaBinomial(), PlotBetaDensities(), BB.WhatIf(), CI.P.score()
Sourcing the file BetaBinomial.RfuncYYMMDD.r (where "YYMMDD" is the version/release date) contains code, loads four functions that relate to analyzing a single proportion, P:
-
BetaBinomial(). Performs a Bayesian beta-binomial analysis of P.
-
PlotBetaDensities(). Plots two beta density functions, such as that shown here.
-
BB.WhatIf(). Performs statistical planning analyses for studies that will depend on beta-binomial analyses.
-
CI.P.score(). Computes the frequentist estimate and the confidence interval for P based on the score method, sometimes called Wilson's method.
​
The examples are not run when the file is sourced, but see

For now, email obrien.ralph@gmail.com for beta version.
The common beta-binomial model is a relatively simple, excellent, and flexible tool for analyzing a single proportion, P, within the classical Bayesian framework. It takes P to be distributed as a beta(a, b) distribution, denoted P ~ beta(a, b), where a and b are the shape1 and shape2 arguments in R's family of beta distribution functions. Beta(1, 1) equates to the standard uniform distribution in which all values of P are equally likely. Beta(a > 1, b > 1) is unimodal. The data are m1 "successes" and m2 "failures," and m1 is taken to be distributed as a binomial random variable with sample size m1+m2 and probability of "success" P, i.e., m1 ~ binomial(m1+m2, P). Given the prior P ~ beta(a0, b0), the posterior is P ~ beta(a0+m1, b0+m2).
​
Example 1 covers the basics using a fictional study based in reality. Examples 2, 3, and 4 use data from three trials that tested vaccines for COVID-19. These show how to adapt the method to assess the incidence rate ratio (IRR, like a fixed hazards ratio) in two-group designs with a time-to-event outcome measure and right censoring. Here, vaccine efficacy is quantified as VE = 1 - IRR.
​
Setting a0 and b0. a0 and b0 may be set directly with the A0 and B0 arguments. However, it is often very helpful to compute values for a0 > 1 and b0 > 1 based on specifying
-
a prior median of P (P_0.50) and a Q*100% quantile (P.Q), i.e. Prob[P < P.Q] = Q, or
-
a prior mode of P and a Q*100% quantile (P.Q)
In addition, specifying
-
a mean and standard deviation
will fit beta(a0 > 0, b0 > 0).
​
Transformations of P. The Trans and TransName arguments transform the results for P to those of t(P), where t() is monotonic. Examples 2, 3, and 4 use t(P) = P/(1 - P)*OSTR, where OSTR, the overall surveillance time ratio, a constant defined below.
​
Arguments
​
A0, B0
​
​
​
Defines the prior distribution, P ~ beta(a0, b0) directly.
Instead of using A0 and B0, P ~ beta(a0, b0) can be fit by specifying Median and P.Q, Mode and P.Q, or Mean and SD. Not all values for these arguments lead to a fitable beta distribution by this methodology. If the fit fails, BetaBinomial() returns $solved=FALSE and prints an appropriate message.
Median
​
​
Mode
​
Mean
Median of P ~ beta(a0>1, b0>1); requires P.Q and Q. The fit is not exact, but almost always sufficient for typical problems.
Mode of P ~ beta(a0>1, b0>1); requires P.Q and Q. Yields exact solution.
Mean of P ~ beta(a>0, b>0); requires SD. Yields exact solution.
P.Q and Q are required when using Median or Mode. SD is required when using Mean.
P.Q
​
​
​
​
Q
​
SD
Q*100% quantile of P, i.e. Prob[P < P.Q] = Q. Fitting is aided when P.Q is well separated from Median or Mode:
-
For Median or Mode <= 0.50, best to use Q >= 0.80.
-
For Median or Mode > 0.50, best to use Q < 0.20.
​
Defines P.Q.
​
Standard deviation of P ~ beta(a0, b0)
To obtain posterior results, you must specify data. The default, M=c(0,0), limits the output to the prior distribution.
M
M = c(m1, m2), where m1 is the number of "successes" and m2 is the number "failures." Thus, the sample proportion of successes is m1/(m1+m2).
The default transformation is t(P) = P, giving a classical beta-binomial problem as per Example 1. Examples 2, 3, and 4 use IRR(P) = P/((1-P)*OSTR.
TransName
​
Trans
Name of transform, such as "IRR" in Examples 2, 3, and 4.
Transformation of P, such as Trans="P/((1-P)*OSTR".
Requests for quantiles and cumulative probabilities for P ~ beta(a0, b0), which may be transformed using TransName and Trans. If using a transformationof P, the quantiles reported are for t(P) and the cumulative probabilities are for Prob[T(P) < t(p)].
PI.level=0.95
​
​
Q.points
​
​
​
CP.points
​
​
​
Level for probability (credible) interval(s), PI = [LPL, UPL], and its associated spread(s),
UPL - LPL.
​
Sets points to compute prior and posterior quantiles of P. The default is Q.points=c(0.005, 0.025,0.05,0.25,0.50,0.75,0.95,0.975,0.995), which gives limits for 90%, 95%, and 99% probability (credible) intervals.
Sets points for computing Prob[P < p], the cumulative probabilities for both prior and posterior beta distributions. For example, CP.points = c(0.20, 0.50) computes Prob[P < 0.20] and Prob[P < 0.50].
​
Print=FALSE suppresses printing of results.
Objects Returned
A list with elements:
$solved
$a0, $b0
$a1, $b1
$qntl
​
$cprob
TRUE/FALSE whether fitting of prior beta(a0, b0) succeeded.
Parameters for prior, P ~ beta(a0, b0).
Parameters for posterior, P ~ beta(a1, b1).
100*q% quantiles of P or t(P), where P ~ beta(a0,b0) and, if M supplied, P ~ beta(a1, b1)); q set by Q.points.
Cumulative probabilities, Prob[P < p)] or Prob[t(P) < t(p)] as set by CP.points. Names may be modified using CProbNames.
​
Methodology
​
The methodology is summarized in the code, which implements well-known theory about the standard beta(a, b) distribution and the beta-binomial model for analyzing a single proportion. The additional methodology underlying the COVID vaccine trials is summarized before Example 2.
Examples
​
-
Example 1 is based on my recollections of an actual study that Jason Connor and I helped design at the Cleveland Clinic in the early 2000s. (The investigator died suddenly before the proposal was funded.) This focuses directly on a single proportion, P, which was known to be small.
​
-
Example 2 deals with the actual analysis and data from the 2020 Pfizer-BioNTech clinical trial that successfully tested their vaccine for COVID. This uses IRR(P) = P/((1- P)*OSTR. where OSTR is the ratio of the group's total at-risk (surveillance) times. Three prior distributions are assessed and compared, noting how they relate to the Scientific Method.
​
-
Example 3 uses the same method as Example 2 to analyze the data from the Janssen (Johnson & Johnson) clinical trial of their COVID vaccine. These analyses demonstrate how statements made widely in the mass media lacked statistical credibility and even common sense.
​
-
Example 4 is much like Example 3, but is based on AstaZenaca's trial of their COVID vaccine. However, it randomized 2:1 vaccine to placebo. (To be completed whenever the necessary data and AstaZenaca's analyses are submitted to the US FDA and publicly released.)
​