top of page

Bayesians are always shrinking things, but they are growing.

- Guernsey McPearson, alter ego of Stephen Senn

BetaBinomial()

For now, email obrien.ralph@gmail.com for beta version.

The common beta-binomial model is a relatively simple, excellent, and flexible tool for analyzing a single proportion, P, within the classical Bayesian framework. It takes P to be distributed as a beta(a, b) distribution, denoted P ~ beta(a, b), where a and b are the shape1 and shape2 arguments in R's family of beta distribution functions. Beta(1, 1) equates to the standard uniform distribution in which all values of P are equally likely. Beta(a > 1, b > 1) is unimodal. The data are m1 "successes" and m2 "failures," and m1 is taken to be distributed as a binomial random variable with sample size m1+m2 and probability of "success" P, i.e., m1 ~ binomial(m1+m2, P). Given the prior P0 ~ beta(a0, b0), the posterior is P1 ~ beta(a1 = a0+m1, a2 = b0+m2).

​

Example 1 covers, in depth, how to use the functions in the downloaded BetaBinomial file to handle a one-sample clinical trial to test a treatment for a fictitious serious disease (like COVID), including creating effective graphics for the prior and posterior beta distributions and performing study planning analyses. Example 2 discusses a study that seeks to infer what P is, given that know it must be quite low.

 

Examples 3 and 4 use data from two real trials that tested vaccines for COVID-19.  These show how Pfizer-BioNTech adapted the method to assess the incidence rate ratio (IRR, like a fixed hazards ratio) in two-group designs with a time-to-event outcome measure and right censoring. Here, vaccine efficacy is quantified as VE = 1 - IRR. However, the method is "lazy" in how it handles time under surveillance. The Rfunc IRR.Bayes() does this more precisely.

​

Setting P0 ~ beta(a0, b0). a0 and b0 may be set directly with the A0 and B0 arguments. However, it is often very helpful to compute values for a0 > 1 and b0 > 1 based on specifying

  •  a prior median of P (P0.50) and a Q*100% quantile (P0.Q), i.e. Prob[P0 < P.Q] = Q, or

  •  a prior mode of P and a Q*100% quantile (P.Q)

  •  a mean and standard deviation for P0.

​

Transformations of P. The Trans and TransName arguments transform the results for P to those of t(P), where t() is monotonic. Examples 3 and 4 use t(P) = P/(1 - P)*TTUSR, where TTUSR, the total time under surveillance ratio, a constant defined below.

​

Arguments

​

A0, B0

​

​

​

Defines the prior distribution, P0 ~ beta(a0, b0) directly.

Instead of using A0 and B0, P0 ~ beta(a0 ≥ 1, b0 ≥ 1) can be fit by specifying P0.50 and P.Q, Mode and P.Q, or Mean and SD. Not all values for these arguments lead to a fitable beta distribution by this methodology. If the fit fails, BetaBinomial() returns $solved=FALSE and prints an appropriate message.

P0.50

​

​

Mode

​

Mean

Median of P0 ~ beta(a0 ≥ 1, b0 ≥ 1); requires P.Q and Q. The fit is not exact, but almost always sufficient for typical problems.


Mode of P0 ~ beta(a0 ≥ 1, b0 ≥ 1); requires P.Q and Q. Yields exact solution.


Mean of P0  ~ beta(a0 ≥ 1, b0 ≥ 1); requires SD. Yields exact solution.

P0.Q and Q are required when using P0.50 or Mode. SD is required when using Mean.

P0.Q

​

​

​

​

Q

​

SD

Q*100% quantile of P, i.e. Prob[P0 < P0.Q] = Q. Fitting is aided when P0.Q is well separated from Median or Mode:

  •  For Median or Mode <= 0.50, best to use Q >= 0.80.

  •  For Median or Mode > 0.50, best to use Q < 0.20.

​

Defines P.Q.

​

Standard deviation of P0 ~ beta(a0, b0)

To obtain posterior results, you must specify data. The default, M=c(0,0), limits the output to describing the prior distribution.

M

M = c(m1, m2), where m1 is the number of "successes" and m2 is the number "failures." Thus, the sample proportion of successes is m1/(m1+m2).

The default transformation is t(P) = P, giving a classical beta-binomial problem as per Example 1. Examples 3 and 4 use IRR(P) = P/((1-P)*TTUSR.

TransName

​

Trans

Name of transform, such as "IRR" in Examples 3 and 4.

 

Transformation of P, such as Trans="P/((1-P)*TTUSR".

Requests for quantiles and cumulative probabilities for P0 ~ beta(a0 ≥ 1, b0 ≥ 1), which may be transformed using TransName and Trans. If using a transformation of P, the quantiles reported are for t(P) and the cumulative probabilities are for Prob[T(P) < t(p)].

PI.level=0.95

​

​

Q.points

​

​

​

CP.points

​

​

​

Print

Level for probability (credible) interval(s), PI = [LPL, UPL], and its associated spread(s),
                UPL - LPL.

​

Sets points to compute prior and posterior quantiles of P. The default is Q.points=c(0.005, 0.025,0.05,0.25,0.50,0.75,0.95,0.975,0.995), which gives limits for 90%, 95%, and 99% probability (credible) intervals.

 

Sets points for computing Prob[P < p], the cumulative probabilities for both prior and posterior beta distributions. For example, CP.points = c(0.20, 0.50) computes Prob[P < 0.20] and Prob[P < 0.50].

​

Print=FALSE suppresses printing of results.

Object Returned

A list with elements:

$solved

$a0, $b0

$a1, $b1

P.50

​

PI

​

​

PI.spread

$qntl

​

$cprob

TRUE/FALSE whether fitting of prior beta(a0, b0) succeeded.

Parameters for prior, P ~ beta(a0, b0).

Parameters for posterior, P ~ beta(a1, b1).

Matrix with medians of P0 or t(P0), where P0 ~ beta(a0,b0) and, if M supplied, P1 ~ beta(a1, b1)).q set by Q.points.

Matrix of lower and upper limits of 100*PI.level% probability intervals, LPL and UPL, for P0 ~ beta(a0, b0) and, if M supplied, P1 ~ beta(a1, b1)), or their like values transformed by t(P0) and t(P1).

UPL - LPL

100*q% quantiles of P0 ~ beta(a0,b0) and P1 ~  beta(a1, b1)), or their t(P) transforms; q set by Q.points.

Cumulative probabilities, Prob[P0 < p)] and Prob[P1 <p] or their t(P) transforms as set by CP.points. Names may be modified using CProbNames.

​

Methodology

​

The methodology is summarized in the code, which implements well-known theory about the standard beta(a, b) distribution and the beta-binomial model for analyzing a single proportion. The additional methodology underlying the COVID vaccine trials is summarized before Example 2.

Examples

​

  • Example 1, which is fiction, stems from my collaborating in 2021 with Bruce Spiess and Cynthia Garvin at the University of Florida to create a Bayesian design for (what I would term) a Phase I/II clinical trial to begin testing a proposed treatment for COVID-19 patients who were hospitalized and had just been intubated. Two prior distributions, each legitimate, are assessed and compared.

 

  • Example 2 is based on my recollections of an actual study that Jason Connor and I helped design at the Cleveland Clinic in the early 2000s. (The investigator died suddenly before the proposal was funded.) This focuses directly on a single proportion, P, which was known to be small.

​

  • Example 3 deals with the actual analysis and data from the 2020 Pfizer-BioNTech clinical trial that successfully tested their vaccine for COVID. This uses IRR(P) = P/((1- P)*TTUSR. where TTUSR is the ratio of the groups' total time under surveillance. Three prior distributions are assessed and compared, noting how they relate to the Scientific Method.

​

  • Example 4 uses the same method as Example 3 to analyze the data from the Janssen (Johnson & Johnson) clinical trial of their COVID vaccine. These analyses demonstrate how statements made widely in the mass media lacked statistical credibility and even common sense.

​

bottom of page