Bayesians are always shrinking things, but they are growing.

- Guernsey McPearson, alter ego of Stephen Senn

IRR.Bayes()

For now, email obrien.ralph@gmail.com for beta version.

An incidence rate, IR, is the number of subjects who experience a specific incidence event (IE) per some unit of subject-time, such as 2.4 IEs per 1000 subject-days (IR = 0.0024). IRR.Bayes() compares two groups' incidence rates, IR1 and IR2, by focusing on the incidence rate ratio, IRR = IR1/IR2, which is assumed constant over the time the data were being collected. We quantify our uncertainty about IRR by modeling it as random variable that arises from a distribution we call betaIRR(a > 1, b > 1 | NUSR). Imperitively, this distribution is readily amenable to applying classical Bayesianism.

Diagrammed to the right is our schema for making inferences about IRR. We first specify an initial prior distribution for IRR that is tailored to address the research goal/question. This factors in to shaping the study design. Then, the explicit methodology delineated in the Methods section conjoins the initial prior distribution with the initial dataset to produce a posterior distribution for IRR. If more data are needed to obtain a sharper answer about IRR, the posterior belief becomes the revised prior belief, which may cause the design to be modified. Using the identical methodology but now with the revised prior and the additional data produces a new posterior. There is no fixed sample size or protocol-set points for formal interim analyses requiring "alpha spending" strategies to control the overall Type I error rate. The cycle simply stops when the understanding about IRR is sufficiently sharp to draw a safe inference and defend it.

The boxes for Data and Explicit Methodology are more pronounced in order to emphasize that the process is driven by data and the computations are exact. In fact, as more data are acquired, the initial prior distribution for IRR becomes progressively inconsequential. Bayesianism gives us a formal way to do statistical science in a manner consistent with how we ordinarily learn about everything throughout our lives.

Learning how to use IRR.Bayes() effectively is best begun by studying the four in-depth examples given here. All of this code is given within the IRR.Bayes() file, so you can execute the examples yourself and even modify them to explore on your own—the best way to learn. The examples:

An experimental vaccine is compared to placebo, Stage A. (Stage B is planned in Example 4, but not run.)
An experimental treatment is compared to placebo, Stages A and B.
An equivalence study, which assesses whether IRR is sufficiently close to 1.0.
A Monte Carlo study to assess whether the vaccine trial of Example 1 should have a Stage B to collect more data. The design is modified based on Stage A's results.

Important. Each example deals with how to form initial prior distributions, IRR ~ betaIRR(a0 > 0, b0 > 0 | NUSR), so that the resulting posterior distributions address the research questions directly and convincingly.

Arguments

— Dataset —

The required dataset conforms to what is commonly used for a two-group time-to-event analysis.

DFrame

Group

GroupNames

StartTime

EventTime

Event

IEvalue

Data.frame with N rows (all subjects, including those with right-censored times) and containing columns (variables) with any names that give values for

subject's group. (See Group argument.)
time subject started the study (became under surveillance). See StartTime argument.)
time subject ended the study either upon first experiencing the IE or being deemed to be right censored. (See EventTime argument.)
type of event. (See EventType argument.)

For example, the data.frame could have:

$treatment with values such as "MJK17" and "placebo".
$start, using either "2021-12-31" format or a numeric value such as 12, if the subject started the study 12 days (or hours) after the study began.
$end, which would be in the same format as $start.
$outcome, with values such as "ZOVID+" and "censored".

DFrame = NULL will only fit and describe a prior as specified by the analysis arguments below. NUSR must be supplied.

Name of DFrame column, in quotes, that has the group values, e.g.,
Group = "treatment"
Values in Group must match GroupNames. All those not matching are removed.

Values in Group, ordered to polarize IRR as GroupNames[1]:GroupNames[2]. For example, comparing the experimental ZVM17W19 vaccine to placebo in Example 1 uses GroupNames = c("ZVM17W19", "Placebo") so that IRR = 0.108 translates to a vaccine efficacy of VE = 1 - IRR = 0.892 or 89.2%.

Name of DFrame column, in quotes, that has the time the subject began the study and was being counted as under surveillance for the IE. Numeric value or "2019-06-17" string format for dates. Example:

StartTime = "start"

Name of DFrame column, in quotes, that has the time the subject experienced the IE or was deemed to be censored. Same format as EventTime. Example:
EventTime = "end"

Name of DFrame column, in quotes, that give what type of event occurred at EventTime. Example:
Event = "outcome"

The value for Event that denotes the incidence event. All other values are deemed "censored", except that NA values are removed from the analysis. Example:
IEvalue = "ZOVID+"

Important: In order to properly adjust for the number under surveillance in each group, all censored observations should be kept in the dataset.

— Analysis —

The initial prior distribution, IRR ~ betaIRR(a0 > 1, b0 > 1 | NUSR), is set by specifying its median, IRR0.50, a Q100% quantile, IRR0.Q, of IRR. Alternatively, FitDiffuse = TRUE tells the function to set IRR0.Q automatically in order to give the prior near-maximal spread.

Time can be of two types, the actual calendar date/time when the IE occurred ("event time") or the duration of time the subject was on study before the event occurred ("time-to-event"). This sets how to count the number of subjects in each group who were under surveillance at each unique IE time.

TimeType

IRR0.50

IRR0.Q

FitDiffuse=

FALSE

NUSR

PI.level=0.95

Q.points

IRR.points

TimeType = "EventTime" defines time as the calendar date/time when the IE occurred. The number of subjects under surveillance in each group counts those who were in the study at the time of the given IE, regardless of when they started the study. Thus, if time is measured in days, two subjects who both became IE+ on 17 June 2019 would have the same number of subjects under surveillance and thus the same ratio of those numbers (NUSR). See Example 1.

TimeType="TimeToEvent" defines time as the duration of time from when the subject started the study to when the subject had the IE or was censored, i.e. time = EventTime - StartTime. The number of subjects under surveillance in each group counts those who had times-to-events equal to or greater than that of the subject who had the IE. Thus, if time is measured in days, two subjects who both became IE+ on their 77nd day on study (regardless of when they started the study) would have the same numbers of subjects under surveillance, i.e., all those who were under surveillance for at least 77 days. See Example 2.

Median of the prior distribution for IRR.

Q*100% quantile of IRR, i.e. Prob[IRR < IRR0.Q] = Q. Fitting is aided when IRR0.Q is well separated from IRR0.50.

For IRR0.50 ≤ 1.00, best to use Q ≤ 0.95.
For IRR0.50 > 1.00, best to use Q < 0.05.

Defines IRR0.Q.

FitDiffuse=TRUE fits a prior for IRR that has median near IRR0.50 and nearly maximum spread. IRR0.50=1 and FitDiffuse=TRUE define what many would call a "non-informative" prior, a topic handled in Example 1.

Number under surveillance ratio, used for forming prior distributions when planning studies and writing formal protocols. Relevant if and only if DFrame = NULL. To guesstimate NUSR, consider the earliest IE (TimeType = "EventTime") or the IE with shortest time-to-event (TimeType = "TimeToEvent"). If about 60% of the at-risk subjects are expected to be Group 1 (such as with a 3:2 randomization), then NUSR = 0.60/0.40 = 1.5. See Example 1.

Level(s) for probability (credible) interval(s), PI = [LPL, UPL]. May have multiple values, e.g. PI.level = c(0.90, 0.95, 0.99).

Sets points to compute initial prior and final posterior quantiles of IRR. The default is Q.points = c(0.005, 0.025, 0.05, 0.25, 0.50, 0.75, 0.95, 0.975, 0.995), which gives limits for the 90%, 95%, and 99% probability (credible) intervals.

Sets points for computing initial prior and final posterior cumulative probabilities, Prob[IRR < irr]. For example, IRR.points = c(0.20, 0.50) computes initial prior and final posterior Prob[IRR < 0.20] and Prob[IRR < 0.50].

— Printing —

Print=TRUE

TimeUnit="Day"

PrintChronology

= TRUE

PrintSteps

=FALSE

Print=FALSE suppresses printing of results. Useful when IRR.Bayes() is incorporated into Monte Carlo simulations for statistical planning, as per Example 4.

For example, if time is expressed in years, use TimeUnit="Year". Only affects printed output.

Prints the data.frame $chronology, which lists the main results computed at each unique IE time.

Prints results for each step of computations for the first unique IE time. Used for checking, debugging, and documenting, and for teaching/learning about the algorithm.

Objects Returned

A list with elements:

$StudyDur

$group

$total.IEs

$totalTUS

IRR.est

$chronology

$a0, $b0

$NUSR.1

$aF, $bF

$NUSR.F

$Md.PI

$qntl

$cprob

Study duration, the time from the earliest StartTime to the last EventTime.

As specified for PrintNames.

Sample sizes for the two groups.

Total number of incidence events in the two groups.

Total time subjects were under surveillance in the two groups.

Ordinary frequentist estimate of IRR, $total.IEs/$totalTUS

Data.frame of the sequence of the times when at least one IE occurred. The columns are:

$Time ... as set by TimeType argument.

$NUS1, $NUS2 ... number under surveillance in each group at that specific time.

$NUSR ... NUS1/NUS2

$Prior.50 ... prior median of IRR distribution

$Prior.PI95 ... prior 95% probability (credible) interval for IRR distribution.

$r1, $r2 ... number of IEs in each group at that specific time.

$Prior.50 ... posterior median of IRR distribution

$Prior.PI95 ... posterior 95% probability (credible) interval for IRR distribution.

Fitted values for parameters of the initial prior distribution, IRR ~ beta(a=a0, b=b0 | NUSR = NUSR.1).

First number under surveillance ratio, i.e. the one associated with for the IE with the earliest time (TimeType = "EventTime") or the shortest time-to-event (TimeType = "TimeToEvent").

Values for parameters of the final posterior distribution, IRR ~ beta(a=aF, b=bF | NUSR = NUSR.F).

Final number under surveillance ratio, i.e., the one associated with the IE with the latest time (TimeType = "EventTime") or the longest time-to-event (TimeType = "TimeToEvent").

Medians and limits of probability intervals, [LPL, UPL], for initial prior and final posterior distributions as per PI.level argument.

Quantiles of initial prior and final posterior distributions as per Q.points argument.

Initial prior and posterior cumulative probabilities, Prob[IRR < irr], as per IRR.points argument.