Methods

To practice safely and effectively, physicians do not need to fully understand the biological mechanisms and the manufacturing processes involved in the drug therapies they prescribe. We ordinary car drivers do not need to understand the engineering of our car's mechanical and electrical systems. Competent use of a statistical method does not require mastering the mathematical theory undergirding that method. But creators of statistical methodologies are obligated to provide such detail.

Setting the Stage

IR1, IR2

IRR

event time,

time-to-event

T, t

NUS1[t], NUS2[t],

NUSR[t]

r1[t], r2[t]

P[t],

a0[t], b0[t],

a[t], b[t]

P0.50[t], P.50[t],

P0.q[t], P.q[t]

IRR[t] ~

beta.IRR(a[t], b[t] | NUSR[t])

IRR0.50[t], IRR.50[t],

IRR0.q[t], IRR.q[t]

Incidence events are those defining a research end-point. A subject in a "ZOVID" vaccine trial is diagnosed with the disease (ZOVID+). A ZOVID+ subject in a treatment trial dies. Subjects who never experience the IE are deemed right censored at the time they leave the study or when the study ends.

Incidence rates—the average number of incidence events (IEs) per unit of time—for the two groups.

Incidence rate ratio, IRR = IR1/IR2, the parameter of inferential interest. It's "true" value is unknowable, so we model our current uncertainty about it by making it a random variable.

Two types of time:

An event time is when a subject experienced the IE or was deemed to be censored. Examples: Subject #231 tested positive for ZOVID on 22 March 2017 ("2017-03-22"), which was Day 56 after the first subject started on study.
A time-to-event is the duration from when a subject started on study to when they experienced the IE or were censored. Example: Subject #4397 died 23 days after being being on study. The time-to-event is also the amount of time the subject was under surveillance (at risk) to have the event.

Number of unique times (either type) when IEs occurred and at least one subject was under surveillance in the other group. A study could recruit and follow tens of thousands of subjects over time, but have relatively few IEs and more than one IE can occur at the same time (day). These unique times are indexed by t, from t = 1, the first/shortest unique IE time, to t = T, the last/longest.

Let NUS1[t] and NUS2[t] be the number of subjects in each group who were under surveillance for the IE at the unique IE time t. For an event time analysis, these are the number of subjects in each group who were on study when that IE occurred. For a time-to-event analysis, these are the number of subjects in each group who had times-to-event equal to or longer than that unique time. Only NUSR[t] = NUS1[t]/NUS2[t] is used, the number under surveillance ratio at time t.

Number of subjects in each group who experienced the IE at time t.

P[t] is probability that a subject who experienced an IE at time t was from Group 1, given NUSR[t]. If IR1 and IR2 are known. then

P[t] = NUS1[t]*IR1/(NUS1[t]*IR1 + NUS2[t]*IR2).

Using IRR = IR1/IR2 and NUSR[t] = NUS1[t]/NUS2[t], a little algebra gives,

P[t] = NUSR[t]*IRR/(NUSR[t]*IRR + 1)

Of course, IRR is not known, so we address our uncertainty by modeling P[t] as a beta random variable with a prior distribution of
P0[t] ~ beta(a0[t], b0[t])
and a posterior distribution of
P[t] ~ beta(a[t], b[t]).

The medians and q[t]•100% quantiles of P0[t] ~ beta(a0[t], b0[t]) and P[t] ~beta(a[t], b[t]).

The reverse transformation is

IRR[t] = P[t]/{(1 - P[t])*NUSR[t]}.

Given P[t] ~ beta(a[t], b[t]), we denote the distribution of IRR[t] as

IRR[t] ~ betaIRR(a[t], b[t] | NUSR[t]).

Any quantile of P[t] maps directly to the concordant quantile of IRR[t].

Prior and posterior medians and q•100% quantiles of IRR at time t. The initial (t = 1) prior median and q•100% quantile should be formally specified in the study protocol.

The Algorithm

I find it easiest to depict the algorithm as a series of cycles, one for each unique event day. These numbers are from Example 1, the ZOVID vaccine trial. The Phase I/II trial had five unique event times, Days 12, 26, 28, 29, and 30. The subsequent Phase III trial had XX unique event times, Days x, x, ..., x.

Four cycles are shown:

Cycle P12.T1: The first unique event day (t = 1) of the Phase I/II trial. The initial prior for IRR is a null-diffuse prior fitted to work with the first number under surveillance ratio, NUSR[1].
Cycle P12.T5: The last unique event day (t = 5) of the Phase I/II trial. This fitted prior is based on the posterior distribution of IRR obtained from the previous cycle (P12.T4) and NUSR[5].
Cycle P3.T1: The first unique event day (t = 1) from the Phase III trial. This fitted prior is based on the final posterior distribution of IRR obtained from the Phase I/II study (Cycle P12.T5) and NUSR[1] (of this trial).
Cycle P3.Txx: The last unique event day (t = xx) from the Phase III trial. This fitted prior is based on the posterior distribution of IRR obtained from the previous cycle (P3.Txx) and NUSR[xx].

by studying a flowchart that gives the results at each step, here Example 1 involves first diagram covers The numbers here come from the first IRR.Bayes() run in Example 1, which analyzes the Phase I/II study of the ZOVID vaccine example. There were five unique event times (at days 12, 26, 28, 29, 30) requiring five cycles through the algorithm. Step A is the transition point between between cycles, here between the unique event times t = 4 and t = 5. The final cycle (t = 5) produces the final posterior distribution (Step ?).

Step A. Any quantile of P[t] ~ beta[a[t], b[t]) converts to the same quantile point of the IRR[t] ~ IRR.beta(a[t], b[t] | NUSR[t]) distribution using

IRR.q[t] = P.q[t]/{(1 - P.q[t])*NUSR[t]}.

For t = 4, the posterior median and 95% quantile of IRR[4] ~ IRR.beta(1.66, 5.23 | NUSR[4] = 1.80) are IRR.50[4] = 0.152 and IRR.95[4] = 0.810.

These posterior quantile values now set the prior distribution for the next (5th) cycle: IRR.50[4] ⇒ IRR0.50[5] = 0.152; IRR.95[4] ⇒ IRR0.95[5] = 0.810.

Step A. IRR0.50[t] and IRR0.q[t] are the median and a q*100% quantile (percentile) of the prior distribution of IRR for unique IE time t.

The initial prior values, IRR0.50[1] and IRR0.q[1], are custom set by the investigators to best meet the goals of the study. This example uses a "null diffuse" prior, so the median is 1.0 and IRR0.q[1] wqs set automatically to make the spread nearly maximal; see Step C.

For t = 2, 3, ..., T, IRR0.50[t] and IRR0.q[t] are the median and a q*100% quantile of the posterior distribution for IRR resulting from the previous unique IE time. Here, the first posterior median IRR.50[1] = 0.428 (computed in Step J) becomes the new prior median, IRR0.50[2]. Likewise, IRR0.95[1] = 3.52 becomes the new other quantile (here, q = 0.95), IRR0.95[2].

Step B. NUSR[t] is the number under surveillance ratio at time t. At this first unique IE time, there were 803 and 834 subjects under surveillance in Groups 1 and 2, respectively, so NUSR[1] = 803/834 = 0.963.

Step C. At unique time t with NUS1[t] and NUS2[t] in the two groups, let P[t] be the probability that an IE comes from a subject in Group 1. If the true incidence rates , IR1 and IR2, are known, then

P[t] = NUS1[t]*IR1/(NUS1[t]*IR1 + NUS2[t]*IR2)

With IRR = IR1/IR2,

P[t] = NUSR[t]*IRR/(NUSR[t]*IRR + 1).

We address our uncertainty about IRR by modeling P[t] as a unimodal beta random variable with a prior distribution of

P0[t] ~ beta(a0[t] ≥ 1, b0[t] ≥ 1).

Its median and a q*100% quantile of the prior distribution for P[t] are

P0.50[t] = NUSR[t]*IRR0.50[t]/(NUSR[t]*IRR0.50[t] + 1)

and

P0.q[t] = NUSR[t]*IRR0.q[t]/(NUSR[t]*IRR0.q[t] + 1).

Here, P0.50[1] = 0.491. Although IRR0.50 = 1, Group 1 has 49.1% of the subjects.

When the initial prior is specified to be "diffuse" (FitDiffuse = TRUE), P0.q[1] is set automatically to be either a 1% or 99% quantile of a beta(a0[t] ≥ 1, b0[t] ≥ 1) distribution that has median P0.50[1] and min(a0[1], b0[1]) = 1.00. Then IRR0.q[1] = P0.q[1]/{(1 - P0.q[1])*NUSR[1]}. In this case, P0.01[1] = 0.010, so (as shown in Step A), IRR0.01[1] = 0.011.

Step D. Unfortunately, values for IRR0.50[t], IRR0.q[t], and NUSR[t], do not directly give values for a0[t] and b0[t] that define a beta distribution that has quantiles of P0.50[t] and P0.q[t] Thus, a fitting algorithm in used, as follows.

In general, for P ~ beta(a > 1, b > 1), the approximate median is
M = (a - 1/3)/(a + b - 2/3).
Then,

b = (a - 1/3)/M - a + 2/3.
For a given q•100% quantile, P.q, and value for a, the difference between the requested quantile point, q, and the actual quantile point, pbeta(P.q, a, (a - 1/3)/M - a + 2/3), is

qdiff(a) = q - pbeta(P.q, a, (a - 1/3)/M - a + 2/3).

Numerically, qdiff(a) = 0.0 is iteratively solved for a when

abs(qdiff(a)) < 0.00001.

This is accomplished using R's uniroot() function with starting limits for a of 1.0001 and 1,000,000. Importantly, qdiff(1.0001) and qdiff(1000000) must be opposite in sign, qdiff(1.0001)*qdiff(1000000) < 0. However, some specifications for IRR0.50[1] and IRR0.q[1] lead to values of P0.50[1] and P0.q[1] such that qdiff(1.0001)*qdiff(1000000) > 0. If so, IRR.Bayes() issues an appropriate error message and stops executing. When developing a protocol, the function can be used without supplying a dataset to explore whether a given pair of choices for IRR0.50[1] and IRR0.q[1]) is admissible.

The same fitting algorithm is used for subsequent unique times (t > 1), but is far less problematic. In fact, in all the testing I've done, every t > 1 fitting has been successful.

Step E. In this example, the "diffuse" algorithm set the prior distribution for P as P0[1] ~ beta(1.01, 1.04). If NUSR[1] had been exactly 1.000, then P0[1] ~ beta(1.01, 1.01), a distribution that is unimodal but almost indistinguishable from the perfectly flat beta(1, 1) distribution, which is the standard uniform(0, 1) distribution. Because NUSR[1] = 0.963, there was slightly less chance that this first IE would come from Group 1, so the prior for P is P0[1] ~ beta(1.01, 1.036123), which has a median of 0.4912118. From above, with IRR0.50[1] = 1.0 and NUSR[1] = 0.963, the requested P0.50[1] is NUSR[1]/(NUSR[1] + 1) = 0.4905315.

Step F. At unique time t, Groups 1 and 2 have r1[t] and r2[t] IEs, respectively. Here, for t = 1, the first and only IE was from Group 2.

Step G. As per the ordinary beta-binomial Bayesian method to analyze a single proportion, r1[t] is taken to be a binomial random variable with N = r1[t] + r2[t] and p = P[t]). The posterior distribution is P[t] ~ beta(a[t] = a0[t] + r1[t], b[t] = b0[t] + r2[t]). At t = 1, P[1] ~ beta(1.01 + 0, 1.04 + 1) = beta(1.01, 2.04).

Step H. For unique times t < T, the algorithm prepares to process unique time t + 1. Otherwise it terminates.

Step I. If the posterior median, P.50[t] = qbeta(0.50, a[t], b[t]), is less than 0.50, use q = 0.95 to define IRR0.q for the next cycle; otherwise use q = 0.05.

Step J. Compute P.50[t] and either P.05[t] or P.95[t]. Here, P.50[1] = 0.291 and P.95[1] = 0.771.

Step K. Any quantile of P[t] ~ beta[a[t], b[t]) converts to the same quantile point of the IRR[t] ~ IRR.beta(a[t], b[t] | NUSR[t]) distribution using

IRR.q[t] = P.q[t]/{(1 - P.q[t])*NUSR[t]}.

Here, the median and 95% quantile of IRR[1] ~ IRR.beta(1.01, 2.04 | NUSR[1] = 0.958) are IRR.50[1] = 0.428 and IRR.95[1] = 3.72.

These values now set the prior distribution for the next cycle: IRR.50[t] ⇒ IRR0.50[t+1] = 0.428; IRR.95[t] ⇒ IRR0.95[t+1] = 3.72.

Step L. The final posterior distribution for IRR is defined by P[T] ~ beta(a[T], b[T]) with the transformation IRR[T] = P[T]/{(1 - P[T])*NUSR[T]}. Computing quantiles is straightforward; see Step K. To compute the posterior cumulative probability Prob[IRR < irr], simply compute Prob[P < p], where P ~ beta(a[T], b[T]) and p = NUSR[T]*irr/(NUSR[T]*irr + 1).

Density Functions for IRR and log(IRR)

Plotting the density function of IRR is highly informative. However because IRR is a ratio measure that can be polarized in two functionally identical ways, such plots are best structured by using log scaling for IRR. If so, IRR and 1/IRR appear equally distant from IRR = 1.0 and, the visual difference between IRRs of IRR.1 and IRR.2 is the same as between 1/IRR.2 and 1/IRR.1.

Let f(P) be the common beta density function, P ~ beta(a, b). In R, this is dbeta(P, a, b). Then the density function for logIRR is
h(logIRR) = f(P)*NUSR*IRR/((1 + NUSR*IRR)^2),

where

P = NUSR*IRR/(1 + NUSR*IRR).

Proof. The Jacobian of P = NUSR*IRR/(1 + NUSR*IRR) is the derivative d.P/d.IRR = NUSR/((1 + NUSR*IRR)^2). So the density function for IRR is
g(IRR) = f(P)*NUSR/((1 + NUSR*IRR)^2).
For log(IRR), the the Jacobian of IRR = exp(logIRR) is d.IRR/d.logIRR = d.exp(logIRR)/d.logIRR = exp(logIRR) = IRR. So,
h(logIRR) = g(IRR)*IRR = f(P)*NUSR*IRR/((1 + NUSR*IRR)^2).