Introduction to Medical Statistics 2026
Exercise class III
Statistical Analysis: Main Concepts and Principles; Binomial Distribution

Author

Ronald Geskus

Published

March 24, 2026

I. Calculation of binomial probabilities

The US CDC estimates that 90% of Americans have had chickenpox by the time they reach adulthood.

Suppose we take a random sample of 100 American adults. Is the use of the binomial distribution appropriate for calculating the probability that exactly 97 out of 100 randomly sampled American adults had chickenpox during childhood? Explain your answer.

Answer: It is. The binomial distribution with “success” probability \(\pi=0.9\) and \(N=100\) describes the variation in number of observed adults that had chickenpox during childhood in a sample of 100.

How many American adults that had chickenpox would you expect to observe among the 100? Wat is the variance of the number of observed cases among 100 sampled individuals? Use the dbinom function to calculate the probability that exactly 97 out of 100 randomly sampled American adults had chickenpox during childhood, i.e. compute P(X=97) if X \(\sim\) B(0.9,100).

Answer: I would expect \(100 \times 0.9=90\) to have had chickenpox; the variance in the observed number is \(N \times p \times (1-p)=100 \times 0.9 \times 0.1=9\).

\(P(X=97)\) is:

What is the probability that exactly 3 out of a new sample of 100 American adults have not had chickenpox in their childhood?

Answer: The same probability as in b. Three not having had chickenpox is the same as 97 having had chickenpox.

Plot the probability function of this binomial distribution. See whether the answer from b. corresponds with the value in the plot. Which number has the highest probability and how large is that probability?

Answer: In the plot, the answer from b. is the height of the bar at x=97. The most likely number is 90, which is number \(90-70=20\) in the generated vector 71:100 that starts at 71. The probability is P(X=90)=0.132

What’s the probability that we observe a number of adults that is at least 5 lower or higher than 90? Hint: use the pbinom function to compute \(P(X \leq 85)\) and \(P(X \geq 95)\). How do you read this number from the plot that you made in b?

Answer: In the plot, this is the sum of the bar heights \(\leq 85\) and \(\geq 95\).

We actually calculate the p-value of a test of the null hypothesis that p is 0.9 against the two-sided alternative hypothesis that there is a difference if the observed number of infected individuals is 95. Why?

Answer: we calculate the tail probability, in both directions, that the outcome deviates 5 or more from the expected number under the null hypothesis that p=0.9.

II: Inference for a single proportion

A new chemotherapy has entered phase 2 in drug development. A tumour response is defined as a decrease in tumour size by at least 50% within 6 months after therapy initiation. According to experts, the tumour response probability of the drug should be more than 20% in order to proceed to a phase 3 study.

The statistical test of interest is the test for the tumour response probability \(\pi\) with the null hypothesis \(H_0: \pi=0.2\) (or \(H_0: \pi \leq 0.2\)) versus the alternative \(H_A: \pi >0.2\). The phase 2 trial consists of 50 patients. We observe the tumour response for each patient and then perform the statistical test.

Assume that 16/50 patients show a tumour response. Calculate the corresponding one-sided p-value and describe the result and your conclusion in words. Hint: You can either calculate the p-value yourself using the binomial distribution (as in Exercise I.b or I.e) or you can use the prop.test or binom.test function. (Note that these two functions use different methods; only binom.test gives the same answer as the direct calculation.)

Answer: We did not specify when there is enough evidence to reject the null hypothesis. Often 0.05 is chosen as cutoff criterion. This cutoff is called significance level. This would suggest to perform a phase 3 trial. Note however that this is a one-sided alternative hypothesis. Then 0.025 is often suggested as significance level. This level would not suggest to perform a phase 3 trial.

Note that the binomial distribution is not symmetric around \(H_0: p=0.2\). Under the null hypothesis the expected number is 10, and \(P(X \geq 16)\) is not equal to \(P(X \leq 4)\), and \(P(X \geq 16)+P(X \leq 4)\) is equal to 0.0492994. This p-value would be just small enough to suggest a phase 3 trial at the \(\alpha=0.05\) significance level. The binom.test function takes this asymmetry into account.

The experts expected a tumour response of 40%. Suppose that the true tumour response probability is indeed 0.4. We reject the null hypothesis if we get a one-sided p-value smaller than 0.025. What is the probability that we reject the null hypothesis if we have a sample of 50 patients (not the current sample, but an arbitrary new sample)? This is the basis of the so-called power or sample size calculation. Hint: First show that for an observed number of tumour responses of 16 the test does not reject the null hypothesis, while for 17 it does. Then calculate the probability of \(P(X\geq 17)\) if the alternative is true, i.e. if p=0.4.

Answer: # Bonus: Graph of the distribution assuming that the null hypothesis p=0.2 is true (black bars) and that the alternative p=0.4 is true (red bars).

III: Analysis of a diagnostic test

A diagnostic study evaluated the Platelia NS1 ELISA assay for diagnosis of dengue. The study participants consisted of 853 children admitted to Children’s Hospital #1, Children’s Hospital #2, or HTD from August 2006 to March 2007. Participants were eligible for entry to the study if they had a history of fever of less than seven days and there was a clinical suspicion of dengue. A patient was classified as having dengue by the reference test (gold standard) if there was RT-PCR detection of DENV RNA in plasma, and/or viral culture and/or serological changes in DENV reactive IgM or IgG levels in paired plasma specimens. The data is stored in dengueNS1.csv, the file dengueNS1_description.txt describes the variables in the data set.

Import the dataset and use the summary function for a data summary.

Use the table function to create a cross-table of NS1 result and “true” dengue status; you can add column and row totals via the addmargins function. Calculate the prevalence (proportion) of children with dengue.

Answer: Using the prop.test function also provides the 95% confidence interval

Calculate the proportion of children with dengue that test positive in the NS1 assay. This quantity is called the sensitivity “SENS” of the NS1 test. Also calculate the proportion of children without dengue that test negative (the specificity “SPEC” of the test). Write the calculation in the blank below. What do you think of the performance of the test?

Answer: Using the prop.test function also provides the 95% confidence interval. SPEC is high, there are no false positive results. However, about 32% of children with dengue are missed with this test.

Calculate the proportion of children that have dengue amongst the ones that test positive in the NS1 assay. This is called the positive predictive value (PPV). Also calculate the proportion of children that do not have dengue amongst the ones that test negative in the NS1 assay. Which characteristic of the test is more clinically relevant, SENS and SPEC or PPV and NPV? Write the function to answer that in the blank below.

Answer: From a clinical perspective PPV and NPV are more relevant. They quantify the probability to have dengue given the test result. SENS and SPEC quantify the test result given the true dengue status. The true status we do not know in practice.

Is there evidence for a difference in sensitivity of the NS1 ELISA assay between DENV serotypes? Use the tbl_summary function from the gtsummary package (No statistical test is required, just an exploratory data analysis.)

Answer: Sensitivity seems to be higher for DENV-1 and DENV-3. There are many children with unknown serotype.

IV: difference in proportions

The dataset bmData.csv contains selected variables from 300 patients with confirmed bacterial meningitis. They were randomized to either adjunctive dexamethasone therapy or placebo. Import the dataset and use the summary function for a data summary.

Create a 2x2 table that compares the number of deaths at 6 months between the two treatment arms, using the function xtabs or table. Is there a difference in the survival status at 6 months of follow-up between the two randomized groups?

Answer: The percentage that died is larger in the placebo group. However, we need to find out whether this could be due to chance. For this, we need to perform a formal test.

Perform a formal test for a difference in survival status. Formulate the null hypothesis and the alternative hypothesis. Use the chisq.test function.

Answer: The null hypothesis is that dexamethasone does not change the probability of dying within 6 months. One way to formulate this is to say that treatment arm and survival status are independent. If we define \(\pi_D\) as the death risk for those that receive dexamethasone and \(\pi_P\) for those that receive standard of care (placebo arm in this trial), then we can formulate the null hypothesis as \(H_0: \pi_D=\pi_P\).

Use the prop.test function as alternative. Look at the difference in the way the results are shown compared to the chi-squared test.

Answer: The prop.test function gives more detailed information. It adds the 95% confidence interval for the difference and is provides the fraction of individuals who died within 6 months in each of the groups. There is moderately strong suggestion that there is a difference in survival status at 6 months. Note that the p-value differs somewhat depending on whether Yates’ continuity correction is used.

We can also use the gtsummary package to obtain all results in tabular format, using the tbl_summary function.