Homework 03

Tanzy Love, based on the course by Sally Thurston

2026-03-29 (updated: 2026-04-06)

Getting started

Warm up

Before we introduce the data, let’s warm up with some simple exercises. Keep an eye out in the instructions for where you are instructed to: 🧶 knit ✅ commit ⬆️ push

Packages

It’s usually a good idea to install packages in the command line and to load them in a header so you can see what they are. You can load them by running the following in your Console:

library(ggplot2)

1. Bioassay

I included the program bioassay.r in your assignment repository. It is an R program to carry out most of the analysis of the bioassay data discussed at the end of chapter 3 (I did not write this program). The program is not well-documented.

  1. Paste the program into the next r chunk and document what it does in sufficient detail for an outside reader to understand what it does. I want you to edit the code itself, and add comments so that someone else reading your comments can understand the code. Also: fix the program so that the warning message does not occur, and fix any other mistakes you notice, and/or make improvements. If you fix or improve something, note what changes you have made and why.

  2. Write down the model (likelihood, prior) from the course notes or the R code.

  3. Write a stan model (plain text or a set of text lines in R code) to fit the model for the bioassay analysis. Note: Instead of a flat prior, you can use the default priors for \(\alpha\) and \(\beta\).

  4. Run either the program in (a) or in (c) for 1000 iterations. You will have samples from the joint posterior of α and β. Make a contour plot from these samples.

  5. What percent of the posterior draws have β > 0? What does β > 0 mean in the context of this problem?

  6. Using only the draws of β > 0, print the posterior median and a 95% posterior interval for the LD50 (defined in the notes) and make a histogram of the LD50.

  7. Suppose instead of an uninformative prior for α and β, we wanted to use the priors α ∼ N (0, 4) and β ∼ N (0, 100), with α and β independent a priori. How would the code be modified (in (a) and in (c))? Ideally you would answer this by showing the code snippet that would be different. At the least, try to explain what change is needed. Note: You do not have to actually rerun the code with these different priors. Getting new model code to work can be too hard for homework.

🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

2. Hierarchical binomial model

The data comes from a table in the textbook, it is included in the R code file. We will be working with the number of bicycles and the number of total vehicles. The first set of numbers is 16/58, where 16 = number bicycles, and 58=number other vehicles, making (16+58) total vehicles in that location. We have J=10 locations.

The file bicycles-snippet.R is included in the assignment repository. We will be using log(α/β) and log(α+β) as the transformed parameters of interest, as we did in class. The marginal posterior for log(α + β) and log(α/β) could be found by transformation (look in the notes) and it is computed in the code already.

  1. Write out a model in which the number of bicycles at location j (for j = 1, · · · , J) is \(y_j\) ∼ Binomial(\(n_j\) , \(θ_j\) ), and \(θ_j\) ∼ Beta(α, β). Use the noninformative prior for α and β that we discussed in class for the rat tumor example. That is, \(p(α, β) = (α + β)^{−5/2}\).

Write down three posterior distributions: i. p(θ, α, β | y) ii. p(θ | α, β, y) iii. p(α, β | y)

  1. Compute the marginal posterior density of log(α + β) and log(α/β) on a grid, and make a contour plot of this joint posterior. You should figure out reasonable values to use for the boundary of the grid. Show your process of finding decent values by including contour plots using your initial grid. You can make use of the code snippet, bicycles-snippet.R, which calculates the probabilities on the right scale for you and includes the contour plot. You just have to come up with reasonable values.

  2. Draw 5000 samples from the joint posterior distribution, by drawing from p(log(α/β), log(α + β) | y), then from p(θ | α, β, y). (You have done something close to this already and there is example code in the notes and posted in Blackboard.) Summarize the 5000 draws of log(α/β), log(α + β), and \(θ_j\) for each location j + 2 in plots.

  3. How do the draws of \(θ_j\) , for each j, compare with the raw proportions? Ideally you would summarize this by a plot (example code in the notes and posted in Blackboard) and please comment on the comparison.

🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards and review the md document on GitHub to make sure you’re happy with the final state of your work.

Rubric: 80 points total