2 Introduction: Credibility, Models, and Parameters

2.1 Bayesian inference is reallocation of credibility across possibilities

To make Figure 2.1 we need data. We will create some synthetic data and store it in using DataFrames.jl

using DataFrames

function expand_grid(; iters...)
    var_names = collect(keys(iters))
    var_itr = [1:length(x) for x in iters.data]
    var_ix = vcat([collect(x)' for x in Iterators.product(var_itr...)]...)
    out = DataFrame()
    for i = 1:length(var_names)
        out[:,var_names[i]] = collect(iters[i])[var_ix[:,i]]
    end
    return out
end

d = expand_grid(iteration=1:3, Possibilities=["A", "B","C", "D"], stage = ["Prior", "Posterior"])

d2 =DataFrame(Credibility =[fill(.25,4); 0; fill(1/3,3); 0; fill(1/3,3);0;.5;0;0.5;0;.5;0;0.5;fill(0,3);1])

sort!(d, [:iteration])
d.Credibility = d2.Credibility

Here we are defining a function that creates combinations and then I just bind that we the Credibility values we will use for plotting.

We can take a look at the top few rows of the data with the first() function.

first(d,5)

5 rows × 4 columns

	iteration	Possibilities	stage	Credibility
	Int64	String	String	Float64
1	1	A	Prior	0.25
2	1	B	Prior	0.25
3	1	C	Prior	0.25
4	1	D	Prior	0.25
5	1	A	Posterior	0.0

Now lets plot our version of Figure 2.1. To do so, we will use the Gadfly.jl package. This is my preferred plotting package because it has similar syntax to ggplot2 package in R.

maybe add the annotation using something like below https://stackoverflow.com/questions/50925450/gadfly-julia-how-to-add-text-annotation-to-a-plot

2.1.1 Data are noisy and inferences are probabilistic.

The next plot will involve getting creating some Gaussian distributions. To do so, we will use the Distributions.jl package. Distribution allows create and extract information from a wide range of probability distributions. We will use this package a lot in subsequent chapter to draw samples to facilitate Bayesian inference. Note that I am ignoring the scaling issue apparent from the four distributions densities not being equal to one. Also to create the illusion of a discrete x axis, I will just show only the 4 values.

TODO: rescale the curves such that the density on the y axis for each curve sums to one.

using Gadfly
using Distributions
using DataFrames

set_default_plot_size(16cm, 18cm)

ticks = [1, 2, 3, 4] #specify the tick marks

#create some data for the bars
x5 = DataFrame(Possibilities = 1, Credibility = pdf(Normal(1, 1.2), 1))
x6 = DataFrame(Possibilities = 2, Credibility = pdf(Normal(2, 1.2), 2))
x7 = DataFrame(Possibilities = 3, Credibility = pdf(Normal(3, 1.2), 3))
x8 = DataFrame(Possibilities = 4, Credibility = pdf(Normal(4, 1.2), 4))

plot(layer(x5, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x6, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x7, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x8, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x->pdf(Normal(1, 1.2), x), -6, 10, color=[colorant"black"], order=1),
    layer(x->pdf(Normal(2, 1.2), x), -6, 10, color=[colorant"black"],order=1),
    layer(x->pdf(Normal(3, 1.2), x), -6, 10, color=[colorant"black"],order=1),
    layer(x->pdf(Normal(4, 1.2), x), -6, 10, color=[colorant"black"],order=1),
    Guide.xticks(ticks=ticks),
    Guide.xlabel("Possibilities"),
    Guide.ylabel("Credibility"),
    Guide.title("Prior"),
     Theme(
        background_color = "white",
        grid_color = "white",
        bar_spacing = 1mm,
        key_position = :none
    ))

Now we can use the same method to create the bottom panel of Figure 2.3. Here I will just hide the y axis. What is important to note is just how the assignment of credibility changes based on our 3 observations (shown in red).

ticks = [1, 2, 3, 4] #specify the tick marks

#create some data for the bars
x5 = DataFrame(Possibilities = 1, Credibility = pdf(Normal(1, 8), 1))
x6 = DataFrame(Possibilities = 2, Credibility = pdf(Normal(2, 3), 2))
x7 = DataFrame(Possibilities = 3, Credibility = pdf(Normal(3, 5), 3))
x8 = DataFrame(Possibilities = 4, Credibility = pdf(Normal(4, 20), 4))
x9 = DataFrame(Possibilities = [1.75, 2.25, 2.75], Credibility = zeros(3))

plot(layer(x5, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x6, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x7, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x8, x = :Possibilities, y = :Credibility, Geom.bar(orientation = :vertical)),
    layer(x->pdf(Normal(1, 8), x), -6, 10, color=[colorant"black"], order=1),
    layer(x->pdf(Normal(2, 3), x), -6, 10, color=[colorant"black"],order=1),
    layer(x->pdf(Normal(3, 5), x), -6, 10, color=[colorant"black"],order=1),
    layer(x->pdf(Normal(4, 20), x), -6, 10, color=[colorant"black"],order=1),
    layer(x9, x = :Possibilities, y = :Credibility, Geom.point, color=[colorant"red"],size =[.2], order=2), #add data
    Guide.xticks(ticks=ticks),
    Guide.yticks(ticks=[]),
    Guide.xlabel("Possibilities"),
    Guide.ylabel("Credibility"),
    Guide.title("Posterior"),
    Coord.Cartesian(ymin=0,ymax=0.3),
     Theme(
        background_color = "white",
        grid_color = "white",
        bar_spacing = 1mm,
        key_position = :none
    ))

2.2 Possibilities are parameter values in descriptive models

In the last section, we used the Normal() function to make curves following Gaussian distribution. Now we will showcase the power and flexibility of the Distribution.jl package by assigning a distribution to a name and drawing random samples from that object. By doing so we can simulate data from a the specified normal distribution.

using Random
# set the seed to make the simulation reproducible
Random.seed!(2022) 

# assign d `Normal()` with μ 10 σ = 5 
d = Normal(10,5)
# simulate the data with `rand()` from the specified distribution
simulated_data = DataFrame(x = rand(d, 2000))

#plot it
#Figure 2.4.a.
plot(layer(simulated_data, x=:x, Geom.histogram(density=true)),
    layer(x->pdf(Normal(10, 5), x), -8, 27, color=[colorant"black"],order=1),
    Guide.xlabel("Data Values"),
    Guide.ylabel("Data Probability"),
    Guide.title("The candidate normal distribution\nhas a μ of 10 and σ of 5."),
    Theme(
        background_color = "white",
        grid_color = "white",
        line_width = 1mm
    )
)

#Figure 2.4.b.
plot(layer(simulated_data, x=:x, Geom.histogram(density=true)),
    layer(x->pdf(Normal(8, 6), x), -8, 27, color=[colorant"black"],order=1),
    Guide.xlabel("Data Values"),
    Guide.ylabel("Data Probability"),
    Guide.title("The candidate normal distribution\nhas a μ of 8 and σ of 6."),
    Theme(
        background_color = "white",
        grid_color = "white",
        line_style = [:dash],
        line_width = 1mm
    )
)

2.3 The steps of Bayesian data analysis

In general, Bayesian analysis of data follows these steps:

Identify the data relevant to the research questions. What are the measurement scales of the data? Which data variables are to be predicted, and which data variables are supposed to act as predictors?

Define a descriptive model for the relevant data. The mathematical form and its parameters should be meaningful and appropriate to the theoretical purposes of the analysis.

Specify a prior distribution on the parameters. The prior must pass muster with the audience of the analysis, such as skeptical scientists.

Use Bayesian inference to re-allocate credibility across parameter values. Interpret the posterior distribution with respect to theoretically meaningful issues (assuming that the model is a reasonable description of the data; see next step).

Check that the posterior predictions mimic the data with reasonable accuracy (i.e., conduct a “posterior predictive check”). If not, then consider a different descriptive model.

Perhaps the best way to explain these steps is with a realistic example of Bayesian data analysis. The discussion that follows is abbreviated for purposes of this introductory chapter, with many technical details suppressed. (p. 25)

I will showcase the entire workflow here. In later chapters we’ll cover this workflow in much more detail.

First we need to generate the data and fit a model to those data. In HtWtDataDenerator.R script, Kruschke provided the code for a function that will simulate height/weight data. Lets rewrite that function in Julia:

using Distributions, Random, DataFrames, StatsBase

function HtWtDataGenerator(nSubj, rndsd = nothing, maleProb = 0.50) 
    # Random height, weight generator for males and females. Uses parameters from
    # Brainard, J. & Burmaster, D. E. (1992). Bivariate distributions for height and
    # weight of men and women in the United States. Risk Analysis, 12(2), 267-275.
    # Kruschke, J. K. (2011). Doing Bayesian data analysis:
    # A Tutorial with R and BUGS. Academic Press / Elsevier.
    # Kruschke, J. K. (2014). Doing Bayesian data analysis, 2nd Edition:
    # A Tutorial with R, JAGS and Stan. Academic Press / Elsevier.
    
    # Specify parameters of multivariate normal (MVN) distributions.
  
    # Men:
    HtMmu   = 69.18
    HtMsd   = 2.87
    lnWtMmu = 5.14
    lnWtMsd = 0.17
    Mrho    = 0.42
    Mmean   = [HtMmu, lnWtMmu]
    Msigma  = [HtMsd^2 Mrho * HtMsd * lnWtMsd; Mrho * HtMsd * lnWtMsd lnWtMsd^2]
    # Women cluster 1:
    HtFmu1   = 63.11
    HtFsd1   = 2.76
    lnWtFmu1 = 5.06
    lnWtFsd1 = 0.24
    Frho1    = 0.41
    prop1    = 0.46
    Fmean1   = [HtFmu1, lnWtFmu1]
    Fsigma1  = [HtFsd1^2 Frho1 * HtFsd1 * lnWtFsd1; Frho1 * HtFsd1 * lnWtFsd1  lnWtFsd1^2]
    # Women cluster 2:
    HtFmu2   = 64.36
    HtFsd2   = 2.49
    lnWtFmu2 = 4.86
    lnWtFsd2 = 0.14
    Frho2    = 0.44
    prop2    = 1 - prop1
    Fmean2   = [HtFmu2, lnWtFmu2]
    Fsigma2  = [HtFsd2^2  Frho2 * HtFsd2 * lnWtFsd2; Frho2 * HtFsd2 * lnWtFsd2  lnWtFsd2^2]

    # Randomly generate data values from those MVN distributions.
    if rndsd !== nothing 
        Random.seed!(rndsd) 
    end
    datamatrix =  DataFrame(zeros(nSubj, 3), ["male", "height", "weight"])
    # arbitrary coding values
    maleval = 1
    femaleval = 0

    for i in 1:nSubj
        # Flip coin to decide sex
        sex = wsample([maleval, femaleval], [maleProb, 1 - maleProb], 1)

        if sex[1] == maleval
            datum = rand(MvNormal(Mmean, Msigma))
        elseif sex[1] == femaleval
            Fclust = wsample([1, 2], [prop1, prop2], 1)

            if Fclust[1] == 1
            datum = rand(MvNormal(Fmean1, Fsigma1))
            else
            datum = rand(MvNormal(Fmean2, Fsigma2))
            end
        end
        datum[2] = exp(datum[2])
        datamatrix[i, :] = [sex; round.(datum, digits = 1)]
    end

    return datamatrix

end

HtWtDataGenerator (generic function with 3 methods)

The HtWtDataGenerator() function has two arguments. The nSubj argument determines how many values to generate, and maleProb determines how probable we want those values to be from men.

5 rows × 3 columns

	male	height	weight
	Float64	Float64	Float64
1	1.0	66.0	141.2
2	0.0	60.4	104.3
3	0.0	62.7	139.4
4	0.0	61.7	109.9
5	1.0	64.4	162.5

We’re about ready for the model, which we will fit it with the Hamiltonian Monte Carlo (HMC) method via the turing.jl package. We’ll introduce turing.jl more fully in Chapter 8. I also recommend you go check out the following and resources:

Following Solomon Kurz’s example, we’ll use weakly-regularizing priors for the intercept and slope and a half Cauchy with a fairly large scale parameter for \(\sigma\)

using Turing
using Optim
using MCMCChains, Plots, StatsPlots, Gadfly
using AbstractMCMC
# Define the model
# linear regression.
@model function linear_regression(x,y)
    n = length(y)

    # Set variance prior.
    σ ~  Truncated(Cauchy(0,10),0,Inf)
    # Set intercept prior.
    α ~ Normal(0,100)
    # Set the prior for beta.
    β  ~ Normal(0,100)

    for i in 1:n
        y[i] ~ Normal(α + β * x[i], σ)
    end

    return y
end

#to be concrete we assign the values to x and y
x = d.height
y = d.weight

model = linear_regression(x, y)

#  Run sampler, collect results
chain_lin_reg = sample(model, NUTS(500, 0.65),MCMCDistributed(),2_000, 4);

If you wanted a model summary, you could execute display(chain_lin_reg). For more detail, see Chapter 8. Here’s Figure 2.5.a.

#store in a Dataframe
plot_df = DataFrame(sample(chain_lin_reg[:,:,:], 57))
plot_df.x = x
plot_df.y = y

#plot
abline = Geom.abline(color="red", style=:dash)
Gadfly.plot(plot_df,
    x=:x, y=:y, Geom.point,color = [colorant"black"], intercept=:α, slope=:β, abline, 
    Guide.xlabel("Height in inches"),
    Guide.ylabel("Weight in pounds"),
    Guide.title("Data with 57 credible regression lines"),
    Theme(
        background_color = "white",
        grid_color = "white",
        line_style = [:dash],
        line_width = .1mm
    ))

For Figure 2.5.b., we’ll mark off the mode and 95% highest density interval (HDI).

df = DataFrame(chain_lin_reg)

plt_HDI = DataFrame(quantile(group(chain_lin_reg, :β)))
plt_HDI.β .= mode(chain_lin_reg[:,:β,:])

plot(layer(df, x=:β, Geom.histogram(bincount = 50)),
    layer(plt_HDI, x = :β,xmin="2.5%", xmax="97.5%",Geom.point, Geom.errorbar, color=[colorant"red"],size =[.1], order=1),
    Guide.xlabel("β(slope)"),
    Guide.title("The posterior distribution \n
    The mode and 95% HDI intervals are\nthe dot and horizontal line at the bottom."),
    Guide.yticks(ticks=[]),
    Theme(
        background_color = "white",
        grid_color = "white", 
        bar_spacing = 0.25mm,
        line_width = 1mm
))

To make Figure 2.6, we use the predict() function. First we create a model object with missing values for our dependent variable, then we will predict those values using our posterior samples.

# extract the posterior draws
model_pred = linear_regression(x,Vector{Missing}(missing, length(y)))
posterior_check = predict(model_pred, chain_lin_reg)

#get the summary stats for plotting
posterior_res = summarystats(posterior_check)
#create DataFrame
plot_df = DataFrame(quantile(posterior_check))
plot_df.x = x
plot_df.y = y
plot_df.predictions = posterior_res[:,:mean]

#plot
Gadfly.plot(
    layer(plot_df, x=:x, y=:y, Geom.point,color = [colorant"black"]),
    layer(plot_df, x=:x, y=:predictions,ymin="2.5%", ymax="97.5%",Geom.point,Geom.errorbar, color = [colorant"grey"]),Guide.xlabel("Height in inches"),
    Guide.ylabel("Weight in pounds"),
    Guide.title("Data with the percentile-based 95% intervals and\nthe means of the posterior predictions"),
    Theme(
        background_color = "white",
        grid_color = "white",
        line_width = .1mm
    ))

The posterior predictions might be easier to depict with a ribbon and line, instead. Luckily Gadfly.jl makes it really easy to change.

Gadfly.plot(
    layer(plot_df, x=:x, y=:y, Geom.point,color = [colorant"black"]),
    layer(plot_df, x=:x, y=:predictions,ymin="2.5%", ymax="97.5%",Geom.line,Geom.ribbon, color = [colorant"red"]),Guide.xlabel("Height in inches"),
    Guide.ylabel("Weight in pounds"),
    Guide.title("Data with the percentile-based 95% intervals and\nthe means of the posterior predictions"),
    Theme(
        background_color = "white",
        grid_color = "white",
        line_width = .2mm
    ))

“We have seen the five steps of Bayesian analysis in a fairly realistic example. This book explains how to do this sort of analysis for many different applications and types of descriptive models” (p. 30).

That’s it! We can now start digging into the details that make up each of the steps outlined in this introductory chapter. Learning Bayesian inference is a rewarding experience. I wish you the best of luck on this exciting journey!

2.4 Session info

Pkg.status()

      Status `~/.julia/environments/v1.7/Project.toml`
  [80f14c24] AbstractMCMC v4.1.3
  [c7e460c6] ArgParse v1.1.4
  [c52e3926] Atom v0.12.38
  [76274a88] Bijectors v0.10.6
  [e28b5b4c] Bootstrap v2.3.3
  [336ed68f] CSV v0.10.4
  [a81c6b42] Compose v0.9.4
  [a93c6f00] DataFrames v1.3.4
  [1313f7d8] DataFramesMeta v0.12.0
  [864edb3b] DataStructures v0.18.13
  [31c24e10] Distributions v0.25.70
  [c91e804a] Gadfly v1.3.4
  [c27321d9] Glob v1.3.0
  [7073ff75] IJulia v1.23.3
  [e5e0dc1b] Juno v0.8.4
  [c7f686f2] MCMCChains v5.3.1
  [c03570c3] Memoize v0.4.4
  [a15396b6] OnlineStats v1.5.14
  [429524aa] Optim v1.7.2
  [d96e819e] Parameters v0.12.3
  [91a5bcdd] Plots v1.31.7
  [1fd47b50] QuadGK v2.5.0
  [295af30f] Revise v3.4.0
  [ed01d8cd] Sobol v1.5.0
  [03a91e81] SplitApplyCombine v1.2.2
  [90137ffa] StaticArrays v1.5.6
  [2913bbd2] StatsBase v0.33.21
  [4c63d2b9] StatsFuns v1.0.1
  [f3b207a7] StatsPlots v0.14.34
  [fd094767] Suppressor v0.2.1
  [bd369af6] Tables v1.7.0
  [fce5fe82] Turing v0.21.12
  [9d95f2ec] TypedTables v1.4.1
  [9a3f8284] Random