In this document we demonstrate the prior distributions and how to we chose the default priors, based on the example of four species in the North Sea.
The truth follows a multivariate random walk with covariance
parameter Λy, In
EcoEnsemble
, the prior distribution for this parameter is
the inverse Wishart distribution. The inverse Wishart distribution is
descried by two parameters, the scale matrix Ψ and the degrees of freedom ν. we were unsure of correlations of
the four species and therefore set the off diagonal elements of Ψ to zero.
To get an idea of the diagonal elements were examined the marginal
distribution of the diagonal elements of Λy. To do this
we explored a univariate random walk, If y0, i = 0, then
the distribution after T times
is The example in EcoEnsemble
runs for 33 years without any
observations from single-species models, and we can investigate what we
believe the change in biomass is in 2050 relative to 2017, without any
other information. To do this we will look sample from an inverse
Wishart distribution, using rstan
, with varying degrees of
freedom.
inv_wish_st_mod <- stan_model(model_code="data {
int<lower=0> N;
real nu;
matrix[N,N] mat_cov;
}
parameters {
cov_matrix[N] Sigma;
}
model {
Sigma ~ inv_wishart(nu, mat_cov);
}
")
The lower the degrees of freedom, the less informative the prior. However, this comes at the expense of computational efficiency and ν > d − 1 is required. We therefore want ν to be as large as possible without excluding any plausible values. We look at 8 degrees of freedom:
We look at the distribution of the first diagonal element
ex.fit.iw <- extract(fit.iw)
plot(density(ex.fit.iw$Sigma[,1,1]),main="",xlab=expression(lambda),xlim=c(0,2))
which in 2050 leads to
sd.2050 <- sqrt(33*ex.fit.iw$Sigma[,1,1])
plot(density(sqrt(33*ex.fit.iw$Sigma[,1,1])),main="",
xlab="standard deviation in 2050",xlim=c(0,5))
For example, this means that the SSB of cod in 2050, assuming in 2017 it is exactly the value from the assessment (ICES (2020)), will be
x <- seq(5,20,length.out=1000)
plot(x,colMeans(sapply(x, function(xs){dnorm(xs,tail(SSB_obs$Cod,n=1),sd.2050)})),
type="l",axes=F,ylab="density",xlab="Cod in 2050 (1000 tonnes)")
labsx <- log(signif(exp(seq(5,20,length.out=7)),1))
axis(1,at=labsx,labels=c(exp(labsx)/1000))
axis(2)
abline(v=tail(SSB_obs$Cod,n=1))
abline(v=range(SSB_obs$Cod),lty=2)
box()
The solid vertical line is the assessment’s SSB in 2017, with the dashed lines being the maximum and minimum SSB values from 1984 until 2017. Therefore we used the fairly uninformative prior
We also require a prior on the distribution of SSB in 1983, that is
The long-term individual discrepancy for the kth model is with where Pγ is a correlation matrix and Πγ = diag(πγ, 1, …, πγ, 4). We put a uniform distribution on all correlation matrices, and This prior gives more weight around zero but has an expectation of 1.
The difference between two simulators, for k ≠ l. The standard deviation of the difference of two models is
pis <- rgamma(1e5,1,1)
plot(density(sqrt(2 * pis)),xlab="Standard deviation",ylab="Density",main="")
and the distribution of the absolute difference is
xs <- seq(0,5,length.out=1000)
plot(xs,2*colMeans(sapply(xs, function(x){dnorm(x,0,sqrt(2*pis))})),
type="l",axes=T,ylab="density",xlab="Difference of two models")
We do not expect the models to differ by much than one order of magnitude and therefore we used this prior.
The short-term individual discrepancy for the kth model follows an auto-regressive process, with Rk = diag(rk, 1, …, rk, 4) where Pk is a correlation matrix and Πk = diag(πk, 1, …, πk, 4).
The distribution was with and hyperpriors and Sampling from this prior
cor_pri_st <- stan_model(model_code = " functions{
real priors_cor_hierarchical_beta(matrix ind_st_cor, int N, matrix M){
real log_prior = 0;
for (i in 1:(N-1)){
for (j in (i+1):N){
log_prior += beta_lpdf(0.5*(ind_st_cor[i, j] + 1)| M[i, j], M[j, i] );
}
}
return log_prior;
}
}
data {
vector[4] cor_p;
}
parameters {
matrix <lower=0>[5,5] beta_pars;
corr_matrix[5] rho[4];
}
model {
for (i in 1:4){
for (j in (i+1):5){
beta_pars[i,j] ~ gamma(cor_p[1],cor_p[2]);
beta_pars[j,i] ~ gamma(cor_p[3],cor_p[4]);
}
}
for(i in 1:4){
target += priors_cor_hierarchical_beta(rho[i],5,beta_pars);
diagonal(beta_pars) ~ std_normal();
}
}
")
fit_cor <- sampling(cor_pri_st,data=list(cor_p=0.1 * c(1,1,1,1)),chains=4)
ex.fit <- extract(fit_cor)
def.par <- par(no.readonly=TRUE) #Old pars to reset afterwards
par(mfrow=c(2,2))
plot(density(ex.fit$beta_pars[,1,2]),xlab=expression(alpha[rho]),main="")
plot(density(log(ex.fit$beta_pars[,1,2]/ex.fit$beta_pars[,2,1])),main="",
xlab="Expected log odds")
plot(density(ex.fit$rho[,1,1,2]),xlab=expression(rho),main="")
xs <- seq(-1,1,length.out=100)
lines(xs,dbeta((xs+1)/2,2,2)/2,col="red")
plot(density(apply(ex.fit$rho[,,1,2],1,var)),xlim=c(0,0.35),main="",
xlab=expression(var(rho)))
par(def.par)
The above plot shows the different aspects of the emergent distribution of this prior. The top left plot shows the marginal distribution of parameter of the beta distribution with the top right shows the log odds of the expectation of the beta distribution. The bottom left plot shows the marginal distribution of ρk, i, i, with the red curve being the distribution that gives a uniform distribution over all correlation matrices. We gave slightly more weight to correlation matrices that are less correlated. The bottom right plot shows the variance of ρk, i, i over k. The variance of a uniform distribution between -1 and 1 is 1/3 and the variance of the marginal distribution of uniform correlation parameters is 1/5. We allowed the prior to be fairly uninformative over this space, as we were unsure how ρk, i, i and ρl, i, i related to one another, however, we do believe they may be similar and therefore we increased our beliefs around 0.
The ith diagonal element of Πk, πk, i, was with zeros on the off diagonals. The hyperpriors were and Sampling from this leads to
st_mod1 <- stan_model(model_code = "data {
vector[4] gam_p;
}
parameters {
vector [5] log_sha_st_var[4];
vector[5] gamma_mean;
vector<lower=0> [5] gamma_var;
}
model {
gamma_mean ~ normal(gam_p[1],gam_p[2]);
gamma_var ~ inv_gamma(gam_p[3],gam_p[4]);
for (i in 1:4){
log_sha_st_var[i] ~ normal(gamma_mean,sqrt(gamma_var));
}
}
generated quantities{
vector[5] sha_st_var[4];
for (i in 1:4){
sha_st_var[i] = exp(log_sha_st_var[i]);
}
}
")
test.fit_norm <- sampling(st_mod1,data=list(gam_p=c(-3,1,8,4)),chains=4)
ex.fit <- extract(test.fit_norm)
def.par <- par(no.readonly=TRUE) #old pars
par(mfrow=c(2,2))
plot(density(ex.fit$gamma_mean[,1]),main="",xlab=expression(mu[pi]))
plot(density(ex.fit$gamma_var[,1]),xlim=c(0,1),main="",xlab=expression(sigma[pi]^2))
plot(density(ex.fit$sha_st_var[,1,1]),xlim=c(0,1),main="",xlab=expression(pi["k,i"]^2))
plot(density(apply(ex.fit$sha_st_var[,,1],1,var)),xlim=c(0,0.2),
xlab=expression(var(pi["k,i"]^2)) ,main="")
par(def.par)
The top left plot shows the marginal distribution of μπ, i and the top right plot being σπ, i2. The bottom left plot shows the prior distribution of pik, i2. For the same reason that describes the variance of the random walk on the truth, we felt that σπ, i2 should not be larger than 0.2. For a similar reason to the correlation parameters, the variance of σπ, i2 will also be small.
For the auto regressive parameter, rk, i, we set a boundary avoiding prior
The prior predictive distribution is then
priors <- EnsemblePrior(4,
ind_st_params =IndSTPrior("hierarchical",list(-3,1,8,4),
list(0.1,0.1,0.1,0.1),AR_params=c(2,2)),
ind_lt_params = IndLTPrior("lkj",list(1,1),1),
sha_st_params = ShaSTPrior("lkj",list(1,10),1,AR_params=c(2,2)),
sha_lt_params = 5
)
prior_density <- prior_ensemble_model(priors, M = 4)
samples <- sample_prior(observations = list(SSB_obs, Sigma_obs),
simulators = list(list(SSB_ewe, Sigma_ewe,"EwE"),
list(SSB_fs , Sigma_fs,"FishSums"),
list(SSB_lm , Sigma_lm,"LeMans"),
list(SSB_miz, Sigma_miz,"mizer")),
priors=priors,
sam_priors = prior_density)
plot(samples)