Blog Post 1

7 minute read

Published:

Short Tutorial on Spatially Misaligned Data with INLA and inlabru

TL;DR

  • This is short tutorial for spatial misaligned data with a simplified dataset (inlabru gorillas dataset), see details Coherent Disaggregation and Uncertainty Quantification for Spatially Misaligned Data.

  • Here we go through a two-stage modelling, i.e. first, fit a continuous covariate field from point locations(e.g. temperature measurement from weather stations); second, fit into a point process model using predicted covariate with uncertainty quantification.

=============================================================

Libraries

Load some libraries.

libs_name <- c("INLA", "inlabru", "sf", "terra", "tidyterra", "ggplot2", "stars",
               "dplyr", "patchwork")

lapply(libs_name, require, character.only = TRUE)
## Loading required package: INLA
## Loading required package: Matrix
## Loading required package: inlabru
## Loading required package: fmesher
## Loading required package: sf
## Linking to GEOS 3.13.1, GDAL 3.11.4, PROJ 9.7.0; sf_use_s2() is TRUE
## Loading required package: terra
## terra 1.8.93
## Loading required package: tidyterra
## 
## Attaching package: 'tidyterra'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## Loading required package: ggplot2
## Loading required package: stars
## Loading required package: abind
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:terra':
## 
##     intersect, union
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Loading required package: patchwork
## 
## Attaching package: 'patchwork'
## 
## The following object is masked from 'package:terra':
## 
##     area
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE
## 
## [[7]]
## [1] TRUE
## 
## [[8]]
## [1] TRUE
## 
## [[9]]
## [1] TRUE

Dataset

We extract the gorillas dataset from the inlabru package, which contains spatially misaligned data. The dataset includes the point locations of gorilla nests, Spatraster covariates, a hexagonal mesh for modelling, and a boundary for the study area.

nests <- gorillas_sf$nests
mesh <- gorillas_sf$mesh # fm_subdivide is your friend if you want to further refine the mesh.
boundary <- gorillas_sf$boundary
gcov <- gorillas_sf_gcov()

Let’s check the size of the mesh elements and the resolution of the covariates.

summary(st_area(fm_as_sfc(mesh)))
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.004531 0.007888 0.009378 0.009816 0.011365 0.020003
summary(res(gcov$elevation))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.02760 0.02787 0.02814 0.02814 0.02842 0.02869

Modelling

Assume we only have elevation data at certain point locations by sampling 100 out of 39,600 rasters.

elev <- gcov$elevation
elev_df <- as.data.frame(elev, xy = TRUE, na.rm = TRUE)
elev_sf <- st_as_sf(elev_df, coords = c("x", "y"), crs = fm_crs(elev)) %>% st_intersection(boundary)
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
elev_sample <- elev_sf[sample(seq_len(nrow(elev_sf)), size = 100), ]

Let’s check the point locations of the sampled elevation data.

We then reconstruct a continuous field for elevation.

# some prior settings
matern <- inla.spde2.pcmatern(
  mesh = mesh,
  prior.sigma = c(1, 0.05),
  prior.range = c(1, 0.05)
)

cmp_cov <- ~ cov(main = geometry, model = matern) + Intercept(1)

z_fml <- elevation ~ cov + Intercept

z_lik <- bru_obs("Gaussian",
                 formula = z_fml,
                 data = elev_sample
)

fit_cov <- bru(components = cmp_cov, 
               z_lik
)

pred_df <- fm_pixels(mesh, mask = boundary)

pred_cov <- predict(fit_cov,
                    newdata = pred_df,
                    formula = ~ cov + Intercept,
                    n.samples = 100
)

pred_df$elev <- pred_cov$mean

pred_terra <- rast(st_rasterize(pred_df["elev"]))

The predicted elevation (left) may seem overly smooth. However, we should all agree that it is not too far from the truth given we only have 100 points. Most importantly, we are still able to do reasonable inference on modelling nests.

One more thing, since we are reconstructing the elevation from point locations, we should include uncertainty quantification for the predicted elevation in the model for the point pattern of gorilla nests. We can extract the Q matrix from the INLA model for the covariate and make it symmetric to be plugged in the uncertainty component

# Make a symmetric matrix for the Q matrix from the INLA model and match the number of vertices in the mesh. 
q_matrix <- forceSymmetric(fit_cov$misc$configs$config[[1]]$Q)[1:1479, 1:1479] 

Then we throw everything into the model fitting the point pattern of gorilla nests. The fitting should not take too long, and one may compare it without uncertainty plugin.

\[ \text{Nests} \sim \text{Poisson}(\lambda) \\ \log(\lambda) = \beta_0 + \beta_x (\text{predicted elevation} + \text{uncertainty}) + \text{Matern field} \]

cmp <- ~ Intercept(1) +
  beta_x(
    1,
    mean.linear = 0,
    prec.linear = 1,
    marginal = bru_mapper_marginal(qexp, rate = 1)
  ) + cov_uncertainty(
    geometry,
    mapper = bru_get_mapper(matern),
    model = "generic0",
    Cmatrix = q_matrix
  ) +
  field(main = geometry, model = matern) +
  cov_pt_est(eval(pred_terra, geometry), model = "const")

fml <- geometry ~ Intercept + field - beta_x * (cov_pt_est + cov_uncertainty)

lik <- bru_obs(
  formula = fml,
  family = "cp",
  data = nests,
  domain = list(geometry = mesh),
  samplers = boundary
)

fit <- bru(
  components = cmp, lik,
  options = list(
    bru_initial = list(beta_x = 0) # exp(0) = 1
    # control.inla = list(int.strategy = "eb"),
    # bru_verbose = 3, bru_max_iter = 10
  )
)
## 
##  *** inla.core.safe:  The inla program failed, but will rerun in case better initial values may help. try=1/1 
## 
##  *** inla.core.safe:  rerun with improved initial values

The summary of the model looks alright.

summary(fit)
## inlabru version: 2.13.0
## INLA version: 26.01.26-1
## Components:
## Intercept: main = linear(1), group = exchangeable(1L), replicate = iid(1L), NULL
## beta_x: main = linear(1), group = exchangeable(1L), replicate = iid(1L), NULL
## cov_uncertainty: main = generic0(geometry), group = exchangeable(1L), replicate = iid(1L), NULL
## field: main = spde(geometry), group = exchangeable(1L), replicate = iid(1L), NULL
## cov_pt_est: main = const(eval(pred_terra, geometry)), group = exchangeable(1L), replicate = iid(1L), NULL
## Observation models:
##   Family: 'cp'
##     Tag: <No tag>
##     Data class: 'sf', 'tbl_df', 'tbl', 'data.frame'
##     Response class: 'numeric'
##     Predictor: geometry ~ Intercept + field - beta_x * (cov_pt_est + cov_uncertainty)
##     Additive/Linear: FALSE/FALSE
##     Used components: effects[Intercept, beta_x, cov_uncertainty, field, cov_pt_est], latent[]
## Time used:
##     Pre = 0.637, Running = 28.2, Post = 0.233, Total = 29.1 
## Fixed effects:
##             mean    sd 0.025quant 0.5quant 0.975quant   mode kld
## Intercept -3.256 2.864     -8.870   -3.256      2.358 -3.256   0
## beta_x    -0.872 0.002     -0.875   -0.872     -0.868 -0.872   0
## 
## Random effects:
##   Name     Model
##     cov_uncertainty Generic0 model
##    field SPDE2 model
## 
## Model hyperparameters:
##                                   mean       sd 0.025quant 0.5quant 0.975quant
## Precision for cov_uncertainty 28550.53 2.66e+04    3492.38 20840.06   99054.86
## Range for field                   3.29 7.46e-01       2.11     3.19       5.03
## Stdev for field                   2.18 4.24e-01       1.50     2.14       3.16
##                                  mode
## Precision for cov_uncertainty 9672.78
## Range for field                  2.98
## Stdev for field                  2.03
## 
## Marginal log-Likelihood:  2577.60 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')