Provides functions to estimate fixed and adaptive kernel-smoothed spatial relative risk surfaces via the density-ratio method and perform subsequent inference. Fixed-bandwidth spatiotemporal density and relative risk estimation is also supported.
Package: | sparr |
Date: | 2023-03-08 |
Version: | 2.3-10 |
License: | GPL (>= 2) |
Kernel smoothing, and the flexibility afforded by this methodology, provides an attractive approach to estimating complex probability density functions.
The spatial relative risk function, constructed as a ratio of estimated case to control densities (Bithell, 1990; 1991; Kelsall and Diggle, 1995a,b), describes the variation in the `risk' of the disease, given the underlying at-risk population. This is a technique that has been applied successfully for mainly exploratory purposes in a number of different analyses (see for example Sabel et al., 2000; Prince et al., 2001; Wheeler, 2007, Elson et al., 2021). It has also grown in popularity in very different fields that pose similarly styled research questions, such as ecology (e.g. Campos and Fedigan, 2014); physiology (Davies et al., 2013); and archaeology (e.g. Bevan, 2012; Smith et al. 2015).
This package provides functions for spatial (i.e. bivariate/planar/2D) kernel density estimation (KDE), implementing both fixed and `variable' or `adaptive' (Abramson, 1982) smoothing parameter options. A selection of bandwidth calculators for bivariate KDE and the relative risk function are provided, including one based on the maximal smoothing principle (Terrell, 1990), and others involving a leave-one-out cross-validation (see below). In addition, the ability to construct both Monte-Carlo and asymptotic p-value surfaces (`tolerance' contours of which signal statistically significant sub-regions of extremity in a risk surface - Hazelton and Davies, 2009; Davies and Hazelton, 2010) as well as some visualisation tools are provided.
Spatiotemporal estimation is also supported, largely following developments in Fernando and Hazelton (2014). This includes their fixed-bandwith kernel estimator of spatiotemporal densities, relative risk, and asymptotic tolerance contours.
Key content of sparr
can be broken up as follows:
DATASETS/DATA GENERATION
pbc
a case/control planar point pattern (ppp.object
) concerning liver disease in northern
England.
fmd
an anonymised (jittered) case/control spatiotemporal point pattern of the 2001 outbreak of veterinary foot-and-mouth disease in Cumbria (courtesy of the Animal and Plant Health Agency, UK).
burk
a spatiotemporal point pattern of Burkitt's lymphoma in Uganda; artificially simulated control data are also provided for experimentation.
Also available are a number of relevant additional spatial datasets built-in to the
spatstat
package (Baddeley and Turner, 2005; Baddeley et al., 2015) through spatstat.data
, such as
chorley
, which concerns the distribution of
laryngeal cancer in an area of Lancashire, UK.
rimpoly
a wrapper function of rpoint
to allow generated
spatial point patterns based on a pixel im
age to be returned with a
polygonal owin
.
SPATIAL
Bandwidth calculators
OS
estimation of an isotropic
smoothing parameter for fixed-bandwidth bivariate KDE, based on the
oversmoothing principle introduced by Terrell (1990).
NS
estimation of an isotropic smoothing parameter for fixed-bandwidth bivariate
KDE, based on the asymptotically optimal value for a normal density
(bivariate normal scale rule - see e.g. Wand and Jones, 1995).
LSCV.density
a least-squares cross-validated (LSCV) estimate
of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Bowman and
Azzalini, 1997).
LIK.density
a likelihood cross-validated (LIK) estimate
of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Silverman, 1986).
SLIK.adapt
an experimental likelihood cross-validation function
for simultaneous global/pilot bandwidth selection for adaptive density estimates.
BOOT.density
a bootstrap approach to optimisation
of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Taylor, 1989).
LSCV.risk
Estimation of a jointly optimal,
common isotropic case-control fixed bandwidth for the kernel-smoothed risk
function based on the mean integrated squared error (MISE), a weighted MISE,
or the asymptotic MISE (see respectively Kelsall and Diggle, 1995a; Hazelton, 2008;
Davies, 2013).
Density and relative risk estimation
bivariate.density
kernel density
estimate of bivariate data; fixed or adaptive smoothing.
multiscale.density
multi-scale adaptive kernel density
estimates for multiple global bandwidths as per Davies and Baddeley
(2018).
multiscale.slice
a single adaptive kernel estimate
based on taking a slice from a multi-scale estimate.
risk
estimation of a (log) spatial relative risk function, either from data or
pre-existing bivariate density estimates; fixed (Kelsall and Diggle, 1995a); fixed with shrinkage (Hazelton, 2023); or both asymmetric (Davies and Hazelton, 2010) and symmetric (Davies et al., 2016) adaptive estimates are possible.
tolerance
calculation of asymptotic or Monte-Carlo p-value surfaces.
Visualisation
S3
methods of the plot
function; see
plot.bivden
for visualising a single bivariate density
estimate from bivariate.density
, plot.rrs
for
visualisation of a spatial relative risk function from
risk
, or plot.msden
for viewing animations of
multi-scale density estimates from multiscale.density
.
tol.contour
provides more flexibility for plotting and
superimposing tolerance contours upon an existing plot of spatial relative risk (i.e. given output from
tolerance
).
Printing and summarising
S3
methods (print.bivden
, print.rrs
,
print.msden
, summary.bivden
,
summary.rrs
, and summary.msden
) are available for
the bivariate density, spatial relative risk, and multi-scale adaptive density objects.
SPATIOTEMPORAL
Bandwidth calculators
OS.spattemp
estimation of an isotropic
smoothing parameter for the spatial margin and another for the temporal margin
for spatiotemporal densities, based on the 2D and 1D versions, respectively, of the
oversmoothing principle introduced by Terrell (1990).
NS.spattemp
as above, based on the 2D and 1D versions of the
normal scale rule (Silverman, 1986).
LSCV.spattemp
least-squares cross-validated (LSCV) estimates
of scalar spatial and temporal bandwidths for edge-corrected spatiotemporal KDE.
LIK.spattemp
as above, based on likelihood cross-validation.
BOOT.spattemp
bootstrap bandwidth selection for the spatial and temporal margins;
for spatiotemporal, edge-corrected KDE (Taylor, 1989).
Density and relative risk estimation
spattemp.density
fixed-bandwidth kernel density estimate of spatiotemporal data.
spattemp.risk
fixed-bandwidth kernel density estimate of spatiotemporal relative risk, either with a time-static or time-varying control density (Fernando and Hazelton, 2014).
spattemp.slice
extraction function of the spatial density/relative risk at prespecified time(s).
Visualisation
S3
methods of the plot
function; see
plot.stden
for various options (including animation) for visualisation of a spatiotemporal density,
and plot.rrst
for viewing spatiotemporal relative risk surfaces (including animation and tolerance contour superimposition).
Printing and summarising objects
S3
methods (print.stden
, print.rrst
, summary.stden
, and summary.rrst
) are available for
the spatiotemporal density and spatiotemporal relative risk objects respectively.
The sparr
package depends upon
spatstat
. In particular, the user should familiarise
themselves with ppp
objects and
im
objects, which are used throughout. For spatiotemporal density estimation, sparr
is assisted by importing from the misc3d
package, and for the
experimental capabilities involving parallel processing, sparr
also
currently imports doParallel
,
parallel
, and foreach
.
To cite use of current versions of sparr
in publications or research projects please use:
Davies, T.M., Marshall, J.C. and Hazelton, M.L. (2018) Tutorial on kernel estimation of continuous spatial and spatiotemporal relative risk, Statistics in Medicine, 37(7), 1191-1221. <DOI:10.1002/sim.7577>
Old versions of sparr
(<= 2.1-09) can be referenced by Davies et al. (2011) (see reference list).
Abramson, I. (1982), On bandwidth variation in kernel estimates --- a square root law, Annals of Statistics, 10(4), 1217-1223.
Baddeley, A. and Turner, R. (2005), spatstat: an R package for analyzing spatial point patterns, Journal of Statistical Software, 12(6), 1-42.
Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R, Chapman and Hall/CRC Press, UK.
Bevan A. (2012), Spatial methods for analysing large-scale artefact inventories. Antiquity, 86, 492-506.
Bithell, J.F. (1990), An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, 691-701.
Bithell, J.F. (1991), Estimation of relative risk function, Statistics in Medicine, 10, 1745-1751.
Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3.
Campos, F.A. and Fedigan, L.M. (2014) Spatial ecology of perceived predation risk and vigilance behavior in white-faced capuchins, Behavioral Ecology, 25, 477-486.
Davies, T.M. (2013), Jointly optimal bandwidth selection for the planar kernel-smoothed density-ratio, Spatial and Spatio-temporal Epidemiology, 5, 51-65.
Davies, T.M. and Baddeley A. (2018), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, 28(4), 937-956.
Davies, T.M., Cornwall, J. and Sheard, P.W. (2013) Modelling dichotomously marked muscle fibre configurations, Statistics in Medicine, 32, 4240-4258.
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Davies, T.M., Hazelton, M.L. and Marshall, J.C.
(2011), sparr
: Analyzing spatial relative risk using fixed and
adaptive kernel density estimation in R
, Journal of Statistical
Software 39(1), 1-14.
Davies, T.M., Jones, K. and Hazelton, M.L. (2016), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Computational Statistics & Data Analysis, 101, 12-28.
Elson, R., Davies, T. M., Lake, I. R., Vivancos, R., Blomquist, P. B., Charlett, A. and Dabrera, G. (2021), The spatio-temporal distribution of COVID-19 infection in England between January and June 2020, Epidemiology and Infection, 149, e73.
Fernando, W.T.P.S. and Hazelton, M.L. (2014), Generalizing the spatial relative risk function, Spatial and Spatio-temporal Epidemiology, 8, 1-10.
Hazelton, M.L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.
Hazelton, M.L. (2023), Shrinkage estimators of the spatial relative risk function, Submitted for publication.
Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109.
Kelsall, J.E. and Diggle, P.J. (1995a), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
Kelsall, J.E. and Diggle, P.J. (1995b), Non-parametric estimation of spatial variation in relative risk, Statistics in Medicine, 14, 2335-2342.
Prince, M. I., Chetwynd, A., Diggle, P. J., Jarner, M., Metcalf, J. V. and James, O. F. W. (2001), The geographical distribution of primary biliary cirrhosis in a well-defined cohort, Hepatology 34, 1083-1088.
Sabel, C. E., Gatrell, A. C., Loytonen, M., Maasilta, P. and Jokelainen, M. (2000), Modelling exposure opportunitites: estimating relative risk for motor disease in Finland, Social Science & Medicine 50, 1121-1137.
Smith, B.A., Davies, T.M. and Higham, C.F.W. (2015) Spatial and social variables in the Bronze Age phase 4 cemetery of Ban Non Wat, Northeast Thailand, Journal of Archaeological Science: Reports, 4, 362-370.
Taylor, C.C. (1989) Bootstrap choice of the smoothing parameter in kernel density estimation, Biometrika, 76, 705-712.
Terrell, G.R. (1990), The maximal smoothing principle in density estimation, Journal of the American Statistical Association, 85, 470-477.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Fourth Edition, Springer, New York.
Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.
Wheeler, D. C. (2007), A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003, International Journal of Health Geographics, 6(13).