Provides functions to estimate fixed and adaptive kernel-smoothed spatial relative risk surfaces via the density-ratio method and perform subsequent inference. Fixed-bandwidth spatiotemporal density and relative risk estimation is also supported.

Details

Package:sparr
Date:2023-03-08
Version:2.3-10
License:GPL (>= 2)

Kernel smoothing, and the flexibility afforded by this methodology, provides an attractive approach to estimating complex probability density functions.

The spatial relative risk function, constructed as a ratio of estimated case to control densities (Bithell, 1990; 1991; Kelsall and Diggle, 1995a,b), describes the variation in the `risk' of the disease, given the underlying at-risk population. This is a technique that has been applied successfully for mainly exploratory purposes in a number of different analyses (see for example Sabel et al., 2000; Prince et al., 2001; Wheeler, 2007, Elson et al., 2021). It has also grown in popularity in very different fields that pose similarly styled research questions, such as ecology (e.g. Campos and Fedigan, 2014); physiology (Davies et al., 2013); and archaeology (e.g. Bevan, 2012; Smith et al. 2015).

This package provides functions for spatial (i.e. bivariate/planar/2D) kernel density estimation (KDE), implementing both fixed and `variable' or `adaptive' (Abramson, 1982) smoothing parameter options. A selection of bandwidth calculators for bivariate KDE and the relative risk function are provided, including one based on the maximal smoothing principle (Terrell, 1990), and others involving a leave-one-out cross-validation (see below). In addition, the ability to construct both Monte-Carlo and asymptotic p-value surfaces (`tolerance' contours of which signal statistically significant sub-regions of extremity in a risk surface - Hazelton and Davies, 2009; Davies and Hazelton, 2010) as well as some visualisation tools are provided.

Spatiotemporal estimation is also supported, largely following developments in Fernando and Hazelton (2014). This includes their fixed-bandwith kernel estimator of spatiotemporal densities, relative risk, and asymptotic tolerance contours.

Key content of sparr can be broken up as follows:

DATASETS/DATA GENERATION

pbc a case/control planar point pattern (ppp.object) concerning liver disease in northern England.

fmd an anonymised (jittered) case/control spatiotemporal point pattern of the 2001 outbreak of veterinary foot-and-mouth disease in Cumbria (courtesy of the Animal and Plant Health Agency, UK).

burk a spatiotemporal point pattern of Burkitt's lymphoma in Uganda; artificially simulated control data are also provided for experimentation.

Also available are a number of relevant additional spatial datasets built-in to the spatstat package (Baddeley and Turner, 2005; Baddeley et al., 2015) through spatstat.data, such as chorley, which concerns the distribution of laryngeal cancer in an area of Lancashire, UK.

rimpoly a wrapper function of rpoint to allow generated spatial point patterns based on a pixel image to be returned with a polygonal owin.

SPATIAL

Bandwidth calculators

OS estimation of an isotropic smoothing parameter for fixed-bandwidth bivariate KDE, based on the oversmoothing principle introduced by Terrell (1990).

NS estimation of an isotropic smoothing parameter for fixed-bandwidth bivariate KDE, based on the asymptotically optimal value for a normal density (bivariate normal scale rule - see e.g. Wand and Jones, 1995).

LSCV.density a least-squares cross-validated (LSCV) estimate of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Bowman and Azzalini, 1997).

LIK.density a likelihood cross-validated (LIK) estimate of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Silverman, 1986).

SLIK.adapt an experimental likelihood cross-validation function for simultaneous global/pilot bandwidth selection for adaptive density estimates.

BOOT.density a bootstrap approach to optimisation of an isotropic fixed bandwidth for bivariate, edge-corrected KDE (see e.g. Taylor, 1989).

LSCV.risk Estimation of a jointly optimal, common isotropic case-control fixed bandwidth for the kernel-smoothed risk function based on the mean integrated squared error (MISE), a weighted MISE, or the asymptotic MISE (see respectively Kelsall and Diggle, 1995a; Hazelton, 2008; Davies, 2013).

Density and relative risk estimation

bivariate.density kernel density estimate of bivariate data; fixed or adaptive smoothing.

multiscale.density multi-scale adaptive kernel density estimates for multiple global bandwidths as per Davies and Baddeley (2018).

multiscale.slice a single adaptive kernel estimate based on taking a slice from a multi-scale estimate.

risk estimation of a (log) spatial relative risk function, either from data or pre-existing bivariate density estimates; fixed (Kelsall and Diggle, 1995a); fixed with shrinkage (Hazelton, 2023); or both asymmetric (Davies and Hazelton, 2010) and symmetric (Davies et al., 2016) adaptive estimates are possible.

tolerance calculation of asymptotic or Monte-Carlo p-value surfaces.

Visualisation

S3 methods of the plot function; see plot.bivden for visualising a single bivariate density estimate from bivariate.density, plot.rrs for visualisation of a spatial relative risk function from risk, or plot.msden for viewing animations of multi-scale density estimates from multiscale.density.

tol.contour provides more flexibility for plotting and superimposing tolerance contours upon an existing plot of spatial relative risk (i.e. given output from tolerance).

Printing and summarising

S3 methods (print.bivden, print.rrs, print.msden, summary.bivden, summary.rrs, and summary.msden) are available for the bivariate density, spatial relative risk, and multi-scale adaptive density objects.

SPATIOTEMPORAL

Bandwidth calculators

OS.spattemp estimation of an isotropic smoothing parameter for the spatial margin and another for the temporal margin for spatiotemporal densities, based on the 2D and 1D versions, respectively, of the oversmoothing principle introduced by Terrell (1990).

NS.spattemp as above, based on the 2D and 1D versions of the normal scale rule (Silverman, 1986).

LSCV.spattemp least-squares cross-validated (LSCV) estimates of scalar spatial and temporal bandwidths for edge-corrected spatiotemporal KDE.

LIK.spattemp as above, based on likelihood cross-validation.

BOOT.spattemp bootstrap bandwidth selection for the spatial and temporal margins; for spatiotemporal, edge-corrected KDE (Taylor, 1989).

Density and relative risk estimation

spattemp.density fixed-bandwidth kernel density estimate of spatiotemporal data.

spattemp.risk fixed-bandwidth kernel density estimate of spatiotemporal relative risk, either with a time-static or time-varying control density (Fernando and Hazelton, 2014).

spattemp.slice extraction function of the spatial density/relative risk at prespecified time(s).

Visualisation

S3 methods of the plot function; see plot.stden for various options (including animation) for visualisation of a spatiotemporal density, and plot.rrst for viewing spatiotemporal relative risk surfaces (including animation and tolerance contour superimposition).

Printing and summarising objects

S3 methods (print.stden, print.rrst, summary.stden, and summary.rrst) are available for the spatiotemporal density and spatiotemporal relative risk objects respectively.

Dependencies

The sparr package depends upon spatstat. In particular, the user should familiarise themselves with ppp objects and im objects, which are used throughout. For spatiotemporal density estimation, sparr is assisted by importing from the misc3d package, and for the experimental capabilities involving parallel processing, sparr also currently imports doParallel, parallel, and foreach.

Citation

To cite use of current versions of sparr in publications or research projects please use:

Davies, T.M., Marshall, J.C. and Hazelton, M.L. (2018) Tutorial on kernel estimation of continuous spatial and spatiotemporal relative risk, Statistics in Medicine, 37(7), 1191-1221. <DOI:10.1002/sim.7577>

Old versions of sparr (<= 2.1-09) can be referenced by Davies et al. (2011) (see reference list).

References

Abramson, I. (1982), On bandwidth variation in kernel estimates --- a square root law, Annals of Statistics, 10(4), 1217-1223.

Baddeley, A. and Turner, R. (2005), spatstat: an R package for analyzing spatial point patterns, Journal of Statistical Software, 12(6), 1-42.

Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R, Chapman and Hall/CRC Press, UK.

Bevan A. (2012), Spatial methods for analysing large-scale artefact inventories. Antiquity, 86, 492-506.

Bithell, J.F. (1990), An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, 691-701.

Bithell, J.F. (1991), Estimation of relative risk function, Statistics in Medicine, 10, 1745-1751.

Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3.

Campos, F.A. and Fedigan, L.M. (2014) Spatial ecology of perceived predation risk and vigilance behavior in white-faced capuchins, Behavioral Ecology, 25, 477-486.

Davies, T.M. (2013), Jointly optimal bandwidth selection for the planar kernel-smoothed density-ratio, Spatial and Spatio-temporal Epidemiology, 5, 51-65.

Davies, T.M. and Baddeley A. (2018), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, 28(4), 937-956.

Davies, T.M., Cornwall, J. and Sheard, P.W. (2013) Modelling dichotomously marked muscle fibre configurations, Statistics in Medicine, 32, 4240-4258.

Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.

Davies, T.M., Hazelton, M.L. and Marshall, J.C. (2011), sparr: Analyzing spatial relative risk using fixed and adaptive kernel density estimation in R, Journal of Statistical Software 39(1), 1-14.

Davies, T.M., Jones, K. and Hazelton, M.L. (2016), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Computational Statistics & Data Analysis, 101, 12-28.

Elson, R., Davies, T. M., Lake, I. R., Vivancos, R., Blomquist, P. B., Charlett, A. and Dabrera, G. (2021), The spatio-temporal distribution of COVID-19 infection in England between January and June 2020, Epidemiology and Infection, 149, e73.

Fernando, W.T.P.S. and Hazelton, M.L. (2014), Generalizing the spatial relative risk function, Spatial and Spatio-temporal Epidemiology, 8, 1-10.

Hazelton, M.L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.

Hazelton, M.L. (2023), Shrinkage estimators of the spatial relative risk function, Submitted for publication.

Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109.

Kelsall, J.E. and Diggle, P.J. (1995a), Kernel estimation of relative risk, Bernoulli, 1, 3-16.

Kelsall, J.E. and Diggle, P.J. (1995b), Non-parametric estimation of spatial variation in relative risk, Statistics in Medicine, 14, 2335-2342.

Prince, M. I., Chetwynd, A., Diggle, P. J., Jarner, M., Metcalf, J. V. and James, O. F. W. (2001), The geographical distribution of primary biliary cirrhosis in a well-defined cohort, Hepatology 34, 1083-1088.

Sabel, C. E., Gatrell, A. C., Loytonen, M., Maasilta, P. and Jokelainen, M. (2000), Modelling exposure opportunitites: estimating relative risk for motor disease in Finland, Social Science & Medicine 50, 1121-1137.

Smith, B.A., Davies, T.M. and Higham, C.F.W. (2015) Spatial and social variables in the Bronze Age phase 4 cemetery of Ban Non Wat, Northeast Thailand, Journal of Archaeological Science: Reports, 4, 362-370.

Taylor, C.C. (1989) Bootstrap choice of the smoothing parameter in kernel density estimation, Biometrika, 76, 705-712.

Terrell, G.R. (1990), The maximal smoothing principle in density estimation, Journal of the American Statistical Association, 85, 470-477.

Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Fourth Edition, Springer, New York.

Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.

Wheeler, D. C. (2007), A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003, International Journal of Health Geographics, 6(13).

Author

T.M. Davies
Dept. of Mathematics & Statistics, University of Otago, Dunedin, New Zealand.
J.C. Marshall
Institute of Fundamantal Sciences, Massey University, Palmerston North, New Zealand.

Maintainer: T.M.D. tilman.davies@otago.ac.nz