R/LSCV.risk.R
LSCV.risk.Rd
Methods to find a jointly optimal, common case-control isotropic bandwidth for use in estimation of the fixed or adaptive kernel-smoothed relative risk function.
Either a pre-calculated object of class bivden
representing the `case' (numerator) density estimate, or an object of class
ppp
giving the observed case data. Alternatively, if
f
is ppp
object with dichotomous
factor-valued marks
, the function treats the first
level as the case data, and the second as the control data, obviating the
need to supply g
.
As for f
, for the `control' (denominator) density; this
object must be of the same class as f
. Ignored if, as stated above,
f
contains both case and control observations.
An optional vector of length 2 giving the limits of the optimisation routine with respect to the bandwidth. If unspecified, the function attempts to choose this automatically.
An optional increasing sequence of bandwidth values at which to
manually evaluate the optimisation criterion. Used only in the case
(!auto.optim && is.null(hlim))
.
A character string; "fixed"
(default) performs classical leave-one-out
cross-validation for a jointly optimal fixed bandwidth. Alternatively, "adaptive"
utilises
multiscale adaptive kernel estimation (Davies & Baddeley, 2018) to run the cross-validation
in an effort to find a suitable jointly optimal, common global bandwidth for the adaptive relative risk function. See `Details'.
A character string controlling the selector to use. There are
three types, based on either the mean integrated squared error (MISE)
(Kelsall and Diggle, 1995; default -- method = "kelsall-diggle"
); a
weighted MISE (Hazelton, 2008 -- method = "hazelton"
); or an
approximation to the asymptotic MISE (Davies, 2013 -- method =
"davies"
). See `Details'.
Spatial grid size; the optimisation will be based on a
[resolution
\(\times\) resolution
] density estimate.
Logical value indicating whether to edge-correct the density estimates used.
A single numeric value or a vector of length 2 giving the pilot
bandwidth(s) to be used for estimation of the pilot
densities for adaptive risk surfaces. Ignored if type = "fixed"
.
A character string used to control the type of
symmetry, if any, to use for the bandwidth factors when computing an
adaptive relative risk surface. See `Details'. Ignored if type = "fixed"
.
Logical value indicating whether to automate the numerical
optimisation using optimise
. If FALSE
, the optimisation
criterion is evaluated over hseq
(if supplied), or over a seqence of
values controlled by hlim
and seqres
.
Optional resolution of an increasing sequence of bandwidth
values. Only used if (!auto.optim && is.null(hseq))
.
Numeric argument to invoke parallel processing, giving
the number of CPU cores to use when !auto.optim
. Experimental. Test
your system first using parallel::detectCores()
to identify the
number of cores available to you.
Logical value indicating whether to provide function progress commentary.
Additional arguments such as dimz
and trim
to be passed to
the internal calls to multiscale.density
.
A single numeric value of the estimated bandwidth (if
auto.optim = TRUE
). Otherwise, a list of two numeric vectors of equal
length giving the bandwidth sequence (as hs
) and corresponding CV
function value (as CV
).
Given the established preference of using a common bandwidth for both case
and control density estimates when constructing a relative
risk surface, This function calculates a `jointly optimal', common isotropic
LSCV bandwidth for the (Gaussian) kernel-smoothed relative risk function
(case-control density-ratio). It can be shown that choosing a bandwidth that
is equal for both case and control density estimates is preferable to
computing `separately optimal' bandwidths (Kelsall and Diggle, 1995). The user
can choose to either calculate a common smoothing parameter for a fixed-bandwidth
relative risk surface (type = "fixed"
; default), or a common global bandwidth for
an adaptive risk surface (type = "adaptive"
). See further comments below.
method = "kelsall-diggle"
: the function computes the
common bandwidth which minimises the approximate mean integrated squared
error (MISE) of the log-transformed risk surface (Kelsall and Diggle, 1995).
method = "hazelton"
: the function minimises a
weighted-by-control MISE of the (raw) relative risk function
(Hazelton, 2008).
method = "davies"
: the optimal bandwidth is
one that minimises a crude plug-in approximation to the asymptotic
MISE (Davies, 2013). Only possible for type = "fixed"
.
For jointly optimal, common global bandwidth selection when type = "adaptive"
, the
optimisation routine utilises multiscale.density
. Like LSCV.density
,
the leave-one-out procedure does not affect the pilot density, for which additional
control is offered via the hp
and pilot.symmetry
arguments. The user has the option of
obtaining a so-called symmetric estimate (Davies et al. 2016) via
pilot.symmetry
. This amounts to choosing the same pilot density for
both case and control densities. By choosing "none"
(default), the
result uses the case and control data separately for the fixed-bandwidth
pilots, providing the original asymmetric density-ratio of Davies and
Hazelton (2010). By selecting either of "f"
, "g"
, or
"pooled"
, the pilot density is calculated based on the case, control,
or pooled case/control data respectively (using hp[1]
as the fixed
bandwidth). Davies et al. (2016) noted some beneficial practical behaviour
of the symmetric adaptive surface over the asymmetric. (The pilot bandwidth(s), if not supplied in hp
, are calculated
internally via default use of LSCV.density
, using the requested symmetric-based data set, or separately with respect to the case and control datasets f
and g
if
pilot.symmetry = "none"
.)
The jointly optimal bandwidth selector can be
computationally expensive for large data sets and fine evaluation grid
resolutions. The user may need to experiment with adjusting hlim
to
find a suitable minimum.
Davies, T. M. (2013), Jointly optimal bandwidth selection for the planar kernel-smoothed density-ratio, Spatial and Spatio-temporal Epidemiology, 5, 51-65.
Davies, T.M. and Baddeley A. (2018), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, 28(4), 937-956.
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Davies, T.M., Jones, K. and Hazelton, M.L. (2016), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Computational Statistics & Data Analysis, 101, 12-28.
Hazelton, M. L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.
Kelsall, J.E. and Diggle, P.J. (1995), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, Chapman & Hall, New York.
Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.
# \donttest{
data(pbc)
pbccas <- split(pbc)$case
pbccon <- split(pbc)$control
# FIXED (for common h)
LSCV.risk(pbccas,pbccon)
#> Searching for optimal Kelsall-Diggle h in [0.1,15.278]...Done.
#> [1] 8.029713
LSCV.risk(pbccas,pbccon,method="hazelton")
#> Searching for optimal Hazelton h in [0.1,15.278]...Done.
#> [1] 3.04405
hcv <- LSCV.risk(pbccas,pbccon,method="davies",auto.optim=FALSE)
#> ================================================================================
plot(hcv[,1],log(hcv[,2]));abline(v=hcv[which.min(hcv[,2]),1],col=2,lty=2)
# ADAPTIVE (for common h0)
LSCV.risk(pbccas,pbccon,type="adaptive")
#> Selecting pilot bandwidth(s)...
#> --f--
#> --g--
#> Done.
#> [ Using hp(f) = 0.427712498752032 ; hp(g) = 0.287502277115396 ]
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.105866014203302, 13.7203855365642]...Done.
#> [1] 5.809375
# change pilot bandwidths used
LSCV.risk(pbccas,pbccon,type="adaptive",hp=c(OS(pbccas)/2,OS(pbccon)/2))
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.108673683863955, 13.6719393206487]...Done.
#> [1] 3.754076
# specify pooled-data symmetric relative risk estimator
LSCV.risk(pbccas,pbccon,type="adaptive",hp=OS(pbc),pilot.symmetry="pooled")
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.11175854841505, 13.620321859812]...Done.
#> [1] 3.039816
# as above, for Hazelton selector
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton")
#> Selecting pilot bandwidth(s)...
#> --f--
#> --g--
#> Done.
#> [ Using hp(f) = 0.427712498752032 ; hp(g) = 0.287502277115396 ]
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.105866014203302, 13.7203855365642]...Done.
#> [1] 1.905045
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton",hp=c(OS(pbccas)/2,OS(pbccon)/2))
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.108673683863955, 13.6719393206487]...Done.
#> [1] 2.391366
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton",hp=OS(pbc),pilot.symmetry="pooled")
#> Computing multi-scale estimates...
#> --f--
#> --g--
#> Done.
#> Searching for optimal h0 in [0.11175854841505, 13.620321859812]...Done.
#> [1] 2.444436
# }