Uses a user-supplied function to find the k nearest neighbours of
specified points in a dataset, adding the option to wrap certain variables
on a torus.
An \(M\) by \(d\) numeric matrix or data frame. Each of the \(M\) rows contains a \(d\)-dimensional observation.
An \(N\) by \(d\) numeric matrix or data frame. Each row
contains an \(d\)-dimensional point that will be queried against
data.
An integer scalar. The number of nearest neighbours, of the
points in the rows of query, to find.
The function with which to calculate the nearest neighbours.
The syntax of this function must be fn(data, query, k, ...).
The default is RANN::nn2. Another possibility is
nabor::knn.
An integer vector with element(s) in
{1, ..., ncol(data)}. The corresponding variables are wrapped
on the corresponding range gives in ranges.
A length(torus) by 2 numeric matrix.
Row i gives the range of variation of the variable indexed by
torus[i]. ranges[i, 1] and ranges[i, 2]
are equivalent values of the variable, such as 0 degrees and 360 degrees.
If length(torus) = 1 then ranges may be a vector of length
2.
An integer scalar, equal to 1 or 2. See Details.
Further arguments to be passed to fn.
An object (a list) of class c("nnt", "donut") containing the
following components.
An \(N\) by \(d\) integer matrix of the k
nearest neighbour indices, i.e. the rows of data.
An \(N\) by \(d\) numeric matrix of the k
nearest neighbour distances.
The arguments data, query,
k and fn (in fact substitute(fn)).
If torus is supplied, the
arguments torus, ranges and method.
The call to spm.
If method = 1 then the data are partially replicated, arranged
around the original data in a way that wraps the variables in torus on their respective
ranges in ranges. Then fn is called using this replicated
dataset as the argument data. If k is large and/or
data is a sparse dataset then it is possible that a single
observation contributes more than once to a set of nearest neighbours,
which is incorrect. If this occurs then nnt uses method 2 to
correct the offending rows in nn.idx and nn.dists in the
returned list object.
If method = 2 then the
following approach is used for the point in each row in query.
The data indexed by torus are shifted (and wrapped) so that the
point is located at the respective midpoints of ranges.
Method 2 is efficient only if the number of points in query is
small.
If torus is missing then fn is called using
fn(data = data, query = query, k = k, ...), so that a call to
nnt is equivalent to a call to the function chosen by fn.
Arya, S., Mount, D., Kemp, S. E. and Jefferis, G. (2019) RANN: Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric. R package version 2.6.1. https://CRAN.R-project.org/package=RANN
Elseberg J., Magnenat S., Siegwart R., Nuchter, A. (2012) Comparison of nearest-neighbor-search strategies and implementations for efficient shape registration. Journal of Software Engineering for Robotics (JOSER), 3(1), 2-12 https://CRAN.R-project.org/package=nabor
RANN::nn2,
nabor::knn: nearest neighbour searches.
plot.nnt plot method for objects returned from
nnt (1 and 2 dimensional data only).
got_RANN <- requireNamespace("RANN", quietly = TRUE)
got_nabor <- requireNamespace("nabor", quietly = TRUE)
set.seed(20092019)
# 2D example from the RANN:nn2 documentation (L2 metric)
x1 <- runif(100, 0, 2 * pi)
x2 <- runif(100, 0, 3)
DATA <- data.frame(x1, x2)
if (got_RANN) {
nearest <- nnt(DATA, DATA)
}
# Suppose that x1 should be wrapped
ranges1 <- c(0, 2 * pi)
query1 <- rbind(c(6, 1.3), c(2 * pi, 3), c(3, 1.5), c(4, 0))
if (got_RANN) {
res1 <- nnt(DATA, query1, k = 8, torus = 1, ranges = ranges1)
plot(res1, ylim = c(0, 3))
}
# Suppose that x1 and x2 should be wrapped
ranges2 <- rbind(c(0, 2 * pi), c(0, 3))
query2 <- rbind(c(6, 1.3), c(2 * pi, 3), c(3, 1.5), c(4, 0))
if (got_RANN) {
res2 <- nnt(DATA, query2, k = 8, torus = 1:2, ranges = ranges2)
plot(res2)
}
# Use nabor::knn (L2 metric) instead of RANN::nn2
if (got_nabor) {
res3 <- nnt(DATA, query2, k = 8, fn = nabor::knn, torus = 1:2,
ranges = ranges2)
plot(res3)
}
# 1D example
ranges <- c(0, 2 * pi)
query <- c(4, 0.1)
if (got_RANN) {
res <- nnt(x1, query, torus = 1, ranges = ranges, method = 1)
plot(res)
}