Package 'kedd' reference manual

Title:	Kernel Estimator and Bandwidth Selection for Density and Its Derivatives
Description:	Smoothing techniques and computing bandwidth selectors of the nth derivative of a probability density for one-dimensional data (described in Arsalane Chouaib Guidoum (2020) <arXiv:2012.06102> [stat.CO]).
Authors:	Iago Giné-Vázquez [cre] , Arsalane Chouaib Guidoum [aut]
Maintainer:	Iago Giné-Vázquez <[email protected]>
License:	GPL (>= 2)
Version:	1.0.4
Built:	2024-10-31 21:13:39 UTC
Source:	https://gitlab.com/iagogv/kedd

Kernel Estimator and Bandwidth Selection for Density and Its Derivatives

Description

Smoothing techniques and computing bandwidth selectors of the r'th derivative of a probability density for one-dimensional data.

Details

Package:	kedd
Type:	Package
Version:	1.0.4
Date:	2024-01-27
License:	GPL (>= 2)

There are four main types of functions in this package:

Compute the derivatives and convolutions of a kernel function (1-d).
Compute the kernel estimators for density and its derivatives (1-d).
Computing the bandwidth selectors (1-d).
Displaying kernel estimators.

Main Features

Convolutions and derivatives in kernel function:

In non-parametric statistics, a kernel is a weighting function used in non-parametric estimation techniques. The kernels functions $K(x)$ are used in derivatives of kernel density estimator to estimate $\hat{f}^{(r)}_{h}(x)$ , satisfying the following three requirements:

$\int_{R} K(x) dx = 1$
$\int_{R} xK(x) dx = 0$
$\mu_{2}(K) = \int_{R}x^{2} K(x) dx < \infty$

Several types of kernel functions $K(x)$ are commonly used in this package: Gaussian, Epanechnikov, Uniform (rectangular), Triangular, Triweight, Tricube, Biweight (quartic), Cosine.

The function kernel.fun for kernel derivative $K^{(r)}(x)$ and kernel.conv for kernel convolution $K^{(r)}\ast K^{(r)} (x)$ , where the write formally:

$K^{(r)}(x) = \frac{d^{r}}{d x^{r}} K(x)$

$K^{(r)} \ast K^{(r)} (x) = \int_{-\infty}^{+\infty} K^{(r)}(y)K^{(r)}(x-y)dy$

for $r = 0, 1, 2, \dots$

Estimators of r'th derivative of a density function:

A natural estimator of the r'th derivative of a density function $f(x)$ is:

$\hat{f}^{(r)}_{h}(x)= \frac{d^{r}}{d x^{r}} \frac{1}{nh} \sum_{i=1}^{n} K\left(\frac{x-X_{i}}{h}\right) = \frac{1}{nh^{r+1}}\sum_{i=1}^{n} K^{(r)}\left(\frac{x-X_{i}}{h}\right)$

Here, $X_{1}, X_{2}, \dots,X_{n}$ is an i.i.d, sample of size $n$ from the distribution with density $f(x)$ , $K(x)$ is the kernel function which we take to be a symmetric probability density with at least $r$ non zero derivatives when estimating $f^{(r)}(x)$ , and $h$ is the bandwidth, this parameter is very important that controls the degree of smoothing applied to the data.

The case $(r=0)$ is the standard kernel density estimator (e.g. Silverman 1986, Wolfgang 1991, Scott 1992, Wand and Jones 1995, Jeffrey 1996, Bowman and Azzalini 1997, Alexandre 2009), properties of such derivative estimators are well known e.g. Sheather and Jones (1991), Jones and Kappenman (1991), Wolfgang (1991). For the case $(r > 0)$ , is derivative of kernel density estimator (e.g. Bhattacharya 1967, Schuster 1969, Alekseev 1972, Wolfgang et all 1990, Jones 1992, Stoker 1993) and for applications which require the estimation of density derivatives can be found in Singh (1977).

For r'th derivatives of kernel density estimator one-dimensional, the main function is dkde. For display, its plot method calls plot.dkde, and if to add a plot using lines.dkde.

  R> data(trimodal)
  R> dkde(x = trimodal, deriv.order = 0, kernel = "gaussian")
   
    Data: trimodal (200 obs.);      Kernel: gaussian
    Derivative order: 0;    Bandwidth 'h' = 0.1007
          eval.points           est.fx         
    Min.   :-2.91274   Min.   :0.0000066  
    1st Qu.:-1.46519   1st Qu.:0.0669750  
    Median :-0.01765   Median :0.1682045  
    Mean   :-0.01765   Mean   :0.1723692  
    3rd Qu.: 1.42989   3rd Qu.:0.2484626  
    Max.   : 2.87743   Max.   :0.4157340 
   
  R> dkde(x = trimodal, deriv.order = 1, kernel = "gaussian")
  
    Data: trimodal (200 obs.);      Kernel: gaussian
    Derivative order: 1;    Bandwidth 'h' = 0.09094
          eval.points           est.fx         
    Min.   :-2.87358   Min.   :-1.740447  
    1st Qu.:-1.44562   1st Qu.:-0.343952  
    Median :-0.01765   Median : 0.009057  
    Mean   :-0.01765   Mean   : 0.000000  
    3rd Qu.: 1.41031   3rd Qu.: 0.415343  
    Max.   : 2.83828   Max.   : 1.256891

Bandwidth selectors:

The most important factor in the r'th derivative kernel density estimate is a choice of the bandwidth $h$ for one-dimensional observations. Because of its role in controlling both the amount and the direction of smoothing, this choice is particularly important. We present the popular bandwidth selection (for more details see references) methods in this package:

Optimal Bandwidth (AMISE); with deriv.order >= 0, name of this function is h.amise.
For display, its plot method calls plot.h.amise, and to add a plot used lines.h.amise.
Maximum-likelihood cross-validation (MLCV); with deriv.order = 0, name of this function is h.mlcv.
For display, its plot method calls plot.h.mlcv, and to add a plot used lines.h.mlcv.
Unbiased cross validation (UCV); with deriv.order >= 0, name of this function is h.ucv.
For display, its plot method calls plot.h.ucv, and to add a plot used lines.h.ucv.
Biased cross validation (BCV); with deriv.order >= 0, name of this function is h.bcv.
For display, its plot method calls plot.h.bcv, and to add a plot used lines.h.bcv.
Complete cross-validation (CCV); with deriv.order >= 0, name of this function is h.ccv.
For display, its plot method calls plot.h.ccv, and to add a plot used lines.h.ccv.
Modified cross-validation (MCV); with deriv.order >= 0, name of this function is h.mcv.
For display, its plot method calls plot.h.mcv, and to add a plot used lines.h.mcv.
Trimmed cross-validation (TCV); with deriv.order >= 0, name of this function is h.tcv.
For display, its plot method calls plot.h.tcv, and to add a plot used lines.h.tcv.

  R> data(trimodal)
  R> h.bcv(x = trimodal, whichbcv = 1, deriv.order = 0, kernel = "gaussian")
  
    Call:           Biased Cross-Validation 1
    Derivative order = 0
    Data: trimodal (200 obs.);      Kernel: gaussian
    Min BCV = 0.004511636;  Bandwidth 'h' = 0.4357812 
	
  R> h.ccv(x = trimodal, deriv.order = 1, kernel = "gaussian")	
  
    Call:           Complete Cross-Validation
    Derivative order = 1 
    Data: trimodal (200 obs.);      Kernel: gaussian
    Min CCV = 0.01985078;   Bandwidth 'h' = 0.5828336
	
  R> h.tcv(x = trimodal, deriv.order = 2, kernel = "gaussian")
  
    Call:           Trimmed Cross-Validation
    Derivative order = 2
    Data: trimodal (200 obs.);      Kernel: gaussian
    Min TCV = -295.563;     Bandwidth 'h' = 0.08908582
	
  R> h.ucv(x = trimodal, deriv.order = 3, kernel = "gaussian")

    Call:           Unbiased Cross-Validation
    Derivative order = 3
    Data: trimodal (200 obs.);      Kernel: gaussian
    Min UCV = -63165.18;    Bandwidth 'h' = 0.1067236

For an overview of this package, see vignette("kedd").

Requirements

R version >= 2.15.0

Licence

This package and its documentation are usable under the terms of the "GNU General Public License", a copy of which is distributed with the package.

References

Alekseev, V. G. (1972). Estimation of a probability density function and its derivatives. Mathematical notes of the Academy of Sciences of the USSR. 12(5), 808–811.

Alexandre, B. T. (2009). Introduction to Nonparametric Estimation. Springer-Verlag, New York.

Bowman, A. W. (1984). An alternative method of cross-validation for the smoothing of kernel density estimates. Biometrika, 71, 353–360.

Bowman, A. W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

Bowman, A.W. and Azzalini, A. (2003). Computational aspects of nonparametric smoothing with illustrations from the sm library. Computational Statistics and Data Analysis, 42, 545–560.

Bowman, A.W. and Azzalini, A. (2013). sm: Smoothing methods for nonparametric regression and density estimation. R package version 2.2-5.3. Ported to R by B. D. Ripley.

Bhattacharya, P. K. (1967). Estimation of a probability density function and Its derivatives. Sankhya: The Indian Journal of Statistics, Series A, 29, 373–382.

Duin, R. P. W. (1976). On the choice of smoothing parameters of Parzen estimators of probability density functions. IEEE Transactions on Computers, C-25, 1175–1179.

Feluch, W. and Koronacki, J. (1992). A note on modified cross-validation in density estimation. Computational Statistics and Data Analysis, 13, 143–151.

George, R. T. (1990). The maximal smoothing principle in density estimation. Journal of the American Statistical Association, 85, 470–477.

George, R. T. and Scott, D. W. (1985). Oversmoothed nonparametric density estimates. Journal of the American Statistical Association, 80, 209–214.

Habbema, J. D. F., Hermans, J., and Van den Broek, K. (1974) A stepwise discrimination analysis program using density estimation. Compstat 1974: Proceedings in Computational Statistics. Physica Verlag, Vienna.

Heidenreich, N. B., Schindler, A. and Sperlich, S. (2013). Bandwidth selection for kernel density estimation: a review of fully automatic selectors. Advances in Statistical Analysis.

Jeffrey, S. S. (1996). Smoothing Methods in Statistics. Springer-Verlag, New York.

Jones, M. C. (1992). Differences and derivatives in kernel estimation. Metrika, 39, 335–340.

Jones, M. C., Marron, J. S. and Sheather,S. J. (1996). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association, 91, 401–407.

Jones, M. C. and Kappenman, R. F. (1991). On a class of kernel density estimate bandwidth selectors. Scandinavian Journal of Statistics, 19, 337–349.

Loader, C. (1999). Local Regression and Likelihood. Springer, New York.

Olver, F. W., Lozier, D. W., Boisvert, R. F. and Clark, C. W. (2010). NIST Handbook of Mathematical Functions. Cambridge University Press, New York, USA.

Peter, H. and Marron, J.S. (1987). Estimation of integrated squared density derivatives. Statistics and Probability Letters, 6, 109–115.

Peter, H. and Marron, J.S. (1991). Local minima in cross-validation functions. Journal of the Royal Statistical Society, Series B, 53, 245–252.

Radhey, S. S. (1987). MISE of kernel estimates of a density and its derivatives. Statistics and Probability Letters, 5, 153–159.

Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9, 65–78.

Scott, D. W. (1992). Multivariate Density Estimation. Theory, Practice and Visualization. New York: Wiley.

Scott, D.W. and George, R. T. (1987). Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association, 82, 1131–1146.

Schuster, E. F. (1969) Estimation of a probability density function and its derivatives. The Annals of Mathematical Statistics, 40 (4), 1187–1195.

Sheather, S. J. (2004). Density estimation. Statistical Science, 19, 588–597.

Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society, Series B, 53, 683–690.

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC. London.

Singh, R. S. (1977). Applications of estimators of a density and its derivatives to certain statistical problems. Journal of the Royal Statistical Society, Series B, 39(3), 357–363.

Stoker, T. M. (1993). Smoothing bias in density derivative estimation. Journal of the American Statistical Association, 88, 855–863.

Stute, W. (1992). Modified cross validation in density estimation. Journal of Statistical Planning and Inference, 30, 293–305.

Tarn, D. (2007). ks: Kernel density estimation and kernel discriminant analysis for multivariate data in R. Journal of Statistical Software, 21(7), 1–16.

Tristen, H. and Jeffrey, S. R. (2008). Nonparametric Econometrics: The np Package. Journal of Statistical Software,27(5).

Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. New York: Springer.

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.

Wand, M.P. and Ripley, B. D. (2013). KernSmooth: Functions for Kernel Smoothing for Wand and Jones (1995). R package version 2.23-10.

Wolfgang, H. (1991). Smoothing Techniques, With Implementation in S. Springer-Verlag, New York.

Wolfgang, H., Marlene, M., Stefan, S. and Axel, W. (2004). Nonparametric and Semiparametric Models. Springer-Verlag, Berlin Heidelberg.

Wolfgang, H., Marron, J. S. and Wand, M. P. (1990). Bandwidth choice for density derivatives. Journal of the Royal Statistical Society, Series B, 223–232.

Datasets

Description

A random sample of size 200 from the claw, bimodal, kurtotic, outlier and trimodal Gaussian density.

Usage

data(claw)
data(bimodal)
data(kurtotic)
data(outlier)
data(trimodal)
data(claw)
data(bimodal)
data(kurtotic)
data(outlier)
data(trimodal)

Format

Numeric vector with length 200.

Details

Generate 200 random numbers, distributed according to a normal mixture, using rnorMix in package nor1mix.

  ## Claw density
  claw <- rnorMix(n=200, MW.nm10)
  plot(MW.nm10)
  
  ## Bimodal density
  bimodal <- rnorMix(n=200, MW.nm7)
  plot( MW.nm7)
  
  ## Kurtotic density
  kurtotic <- rnorMix(n=200, MW.nm4)
  plot(MW.nm4)
  
  ## Outlier density
  outlier <- rnorMix(n=200, MW.nm5)
  plot( MW.nm5)
  
  ## Trimodal density
  trimodal <- rnorMix(n=200, MW.nm9)
  plot(MW.nm9)

Source

Randomly generated a normal mixture with the function rnorMix in package nor1mix.

References

Martin, M. (2013). nor1mix: Normal (1-d) mixture models (S3 classes and methods). R package version 1.1-4.

Derivatives of Kernel Density Estimator

Description

The (S3) generic function dkde computes the r'th derivative of kernel density estimator for one-dimensional data. Its default method does so with the given kernel and bandwidth $h$ for one-dimensional observations.

Usage

dkde(x, ...)
## Default S3 method:
dkde(x, y = NULL, deriv.order = 0, h, kernel = c("gaussian", 
         "epanechnikov", "uniform", "triangular", "triweight", 
         "tricube", "biweight", "cosine"), ...)
dkde(x, ...)
## Default S3 method:
dkde(x, y = NULL, deriv.order = 0, h, kernel = c("gaussian", 
         "epanechnikov", "uniform", "triangular", "triweight", 
         "tricube", "biweight", "cosine"), ...)

Arguments

`x`	the data from which the estimate is to be computed.
`y`	the points of the grid at which the density derivative is to be estimated; the defaults are $\tau * h$ outside of range( $x$ ), where $\tau = 4$ .
`deriv.order`	derivative order (scalar).
`h`	the smoothing bandwidth to be used, can also be a character string giving a rule to choose the bandwidth, see `h.bcv`. The default `h.ucv`.
`kernel`	a character string giving the smoothing kernel to be used, with default `"gaussian"`.
`...`	further arguments for (non-default) methods.

Details

A simple estimator for the density derivative can be obtained by taking the derivative of the kernel density estimate. If the kernel $K(x)$ is differentiable $r$ times then the r'th density derivative estimate can be written as:

$\hat{f}^{(r)}_{h}(x)=\frac{1}{nh^{r+1}}\sum_{i=1}^{n} K^{(r)}\left(\frac{x-X_{i}}{h}\right)$

where,

$K^{(r)}(x) = \frac{d^{r}}{d x^{r}} K(x)$

for $r = 0, 1, 2, \dots$

The following assumptions on the density $f^{(r)}(x)$ , the bandwidth $h$ , and the kernel $K(x)$ :

The $(r+2)$ derivative $f^{(r+2)}(x)$ is continuous, square integrable and ultimately monotone.
$\lim_{n \to \infty} h = 0$ and $\lim_{n \to \infty}n h^{2r+1} = \infty$ i.e., as the number of samples $n$ is increased $h$ approaches zero at a rate slower than $1/n^{2r+1}$ .
$K(x) \geq 0$ and $\int_{R} K(x) dx = 1$ . The kernel function is assumed to be symmetric about the origin i.e., $\int_{R} xK^{(r)}(x) dx = 0$ for even $r$ and has finite second moment i.e., $\mu_{2}(K)=\int_{R}x^{2} K(x) dx < \infty$ .

Some theoretical properties of the estimator $\hat{f}^{(r)}_{h}$ have been investigated, among others, by Bhattacharya (1967), Schuster (1969). Let us now turn to the statistical properties of estimator. We are interested in the mean squared error since it combines squared bias and variance.

The bias can be written as:

$E\left[\hat{f}^{(r)}_{h}(x)\right]- f^{(r)}(x) = \frac{1}{2}h^{2}\mu_{2}(K) f^{(r+2)}(x)+o(h^{2})$

The variance of the estimator can be written as:

$VAR\left[\hat{f}^{(r)}_{h}(x)\right]=\frac{f(x) R\left(K^{(r)}\right)}{nh^{2r+1}} + o(1/nh^{2r+1})$

with, $R\left(K^{(r)}\right) = \int_{R} \left(K^{(r)}(x)\right)^{2}dx.$

The MSE (Mean Squared Error) for kernel density derivative estimators can be written as:

$MSE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=\frac{f(x)R\left(K^{(r)}\right)}{nh^{2r+1}}+\frac{1}{4}h^{4}\mu_{2}^{2}(K) f^{(r+1)}(x)^{2}+o(h^{4}+1/nh^{2r+1})$

It follows that the MSE-optimal bandwidth for estimating $\hat{f}^{(r)}_{h}S(x)$ , is of order $n^{-1/(2r+5)}$ . Therefore, the estimation of $\hat{f}^{(1)}_{h}(x)$ requires a bandwidth of order $n^{-1/7}$ compared to the optimal $n^{-1/5}$ for estimating $f(x)$ itself. It reveals the increasing difficulty in problems of estimating higher derivatives.

The MISE (Mean Integrated Squared Error) can be written as:

$MISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=AMISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)+o(h^{4}+1/nh^{2r+1})$

where,

$AMISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=\frac{1}{nh^{2r+1}}R\left(K^{(r)}\right)+\frac{1}{4}h^{4}\mu_{2}^{2}(K)R\left(f^{(r+2)}\right)$

with: $R\left(f^{(r)}(x)\right) = \int_{R} \left(f^{(r)}(x)\right)^{2}dx.$
The performance of kernel is measured by MISE or AMISE (Asymptotic MISE).

If the bandwidth h is missing from dkde, then the default bandwidth is h.ucv(x,deriv.order,kernel) (Unbiased cross-validation, see h.ucv).
For more details see references.

Value

`x`	data points - same as input.
`data.name`	the deparsed name of the `x` argument.
`n`	the sample size after elimination of missing values.
`kernel`	name of kernel to use.
`deriv.order`	the derivative order to use.
`h`	the bandwidth value to use.
`eval.points`	the coordinates of the points where the density derivative is estimated.
`est.fx`	the estimated density derivative values.

Note

This function are available in other packages such as KernSmooth, sm, np, GenKern and locfit if deriv.order=0, and in ks package for Gaussian kernel only if 0 <= deriv.order <= 10.

Author(s)

Arsalane Chouaib Guidoum [email protected]

References

Alekseev, V. G. (1972). Estimation of a probability density function and its derivatives. Mathematical notes of the Academy of Sciences of the USSR. 12 (5), 808–811.