Abstract:
One of the main objectives of this dissertation is to derive efficient nonparametric estimators
for an unknown density f . It is well known that the ordinary kernel density
estimator has, despite of several good properties, some drawbacks. For example, it suffers
from boundary bias and it also exhibits spurious bumps in the tails. Various solutions
to overcome these defects are presented in this study, which include the application of a
transformation kernel density estimator. The latter estimator (if implemented correctly)
is pursued as a simultaneous solution for both boundary bias and spurious bumps in the
tails. The estimator also has, among others, the ability to detect and estimate density
modes more effectively.
To apply the transformation kernel density estimator an effective transformation of the
data is required. To achieve this objective, an extensive discussion of parametric transformations
introduced and studied in the literature is presented firstly, emphasizing the
practical feasibility of these transformations. Secondly, known methods of estimating the
parameters associated with these transformations are discussed (e.g. profile maximum
likelihood), and two new estimation techniques, referred to as the minimum residual and
minimum distance methods, are introduced. Furthermore, new procedures are developed
to select a parametric transformation that is suitable for application to a given set of
data. Finally, utilizing the above techniques, the desired optimal transformation to any
target distribution (e.g. the normal distribution) is introduced, which has the property
that it can also be iterated. A polynomial approximation of the optimal transformation
function is presented. It is shown that the performance of this transformation exceeds
that of any transformation available in the literature.
In the context of transformation kernel density estimation, we present a comprehensive
literature study of current methods available and then introduce the new semi-parametric
transformation estimation procedure based on the optimal transformation of data to normality.
However, application of the optimal transformation in this context requires special
attention. In order to create a density estimator that addresses both boundary bias and
spurious bumps in the tails simultaneously in an automatic way, a generalized bandwidth
adaptation procedure is developed, which is applied in conjunction with a newly developed
constant shift procedure.
Furthermore, the optimal transformation function is based on a kernel distribution function
estimator. A new data-based smoothing parameter (bandwidth selector) is invented,
and it is shown that this selector has better performance than a well established bandwidth
selector proposed in the literature.
To evaluate the performance of the newly proposed semi-parametric transformation estimation
procedure, a simulation study is presented based on densities that consist of a
wide range of forms. Some of the main results derived in the Monte Carlo simulation
study include that:
- the proposed optimal transformation function can take on all the possible shapes of a parametric transformation as well as any combination of these shapes, which result in high p-values when testing normality of the transformed data.
- the new minimum residual and minimum distance techniques contribute to better transformations to normality, when a parametric transformation is applicable.
- the newly proposed semi-parametric transformation kernel density estimator perform well for unimodal, low and high kurtosis densities. Moreover, it estimates densities with much curvature (e.g. modes and valleys) more effectively than existing procedures in the literature.
- the new transformation density estimator does not exhibit spurious bumps in the tail regions.
- boundary bias is addressed automatically.
In conclusion, practical examples based on real-life data are presented.