Search Results

  • Item
    Pattern recognition via projection – based k – NN rules
    (2008-06) Fraiman, Ricardo; Justel, Ana; Svarc, Marcela
    We introduce a new procedure for pattern recognition, based on the concepts of random projections and nearest neighbors. It can be thought as an improvement of the classical nearest neighbors classification rules. Besides the concept of neighbors we introduce the notion of district, a larger set which will be projected. Then we apply one dimensional k-NN methods to the projected data on randomly selected directions. In this way we are able to provide a method with some robustness properties and more accurate to handle high dimensional data. The procedure is also universally consistent. We challenge the method with the Isolet data where we obtain a very high classification score.
  • Item
    Selection of variables for cluster analysis and classification rules
    (2006-11) Fraiman, Ricardo; Justel, Ana; Svarc, Marcela
  • Item
    The choice of inicial Estimate for Computing MM-Estimates
    (2008-04) Svarc, Marcela
    We show, using a Monte Carlo study, that MM-estimates with projec- tion estimates as starting point of an iterative weighted least squares algorithm, behave more robustly than MM-estimates starting at an S-estimate and similar Gaussian efficiency. Moreover the former have a robustness behavior close to the P-estimates with an additional advantage: they are asymptotically normal making statistical inference possible.
  • Item
    Resistant estimates for high dimensional and functional data based on random projections
    (2011) Fraiman, Ricardo; Svarc, Marcela
    We herein propose a new robust estimation method based on random projections that is adaptive and, automatically produces a robust estimate, while enabling easy computations for high or infinite dimensional data. Under some restricted contamination models, the procedure is robust and attains full efficiency. We tested the method using both simulated and real data.
  • Item
    Principal components for multivariate functional data
    (2011) Barrendero, J.R.; Justel, Ana; Svarc, Marcela
    A principal component method for multivariate functional data is proposed. Data can be arranged in a matrix whose elements are functions so that for each individual a vector of p functions is observed. This set of p curves is reduced to a small number of transformed functions, retaining as much information as possible. The criterion to measure the information loss is the integrated variance. Under mild regular conditions, it is proved that if the original functions are smooth this property is inherited by the principal components. A numerical procedure to obtain the smooth principal components is proposed and the goodness of the dimension reduction is assessed by two new measures of the proportion of explained variability. The method performs as expected in various controlled simulated data sets and provides interesting conclusions when it is applied to real data sets.
  • Item
    Interpretable Clustering using Unsupervised Binary Trees
    (2011) Fraiman, Ricardo; Ghattas, Badih; Svarc, Marcela
    We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the rst stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data within the new subsamples. During the second stage (pruning), consideration is given to whether adjacent nodes can be aggregated. Finally, during the third stage (joining), similar clusters are joined together, even if they do not share the same parent originally. Consistency results are obtained, and the procedure is used on simulated and real data sets.