smile.feature.imputation.KNNImputer

All Implemented Interfaces:: Serializable, Function<Tuple,Tuple>, Transform

public class KNNImputer extends Object implements Transform

Missing value imputation with k-nearest neighbors. The KNN-based method selects instances similar to the instance of interest to impute missing values. If we consider instance A that has one missing value on attribute i, this method would find K other instances, which have a value present on attribute i, with values most similar (in terms of some distance, e.g. Euclidean distance) to A on other attributes without missing values. The average of values on attribute i from the K nearest neighbors is then used as an estimate for the missing value in instance A. In the weighted average, the contribution of each instance is weighted by similarity between it and instance A.

See Also:

Constructor Summary

Constructors

Constructor

Description

KNNImputer(DataFrame data, int k, String... columns)

Constructor with Euclidean distance on selected columns.

KNNImputer(DataFrame data, int k, Distance<Tuple> distance)

Constructor.
Method Summary

Modifier and Type

Method

Description

Tuple

apply(Tuple x)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.util.function.Function
andThen, compose

Methods inherited from interface smile.data.transform.Transform
andThen, apply, compose

Constructor Details
- KNNImputer
  
  public KNNImputer(DataFrame data, int k, Distance<Tuple> distance)
  
  Constructor.
  
  Parameters:
  
  data - the map of column name to the constant value.
  
  k - the number of nearest neighbors used for imputation.
  
  distance - the distance measure.
- KNNImputer
  
  public KNNImputer(DataFrame data, int k, String... columns)
  
  Constructor with Euclidean distance on selected columns.
  
  Parameters:
  
  data - the map of column name to the constant value.
  
  k - the number of nearest neighbors used for imputation.
  
  columns - the columns used in Euclidean distance computation. If empty, all columns will be used.
Method Details
- apply
  
  public Tuple apply(Tuple x)
  
  Specified by:
  
  apply in interface Function<Tuple,Tuple>

Class KNNImputer

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface java.util.function.Function

Methods inherited from interface smile.data.transform.Transform

Constructor Details

KNNImputer

KNNImputer

Method Details

apply