Package smile.feature.imputation
Class KNNImputer
java.lang.Object
smile.feature.imputation.KNNImputer
- All Implemented Interfaces:
Serializable
,Function<Tuple,
,Tuple> Transform
Missing value imputation with k-nearest neighbors. The KNN-based method
selects instances similar to the instance of interest to impute
missing values. If we consider instance A that has one missing value on
attribute i, this method would find K other instances, which have a value
present on attribute i, with values most similar (in terms of some distance,
e.g. Euclidean distance) to A on other attributes without missing values.
The average of values on attribute i from the K nearest
neighbors is then used as an estimate for the missing value in instance A.
In the weighted average, the contribution of each instance is weighted by
similarity between it and instance A.
- See Also:
-
Constructor Summary
ConstructorDescriptionKNNImputer
(DataFrame data, int k, String... columns) Constructor with Euclidean distance on selected columns.KNNImputer
(DataFrame data, int k, Distance<Tuple> distance) Constructor. -
Method Summary
-
Constructor Details
-
KNNImputer
Constructor.- Parameters:
data
- the map of column name to the constant value.k
- the number of nearest neighbors used for imputation.distance
- the distance measure.
-
KNNImputer
Constructor with Euclidean distance on selected columns.- Parameters:
data
- the map of column name to the constant value.k
- the number of nearest neighbors used for imputation.columns
- the columns used in Euclidean distance computation. If empty, all columns will be used.
-
-
Method Details