Record Class SumSquaresRatio

java.lang.Object
java.lang.Record
smile.feature.selection.SumSquaresRatio
Record Components:
feature - The feature name.
ratio - Sum squares ratio.
All Implemented Interfaces:
Comparable<SumSquaresRatio>

public record SumSquaresRatio(String feature, double ratio) extends Record implements Comparable<SumSquaresRatio>
The ratio of between-groups to within-groups sum of squares is a univariate feature ranking metric, which can be used as a feature selection criterion for multi-class classification problems. For each variable j, this ratio is BSS(j) / WSS(j) = ΣI(yi = k)(xkj - x·j)2 / ΣI(yi = k)(xij - xkj)2; where x·j denotes the average of variable j across all samples, xkj denotes the average of variable j across samples belonging to class k, and xij is the value of variable j of sample i. Clearly, features with larger sum squares ratios are better for classification.

References

  1. S. Dudoit, J. Fridlyand and T. Speed. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc, 97:77-87, 2002.
  • Constructor Details

    • SumSquaresRatio

      public SumSquaresRatio(String feature, double ratio)
      Creates an instance of a SumSquaresRatio record class.
      Parameters:
      feature - the value for the feature record component
      ratio - the value for the ratio record component
  • Method Details

    • compareTo

      public int compareTo(SumSquaresRatio other)
      Specified by:
      compareTo in interface Comparable<SumSquaresRatio>
    • toString

      public String toString()
      Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components.
      Specified by:
      toString in class Record
      Returns:
      a string representation of this object
    • fit

      public static SumSquaresRatio[] fit(DataFrame data, String clazz)
      Calculates the sum squares ratio of numeric variables.
      Parameters:
      data - the data frame of the explanatory and response variables.
      clazz - the column name of class labels.
      Returns:
      the sum squares ratio.
    • hashCode

      public final int hashCode()
      Returns a hash code value for this object. The value is derived from the hash code of each of the record components.
      Specified by:
      hashCode in class Record
      Returns:
      a hash code value for this object
    • equals

      public final boolean equals(Object o)
      Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared with Objects::equals(Object,Object); primitive components are compared with '=='.
      Specified by:
      equals in class Record
      Parameters:
      o - the object with which to compare
      Returns:
      true if this object is the same as the o argument; false otherwise.
    • feature

      public String feature()
      Returns the value of the feature record component.
      Returns:
      the value of the feature record component
    • ratio

      public double ratio()
      Returns the value of the ratio record component.
      Returns:
      the value of the ratio record component