Package smile.nlp.normalizer
Class SimpleNormalizer
java.lang.Object
smile.nlp.normalizer.SimpleNormalizer
- All Implemented Interfaces:
Normalizer
A baseline normalizer for processing Unicode text.
- Apply Unicode normalization form NFKC.
- Strip, trim, normalize, and compress whitespace.
- Remove control and formatting characters.
- Normalize dash, double and single quotes.
-
Method Summary
Modifier and TypeMethodDescriptionstatic SimpleNormalizer
Returns the singleton instance.Normalize the given string.
-
Method Details
-
getInstance
Returns the singleton instance.- Returns:
- the singleton instance.
-
normalize
Description copied from interface:Normalizer
Normalize the given string.- Specified by:
normalize
in interfaceNormalizer
- Parameters:
text
- the text.- Returns:
- the normalized text.
-