TextNormalizer (Acciente OACC 2.0.0-rc.8 API)

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- com.acciente.oacc.normalizer.TextNormalizer

Direct Known Subclasses:

ICU4Jv26TextNormalizer, ICU4Jv46TextNormalizer, JDKTextNormalizer
```
public abstract class TextNormalizer
extends Object
```
Normalizes Unicode text to handle characters that have more than one canonically equivalent representation.
This is important when comparing hashed passwords because plaintext that visually looks the same might actually be represented differently binarily, without the user being aware. For example, `é` (the letter `e` with accent acute) may be represented as a single Unicode character (U+00E9) or composed of two characters (U+0065 + U+0301), but both representations are canonically equivalent.
This class first tries to use the ICU4J library for normalization because it normalizes character arrays without converting to String. If ICU4J is not available, then it falls back to the text normalizer provided by the JDK, which produces an **intermediate String representation** of the text.
In other words, if you need to prevent a cleanable char[] password being turned into a temporary String during Unicode character normalization, you need to include a dependency to ICU4J.

Constructor Summary

Constructors
Constructor and Description

TextNormalizer()

Method Summary

Methods
Modifier and Type	Method and Description
`static TextNormalizer`	`getInstance()` Get an instance of a text normalizer.
`abstract char[]`	`normalizeToNfc(char[] source)` Returns the canonically equivalent normalized (NFC) version of a Unicode character array.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - TextNormalizer
```
public TextNormalizer()
```
- Method Detail
  - getInstance
```
public static TextNormalizer getInstance()
```
    Get an instance of a text normalizer.
    If the ICU4J library is available, the returned instance will use an ICU4J normalizer, which handles character arrays without converting to String. Otherwise (if ICU4J is not available), the fallback instance returned uses the normalizer provided by the JDK, which produces an **intermediate String representation** of the normalized text.
    
    Returns:
    a text normalizer instance
  - normalizeToNfc
```
public abstract char[] normalizeToNfc(char[] source)
```
    Returns the canonically equivalent normalized (NFC) version of a Unicode character array.
    Note: If the ICU4J library for normalization is not available, the fallback Normalizer provided by the JDK will produce an intermediate String representation of the normalized text!
    
    Parameters:
    source - any Unicode text
    
    Returns:
    a character array containing the normalized representation of the source text

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

OACC is a Java Application Security Framework developed by Acciente, LLC., released under Apache License 2.0.
Copyright 2009-2017, Acciente, LLC.