net.sf.saxon.codenorm
Class Normalizer
java.lang.Object
net.sf.saxon.codenorm.Normalizer
public class Normalizer
extends java.lang.Object
Implements Unicode Normalization Forms C, D, KC, KD.
Copyright (c) 1991-2005 Unicode, Inc.
For terms of use, see http://www.unicode.org/terms_of_use.html
For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any
kind, and assumes no liability for errors or omissions.
No liability is assumed for incidental and consequential damages
in connection with or arising out of the use of the information here.
- Mark Davis
Updates for supplementary code points: Vladimir Weinstein & Markus Scherer
Modified to remove dependency on ICU code: Michael Kay
static byte | C - Normalization Form Selector
|
static byte | D - Normalization Form Selector
|
static byte | KC - Normalization Form Selector
|
static byte | KD - Normalization Form Selector
|
static byte | NO_ACTION - Normalization Form Selector
|
Normalizer(CharSequence formCS) - Create a normalizer for a given form, expressed as a character string
|
Normalizer(byte form) - Create a normalizer for a given form.
|
CharSequence | normalize(CharSequence source) - Normalizes text according to the chosen form
|
C
public static final byte C
Normalization Form Selector
D
public static final byte D
Normalization Form Selector
KC
public static final byte KC
Normalization Form Selector
KD
public static final byte KD
Normalization Form Selector
NO_ACTION
public static final byte NO_ACTION
Normalization Form Selector
Normalizer
public Normalizer(CharSequence formCS)
throws XPathException
Create a normalizer for a given form, expressed as a character string
formCS
- the normalization form required: for example "NFC" or "NFD"
Normalizer
public Normalizer(byte form)
Create a normalizer for a given form.
form
- the normalization form required: for example C
, D
normalize
public CharSequence normalize(CharSequence source)
Normalizes text according to the chosen form
source
- the original text, unnormalized
- target the resulting normalized text