net.sf.saxon.charcode

Class UTF16


public class UTF16
extends java.lang.Object

A class to hold some static constants and methods associated with processing UTF16 and surrogate pairs

Field Summary

static int
NONBMP_MAX
static int
NONBMP_MIN
static char
SURROGATE1_MAX
static char
SURROGATE1_MIN
static char
SURROGATE2_MAX
static char
SURROGATE2_MIN

Method Summary

static int
combinePair(char high, char low)
Return the non-BMP character corresponding to a given surrogate pair surrogates.
static char
highSurrogate(int ch)
Return the high surrogate of a non-BMP character
static boolean
isHighSurrogate(int ch)
Test whether the given character is a high surrogate
static boolean
isLowSurrogate(int ch)
Test whether the given character is a low surrogate
static boolean
isSurrogate(int c)
Test whether a given character is a surrogate (high or low)
static char
lowSurrogate(int ch)
Return the low surrogate of a non-BMP character

Field Details

NONBMP_MAX

public static final int NONBMP_MAX
Field Value:
1114111

NONBMP_MIN

public static final int NONBMP_MIN
Field Value:
65536

SURROGATE1_MAX

public static final char SURROGATE1_MAX
Field Value:
'\udbff'

SURROGATE1_MIN

public static final char SURROGATE1_MIN
Field Value:
'\ud800'

SURROGATE2_MAX

public static final char SURROGATE2_MAX
Field Value:
'\udfff'

SURROGATE2_MIN

public static final char SURROGATE2_MIN
Field Value:
'\udc00'

Method Details

combinePair

public static int combinePair(char high,
                              char low)
Return the non-BMP character corresponding to a given surrogate pair surrogates.
Parameters:
high - The high surrogate.
low - The low surrogate.
Returns:
the Unicode codepoint represented by the surrogate pair

highSurrogate

public static char highSurrogate(int ch)
Return the high surrogate of a non-BMP character
Parameters:
ch - The Unicode codepoint of the non-BMP character to be divided.
Returns:
the first character in the surrogate pair

isHighSurrogate

public static boolean isHighSurrogate(int ch)
Test whether the given character is a high surrogate
Parameters:
ch - The character to test.
Returns:
true if the character is the first character in a surrogate pair

isLowSurrogate

public static boolean isLowSurrogate(int ch)
Test whether the given character is a low surrogate
Parameters:
ch - The character to test.
Returns:
true if the character is the second character in a surrogate pair

isSurrogate

public static boolean isSurrogate(int c)
Test whether a given character is a surrogate (high or low)
Parameters:
c - the character to test
Returns:
true if the character is the high or low half of a surrogate pair

lowSurrogate

public static char lowSurrogate(int ch)
Return the low surrogate of a non-BMP character
Parameters:
ch - The Unicode codepoint of the non-BMP character to be divided.
Returns:
the second character in the surrogate pair