|
168 | 168 | * <tr><th scope="row" style="vertical-align:top">{@code UTF-16}</th>
|
169 | 169 | * <td>Sixteen-bit UCS Transformation Format,
|
170 | 170 | * byte order identified by an optional byte-order mark</td></tr>
|
| 171 | + * <tr><th scope="row" style="vertical-align:top">{@code UTF-32BE}</th> |
| 172 | + * <td>Thirty-two-bit UCS Transformation Format, |
| 173 | + * big-endian byte order</td></tr> |
| 174 | + * <tr><th scope="row" style="vertical-align:top">{@code UTF-32LE}</th> |
| 175 | + * <td>Thirty-two-bit UCS Transformation Format, |
| 176 | + * little-endian byte order</td></tr> |
| 177 | + * <tr><th scope="row" style="vertical-align:top">{@code UTF-32}</th> |
| 178 | + * <td>Thirty-two-bit UCS Transformation Format, |
| 179 | + * byte order identified by an optional byte-order mark</td></tr> |
171 | 180 | * </tbody>
|
172 | 181 | * </table></blockquote>
|
173 | 182 | *
|
174 | 183 | * <p> The {@code UTF-8} charset is specified by <a
|
175 | 184 | * href="http://www.ietf.org/rfc/rfc2279.txt"><i>RFC 2279</i></a>; the
|
176 | 185 | * transformation format upon which it is based is specified in
|
177 |
| - * Amendment 2 of ISO 10646-1 and is also described in the <a |
| 186 | + * ISO 10646-1 and is also described in the <a |
178 | 187 | * href="http://www.unicode.org/standard/standard.html"><i>Unicode
|
179 | 188 | * Standard</i></a>.
|
180 | 189 | *
|
181 | 190 | * <p> The {@code UTF-16} charsets are specified by <a
|
182 | 191 | * href="http://www.ietf.org/rfc/rfc2781.txt"><i>RFC 2781</i></a>; the
|
183 | 192 | * transformation formats upon which they are based are specified in
|
184 |
| - * Amendment 1 of ISO 10646-1 and are also described in the <a |
| 193 | + * ISO 10646-1 and are also described in the <a |
| 194 | + * href="http://www.unicode.org/standard/standard.html"><i>Unicode |
| 195 | + * Standard</i></a>. |
| 196 | + * |
| 197 | + * <p> The {@code UTF-32} charsets are based upon transformation formats |
| 198 | + * which are specified in |
| 199 | + * ISO 10646-1 and are also described in the <a |
185 | 200 | * href="http://www.unicode.org/standard/standard.html"><i>Unicode
|
186 | 201 | * Standard</i></a>.
|
187 | 202 | *
|
188 |
| - * <p> The {@code UTF-16} charsets use sixteen-bit quantities and are |
| 203 | + * <p> The {@code UTF-16} and {@code UTF-32} charsets use sixteen-bit and thirty-two-bit |
| 204 | + * quantities respectively, and are |
189 | 205 | * therefore sensitive to byte order. In these encodings the byte order of a
|
190 | 206 | * stream may be indicated by an initial <i>byte-order mark</i> represented by
|
191 |
| - * the Unicode character <code>'\uFEFF'</code>. Byte-order marks are handled |
| 207 | + * the Unicode character {@code U+FEFF}. Byte-order marks are handled |
192 | 208 | * as follows:
|
193 | 209 | *
|
194 | 210 | * <ul>
|
195 | 211 | *
|
196 |
| - * <li><p> When decoding, the {@code UTF-16BE} and {@code UTF-16LE} |
| 212 | + * <li><p> When decoding, the {@code UTF-16BE}, {@code UTF-16LE}, |
| 213 | + * {@code UTF-32BE}, and {@code UTF-32LE} |
197 | 214 | * charsets interpret the initial byte-order marks as a <small>ZERO-WIDTH
|
198 | 215 | * NON-BREAKING SPACE</small>; when encoding, they do not write
|
199 | 216 | * byte-order marks. </p></li>
|
200 | 217 | *
|
201 |
| - * <li><p> When decoding, the {@code UTF-16} charset interprets the |
| 218 | + * <li><p> When decoding, the {@code UTF-16} and {@code UTF-32} charsets interpret the |
202 | 219 | * byte-order mark at the beginning of the input stream to indicate the
|
203 | 220 | * byte-order of the stream but defaults to big-endian if there is no
|
204 | 221 | * byte-order mark; when encoding, it uses big-endian byte order and writes
|
|
0 commit comments