Wireshark · Wireshark-dev: [Wireshark-dev] UCS-2 vs. UTF-16?

Wireshark-dev: [Wireshark-dev] UCS-2 vs. UTF-16?

From: Guy Harris <guy@xxxxxxxxxxxx>

Date: Fri, 13 Dec 2013 23:49:31 -0800

Currently, we have separate encoding values for UCS-2 (Unicode code points between 0 and 65535, represented as 2-byte sequences) and UTF-16 (all of Unicode, with code points > 65535 represented as "surrogate pairs").

Is there any reason to support UCS-2 (in which, presumably, code points in the ranges 0xD800-0xDFFF would be treated as errors, as those code points are reserved as surrogates), or should we just support UTF-16?

The Microsoft [MS-RPCE] (Remote Procedure Call Protocol Extensions) specification talks about "Unicode" strings, without indicating whether that's full Unicode, encoded as UTF-16, or only the Unicode Basic Multilingual Plane, encoded as UCS-2.  I haven't checked what, for example, the SMB or SMB2 specification says.

Prev by Date: [Wireshark-dev] text2pcap enhancements
Next by Date: Re: [Wireshark-dev] r54005 by wmeier for packet-mq.c and packet-mq-pcf.c
Previous by thread: [Wireshark-dev] text2pcap enhancements
Next by thread: [Wireshark-dev] large signed 40-56bit integers
Index(es):
- Date
- Thread