Huge thanks to our Platinum Members Endace and LiveAction,
and our Silver Member Veeam, for supporting the Wireshark Foundation and project.

Ethereal-dev: Re: [Ethereal-dev] While we're on the subject of new frametypes...

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxx>
Date: Fri, 13 Dec 2002 11:50:21 -0800
On Fri, Dec 13, 2002 at 10:33:17AM -0500, Devin Heitmueller wrote:
> Since there are numerous variants of extended
> ASCII, it should be left to the dissector to decide which character set
> they are using.  This means that we should add types like
> FT_STRING_ISO8859-1 and FT_STRING_UCS2_LE.

That's precisely the idea that I said might not work.

What if a string's character set can only be determined at run time, as
per

	   Making the character set a property of the field might not
	   work - for example, that wouldn't work for OEM character sets
	   in SMB, as that'd have to be something set by an SMB
	   preference item at run time.  It might work for the Mac
	   character set in Appletalk, however.

The alternative I suggested, which also lets the dissector decide what
character set is being used, was

		perhaps have the byte-order argument to
		"proto_tree_add_item()" specify, for FT_STRING types,
		the character set and, in cases where a multi-byte
		character type can come in either byte order, the byte
		order;

		add a character set+byte order argument to
		"proto_tree_add_string()"?

Then "proto_tree_add_item()" would perform conversation as necessary:

	If the type (supplied in the field that's currently used only
	for byte order) is CHARSET_ISO8859-1, it would convert to UTF-8.

	If it was CHARSET_UCS2|LITTLE_ENDIAN, it would convert to UTF-8.

	If it was CHARSET_UCS2|BIG_ENDIAN, it would convert to UTF-8.

	If it was CHARSET_ASCII, it would convert to UTF-8.

and so on