Huge thanks to our Platinum Members Endace and LiveAction,
and our Silver Member Veeam, for supporting the Wireshark Foundation and project.

Ethereal-dev: Re: [Ethereal-dev] Ethereal and internationalization

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>
Date: Sun, 21 Mar 2004 18:22:48 -0800
On Sun, Mar 21, 2004 at 06:12:33PM -0800, Richard Sharpe wrote:
> It seems that there are going to be a number of cases where Ethereal does 
> not handle internationalization. For example, if someone has an SMB 
> capture that contains file names that are in a non-ASCII character set, 
> they might have difficulty entering text strings to perform functions 
> like:
> 
>     smb.file contains "some non-ASCII string"

There are two issues here.

The first issue is character sets in fields - not all string fields are
in the same character set, and even if they are, they might use
different encodings (UTF-8 vs.  some 2-byte encoding, for example).

The second issue is user input - is the value of a text entry field ISO
8859-x, or UTF-8, or....?

The first issue causes problems even when you *aren't* filtering fields
- we need to somehow handle it.  I suspect the right answer is to have
part of the value of a string field be the character set and encoding of
the field; we could canonicalize into some standard encoding, e.g.
UTF-8, but if we're just building a protocol tree to do filtering, and
the filter doesn't involve a particular field, canonicalizing the
field's value is a waste of time (then again, so is storing the value at
all...).

The second issue might be soluble fairly straightforwardly if the text
entry field is UTF-8 (which it might be in GTK2) *and* if if the GUI
includes input methods to let you enter arbitrary characters (e.g., the
Character Palette in Mac OS X).