Huge thanks to our Platinum Members Endace and LiveAction,
and our Silver Member Veeam, for supporting the Wireshark Foundation and project.

Wireshark-dev: Re: [Wireshark-dev] Replacing g_iconv and different codesets

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Fri, 20 Dec 2013 11:15:25 -0800
On Dec 20, 2013, at 10:46 AM, Michael Lum <michael.lum@xxxxxxxxxxxxxxxxx> wrote:

> Is there a goal to remove g_iconv calls from Wireshark.

I'd certainly like to do so to the maximum extent possible.  I would like to have *all* code set handling done by using ENC_ arguments to proto_tree_add_item() or tvb_get_string_enc().

The code in dissectors would be much simpler, it wouldn't depend on particular g_iconv() implementations handling particular character sets, and it would allow us to handle invalid strings as we choose.

> I checked charsets.c/.h and there are two encodings that are not available that are used in the ANSI SMS dissector.
>  
> iso-8859-8 (Latin/Hebrew) is pretty easy to add I believe I can follow the pattern for the code that is there now.

Yes, the ISO 8859-x character sets and encodings are fairly straightforward.

> The other is EUC-KR (Korean).  I tried to find a code page that looks like the ISO ones but I'm not how these
> conversions are supposed to work.

DBCS encodings, such as the EUC encodings, will be more work, but we should do them eventually as well.