Huge thanks to our Platinum Members Endace and LiveAction,
and our Silver Member Veeam, for supporting the Wireshark Foundation and project.

Wireshark-dev: Re: [Wireshark-dev] RFD: The Future of Memory Management in Wireshark

From: Evan Huus <eapache@xxxxxxxxx>
Date: Sat, 3 Nov 2012 11:55:21 -0400
On Sun, Oct 28, 2012 at 10:12 AM, Evan Huus <eapache@xxxxxxxxx> wrote:
> On Sat, Oct 27, 2012 at 3:09 PM, Evan Huus <eapache@xxxxxxxxx> wrote:
>> On Sat, Oct 27, 2012 at 2:17 PM, Dirk Jagdmann <doj@xxxxxxxxx> wrote:
>>>> General thoughts from the list on whether or not this would be a good idea?
>>>
>>> some general comments on the whole wmem idea:
>>> memory allocation is done almost everywhere in Wireshark, also in somewhat
>>> hidden places. For example all the proto_tree_add_* functions will need to
>>> allocate memory. It will be a real pain to pass down the current allocator
>>> object into each of those function invocations, you're looking at changing a
>>> considerable amount of code.
>>
>> Yes, it wouldn't be a small job.
>>
>>> Also when thinking about writing dissector code,
>>> it's probably not useful to mix different allocators while dissecting a protocol.
>>
>> Not sure what you mean here. Mixing ep_ and se_ memory? Or mixing
>> g_alloc and mmap memory? The separation of the allocators (glib, mmap,
>> etc.) from the front-end isn't driven by a desire to mix and match
>> allocators during a specific dissection (I agree that would be odd).
>> It was more driven by the fact that the current allocators are all
>> jumbled together and thus hard to maintain. It also makes it easy to
>> add new scopes where necessary without writing hundreds of new wrapper
>> functions.
>>
>>> An alternative idea (although a little more ugly) is to use a thread local
>>> global variable which contains pointers to the current ep and se allocators. The
>>> se_* and ep_* functions retrieve the allocator object from that thread local
>>> global variable and do their allocations. If a function wants to change the
>>> allocators for its own function calls, it can make a copy of the global variable
>>> pointers to local variables, set it's desired new allocators and restore the
>>> global pointers at the end, like:
>>>
>>> __thread void* se_allocator;
>>> __thread void* ep_allocator;
>>>
>>> void my_special_func()
>>> {
>>>   void* old_se = se_allocator;
>>>   void* old_ep = ep_allocator;
>>>
>>>   se_allocator = create_special_se();
>>>   ep_allocator = create_special_ep();
>>>   dissect_packet();
>>>   garbage_collect(se_allocator);
>>>   garbage_collect(ep_allocator);
>>>
>>>   se_allocator = old_se;
>>>   pe_allocator = old_ep;
>>> }
>>
>> This would certainly require fewer changes to existing code. It is,
>> however, complicated by the lack of nice cross-platform API for
>> thread-local storage. I just took a quick look at glib's gestures in
>> this direction and they seem rather limited (a max of 128 thread-local
>> objects, and they can't be destroyed).
>>
>> One of the things I was hoping to do by removing the allocators from
>> the global scope was make it very obvious what the exact scopes of the
>> various allocators are. Right now ep_ and se_ memory is used in a lot
>> of places that aren't correct because the code has simply been able to
>> grab the global. We really don't want anyone using ep_ memory unless
>> there is actually a packet being dissected, and scoping the ep_ pool
>> would enforce that nicely.
>
> We might be able to fake the proper scoping using thread-local globals
> if we wrap everything in functions that assert the state of a
> dissection. Something like:
>
> __thread wmem_allocator_t *packet_scope;
> __thread gboolean packet_in_scope = FALSE;
>
> wmem_allocator_t *
> wmem_packet_scope(void)
> {
>   g_assert(packet_in_scope);
>   return packet_scope;
> }
>
> void
> wmem_start_packet_scope(void)
> {
>   g_assert(!packet_in_scope);
>   packet_in_scope = TRUE;
> }
>
> void
> wmem_stop_packet_scope(void)
> {
>   g_assert(packet_in_scope);
>   packet_in_scope = FALSE;
>   wmem_free_all(packet_scope);
> }
>
> Invalid accesses would still compile, but at least they would throw an
> assertion as soon as the code path was hit. Thoughts on something like
> this?

I've committed the base for this in revision 45879. Setting up the
packet-level scopes seems to be fine, but as I mentioned in the commit
message, I can't figure out where to put the calls to
wmem_enter_file_scope() and wmem_leave_file_scope(). My first attempt
put them in init_dissection() and cleanup_dissection() in packet.c but
that caused a lot of assertion errors.

Does anyone know the correct place for these calls?

Thanks,
Evan