Wireshark-dev: Re: [Wireshark-dev] Wireshark memory handling
From: Guy Harris <[email protected]>
Date: Wed, 14 Oct 2009 12:32:24 -0700
On Oct 13, 2009, at 11:00 AM, Erlend Hamberg wrote:

On Saturday 10. October 2009 03.48.29 Guy Harris wrote:
The data Wireshark currently keeps in its address space that could
grow in size as the capture file grows are:

the frame_data structure (epan/frame_data.h) - one structure instance
per packet;
Ok, so – if my understanding is correct – for every packet that is  
read, an
frame_data structure is created

	the text for some or all of the columns in all of the rows of the
packet list (all, in current releases of Wireshark; some, in the
development branch);
Ok, not much to save here after the introduction of the new packet  
list, I
There might be more we can save if we have efficient random access to  
packets (even in compressed files), as we can just re-dissect the  
packet whenever we need the columns for it.
That could make sorting painful, however.

The data from the frames in the capture file are not kept in
Wireshark's address space - they are read in as necessary, into a
small number of buffers (one for the main window, and one for each
packet window opened). *HOWEVER*, if data from a frame is reassembled
into a higher-level multiple-frame packet, the result of the
reassembly is, as noted, kept in Wireshark's address space.
So, when Wireshark reads the capture file, if it finds a single- 
frame packet,
it will only create a frame_data structure in memory and possibly  
data from
the dissector for that type of packet. But if the packet is made up  
of several
frames, the packet is reassembled and kept in memory?

If so, do you think this could be changed?
We probably need to keep the packet data in memory while it's being  
reassembled and when it's dissected.
Again, with efficient random access, we could free it when we're done  
with it, and leave behind an array of frame numbers, starting offsets,  
and lengths, so that on the next reference the frames can be read, the  
data reassembled, and keep the data around, again, only while it's  
Would it be worth it?
Probably.  It would also mean that TShark would accumulate a lot less  
memory, and perhaps be able to run much longer, when dissecting  
packets (rather than just writing them to a file).