Wireshark-users: Re: [Wireshark-users] Filtering a very large capture file
From: Jeff Morriss <[email protected]>
Date: Thu, 01 Feb 2007 20:37:09 +0800

Stuart MacDonald wrote:
From: On Behalf Of Guy Harris
On Jan 25, 2007, at 8:23 PM, Stuart MacDonald wrote:
That can't do arbitrary display filtering, but truly *arbitrary* display filtering has problems with reassembly (i.e., a filter that matches something in the reassembled portion of the packet can't match anything but the last packet). It also can't handle non-libpcap
Fair enough. What exactly constitutes the reassembled portion? I'm
thinking it's things like the TCP analysis; "Zero Window" status etc.
<mulls> I guess it's anything that can't be expressed as a capture
filter.

Interesting. I've lcoated
http://wiki.wireshark.org/TCP_Reassembly
and those options are off (by default) for my Wireshark. Are they not
also off (by default) for tshark?
The defaults should be the same in both programs.

tshark is almost the right thing, except that tshark also tries to
read in the whole capture first instead of processing it
like editcap.

No, actually, it *does* process it like editcap; neither it nor Wireshark read the entire capture file into memory. They *do* keep reassembled data in memory, but that's another matter.
Let me reprhase that then. tshark also bails with the out of memory
crash, just like Wireshark. editcap does not. I assumed that was due
to the method of processing the file, but I see now that it's due to
reassembly, and this is perhaps why editcap doesn't filter on anything
but frame numbers and time; it avoids reassembly by doing so.

Hm, the research on TCP Reassembly from above makes me think the
crashes are not due to reassembly after all. Is that a new bug in
Wireshark/tshark then?
Well, if TCP Reassembly were the only memory eater then yes.  But other 
things Wireshark does also eat memory--for example TCP sequence analysis.
To know exactly what is taking the memory usage in your case would 
probably take some deeper investigation with your capture files and 
would probably (I guess) find that it's a Feature that is eating all the 
memory.
[Though I do sometimes worry that some real memory leaks or just memory 
inefficiency may be lurking in Wireshark but they're hidden by the fact 
we all know it should be using lots of memory in big capture files.]
Ah yes. tshark refuses to apply a capture filter when reading from a
file, thereby enforcing a display filter and the subsequent crash. I
suppose that it can't apply a capture filter because it's not using
libpcap to get the packets in the first place? Perhaps libpcap needs
to be taught how to use a file instead of an interface.
In fact libpcap does support reading from a file and I would imagine 
applying a capture filter to it--I'd bet that's exactly what 'tcpdump' 
is doing.  The issue is, I think [haven't looked], supporting that in 
'tshark'.
Is there a way to turn off reassembly so that tshark would work the
same as tcpdump in the above example? Although now it looks like it
should be off (by default).
Well, look in the 'tshark' output and see if it's telling you about 
Reassembled packets.  If so, it's not off.  But you've also gotta look 
for other things like TCP sequence analysis and who knows what else...