Wireshark-dev: Re: [Wireshark-dev] Wireshark memory handling
From: Guy Harris <[email protected]>
Date: Mon, 5 Oct 2009 11:23:42 -0700
On Oct 5, 2009, at 8:01 AM, Håvar Aambø Fosstveit wrote:

We are a student group from the university of Science and Engineering in Norway, and are having a project on handling large data sets and specifically Wiresharks issues with it. I have included part of our prestudy into the problem as an attachment, and we are wondering if anybody has som immediate
thoughts regarding our plans for a sollution.
The paper says

	Since exhausting the available primary memory is the problem ...

What does "primary memory" refer to here?

It later says

	An alternative for getting more memory than the machine's RAM is to use
memory-mapped files.

so presumably "primary memory" is referring to main memory, not to the sum of main memory and available backing store ("swap space"/paging files/swap files/whatever the OS calls it, plus the files that are mapped into the address space).
Presumably by "more memory than the machine's RAM" you mean "more  
memory than the machine's RAM plus the machine's swap space" - all the  
OSes on which Wireshark runs do demand paging, so Wireshark can use  
more memory than the machine has (unless the OS requires every page in  
RAM to have a swap-space page assigned to it, in which case it can use  
max(available main memory, available swap space)).
In effect, using memory-mapped files allows the application to extend  
the available backing store beyond what's pre-allocated (note that OS  
X and Windows NT - "NT" as generic for all NT-based versions of  
Windows - both use files, rather than a fixed set of separate  
partitions, as backing store, and I think both will grow existing swap  
files or add new swap files as necessary; I know OS X does that),  
making more virtual memory available.
The right long-term fix for a lot of this problem is to figure out how  
to make Wireshark use less memory; we have some projects we're working  
on to do that, and there are some additional things that can be done  
if we support fast random access to all capture files (including  
gzipped capture files, so that involves some work).  However, your  
scheme would provide a quicker solution for large captures that  
exhaust the available main memory and swap space, as long as you can  
intercept all the main allocators of main memory (the allocators in  
epan/emem.c can be intercepted fairly easily; the allocator used by  
GLib might be harder, but it still might be possible).