Wireshark-dev: Re: [Wireshark-dev] one possible way to speed up filtering
From: ronnie sahlberg <[email protected]>
Date: Sun, 22 Mar 2009 19:07:13 +1100
Another way to greatly speed up filtering would be to pick up and clomplete the work to make it possible to use ep_* memory
for all field types when dissecting a packet.


When wireshark dissects a packet it performs a massive amount of malloc()/free().
This was partially addressed when I added the SLAB_ allocator in slab.h eons ago.
This did improve performance very significantly at that time, but there are still speedups to gain, though not as low hanging anymore.

One thing that could make dissection faster is if we can get rid of all usage of SLAB for the field_info type and make it use cheap and fast
ep_ allocated memory instead.


This would mainly be stuff down in the epan/ftypes directory.
I remember I started preparing some of the functions there to allow a migration to ep_ allocation but did not do all of them.
One I remember would be problematic was the pcre one.

The first step would be to analyze all codepaths where data is created by calling into the epan/ftypes helpers and make sure the scope
of useage for these are valid if they are allocated with an ep_ scope.

Second would be to get rid of all the use of SLAB for field_info and just rely on ep_ allocations.

This would be more problematic though since there are instances when a tree of field_info is created but the tree is valid and referenced across several packets/the entire capture. I never had time to analyze this properly but if the cause is identified, maybe using se_ allocations for the second style of field_info allocations, which scope to use identified by a parameter to the functions?





2009/3/22 yami <[email protected]>
Hi Didier,

Thank you for trying the patch :) and all the good comments given.

I've attached a new patch to the wiki. Please see my detailed reply below.


On Fri, Mar 20, 2009 at 6:30 AM, didier <[email protected]> wrote:
Hi,

Le mercredi 18 mars 2009 à 23:05 +0800, yami a écrit :
> Thanks, I've written a page in Wiki:
>   http://wiki.wireshark.org/Development/FastFiltering
Nice work.

- If compiled without NDEBUG defined I get a failed assert:
epan/dfilter/wslimmat.c :1680 : fix_variables:  "v->assignment == v"
It seems like a 'bug' of gcc optimization. We can simply remove this assertion, see explanations below.

The assertion fails, but the 'real' value of v->assignment and v are equal.
I came to this conclusion by following experiments (is there any better way?):

experiment 1. without NDEBUG defined, but using '-O0' to compile, not assertion failure occurs.
experiment 2. no change to Makefile, simply add printf(v->assignment, v) to fix_pointer(), no assertion failure
experiment 3. no change to Makefile, add following to fix_variables() after the assert line:
      if (v->assignment != v)
          printf("not hold v->assignment=%p, v=%p\n", v->assignment, v);
      else
          printf("    hold v->assignment=%p, v=%p\n", v->assignment, v);

The assertion fails (simply print, no abort), however output is:
             hold v->assignment=0x8070508, v=0x8070508
 
which just says the opposite.


valgrind doesn't complain and it seems to work with NDEBUG but only for
simple stuff ie udp && dns, something like !(tcp.stream eq 1)
&& !(tcp.stream eq 2) doesn't return the right result.
ie:
follow TCP stream, filter out this stream, follow TCP this stream, and
so on.
Does it work for you?

This is a bug, I've fixed '!' (TEST_OP_NOT) part at least. I'll test the patch more. 


- stupid but Limmat uses the original BSD license which is incompatible
with the GPL.

Really? I've thought BSD license is looser than GPL (can you give me more details?).
On the other hand, I find Wireshark code has already contains similar licenses (am I right?), for example,
Menu -> Help -> About Wireshark -> License -> Part III has metioned some.

- On the other hand if expressions are made incrementally via popup
menus is a full SAT solver need?
 
Perhaps you are right, but I'm not sure.

However using a SAT solver (even if it is a simple one) is the most generalized way, which requires no special handling. And it is also a good example of how math is applied in real life :)

Cons are
1. SAT solvers are complicated. (But we may use a simple algorithm)
2. Public available SAT solvers are mainly wrote by researchers, which may not have time to maintain the software.

eg:
Something like
tcp.stream eq 1 --> H1

!(tcp.stream eq 1) --> !H1 --> H2

tcp.stream eq 2 --> H3

!(tcp.stream eq 1) && !(tcp.stream eq 2) --> H2 && !H3

may be good enough.

Didier



___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
            mailto:[email protected]?subject=unsubscribe


___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
            mailto:[email protected]?subject=unsubscribe