Wireshark-dev: Re: [Wireshark-dev] slow sip voip flow for large captures
From: Anders Broman <[email protected]>
Date: Fri, 3 Feb 2012 15:02:35 +0100
Hi,
Forget about my comment about the sip hash table.

Please add the patch(es) to bugzilla, if possible break it up to something we can commit straight away and stuff that needs to be worked on to work
For all taps as that can't be committed until it's done for all tap. Two bug reports?

Regards
Anders
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Cristian Constantin
Sent: den 3 februari 2012 14:54
To: Developer support list for Wireshark
Subject: Re: [Wireshark-dev] slow sip voip flow for large captures

On Fri, Feb 3, 2012 at 1:44 PM, Anders Broman <[email protected]> wrote:
> Hi,
> Please put the patches in bugzilla, if possible split them in two 
> parts one that's ok for all types(Step 2,3,4?) of calls and one which Implements the stuff for SIP but not for the other call types.

cristian: step 1 is dealing only with sip packets.

step 2's improvement could be used all voip taps.

step 3 is tricky since the lists have to be reversed (trivial operation for double linked lists; glib offers an api for this). however currently the prepend and reversing is performed only for sip calls. extra-work to be done here for other voip taps to use more efficient linked lists operations (prepend and eventl. reverse). also, this step modifies the callback for the button "Flow" of the "VoIP" window  - on_graph_bt_clicked() - to reverse (once) the lists:

* voip_calls_get_info()->callsinfo_list
* voip_calls_get_info()->graph_analysis->list

so, in order to work correctly with the other taps, they have to use the "prepend" operation/convention as well.

step 4 could be used for all taps

let me see what I can do about the patches.

> I think the SIP dissector keeps a table over the calls as well could that table be shared?

cristian: I do not know which table is that.

> Regards
> Anders
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Cristian 
> Constantin
> Sent: den 3 februari 2012 13:01
> To: Developer support list for Wireshark
> Subject: [Wireshark-dev] slow sip voip flow for large captures
>
> hi!
>
> wireshark can draw call flows for sip voip calls (accessible through the menu Telephony/VoIP Calls).
>
> however, when the capture is large, containing tens of thousands of sip voip calls, wireshark becomes very slow at producing the list of calls and the call flows.
>
> here are my experiences with a capture file of 500 MB containing approx. 40K calls on a debian machine having:
>
> 4xAMD Phenom(tm) II X4 940 Processor
> 4GB RAM
>
> with the svn stock it takes about 10 hours to do it; the report of the profiling done with oprofile looks like this:
>
> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted 
> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 750000 samples  %        image name               
> app name                 symbol name 8146520  55.2011  lt-wireshark             
> lt-wireshark
> append_to_frame_graph.isra.7
> 3698478  25.0610  libglib-2.0.so.0.2800.6  lt-wireshark             
> g_list_last
> 1566933  10.6176  lt-wireshark             lt-wireshark 
> SIPcalls_packet
> 1162377   7.8763  libc-2.13.so             lt-wireshark
> __strcmp_sse2
> 8734      0.0592  libwireshark.so.0.0.1    lt-wireshark 
> serv_name_lookup 8130      0.0551  libwireshark.so.0.0.1    
> lt-wireshark dissect_sip_common
> 6504      0.0441  libwireshark.so.0.0.1    lt-wireshark             
> format_text
> 5749      0.0390  libwireshark.so.0.0.1    lt-wireshark 
> tap_push_tapped_queue
> 4753      0.0322  libglib-2.0.so.0.2800.6  lt-wireshark 
> g_unichar_isprint
> 4627      0.0314  libglib-2.0.so.0.2800.6  lt-wireshark 
> g_hash_table_lookup
> 3699      0.0251  libwireshark.so.0.0.1    lt-wireshark             
> guint8_pbrk
>
> step 1
> ------
>
> by changing SIPcalls_packet() to use a hash table for the lookup of the calls (on sip call-id) the runtime decreases to an (estimated) 3-4 hours.
> SIPcalls_packet()'s cpu usage has decreased from 10.61% to 0.0115% 
> (1000 times faster)
>
> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted 
> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 750000 samples  %        image name               
> symbol name
> 411305   69.4523  lt-wireshark             
> append_to_frame_graph.isra.7
> 171549   28.9675  libglib-2.0.so.0.2800.6  g_list_last
> 475       0.0802  libwireshark.so.0.0.1    serv_name_lookup
> 439       0.0741  libwireshark.so.0.0.1    dissect_sip_common
> 375       0.0633  libwireshark.so.0.0.1    format_text 360       
> 0.0608  libwireshark.so.0.0.1    tap_push_tapped_queue
> 303       0.0512  libglib-2.0.so.0.2800.6  g_hash_table_lookup
> 283       0.0478  libglib-2.0.so.0.2800.6  g_unichar_isprint
> 202       0.0341  libwireshark.so.0.0.1    guint8_pbrk
> 172       0.0290  libwireshark.so.0.0.1    proto_tree_add_item
> 149       0.0252  libwireshark.so.0.0.1    compute_offset_length
> 143       0.0241  libc-2.13.so             __memcpy_sse2
> 141       0.0238  libwireshark.so.0.0.1    call_dissector_work
> 141       0.0238  libwireshark.so.0.0.1    emem_free_all
> 115       0.0194  libc-2.13.so             __strncmp_sse2
> 112       0.0189  libc-2.13.so             vfprintf
> 105       0.0177  libwireshark.so.0.0.1    proto_item_add_subtree
> 104       0.0176  libglib-2.0.so.0.2800.6  
> g_slice_free_chain_with_offset
> 99        0.0167  libwireshark.so.0.0.1    dissect_sdp
> 91        0.0154  libc-2.13.so             free
> 91        0.0154  libwireshark.so.0.0.1    dissect_ip 90        0.0152  
> libwireshark.so.0.0.1    proto_field_is_referenced 90        0.0152  
> libwireshark.so.0.0.1    tvb_length_remaining 90        0.0152  
> lt-wireshark process_specified_packets.constprop.10
> 89        0.0150  libglib-2.0.so.0.2800.6  g_slice_free1
> 89        0.0150  libwiretap.so.0.0.1      file_seek
> 88        0.0149  libwireshark.so.0.0.1    dissect
> 86        0.0145  libwireshark.so.0.0.1    dissect_frame
> 82        0.0138  libwireshark.so.0.0.1    proto_is_protocol_enabled
> 79        0.0133  libc-2.13.so             __strcmp_sse2
> 79        0.0133  libc-2.13.so             _int_malloc
> 78        0.0132  libglib-2.0.so.0.2800.6  g_slice_alloc
> 75        0.0127  libpthread-2.13.so       pthread_getspecific
> 73        0.0123  libwireshark.so.0.0.1    tvb_find_guint8
> 72        0.0122  libc-2.13.so             malloc
> 71        0.0120  libwireshark.so.0.0.1    fast_ensure_contiguous
> 68        0.0115  lt-wireshark             SIPcalls_packet
>
> step 2
> ------
>
> the next improvment was to change append_to_frame_graph() to use a hash table for graph analysis (still slow, over 1 hour):
>
> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted 
> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 750000 samples  %        image name               
> symbol name
> 565268   94.6185  libglib-2.0.so.0.2800.6  g_list_last
> 1863      0.3118  libwireshark.so.0.0.1    serv_name_lookup 1620      
> 0.2712  libwireshark.so.0.0.1    dissect_sip_common 1340      0.2243  
> libwireshark.so.0.0.1    format_text
> 987       0.1652  libglib-2.0.so.0.2800.6  g_unichar_isprint
> 982       0.1644  libglib-2.0.so.0.2800.6  g_hash_table_lookup
> 819       0.1371  libwireshark.so.0.0.1    tap_push_tapped_queue
> 731       0.1224  libwireshark.so.0.0.1    guint8_pbrk
> 628       0.1051  libwireshark.so.0.0.1    proto_tree_add_item
> 587       0.0983  libglib-2.0.so.0.2800.6  
> g_hash_table_insert_internal
> 574       0.0961  libwireshark.so.0.0.1    compute_offset_length
> 405       0.0678  libwireshark.so.0.0.1    proto_is_protocol_enabled
> 401       0.0671  libwireshark.so.0.0.1    emem_free_all
> 395       0.0661  libwireshark.so.0.0.1    dissect_ip
> 391       0.0654  libc-2.13.so             __strncmp_sse2
> 367       0.0614  libwireshark.so.0.0.1    dissect ...
> 20        0.0017  lt-wireshark             
> append_to_frame_graph.isra.8
>
> however, append_to_frame_graph() got from 69.45% to 0.0017% (40852 
> times better)
>
> step 3
> ------
>
> I have changed the graph analysis list and the callsinfo list to use prepend instead of append:
>
> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted 
> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 750000 samples  %        image name               
> symbol name
> 130285   18.6875  libc-2.13.so             msort_with_tmp
> 66464     9.5333  libglib-2.0.so.0.2800.6  node_get_pos 53790     
> 7.7154  libglib-2.0.so.0.2800.6  g_hash_table_insert_internal
> 51146     7.3361  libc-2.13.so             __memcpy_sse2
> 44419     6.3712  libglib-2.0.so.0.2800.6  node_get_next
> 42461     6.0904  libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_fixup
> 41265     5.9189  libglib-2.0.so.0.2800.6  is_end
> 40819     5.8549  libglib-2.0.so.0.2800.6  g_hash_table_resize
> 38905     5.5803  libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_next
> 36069     5.1736  libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_reorder
> 35828     5.1390  libglib-2.0.so.0.2800.6  g_hash_table_lookup
> 29469     4.2269  libgtk-x11-2.0.so.0.2400.4 save_positions
> 26858     3.8524  libgtk-x11-2.0.so.0.2400.4 generate_order
> 7892      1.1320  libc-2.13.so             __memset_sse2
> 7752      1.1119  libgtk-x11-2.0.so.0.2400.4 
> gtk_rbtree_reorder_sort_func 6860      0.9840  
> libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_invert_func
> 6835      0.9804  libglib-2.0.so.0.2800.6  
> g_hash_table_remove_all_nodes
> 4085      0.5859  libglib-2.0.so.0.2800.6  g_array_append_vals
> 3864      0.5542  libglib-2.0.so.0.2800.6  g_sequence_iter_is_end
> 3591      0.5151  libc-2.13.so             __memset_x86_64
> 2057      0.2950  libglib-2.0.so.0.2800.6  g_array_maybe_expand
>
> step 4
> ------
>
> reordering is performed _continously_ while adding elements to the voip calls list.
> (by _gtk_rbtree_reorder which is calling qsort from libc which is 
> calling msort_with_tmp)
>
> after disabling continous reordering, by using in voip_calls_dlg_update():
>
>        /* disable the re-ordering */
>        gtk_tree_sortable_set_sort_column_id(sortable,
> GTK_TREE_SORTABLE_UNSORTED_SORT_COLUMN_ID, GTK_SORT_ASCENDING);
>
>        listx = g_list_first(listx);
>        while (listx) {
>            add_to_list_store((voip_calls_info_t*)(listx->data));
>            listx = g_list_next(listx);
>        }
>
>        /* enable the re-ordering */
>        gtk_tree_sortable_set_sort_column_id(sortable,
> CALL_COL_START_TIME, GTK_SORT_ASCENDING);
>
> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted 
> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 750000 samples  %        image name               
> symbol name
> 13495     8.8003  libwireshark.so.0.0.1    format_text
> 10698     6.9763  libglib-2.0.so.0.2800.6  g_unichar_isprint
> 7421      4.8394  libwireshark.so.0.0.1    guint8_pbrk
> 4973      3.2430  libwireshark.so.0.0.1    compute_offset_length
> 4776      3.1145  libglib-2.0.so.0.2800.6  g_hash_table_lookup
> 4029      2.6274  libwireshark.so.0.0.1    dissect_sip_common
> 3851      2.5113  libglib-2.0.so.0.2800.6  g_atomic_pointer_get
> 3186      2.0776  libc-2.13.so             __memcpy_sse2
> 3064      1.9981  libglib-2.0.so.0.2800.6  
> g_hash_table_insert_internal
> 3008      1.9616  libwireshark.so.0.0.1    emem_free_all
> 2964      1.9329  libwireshark.so.0.0.1    fast_ensure_contiguous
> 2803      1.8279  libwireshark.so.0.0.1    serv_name_lookup
> 2793      1.8214  libwireshark.so.0.0.1    
> check_offset_length_no_exception
> 2114      1.3786  libc-2.13.so             _int_malloc
> 2063      1.3453  libglib-2.0.so.0.2800.6  g_slice_free1
> 2046      1.3342  libwireshark.so.0.0.1    tap_push_tapped_queue
> 1841      1.2005  libwireshark.so.0.0.1    tvb_get_guint8
> 1713      1.1171  libc-2.13.so             memchr
> 1668      1.0877  libwireshark.so.0.0.1    tvb_length_remaining
> 1625      1.0597  libglib-2.0.so.0.2800.6  g_atomic_int_get 1550      
> 1.0108  libwireshark.so.0.0.1    proto_tree_add_item
> 1524      0.9938  libc-2.13.so             vfprintf
> 1475      0.9619  libwireshark.so.0.0.1    emem_alloc_chunk
> 1439      0.9384  libc-2.13.so             __strncmp_sse2
> 1395      0.9097  libglib-2.0.so.0.2800.6  g_slice_alloc
> 1312      0.8556  libwireshark.so.0.0.1    tvb_find_guint8
> 1279      0.8341  libwireshark.so.0.0.1    
> ensure_contiguous_no_exception
> 1052      0.6860  libc-2.13.so             malloc
> 1028      0.6704  libwireshark.so.0.0.1    dissect_sdp
> 1022      0.6665  libwireshark.so.0.0.1    tvb_pbrk_guint8
>
> the runtime decreased now to approx. 1 minute (pretty reasonable when 
> compared to 10 hours)
>
> I can publish the patches on bugzilla; however, the changes/tests have been explicitly targeted at sip voip calls.
> some of the data structures are global and used by _all_ voip protocol taps; it won't work correctly with dumps having also non-sip voip calls.
>
> pls. let me know your opinion.
>
> thanks.
> cristian
> ______________________________________________________________________
> _____ Sent via:    Wireshark-dev mailing list 
> <[email protected]>
> Archives:    http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
>             
> mailto:[email protected]?subject=unsubscribe
> ______________________________________________________________________
> _____ Sent via:    Wireshark-dev mailing list 
> <[email protected]>
> Archives:    http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
>             
> mailto:[email protected]?subject=unsubscribe
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:[email protected]?subject=unsubscribe