Wireshark-dev: Re: [Wireshark-dev] Detecting Protocol Headers
From: Guy Harris <[email protected]>
Date: Mon, 9 Mar 2009 18:58:26 -0700
On Mar 9, 2009, at 6:09 PM, Rayne wrote:

I took a look at packet-udp.c and packet-ip.c, and am wondering where I can find the definitions of the following functions:
call_dissector()
dissector_add()
dissector_try_heuristic()
dissector_try_port()
register_dissector_table()
register_heur_dissector_list()
epan/packet.c

and the following structures:
dissector_table_t
heur_dissector_list_t
dissector_handle_t
epan/packet.c and epan/packet.h

Also, where are the UDP ports and list of heuristic dissectors tried by the UDP dissector defined?
The tables of ports and of heuristic dissectors are defined in the UDP  
dissector.
Those tables are filled in by other dissectors.

From what I can understand from packet-udp.c, the structures udp_dissector_table and heur_subdissector_list are first defined and registered in the file packet-udp.c itself.
Yes.

So how would the UDP dissector know which sub-dissector and UDP ports to try next in order to call the next dissector?
It looks in those tables.

The process of registering dissector modules is a two-step process.

In the first phase, the proto_register_ routines for all dissectors are called. They register the protocol, the fields for the protocol, and other things, including any dissector tables or heuristic dissector lists for that protocol.
In the second phase, the proto_reg_handoff_ routines for all  
dissectors are called.  They register the dissectors in the dissector  
tables and heuristic dissector lists created in the first phase.
Also, are the dissectors in the heuristics list determined by statistics?
No.

For example, if say Protocol A follows Protocol B 80% of the time from traffic observed,
Which traffic?

At one site, protocol A might follow protocol B 80% of the time. At another site, it might follow it 0% of the time, because, at that site, protocol A might not be used at all.
then Protocol A is included in the heuristic list of dissector to try by Protocol B?
If

	1) there's a dissector for protocol A;

	2) protocol A can follow (be encapsulated in) protocol B;

3) you can't tell by looking at some "next protocol" field in protocol B whether it's followed by protocol A or not, you have to guess by looking at the payload of protocol B;
	4) protocol B supports heuristic dissectors for protocols that follow  
it;
	5) whoever wrote the dissector for protocol B knew all that and knew  
that they should therefore make the dissector for protocol A a  
heuristic dissector for protocol B;
then the proto_reg_handoff_ routine for the dissector module for  
protocol A would register the dissector for protocol A in the list of  
heuristic dissectors for protocol B.
And am I right to say that the protocol tree is built before the first packet is captured,
No, because there's no such thing as "*the* protocol tree" in general;  
a packet has *a* protocol tree that shows the dissection of all the  
protocols in that packet, so one can speak only of "the protocol tree"  
for a given packet, which obviously can't be created until that packet  
has been read.  (Note that the packet might be captured minutes, or  
hours, or days, or weeks, or months, or years... before the packet is  
read; it might have been written to a capture file when it was  
captured, and Wireshark or TShark might be reading the file much later.)
Where can I find an example where dissect-protocol() is called?
What do you mean by "dissect-protocol()"?

I also noticed that in packet-ip.c, the function dissector_try_port() is called. However, it appears that the "port" used here is the protocol field.
Correct.  The name dissector_try_port() is historical; it should  
really be called dissector_try_uint(), or something such as that, as  
its argument is an unsigned integer value.  There's also a  
dissector_try_string() routine, for dissector tables where the key is  
a string rather than an unsigned integer.
Without seeing the definition for dissector_try_port(), I'm guessing that the second argument of this function is the search critieria,
Correct.

and for UDP (and presumably TCP), it's the source/destination ports,
Yes - in one call to dissector_try_port() in the UDP and TCP  
dissectors, it's the lower-valued of the source and destination port  
numbers, and in the other call to dissector_try_port(), it's the  
higher-valued of the source and destination port numbers.
whereas for IP, it's the protocol field. Is this correct?
Yes (and in the Ethernet dissector, it's the Ethernet type field, for  
example).