ANNOUNCEMENT: Live Wireshark University & Allegro Packets online APAC Wireshark Training Session
April 17th, 2024 | 14:30-16:00 SGT (UTC+8) | Online

Wireshark-dev: [Wireshark-dev] Protocols vs dissectors, take 23

From: Michael Mann <mmann78@xxxxxxxxxxxx>
Date: Sun, 1 Jan 2017 21:13:01 -0500
I really like the flexibility and power that the current dissector table API provides.  The one small shortcoming is that a "protocol" needs to be provided when adding a dissection function to a dissector table.  And not all dissection functions are a protocol.  The "protocol" API is intentionally separate from the dissector function API because there isn't always a 1-1 relationship. This can lead to the creation of "dummy" protocols, or "pinos" (Protocols in name only) as I've decided to call them (modeled after [1]).  To me, pinos don't have the same capabilities as real protocols.  They don't have hf_ fields or heuristic dissection functions associated with them.  They cannot be enabled/disabled.  They are there strictly to satisfy the dissector table API and Decode As functionality by extension (names for dissection functions).
 
A good example of where pinos should be used is creating a dissector table for TCP options.   Each option could be its own dissection function and the dissector table architecture also allows for options to be implemented outside of the TCP dissector file (for things like Lua scripts for experimental or proprietary options). However, it's all under the umbrella of the TCP protocol. So if each TCP option is treated as a real protocol, the Enable/Disable dialog (or anything else that needs to list all protocols) gets needlessly populated with "protocols" that aren't real.
 
An example of where pinos are questionable is where there are different versions of a protocol, especially when they are separate dissector files.  (i.e. Openflow).  Would a user want to enable/disable certain versions of OpenFlow?  I don't know, but I guess to be on the safe side, I would say "yes", but that then leads to multiple entries for a single "protocol".  I also like "self contained" dissector files and sharing a protocol ID across files so hf_ fields can be registered under the same protocol ID can be dangerous based on how the register-cache.pkl works.
 
Another example of where pinos are questionable are "object oriented protocols".  Dissector tables are set up to key off of an "object ID", and that turns into a protocol ID (that typically has it's own hf_ fields for the data within that object).  It varies per protocol, but I would guess most of protocols don't need enable/disable and the protocol IDs created for the objects are just for the dissector table itself and to modularize the hf_ fields and/or dissection functionality across files.
 
The last example I have is one of the reasons for writing this email in the first place - Bluetooth.  I know nothing of how the Bluetooth protocol works or is architected, I just know it has over 400 entries in the protocol list, which is over 20% of the total list. That's a lot of real estate to take up for the many users that don't use Wireshark for Bluetooth.  It looked simple enough to remove most of those entries from the Enabled Protocol dialog by turning them into pinos with a few lines of code (https://code.wireshark.org/review/19482).   However it was met with resistance from people more familiar with Bluetooth (they felt enable/disable is warranted).
 
So I'm back to asking for ways to better categorize these situations.  How much of it is "GUI presentation" and how much of its "underlying architecture"?  As developers we can be lazy and have GUI just mirror the underlying architecture, which is not always advantageous to the user.  I don't think we want to expose every dissection function with its own name to the user, at least on "main" dialogs.  If we create more "Advanced" dialogs that seems more acceptable, but I also don't want GUI code to be that knowledgeable about dissectors (nothing like "if Bluetooth - create separate dialog", or even "if protocol has more than 10 children, create separate dialog").  However, what I'm interested in working on is the underlying architecture.   I can work on writing/implementing rules categories (so GUI could follow), but I'm still at a loss as to what those categories would be. So far I have
 
1. Real protocols
2. 'tweeners.  Things like protocols with multiple versions, object oriented protocols.  Should there be a parent/child relationship with a "real" protocol?
3. heuristic functions.  Should they really have a parent/child relationship with real protocols like is presented in the GUI?  Historically they have had their own category which I think ranks them above pinos.
4. pinos
 
I'm open to discussing what the rules of pinos should be, but I intentionally started with the most strict (name only).  If pinos support hf_ fields or heuristic functions should they be called something else that would be considered "better" hierarchically than a pino?  I would also like APIs to be more implicit and relationships handled "under the covers" and not have something like proto_set_type(enum proto_category) function.
 
Parent/child relationships in the underlying architecture lend themselves well to subtrees in the GUI.  This is a case where I don't have better suggestions, but it does feel like that's the lazy developer option.  It's also where sheer volume (for things like Bluetooth) really cause it to break down.  The search functionality does mitigate it some, I'm just not sure it's enough.  Does having "real protocols" as the root node of a tree for all other categories make sense?
 
Ideas/opinions welcome.
 
Michael
 
P.S. I'm not trying to pick on Openflow or any other protocol example I give in the email.  They are just examples of problems to be solved or questions that need to be asked when new dissectors come in for review.