On Feb 22, 2011, at 12:15 PM, David Aggeler wrote:
> Welcome to the club. Even though my opinion is not shared widely, there's no good TCP reassembly API in place to handle unknown length reassembly at TCP level reliably.
There isn't, because there's no single mechanism for delimiting messages over a byte-stream protocol - a number of protocols are structure similarly to HTTP, where the message headers are text, with the first line being a request or reply code, followed by zero or more optional headers, followed by a blank line, followed by the body, with the body either having its length specified by one of the optional headers or by some other mechanism (I forget how chunked encoding works - it might be different) or terminated by a connection half-close. There is code in Wireshark to support that model. Other protocols might use other mechanisms; there's not necessarily code for all of those mechanisms.
> If yo know the length, tcp_
Presumably you meant to say "if you know the length, tcp_dissect_pdus() can be used", where "if you know the length" means:
1) every message is at least N bytes long, for some fixed value of N;
2) by looking at the first N bytes of the message, you can determine the message length (either it contains a field giving the message length, or it contains a message type field from which you can infer the length, or...).
Unfortunately, protocols running on top of TCP can assume that data is delivered in order (if it's not, your TCP implementation is broken), so they don't have to provide their own sequence numbers for reassembling data split between TCP segments. Unfortunately, Wireshark doesn't currently guarantee in-order delivery of TCP segment data to dissectors; arguably, if you have TCP's "allow subdissectors to reassemble" preference set, it should attempt, as best it can, to do so, with out-of-order segments saved for reassembly if the missing segments appear later in the capture - unfortunately, there's no guarantee that the missing segments, even if they were successfully transmitted, will be in the capture, for various reasons (starting the capture after they're on the wire, packets dropped when capturing, etc.).