ANNOUNCEMENT: Live Wireshark University & Allegro Packets online APAC Wireshark Training Session
April 17th, 2024 | 14:30-16:00 SGT (UTC+8) | Online

Wireshark-dev: Re: [Wireshark-dev] How to skip unrecognizable packets in saved pcap files

From: Ye Deng <yedeng0@xxxxxxxxx>
Date: Mon, 19 Sep 2011 10:38:42 -0400
Hello Guy,

Thanks A LOT for the very informative reply.


On Mon, Sep 19, 2011 at 3:17 AM, Guy Harris <guy@xxxxxxxxxxxx> wrote:

On Sep 18, 2011, at 9:22 PM, Ye Deng wrote:

> I have a serious issue when using libpcap functions to process pcap files.
> The error happens when I use pcap_next_ex() function to get packets from saved pcap files one-by-one. The pcap_next_ex() terminates processing, and returns an error saying, "bogus savefile header".
>
> Therefore I may want to know: how to skip the unrecognizable packets, and let libpcap functions to process the resting valid packets? I really prefer to use some *existing* modules/tools to do the job.
> I tried "mergecap" and "editcap", and found they cannot skip the unrecognizable packets. Are there some "improved mergecap/editcap" can do the job, and produce pcap files without any unrecognizable packet?

None that I know of.  The program would have to use something other than either libpcap or Wireshark's Wiretap code to read the capture file, because (as you've discovered) both of them regard packets with a size bigger than 65535 as invalid.

> After I did some researches online, I found the "unrecognizable packets" may be generated by file transfers using HTTP/FTP in some text mode.
> Please search "corrupt" on this webpage below.
> http://www.winpcap.org/ntar/draft/PCAP-DumpFileFormat.html
> Therefore, I think the pcap-next-generation-dump-file can deal with this issue.

Yes, it can deal with this issue.

It deals with it by having a field in the file that will be changed if you transfer a file in text mode between systems with different line ending conventions (for example, between Windows and UN*X) and by treating a file with a wrong value in that field as being damaged.

It does *NOT* deal with it by doing anything more than that.  In particular, it does *NOT*, and cannot, magically undo the damage done to the file by transferring it in text mode.

> But I tried "pcap-ng" in Wireshark, and got an assertion failure during every capturing test, which shows that the "pcap-ng" related functions are still unfinished...

No, that shows that there's a bug somewhere.  What was the assertion failure?  We'd like to fix the bug, but we'd need to know the assertion failure.
My experiment is simple. 
On my iMac, I start Wireshark (Version 1.4.1: SVN Rev 34476 from /trunk-1.4) as root from terminal; 
Then, I turn on the option in Wireshark GUI:  Capture->Options->Capture packets in pcap-ng format (experimental);
I visit some website, and wait to see any output shown on the Wireshark GUI window;
The terminal windows shows: "ERROR:new_packet_list.c:1126:show_cell_data_func: assertion failed: (cell_text) 
[1]+  Done                    /Applications/Wireshark.app/Contents/MacOS/Wireshark"
The Wireshark GUI window is turned off while the error information above appears.

 
However, even if we fix that bug, and any other bugs you run into:

       1) it will not magically be able to read pcap-ng files that have been damaged by being transferred in text mode;

       2) even if it could (which it can't, as there's no way for it to figure out where, in the file, the pair of bytes 0x0d 0x0a was turned into the single byte 0x0a, or the single byte 0x0a was turned into the pair of bytes 0x0d 0x0a - the first of those would happen if a file were transferred in text mode from Windows to UN*X, the second of those would happen if a file were transferred in text mode from UN*X to Windows), it wouldn't help you, because your file is in pcap format, not pcap-ng format.
I understand that, to deal with this issue well, I'd better capture packets in pcap-ng format, which means I should use new Wireshark/Tshark/tcpdump that understands pcap-ng format to do the capturing job.

When the new Tshark/tcpdump will to be very stable to use? I may want to write some bash scripts with Tshark/tcpdump to automatically capture packet in pcap-ng format for my project.

 
> Also, I read the source code of libpcap, that error happens when length of captured packet is considered too big.
> In "/libpcap-1.1.1/sf-pcap.c"
> In this function below:
> static int pcap_next_packet(pcap_t *p, struct pcap_pkthdr *hdr, u_char **data)
> {
> ... ...
> if (hdr->caplen > 65535)
> { snprintf(p->errbuf, PCAP_ERRBUF_SIZE,"bogus savefile header");
> return (-1); }
> ... ...
> }
>
> Basing on the pcap file format:  http://wiki.wireshark.org/Development/LibpcapFileFormat
> I think it is possible to do a "magic number searching" when the if() above is true. The bytes holding that "magic number" can be considered as the beginning of next valid packet.
> Notice that every valid packet has a timestamp in packet header.
> typedef struct pcaprec_hdr_s {
> guint32 ts_sec; /* timestamp seconds */
> guint32 ts_usec; /* timestamp microseconds */
> guint32 incl_len; /* number of octets of packet saved in file */
> guint32 orig_len; /* actual length of packet */
> } pcaprec_hdr_t;
> If we know the range of the capturing time, we can use some bytes in "pcaprec_hdr_s.ts_sec" as the "magic number".

There is no guarantee that

       1) the packets in the file before the first packet with a too-large captured length have not had their data damaged by transferring the file in text mode, so just because they're "valid" in that the captured length isn't > 65535 that doesn't mean they're "valid" in the sense that the data actually reflects what was captured;
I agree. I think what I proposed is not a perfect solution.
The other components of my packet-processing program will check the resting packet data to extract information for my project, which means those components will check if the resting packet data is "valid".

Regards,
Deng
 

       2) the same applies to packets after the ones you've skipped.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
            mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe