Wireshark-users: [Wireshark-users] Parsing out HTTP requests from Wireshark/Ethereal saved packet

From: "David Luu" <DLuu@xxxxxxxxxxxx>
Date: Mon, 7 Aug 2006 17:56:26 -0700
Title: Message
Don't know if anyone has done something similar before, as I don't want to reinvent the wheel if I don't have to. Whether anyone has done such before, I could use some advice if any.
I'd like to automate some web configuration that occurs from Java applets making HTTP POST requests to some ASP server script pages (I prefer this idea over GUI automation of the Java applets to do the same configuration). I'd like to parse out the HTTP request information & extract the URL and POST data from a script (like Perl) instead of manually analyzing the HTTP requests to do the same. I was thinking of parsing saved capture files in libpcap format from Wireshark/Ethereal instead of real-time capture & parsing of the requests in Perl and some libpcap library. Then I could (nearly) automate building a Perl user agent script that would do the same web configuration.
It looks to be feasible except that the saved capture file has non-ASCII garbled junk padded in between the HTTP request/response body & the next HTTP request/response header field. And this non-ASCII garbled junk has some ASCII characters in it also. This part would be more troublesome to parse out.
So I was wondering if someone has already done a similar HTTP request parsing framework that I could use or if there is a way to save/extract (in Wireshark/Ethereal, etc.) the HTTP requests as pure ASCII messages as defined in the RFCs (without all the garbled junk in between) so that it would be easier to parse.
David Luu