Wireshark · Ethereal-dev: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)

Ethereal-dev: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, some

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Peter Johansson <Peter.Johansson@xxxxxxxxxxxx>

Date: Mon, 25 Jul 2005 22:43:58 +0200

Guy Harris wrote:

Peter Johansson wrote:
I have not forgot about this, I have just been a little bit morebusy than usual. I finally tracked down the problem though.
My conclusion is this:
Ethereal crashes in reassemble.c because reassemble.c copies datato a memory area that is not yet allocated (fd_i->flags has theFD_NOT_MALLOCED bit set). I have a solution to this (ensuringthat a crash does not occur) which I will post once I have donesome cleaning up.
The crash in reassemble.c occurs only as the result of a faultyprotocol dissector. In this case it is packet-bittorrent.c thatis the reason for the crash.The Bittorrent dissector registers only a heur-dissector (whichshould be fine). But once the heur test function detects thatthis TCP stream is in fact Bittorrent data, it creates aconversation, making sure that all future data in the same TCPstream is decoded by the Bittorrent protocol dissector withoutthe use of the heur test function (this too should be fine I guess).The heur test function is capable of telling the callingframework whether the PDU was in fact decoded by this dissectoror not by returning TRUE or FALSEpacket-bittorrent.c. The function that dissects Bittorrent databased on the fact that it belongs to a conversation does not havethe opportunity of telling the calling framework that it in factcannot decode the supplied PDU if necessary. And this isnecessary in the rare event that packet-tcp has marked thecurrent PDU with "[TCP previous segment lost]". In this case somedata is missing but the Bittorrent dissector still assumes thatthe first 4 bytes of the PDU denotes the length of the PDU to bedissected. The problem now is that since data was lost, thelength is read using a random offset into the original Bittorrentpacket (since some data was lost).My guess is that this could happen for any dissector that iscalled since the data belongs to a conversation created by thespecific dissector when data has been lost.
Should packet-tcp perhaps not call higher level dissector whenthe PDU is marked with "[TCP previous segment lost]" or at leastnot perform the try_conversation_dissector(...) call?What would be the better way of ensuring that this does nothappen with any of the already existing dissectors?Should perhaps the API at hand for dissectors be changed so thatwhen decoding PDU data, the dissector would be able to returnTRUE or FALSE in a similar way to the heur functions? This way,any dissector would be able to tell the lower layer dissectorthat although it should have handled this PDU, it could not.
What is your opinion?

/ Peter

_______________________________________________
Ethereal-dev mailing list
Ethereal-dev@ethereal.
http://www.ethereal.com/mailman/listinfo/ethereal-dev
Unfortunately:
and
2) I don't understand Swedish so I can't easily tell what thetechnical discussion above says.
That feature is used by several dissectors, such as the HTTPdissector which, when it's reassembling the entity headers of anHTTP request or response, keeps requesting more data until it seesthe blank line at the end of the entity headers, at which point itsays the reassembly is complete.
That feature is now broken because:
The "If it was already defragmented and this new fragment goesbeyond data limits" loop at the top of "fragment_add_work()""undoes" the reassembly by pointing fragments that no longer havedata, because it was copied to the reassembled chunk and then freed,at the target of the copy in the reassembled chunk, and sets theFD_NOT_MALLOCED flag on those fragments.
The "we have received an entire packet, defragment it and freeall fragments" code in "fragment_add_work()" saves the pointer tothe old reassembled chunk, allocates a new chunk to hold thereassembled data, and then falls into the "add all data fragments"loop.
The "add all data fragments" loop in "fragment_add_work()" thenused to copy *all* the fragments, regardless of whetherFD_NOT_MALLOCED was set on the fragment or not, into thenewly-allocated chunk. It now copies only the chunks withFD_NOT_MALLOCED set, and reports the others as being "Reassembleerror"s.
This means that, in the reassemblies after the first reassembly,some of the data in the reassembled chunk is whatever just happenedto be there at the time of the allocation.
The old code *did* work correctly for some captures I have with HTTPtraffic in them - FD_NOT_MALLOCED doesn't mean "fd_i->data isn'tvalid", it means "it's not the address of a mallocated chunk, it'san address *in* a mallocated chunk".
What are the details of the cases where the old code *didn't* work?
It might be that the correct fix is to, in the "we have received anentire packet..." code, set "fd_head->data" to"g_realloc(fd_head->data, max)", which means that the data that wasalready copied there during previous reassemblies will still bethere. However, we *still* need to get rid of the printout of the"Reassemble error" message, because it's bogus to print that messageevery time we, for example, reassemble HTTP entity headers - whichmeans we should really figure out why we're doing that in caseswhere it *is* an error, and figure out where to fix that.
"tcp_dissect_pdus()" uses the "continue reassembly" feature - itfirst tries reassembling the fixed-length portion of the PDU, sothat the "get the length" routine has enough of that portion to findout how large the packet is, and then tries reassembling the entirepacket, so if the 4-byte header of a presumed BitTorrent packet issplit across TCP segments, that code path would be used.
One place where there's *definitely* a risk of problems is apresumed BitTorrent packet where the presumed length field isgreater than 2^32-5, so that when 4 is added to it we overflow andget a value *less* than 4. However, going back to at least 0.10.8,if the get_pdu_len routine called by "tcp_dissect_pdus()" returns avalue less than the length of the fixed-length portion of the PDU,that's assumed to be an overflow, so it just shows a "Malformedpacket" error and quits.
This is a repost since my first attempt (a week ago) did not seem toreach the list.
1) My apologies, my intention was of course not to break thereassembly code...although I don't quite understand what you imply. Ithought I had verified that packet reassembly still works after thepatch. Am I just not looking at enough layers of dissectors on top ofpacket-tcp?
Are you looking at HTTP packets with the HTTP entity headers splitacross TCP segment boundaries and with reassembly of HTTP entityheaders enabled?

No, I haven't looked at that. Is there perhaps a sample capture filethat can be used for this. I looked (now) at the wiki but could not findone.

If not, you're less likely to see this problem.  As per my message:

>>     1) this completely breaks the feature wherein a TCP dissector,
>> handed a reassembled chunk of data, can indicate that it needs at
>> least N more bytes of data to be added to the reassembled chunk, so
>> that the reassembly has to be continued
reassembly isn't *completely* broken, it's just broken in the casewhere reassembly is done but the dissector says, when handed thereassembled chunk of data, "sorry, I need even more data" causingreassembly to be restarted.

OK.

2) My end-part of my conversation with Ronnie Sahlberg translated fromswedish is attached in the Swedish2English.txt file.
Do you also have a translation of what Ronnie said?

He sent me a private mail (hence the Swedish conversation) and gave mesome ideas about what to look for as he wrote the reassembly routinesfrom the beginning. At the same time he requested some sample capturefiles but he did not see the need to get them after I had presented whatI had found so far (the translation that you have already read).

Since the reassembly function is now broken, I guess that the bestthing to do right now is to back out of my earlier proposed change toreassembly.c. This will however make ethereal crash again in memcpycalled from reasseble.c most certainly when decoding Bittorrent data.
*All* Bittorrent data, or just some? "Update list of packets in realtime":


Just some. It seems to occur especially when the capture is missing frames.
Anyhow, it only occurs when Bittorrent dissector is enabled.

only if performing a capture with "Update list of packets in realtime"
enabled.
3. The crash only ocurrs when ethereal decodes gathered data, that is
doesn't affect the way dissection is done (it dissects as it reads thefile - i.e., as new packets are put in the file - and puts up thedissection as that happens), but it *might* increase the chances ofpackets being dropped (as Ethereal's doing a lot more work as packetsarrive).
Do you have a capture file that shows this problem? (If Etherealcrashes, the capture file will probably be in /tmp or /var/tmp orC:\temp or...; with the bug fix, you can save it.

Currently, with the capture files that I have, it requires two sets ofcapture files (I captured bittorrent data to rotating sets of files).Each file is about 20MB.

I will fiddle with this a bit using your changes to reassemble.c.

Should perhaps the API at hand for dissectors be changed so that whendecoding PDU data, the dissector would be able to return TRUE or FALSEin a similar way to the heur functions?
It should be changed to allow it to return some "mine" or "not mine"indication. Currently, we have such a mechanism, but it's not reallyclean. It returns an indication of how many bytes of the packet weredissected; unfortunately, there are places were turning 0 for "notmine" causes problems (yes, really - I could find out where it was,but it'd take some work). I've eliminated one place where that numberis used; if I can eliminate all of them (or provide some other way toget that information), we could just have the dissectors return aTRUE/FALSE indication.


That would be neat.

I honestly do not know what the best solution would be for thisproblem as it is reassemble.c that is vulnerable to dissectors thatcannot handle their data correctly.
I understand (now) that my interpretation of FD_NOT_MALLOCED was notcorrect, the fact still remains though that reassemble.c passedinvalid (unallocated data) to memcpy. I only noticed this when alsothe FD_NOT_MALLOCED bit was set, hence my misinterpretation. I guessthat Ronnie Sahlbergs proposal for a new memory allocation API couldbe used to detect that a memory are is not valid and should not bepassed on to for instance memcpy.
The best solution to this problem is to figure out what code pathleads there and, based on that, to figure out at what point a checkshould be inserted to detect it.


I do agree!

A too-large length, at least for a protocol running atop TCP,*shouldn't* cause a problem (other than filling up memory withreassembled data), as tcp_dissect_pdus() checks for overflows.


/ Peter

References:
- Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
  - From: Guy Harris
- Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
  - From: Peter Johansson
- Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
  - From: Guy Harris

Prev by Date: SV: [Ethereal-dev] RSVP and OSPF patches
Next by Date: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
Previous by thread: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
Next by thread: [Ethereal-dev] asn1.dll : fatal error LNK1120
Index(es):
- Date
- Thread