Huge thanks to our Platinum Members Endace and LiveAction,
and our Silver Member Veeam, for supporting the Wireshark Foundation and project.

Wireshark-dev: [Wireshark-dev] Storing Generated Code in Git [Was: master 9079e3a: Cheat and tr

From: Evan Huus <eapache@xxxxxxxxx>
Date: Mon, 23 Jun 2014 17:06:21 -0400
Perhaps this is a discussion we should have had at Sharkfest, but it's come up now. Oh well.

My objections to generated code in git are two-fold: practical and philosophical.

Practically, it's painful to have to run make twice to test ASN.1 changes. It's painful to review diffs full of #line changes in generated code. It's painful to be debugging something and realize that the last time somebody changed the template file they forgot to regenerate the dissector. Generated code also complicates the use of tools like ctags because all the ASN.1 functions end up with two definitions - one in the original template, and one in the generated code.

Philosophically, generated C code is not "source". Ideally we store only "source" in git, and the build process turns that source into usable binaries, libraries, etc. Most of our source already goes through an intermediate step (.o file, on *nix at least) before reaching its final form. Whether it goes through a second intermediate step as a .c file is irrelevant to me, because that step is intermediary. I don't want to have to care about it unless I'm actually working on the build system.

---

As far as I can see, the main arguments for storing generated code in git are:
- not all platforms have the tools necessary to generate the code
- generating it can take lots of time

Both of these are valid points, in a general sense. I don't think either of them are particularly strong with respect to ASN.1 specifically, but there is a very good argument to be made for keeping the X11 dissector in git (for example).

---

There are also cases like lemon, where we don't store the generated files but we store a fork of the entire generating program itself. Ideally we wouldn't do this either, but there are many fewer practical pain points with this approach.

---

Are there any general arguments I've missed? Other opinions, or perspectives?

Evan