Dr. Brian Robert Callahan

academic, developer, with an eye towards a brighter techno-social life



[prev]
[next]

2021-02-14
I donated a patch and got a free compiler!

Remember when I used to author blog posts on this site? Me neither. Let's fix that.

I recently noticed that the Tiny C Compiler (tcc) developers gained an interest in widening their platform support. Chief among those new platforms were the BSDs. A developer had committed most of the work necessary to almost allow tcc to compile and link working binaries on OpenBSD. I say almost, because it was missing a key final feature to make things just work(TM).

Security warning

It should be stated in no uncertain terms that you use tcc at your own risk. You will lose all the mitigations you expect from in-base compilers, including but not limited to RETGUARD and ASLR. This blog post is about the process of a compiler gaining support for a new platform.

Linkers are tricky

See, tcc might be a little different than other C compilers we are used to on Unix. More like clang, and less like gcc, tcc puts all the stages of compilation into a single binary. This means that the tcc binary includes a C preprocessor, C compiler, assembler, and linker. Compare that to clang, which (at least on OpenBSD) includes all those things except for the linker. And compare that to gcc, which requires a separate assembler and linker on the system, and keeps the preprocessor and compiler as separate binaries. And while clang has provisions to use any system linker, tcc does not. You only have the option of using the built-in linker with tcc. The compilation strategy of tcc is also similar to that of clang: directly output object code and skip machine-specific assembly altogether. The assembler is there to support inline assembly. To compare, gcc compiles to assembly, to be subject to further processing.

With LLVM lld and GNU ld, you can expect all modern amenities. This is not the case with the tcc linker. The last piece of the tcc puzzle was that the tcc linker did not know how to select shared libraries on OpenBSD, and as such, could only select static libraries. This meant many linker failures.

My first thought was to bypass the tcc linker entirely and pass the linking step to cc. That ended up not being ideal. If you chose to execute a full compile, tcc -o code code.c, this would pass the entire compilation to cc. You needed to break it up into two stages, tcc -c code.c && tcc -o code code.o, in order to actually use tcc to compile the C code. A new strategy was needed.

Fortunately, I knew how to teach the tcc linker how to select shared libraries on OpenBSD by using the same strategy I used to teach GNU gold the very same in my eternally unofficial port. I emailed the diff to the tcc developer spearheading the OpenBSD support, and it was quickly committed. Though the diff did use some OpenBSD extensions like strtonum and was eventually replaced, it was the case that tcc was able to do the right thing on OpenBSD/amd64. You could use tcc as a drop-in replacement if you wanted.

CPUs are tricky

Now we had OpenBSD/amd64 support only. While tcc supports aarch64, amd64, armv7, i386, and riscv, more work was required for other platforms. This is undoubtedly due to the differences in CPUs and their ABIs. You cannot simply expect that the way things like registers are used to be consistent across all CPUs. The tcc developer I was in communication with was able to fire up a virtual machine to bring up OpenBSD/i386 support. The dveloper expressed a desire to bring up OpenBSD/arm64 and OpenBSD/armv7 support as well.

Using the virtual machine, OpenBSD/i386 support was quickly added. But that was as far as the developer could go with currently available resources. I offered the developer shell access to my BeagleBone Black and my Raspberry Pi 3B+. This quickly became two tcc developers with shell access to my machines. Development from my perspective was a matter of rebooting the machines when things crashed and little more. And quite soon, I saw a commit for OpenBSD/arm64 and OpenBSD/armv7 support. Support for OpenBSD/arm64 was completed first. The OpenBSD/armv7 support required perhaps the most significant CPU-specific changes, requiring teaching the tcc linker about more parts of ARM ELF than were needed by other platforms.

Packages are not tricky

Despite not doing much more work than submitting a single diff, at this point I could go ahead and create a package that I subsequently committed. Writing the port was extremely straightfoward, which was even nicer.

And that's the story of how I donated a patch and got a free compiler in return. Sometimes donating hardware is better than donating code.

Top

RSS