Dr. Brian Robert Callahan

academic, developer, with an eye towards a brighter techno-social life



[prev]
[next]

2022-07-04
Your next C compiler is a D compiler: Introducing DMD's ImportC

In my never ending quest to have oksh support every C compiler in existence, I sometimes find C compilers in places you wouldn't expect them. Today, I'd like to demonstrate the C compiler built into the Digital Mars D compiler, or DMD for short. Recent versions of DMD have a complete C11 compiler built in named ImportC. It is mature enough to almost fully build oksh. Let's take a look at it.

About ImportC

ImportC was written by Walter Bright, who is the person who created the D programming language. Previously, he developed one of the first C++ compilers by himself known as Zortech C++. He also created the Empire video game.

Full disclose, I have met Walter (virtually, at least). And as many of you know, I completed the port of all three D compilers, DMD, GDC, and LDC. I maintain the DMD, LDC, and Dub OpenBSD packages and help out with the GDC OpenBSD package. I have also spoken and done a Q&A session at DConf, the annual D conference. All this to say that yes I am biased towards D succeeding but I will do my best to provide a clear picture of what ImportC can do today.

ImportC as mentioned is a C11 compiler. As I understand it, ImportC focuses on standard compliance rather than implementing lots of different extensions. Because D modules and libraries can be freely linked with C modules and libraries, a common activity in the D community is to port C headers to D modules. These translations allow D code to be more easily written that interacts with the C code and libraries. ImportC aims to help solve this tedious work by allowing DMD to ingest C code directly, bypassing the need to write these C header to D module translations at least in some cases.

As Phobos, the D Standard Library, incorporates a copy of zlib, ImportC eliminates the need for an external C compiler during the DMD build process, meaning that all of DMD can be built by DMD. Indeed, that was one of the motivations behind the development of ImportC.

However, as a side effect, this means that the DMD compiler is also a C compiler and can be used directly on C codebases that don't include any D code.

Building oksh with DMD

oksh is a C codebase without any D code in it. Let's see if DMD can compile oksh. I started by configuring oksh with OpenBSD cc, aka clang.

DMD uses different flags than the C compiler, and this needs to be taken into account. One of the first things to notice is that DMD does not have an -o flag. If you want to set the name of the output file, you need to use the -of flag. And there is no space between -of and the name of the output file. Next thing to know is that D itself does not use a preprocessor, and so there is a different syntax to specify flags to send to the C preprocessor when using ImportC. That flag is -P=. As oksh uses -DEMACS and -DVI flags, those need to be converted to -P=-DEMACS and -P=-DVI for DMD.

Because of the -of flag difference, I had to write a new rule in the Makefile:

.c.o:
	${CC} ${CFLAGS} -of$@ -c $<

Combined with the converted CFLAGS, that yield invocations that look like this:

dmd -g -O -P=-DEMACS -P=-DVI -ofalloc.o -c alloc.c

Running make

One last thing before we get started: as we learned in the previous post, OpenBSD uses GNU extended assembly in some of its headers. While DMD does understand inline assembly, it does not understand GNU extended assembly. I simply #ifdef'd out the offending code in that header file so that I didn't have to do anything more complicated. It would be great if DMD learned how to handle GNU extended assembly, but for now this workaround is fine.

Now I could run make. And it began compiling. Unfortunately, oksh triggers an error for an as-of-now unimplemented item:

dmd -g -O -P=-DEMACS -P=-DVI -ofc_ksh.o -c c_ksh.c
c_ksh.c(1210): Error: C designator-list not supported yet

OK, so I built this file with clang and kept going. There was one more file that DMD couldn't compile:

dmd -g -O -P=-DEMACS -P=-DVI -ofexpr.o -c expr.c
expr.c(204): Error: cannot modify `const` expression `(*es).tok`
expr.c(205): Error: cannot modify `const` expression `(*es).val`

I wonder if this is a bug in ImportC. No other C compiler we've tried fails on this code.

Other than those two files, everything else builds with DMD. We need to change the library invocation in the link stage from -lcurses to -L=-lcurses as that's the format DMD expects, but that's not too dissimilar to the C preprocessor and the -P= modifications.

However, we get a lot of linker errors for multiple definitions of signal-related symbols. The solution is to add the _ANSI_LIBRARY definition to the C preprocessor. So we just add -P=-D_ANSI_LIBRARY to CFLAGS. That solved the problem.

And with that, we have a working oksh built (mostly) with DMD.

Refining the process

If we run size on the newly built oksh, we get surprisingly large numbers:

/home/brian/oksh $ size oksh
text    data    bss     dec     hex
1082444 315776  55020   1453240 162cb8

This is suprising because DMD is an optimizing compiler, so we should expect numbers much better than this. It turns out a lot of this is wasted space; DMD is linking in the DRuntime and Phobos libraries into the executable, but we're not using the D standard library. The solution is to add the -betterC flag to CFLAGS. This flag disables the Phobos standard library from being linked in. It also disables the ability for you to use any features from Phobos, but since we're only compiling C code, there's nothing from Phobos that we need.

And indeed, that does the trick:

/home/brian/oksh $ size oksh
text    data    bss     dec     hex
799716  67872   55020   922608  e13f0

But that still seems a little large. We can add the -release and -inline flags to CFLAGS to tell DMD not to emit contracts and asserts, and to try to inline functions as it can. This gets us to a size comparable to clang and gcc:

/home/brian/oksh $ size oksh
text    data    bss     dec     hex
284573  8488    50480   343541  53df5

Neat. The resulting binary works fine.

The only interesting bit left to report is that DMD links in many libraries that the other C compilers do not such as libc++ and libpthread. The solution here is to add -L=--as-needed to the linking step to tell lld not to link in shared libraries that are not used.

Building some other software

DMD was able to build oed without issue and was able to build mg with the exception of interpreter.c:

dmd -g -O -release -inline -betterC -P=-DREGEX -P=-D_ANSI_LIBRARY -P=-D__DMD__ -ofinterpreter.o -c interpreter.c
interpreter.c(108): Error: cannot implicitly convert expression `"define"` of type `char[7]` to `char`
interpreter.c(109): Error: cannot implicitly convert expression `"list"` of type `char[5]` to `char`
interpreter.c(110): Error: cannot implicitly convert expression `"if"` of type `char[3]` to `char`
interpreter.c(111): Error: cannot implicitly convert expression `"lambda"` of type `char[7]` to `char`

Conclusion

ImportC has come a long way since it was imported into DMD a little over a year ago. It helps DMD be fully self-contained, and can help in certain situations combining C and D code together. While there is the side effect of being able to be used to compile C-only codebases, that really is a side effect I think. With that said, definitely feel free to try ImportC on your own C codebases. It might just be the soft introduction to D you've been looking for.

While ImportC would be difficult to use in a configure script due to incompatible flags, like we did with oksh you can use it after configuring with another C compiler and altering commandline flags as necessary.

This kind of experimentation has made being a part of the D community a lot of fun and I'm looking forward to seeing what D has in store in the future.

Top

RSS