heptadecagram.net

Programming

Rules of Writing C Code

While C may not be a very ergonomic programming language in many ways, it does fulfill a key niche. It is the lingua franca of programming: nearly every architecture that sports something fancier than assembly has a C compiler for it. As such, C is still an important language.

While important, C is a language notorious for potential exploits and pitfalls. Not many languages had an unsafe-by-design function in the standard library that wasn't removed for nearly 40 years! When writing C code, keeping a few rules in mind can make the experience much more bug-proof.

Rule 3: WAVE (Warnings: All; Valgrind: Every time)

C is a very loose language in its specification. A number of practices or patterns are now known to be risky or likely sources of bugs. Well, those code snippets might be risky, but compilers can't alert on them, becuase that would break legacy code! So compilers instead push those warning into compiler flags, that a user must explicitly turn on to see. At bare minimum, -Wall -Wextra -Wpedantic should be set. My personal set is usually -Wall -Wextra -Wpedantic -Wvla -Wfloat-equal. Depending on the program, -Waggregate-return -Wconversion can also be a good choice. Turn on as many warnings as practical, and then fix all the warnings.

Memory leaks and improper memory access are highly likely in C. Thankfully, many dynamic analysis tools exist that can detect and report on these errors. Valgrind is one such tool. Running a program through valgrind will detect memory leaks, as well as a number of heap usage errors. Just like with warning flags, try to fix every error or warning that Valgrind reports on. Sometimes, the problem will be with a third-party library. In that case, at least the programmer is aware of the problem and might even be able to push a fix upstream or work around the issue. Valgrind should be used every time to detect and correct the errors it reports.

Rule 2: Kindergarten (If you ask for it, you must put it back)

Object-oriented languages tend to support destructor or tear-down functions that run automatically when a resource is no longer used. C does not have such a feature. As such, every resource requested must be returned to the system. Open a FILE *? Ensure that the code paths from that point include a fclose call. Create a network socket? Make sure it gets passed to close at some point. Allocated memory? Confirm that it gets freed using Valgrind (see Rule 3).

As a matter of fact, Valgrind can also be used to detect things like unclosed file descriptors. While actual objects are usually detected via heap leaks, this file descriptor tracking will pretty much catch anything else.

Rule 1: ABC (Always Be Checking)

The number-one rule of C is remembering the ABCs of checking return values from functions: Always Be Checking! Check those return values! C does not have exceptions, so all code flow is "local". As long as the return value from a given function call is checked for any erroneous result, the code flow is predictable. The program's not going to run off into some exception handler in the middle of a for-loop.

This rule is often extended to function parameters as well. Always Be Checking those parameters, especially if they are pointers! The standard library may be able to get away with segfaulting on NULL parameters, but new libraries shouldn't. Return an error value instead.

Rule 0: TN1 (Trust No One)

The final rule, rule zero. Trust No One. Don't trust those assert calls to always be there. Don't trust the user to type fewer characters than your buffer size. Be properly mistrustful of your code, your libraries, and your tools. You are never such a good programmer that you can't be the source of the bug you're investigating.

Least of all should you trust arbitrary rules on how to write code.