When you read the discussions between proponents and opponents of automatic memory management, one might get the impression that this is some kind of unified technology, equally implemented in all programming languages. In this article, we'll talk about garbage collection, the most common memory management mechanism.
Disputes on this topic in general often sound like "pure C versus other languages." In fact, automatic memory management in C is quite possible and applied in practice, and “other languages” and their different implementations are very different from each other.
Memory management errors
To begin with, let's remember what automatic memory management saves us from.
The first and most famous, but not the most dangerous one, is a memory leak. The leak occurs if you request memory from the OS kernel and forget to return it. In C terms, call
malloc( and forget
free(… The program with this problem will take up more and more memory until the user or the OS itself stops it. At the same time, the program behavior remains correct, and leaks do not cause security problems.
The second problem is dangling pointers. The essence of the problem is that a pointer to a memory location that has already been freed remains in the program. There is a separate term for re-accessing such memory – use after free. Such errors are much more dangerous, and the consequences can be very different: from difficult to debug glitches to the ability to execute arbitrary code – CVE base won't let you lie…
A rarer version of the dangling pointer problem is double free, which destroys the payload.
Thus, two properties are required from the automatic control solution: never remove objects to which there are live pointers from memory, and, if possible, do not leave in memory objects to which there are no live pointers.
What does the garbage collector do?
Simplistically, we can say that at startup, a program has a contiguous range of addresses where it can place its data. A program with automatic memory management immediately at startup requests a memory area from the OS for the "heap". The initial heap size can often (but not always) be adjusted at compile time or run time. The heap can grow in size as the program runs.
After that, the garbage collector periodically monitors which parts of memory still contain the necessary data, and which ones can be freed and filled with new data. How exactly he does this depends on the implementation, but more on that later. To begin with, let's dispel simpler myths.
Is garbage collector part of the language?
You will often hear statements like "Ruby is a garbage collection language" or "C is a language with manual memory management." The first statement is true in the sense that no Ruby implementation provides manual memory management.
The second statement is more complicated. Garbage collection is not part of the C language specification. However, the specification does not prohibit it. The Hell language spec also does not impose any particular memory management model on compiler authors, but some compilers do provide an optional garbage collector.
Such C compilers are unknown to me, but in practice it is quite possible to automatically manage memory in C programs using third-party libraries.
For example, we will take Boehm GC… This is a very mature and functional product that has been or is still used by many projects: as applications (for example, a vector graphics editor Inkscape) and implementations of programming languages.
Using Boehm GC
Many Linux distributions provide a package with Boehm GC in their repositories, most often under the name
libgc… On Fedora, you can install it with the command
sudo , in Debian –
For demonstration, we will write a program that continuously requests memory for an array of thousands of integers, but never frees it. If we used the classic
malloc(, it would be a textbook memory leak. But we will not refer directly to the OS, but to the Boehm GC memory manager using the function
GC_MALLOC( and see what happens.