Native POSIX Thread Library

The Native POSIX Thread Library ( NPTL ) is a modern implementation of a threading library for Linux. It is used in conjunction with the GNU C Library (glibc ) and allows Linux programs using POSIX threads ( pthreads ).


Since kernel version 2.0 Linux existed for the LinuxThreads threading library whose basic design principles under the influence of the existing 1996 restrictions on the Linux kernel and the libc5 were concluded. Linux had no real support for threads in kernel, but knew the clone ( ) system call, which produced a copy of the calling process with the same address space. LinuxThreads used this system call as best he could in the user space to simulate thread support. The library has been improved continuously, but was long out of date and conceptually also not repairable due. Drug of choice was therefore a re-implementation of the threading library.

The following problems with the existing LinuxThreads implementation were identified:

  • It needs a manager thread in the process that produces more threads, clean up again and the signal processing channels.
  • It is not POSIX compliant, since it is not possible for example, to send a signal to the whole process, and the kernel must ensure that signals as SIGSTOP and SIGCONT are passed to all threads in the process.
  • Under load, the performance is poor because the signal system is being abused for the purpose of synchronization. This is also bad for stability.
  • Each thread executes incorrectly own process ID.
  • On the important IA- 32 architecture only a maximum of 8192 threads were possible.

In order to solve the existing problems, additional infrastructure in the kernel and a rewritten threading library were needed. There are two competing projects were launched: Next Generation POSIX Threads ( NGPT ) led by IBM and NPTL under the aegis of Red Hat employed kernel and glibc programmer Ingo Molnar and Ulrich Drepper. Because it became clear that in practice the NPTL would prevail, the NGPT project in mid-2003 has been discontinued.

The NPTL team set itself the following objectives for its new library:

  • To make POSIX conformance userspace code again more portable
  • In order to increase effective use of SMP and good scalability to multi-processor systems, the performance on this basis: NUMA support

Under these conditions began in mid- 2002, the work on the new Native POSIX Thread Library. In August / September 2002, the Linux 2.5 for NPTL was prepared. For this it was necessary to introduce some new system calls and optimize existing ones. In the first benchmarks could now be produced on an IA -32 system within 2 seconds 100,000 concurrent threads; without NPTL alone took the creation of threads for almost 15 minutes. Despite this remarkable load, the system remained during these tests almost usable at normal speed.

Red Hat Linux 9 was the first Linux distribution, in which the NPTL is used in a patched 2.4 kernel ( and their users thereby sometimes were involuntary beta testers ). Meanwhile, virtually all modern distributions use the NPTL when using kernel version 2.6 or higher.


NPTL works similar to Linux threads. The kernel maintains still processes and not threads and new threads are created with a called from the NPTL clone (). However, the NPTL requires special kernel support and implemented so that synchronization mechanisms are placed in which to sleep and wake up threads again. These Futexes be used.

The NPTL is a so-called 1:1 threading library. The threads created by the user with the pthread_create () function are employed in a 1- to-1 relationship with processes in the scheduler queues the kernel. This is the simplest conceivable threading implementation. The alternative would be m: n In this case, there are typically more threads in user space, as there are processes in the kernel. The threading library would then be responsible for distributing the processor time to each thread in the process. With this concept, very fast context switching would be possible, since the number of necessary system calls is minimized, on the other hand, the complexity would be increased, and it could easily lead to priority inversion.