Version:  2.6.24 2.6.25 2.6.26 2.6.27 2.6.28 2.6.29 2.6.30 2.6.31 2.6.32 2.6.33-rc7

Architecture:  x86 arm avr32 blackfin m68k m68knommu microblaze mips powerpc sh

Linux/Documentation/mutex-design.txt

  1 Generic Mutex Subsystem
  2 
  3 started by Ingo Molnar <mingo@redhat.com>
  4 
  5   "Why on earth do we need a new mutex subsystem, and what's wrong
  6    with semaphores?"
  7 
  8 firstly, there's nothing wrong with semaphores. But if the simpler
  9 mutex semantics are sufficient for your code, then there are a couple
 10 of advantages of mutexes:
 11 
 12  - 'struct mutex' is smaller on most architectures: .e.g on x86,
 13    'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes.
 14    A smaller structure size means less RAM footprint, and better
 15    CPU-cache utilization.
 16 
 17  - tighter code. On x86 i get the following .text sizes when
 18    switching all mutex-alike semaphores in the kernel to the mutex
 19    subsystem:
 20 
 21         text    data     bss     dec     hex filename
 22      3280380  868188  396860 4545428  455b94 vmlinux-semaphore
 23      3255329  865296  396732 4517357  44eded vmlinux-mutex
 24 
 25    that's 25051 bytes of code saved, or a 0.76% win - off the hottest
 26    codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%)
 27    Smaller code means better icache footprint, which is one of the
 28    major optimization goals in the Linux kernel currently.
 29 
 30  - the mutex subsystem is slightly faster and has better scalability for
 31    contended workloads. On an 8-way x86 system, running a mutex-based
 32    kernel and testing creat+unlink+close (of separate, per-task files)
 33    in /tmp with 16 parallel tasks, the average number of ops/sec is:
 34 
 35     Semaphores:                        Mutexes:
 36 
 37     $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
 38     8 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
 39     checking VFS performance.          checking VFS performance.
 40     avg loops/sec:      34713          avg loops/sec:      84153
 41     CPU utilization:    63%            CPU utilization:    22%
 42 
 43    i.e. in this workload, the mutex based kernel was 2.4 times faster
 44    than the semaphore based kernel, _and_ it also had 2.8 times less CPU
 45    utilization. (In terms of 'ops per CPU cycle', the semaphore kernel
 46    performed 551 ops/sec per 1% of CPU time used, while the mutex kernel
 47    performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times
 48    more efficient.)
 49 
 50    the scalability difference is visible even on a 2-way P4 HT box:
 51 
 52     Semaphores:                        Mutexes:
 53 
 54     $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
 55     4 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
 56     checking VFS performance.          checking VFS performance.
 57     avg loops/sec:      127659         avg loops/sec:      181082
 58     CPU utilization:    100%           CPU utilization:    34%
 59 
 60    (the straight performance advantage of mutexes is 41%, the per-cycle
 61     efficiency of mutexes is 4.1 times better.)
 62 
 63  - there are no fastpath tradeoffs, the mutex fastpath is just as tight
 64    as the semaphore fastpath. On x86, the locking fastpath is 2
 65    instructions:
 66 
 67     c0377ccb <mutex_lock>:
 68     c0377ccb:       f0 ff 08                lock decl (%eax)
 69     c0377cce:       78 0e                   js     c0377cde <.text.lock.mutex>
 70     c0377cd0:       c3                      ret
 71 
 72    the unlocking fastpath is equally tight:
 73 
 74     c0377cd1 <mutex_unlock>:
 75     c0377cd1:       f0 ff 00                lock incl (%eax)
 76     c0377cd4:       7e 0f                   jle    c0377ce5 <.text.lock.mutex+0x7>
 77     c0377cd6:       c3                      ret
 78 
 79  - 'struct mutex' semantics are well-defined and are enforced if
 80    CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have
 81    virtually no debugging code or instrumentation. The mutex subsystem
 82    checks and enforces the following rules:
 83 
 84    * - only one task can hold the mutex at a time
 85    * - only the owner can unlock the mutex
 86    * - multiple unlocks are not permitted
 87    * - recursive locking is not permitted
 88    * - a mutex object must be initialized via the API
 89    * - a mutex object must not be initialized via memset or copying
 90    * - task may not exit with mutex held
 91    * - memory areas where held locks reside must not be freed
 92    * - held mutexes must not be reinitialized
 93    * - mutexes may not be used in hardware or software interrupt
 94    *   contexts such as tasklets and timers
 95 
 96    furthermore, there are also convenience features in the debugging
 97    code:
 98 
 99    * - uses symbolic names of mutexes, whenever they are printed in debug output
100    * - point-of-acquire tracking, symbolic lookup of function names
101    * - list of all locks held in the system, printout of them
102    * - owner tracking
103    * - detects self-recursing locks and prints out all relevant info
104    * - detects multi-task circular deadlocks and prints out all affected
105    *   locks and tasks (and only those tasks)
106 
107 Disadvantages
108 -------------
109 
110 The stricter mutex API means you cannot use mutexes the same way you
111 can use semaphores: e.g. they cannot be used from an interrupt context,
112 nor can they be unlocked from a different context that which acquired
113 it. [ I'm not aware of any other (e.g. performance) disadvantages from
114 using mutexes at the moment, please let me know if you find any. ]
115 
116 Implementation of mutexes
117 -------------------------
118 
119 'struct mutex' is the new mutex type, defined in include/linux/mutex.h
120 and implemented in kernel/mutex.c. It is a counter-based mutex with a
121 spinlock and a wait-list. The counter has 3 states: 1 for "unlocked",
122 0 for "locked" and negative numbers (usually -1) for "locked, potential
123 waiters queued".
124 
125 the APIs of 'struct mutex' have been streamlined:
126 
127  DEFINE_MUTEX(name);
128 
129  mutex_init(mutex);
130 
131  void mutex_lock(struct mutex *lock);
132  int  mutex_lock_interruptible(struct mutex *lock);
133  int  mutex_trylock(struct mutex *lock);
134  void mutex_unlock(struct mutex *lock);
135  int  mutex_is_locked(struct mutex *lock);
136  void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
137  int  mutex_lock_interruptible_nested(struct mutex *lock,
138                                       unsigned int subclass);

This page was automatically generated by LXR 0.3.1.  •  Linux is a registered trademark of Linus Torvalds