[COLUG] Apache 1.3.29 vs Apache 2.0.48
Josh Glover
colug at jmglov.net
Tue Mar 16 17:07:32 EST 2004
Quoth Rob Funk:
> tom hanlon wrote:
>
>> The negative is that all modules need to be
>> thread safe. As a non programmer I do not have a grasp on how big of a
>> headache this is.
>
> Huge headache.
In my experience, code that is hard to make thread-safe is not so well
designed. For those who are not familiar with the difference between a
multi-threaded program and a multi-process one, it boils down to this:
Threads share the same address space, and child processes do not. [1]
What this means it that multi-process apps must communicate via IPC (Inter
Process Communication) methods such as pipes, shared memory, the filesystem,
etc, whereas multi-threaded apps can communicate via data structures. However,
this ability to share data between threads means race conditions emerge. A
race condition is a point in your program where two (or more) separate threads
are attempting to access the same data structure. Consider this simple
example: [2]
state = machines[i]->state;
state++;
machines[i]->state = state;
Suppose thread A and thread B both attempt to execute this line of code. Say
machines[i]->state starts out as 3. One possible outcome is that thread A
fetches the state (state == 3), adds one to it (state == 4), and stores it
(state == 4). Then thread B does the same thing, so machines[i]->state ends up
as 5. But in multi-threaded programming, you must consider *all possible
interleavings* of non-atomic instructions. So another possible outcome is
this:
A: state = machines[i]->state; // 3
B: state = machines[i]->state; // 3
A: state++; // 4
A: machines[i]->state = state; // 4
B: state++; // 4
B: machines[i]->state = state; // 4
This is a very simple example of a race, but one that illustrates the basic
concept.
To prevent races, you must have some way of locking a variable (or data
structure) so that one thread can execute a series of instructions that must
be atomic. Luckily, modern operating systems give us synchronisation
constructs that allow us to write thread-safe code.
Writing thread-safe code is a huge pain when you have a lot of global data
structures. If you write modular code, you will find thread safety not so
difficult to implement. In fact, Java makes it down right simple: there is a
keyword ("synchronized", IIRC) that you add to a method and then the mutex is
done for you.
Multi-threaded code *is* faster, if done right, because it tends to keep the
data structures "cache-hot", and IPC has more overhead than direct
communication.
-Josh
[1] Of course, Linux (and probably other Unices) has a performance hack called
"copy on write", where child processes start out sharing the parent's
address space until either the parent or the child writes to that space,
at which time a separate space is created for the child. However, this does
not change the way multi-process programs work.
[2] Which could also be written as machines[i]->state++, but that does nothing
to alleviate the race. Three lines of code just makes the race easier to
see.
--
Josh Glover
GPG keyID 0xDE8A3103 (C3E4 FA9E 1E07 BBDB 6D8B 07AB 2BF1 67A1 DE8A 3103)
gpg --keyserver pgp.mit.edu --recv-keys DE8A3103
More information about the colug
mailing list