Multi-threaded programming * error-prone
* easy to make synch mistakes := data race
* hard to locate synch mistakes in debug
In few Word: Eraser is ?
* dynamically detects data races in lock-based multi-threaded programs
* uses binary rewriting techniques
* monitor every shared memory reference
* verify "consistent" locking behavior/discipline
Introduction
* multi-threaded programs are the common programming techniques
* Dynamic race detection based on "happens-before" relation
* checks conflicting mem accesses from different threads
* Based on locking discipline: programming policy, ensures absence of data races E.g. require every variable shared b/n threads to be protected by a mutual exclusion lock
Definitions
* Lock : synch object; used for mutex; either available or owned by a thread
= ops on lock L: lock(L); unlock(L)
= is essentially a binary semaphore ,but ONLY lock's owner can release that lock
* Data Race : when two (or more) concurrent threads access a shared var
= at least one access is a write AND
= the threads use no explicit mech to prevent the accesses from being simultaneous
= If a prog has a potential data race, then the impact of conflicting accesses to the shared var *will depend on the interleaving* (of the thread executions)
Related Work ? Monitors
* group of shared vars + procedures that are allowed to access those vars
* single "anonymous" lock (by which we mean implicit)
* whenever one of those procedures is entered, that lock is acquired & whenever one of those procedures is exited, the lock is released
* the shared vars in the monitor are invisible outside of the monitor
* provide a static, compile-time guarantee that shared vars are
* serialized and thus free from data races
* Don't protect against data races in progs with dynamically allocated shared vars
Happens-Before Relation
* The HB order is a partial order on all events of all threads in a concurrent execution.
* W/in any single thread: events ordered in order in which they occur.
* B/n threads, events ordered according to the properties of the synch objects they access;
= if thread T1 accesses synch object L1 and the next access to L1 is by T2 then the T1's access of L1 "happens before" T2's
= if two (or more) threads access a shared var and the accesses are NOT ordered by the happens-before relation, then in another exec of the prog, the two accesses could have happened simultaneously == a data race could have occurred
# If there were no happens-before relation b/n T1's unlock(L) and T2's lock(L) then T1 could have read v = 3; T2 could have read v = 3; then T1 would write v = 4; so would T2 (which is not what SHOULD have happened had those two variable assignments been serialized)
Drawbacks to tools based on HB
* inefficient; require per-thread info a/b concurrent accesses to all shared mem locs
* effectiveness of tools based on HB is highly dependent upon the interleaving produced by the scheduler
Interleaving produced by the Scheduler
* Thread 1
y++ //y1
A
* Thread 2
B
y++ //y2
* So possible orderings (y_1 comes before A; y_2 comes after B):
= y_1, A, B, y_2
= y_1, B, A, y_2
= y_1, B, y_2, A
= B, y_1, A, y_2
= B, y_1, y_2, A <=== potential data race
= B, y_2, y_1, A <=== potential data race
The Lockset algo
* enforces simple locking discipline: "every shared var must be protected by some lock?
= the lock is held by the thread accessing the var *when* that thread accesses the shared var
* Eraser monitors all reads and writes as the program executes; infers protection relation (b/n particular locks & specific vars) from execution history (since it does not know which lock protects which shared var)
* V => shared var
* C(V) => candidate locks for V
* as would expect, when new var V is initialized, its candidate set, C(V), includes all possible locks
* when V is accessed, it updates C(V) with the intersection of C(V) and the set of locks held by the current thread => "lockset refinement?
* if C(V) becomes empty, then this indicates that there is no lock that consistently protects V. (error)
The Lockset algo V.1
* Let locks held(t) be the set of locks held by thread t.
* For each v, initialize C(v) to the set of all locks.
* On each access to v by thread t,
= set C(v) := C(v) ? locks held(t);
= if C(v) := { }, then issue a warning.
Improving the Locking Discipline
* Three cases violate the discipline but don't have data races
= Initialization: shared vars are frequently initialized w/o holding a lock
= Read-shared data: some shared vars are written during init only and are read-only thereafter; is OK to have multiple readers simultaneously reading
= Read-write locks: read-write locks allow multiple readers to access a shared var but only ONE writer to do so
Accommodating Initialization
* by delaying refinement of a variable's candidate after that var has been initialized
* how to know when initialization is complete?
= Eraser says a shared var has been initialized at the instant that var is first accessed by a second thread (Virgin)
= As long as a var has been touched by a single thread only, all reads & writes of that var are considered part of initialization (Exclusive). C(V) is not refined here
Accommodating Multiple-Readers
* A read access from another state changes the state of the variable to shared
* C(V) is refined, but errors are not reported even if C(V) is null
* Accommodating Read-Write
* A write by another thread on a shared var, takes the var from Exclusive or shared state to shared-modified
* C(V) is refined and error is reported if C(V) is null
* Require that for every variable V, some lock M protects V
= M is held in write-mode for every write of V
= M is held in *some* mode (read or write) for every read of V
The Lockset algo V.2
* keep track of all locks held by thread T; locks_held(T)
* keep track of locks held in write mode by T; write_locks_held(T)
* Upon entering the Shared-Modified, C(V) = set of all locks
= If we are READING V, C(V) = C(V) intersection locks_held(T)
= If we are writing V, C(V) = C(V) intersection write_locks_held(T)
= if C(V) == {}, report error
Implementation
* Eraser takes an unmodified program binary as input and adds instrumentation to produce a new binary that is functionally equivalent but that includes calls to the Eraser runtime to implement the Lockset algo
* To maintain C(V), Eraser instruments each load & store in the program
* To maintain locks_held(T) for each thread T, Eraser instruments each call to acquire or release a lock as well as the stubs used to manage thread initi & finalization.
* To initialize C(V) for dynamically allocated data, Eraser instruments each call to the storage allocator
* Eraser treats each 32-bit word in the heap or global data as a possible shared var
* Eraser does NOT instrument loads & stores whose address mode is indirect off the stack pointer -- since these are presumed to be stack refs which are presumed to be local to the thread data, not global data (which will be in a global location or in the heap).
* Eraser maintains candidate sets for stack locations that are accessed via registers *besides* the stack pointer
Representing Candidate Lock Sets
* naive implem: store a list of candidate locks for each mem loc
* exploit: distinct sets of locks observed in practice is small
= represent each set of locks by an int (lockset index)
= lockset index indexes into a table whose entries represent the set of locks as sorted vectors of lock addresses
= each entry represents some set of locks stored as a vector where that vector is sorted by the addresses of the locks comprising the set
= use hashing to eliminate dupes in the table & to find the key for a given set of locks
= the entries in the table are never deallocated nor modified; lockset index valid for program's lifetime
= Eraser caches result of several set intersections so fast case for set intersection is just a table lookup
= each lock set is -- again -- stored as a sorted vector so the worst case, intersection can be performed by comparing two sorted vectors
* For every 32-bit word in the data segment or in the heap, there is a shadow word that contains
= the 30-bit lockset index for that word
= a 2-bit state condition (Virgin, Exclusive, Shared, Shared-Modified).
= In the Exclusive state the 30 bits are used to store the ID of the thread with exclusive access.
* All the std. mem alloc routines are instrumented to allocate and initialize a shadow word for each word allocated by the prog.
* When a thread accesses a mem loc, Eraser finds the shadow word for that mem loc by adding some fixed displacement to the memory location's
Annotations
* How to suppress false positives without eliminating true positives?
* If false alarms suppressed with accurate & specific "annotations,? then when a prog is modified and modified prog is tested, only "fresh and relevant" warnings will be produced
Kinds of False Alarms
* mem reuse: mem is reused w/o resetting shadow var
= many progs implement free lists or private allocators so now way for Eraser to know that some piece of memory has been "privately recycled" (and thus is protected by a *new* set of locks) must be accommodated
= EraserReuse(address,size)
= instructs eraser to reset the shadow mem corresponding to the mem range to the Virgin state
* Private locks:
= locks taken w/o communicating this seizure to Eraser
= caused by private implems of multiple-reader, single-writer locks (not part of pthreads i/f that Eraser implements)
= EraserReadLock(lock)
= EraserReadUnlock(lock)
= EraserWriteLock(lock)
= EraserWriteUnlock(lock)
= communicate private lock implementations
* Benign races:
= true data races; didn't affect program's correctness
= some intentional, others no
= EraserIgnoreOn()
= EraserIgnoreOff()
= surround code which the race detector should not worry about w.r.t. finding races
Additional Thoughts/Works
* Protection by multiple locks
* Deadlock
Protection by multiple locks
* Some programs protect a shared variable by multiple locks instead of single
* Working :
= any writer must hold all locks for a shared var
= any reader must hold at least one lock for a shared var
= aimed at avoiding dead lock
* modified Lockset in following manner:
= On each read of V by T,
= if C(V) == {}, issue warning
= On each write of V by T,
= set C(V) == C(V) intersect locks_held(T)
= if C(V) == {}, issue warning
* prevented false alarms, could also cause false negatives
= if T1 reads V while holding M1 and T2 writes V while holding M2
= violation of locking discipline will only be reported if the write preceeds the read
* requires complex DSs: sets of sets of locks
Deadlocks
* Discipline to avoid deadlock:
= choose partial ordering among all locks
= and program each thread so that whenver it holds more than one lock, it acquires them in increasing order (prevents Circularity requirement of deadlock).
* deadlock checking could be useful addition to Eraser