Note 1: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #7KfXN=Mrp
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A spinlock built on TAS:
Acquire: while(!TAS(lock));
Release: lock = 0;
A spinlock built on CAS:
Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0); // result ignored
Both use
\(O(1)\) shared state regardless of \(n\), in contrast to Filter / Bakery which need \(\Theta(n)\).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A spinlock built on TAS:
Acquire: while(!TAS(lock));
Release: lock = 0;
A spinlock built on CAS:
Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0); // result ignored
Both use
\(O(1)\) shared state regardless of \(n\), in contrast to Filter / Bakery which need \(\Theta(n)\).
The release on CAS could just write 0 directly (faster); using CAS(lock,1,0) just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
A spinlock built on TAS: <pre>Acquire: while(!TAS(lock));
Release: lock = 0;</pre> A spinlock built on CAS: <pre>Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0); // result ignored</pre> Both use {{c1::\(O(1)\) shared state regardless of \(n\)}}, in contrast to Filter / Bakery which need \(\Theta(n)\). |
| Extra |
|
The release on CAS could just write 0 directly (faster); using <code>CAS(lock,1,0)</code> just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 2: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #P|j?^In$#
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Dekker's algorithm fixes the previous mutex tries by combining the flag-based approach (try 2) with a turn variable used only for conflict resolution (try 3): when both flags are set, the process whose turn it is not temporarily lowers its flag and waits, then tries again.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Dekker's algorithm fixes the previous mutex tries by combining the flag-based approach (try 2) with a turn variable used only for conflict resolution (try 3): when both flags are set, the process whose turn it is not temporarily lowers its flag and waits, then tries again.
Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
{{c1::Dekker's algorithm}} fixes the previous mutex tries by combining the {{c2::flag-based approach (try 2)}} with a {{c2::<code>turn</code> variable used only for conflict resolution (try 3)}}: when both flags are set, the process whose turn it is <i>not</i> temporarily lowers its flag and waits, then tries again. |
| Extra |
|
Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 3: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #vrtD+3u4N
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Fehlerreduktion Las-Vegas
Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).
Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis
\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:
\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Fehlerreduktion Las-Vegas
Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).
Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis
\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:
\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Beweisskizze: Pr[\(A_\delta\) gibt '???' aus] \(\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta\) für \(N \geq \varepsilon^{-1} \ln \delta^{-1}\), via \(1 - x \leq e^{-x}\).
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern compilers give no guarantee of a global ordering of memory accesses; individual accesses may even be optimised away entirely. Typical optimisations that cause reordering are dead code elimination, register hoisting, and locality optimisations.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern compilers give no guarantee of a global ordering of memory accesses; individual accesses may even be optimised away entirely. Typical optimisations that cause reordering are dead code elimination, register hoisting, and locality optimisations.
Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Fehlerreduktion Las-Vegas</b> <br>Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).<br><br>Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis <br>\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:<br>\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta}}.\] |
Modern compilers give <b>no</b> guarantee of a global ordering of memory accesses; individual accesses may even be {{c1::optimised away entirely}}. Typical optimisations that cause reordering are {{c2::dead code elimination, register hoisting, and locality optimisations}}. |
| Extra |
Beweisskizze: Pr[\(A_\delta\) gibt '???' aus] \(\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta\) für \(N \geq \varepsilon^{-1} \ln \delta^{-1}\), via \(1 - x \leq e^{-x}\). |
Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 4: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: )Wj64y)bWf
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's lock (for two processes \(P\) and \(Q\)) uses two shared variables:
flag[1..2] ("I am interested") and
victim ("who yields if both want in"). The pre-protocol for thread
id is
flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's lock (for two processes \(P\) and \(Q\)) uses two shared variables:
flag[1..2] ("I am interested") and
victim ("who yields if both want in"). The pre-protocol for thread
id is
flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);
Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten victim (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
{{c1::Peterson's lock}} (for two processes \(P\) and \(Q\)) uses two shared variables: {{c2::<code>flag[1..2]</code> ("I am interested")}} and {{c3::<code>victim</code> ("who yields if both want in")}}. The pre-protocol for thread <code>id</code> is <pre>flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);</pre> |
| Extra |
|
Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten <code>victim</code> (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 5: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: */_gG7JffH
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronizes-With (SW) pairs the specific actions that "see" each other: a volatile-write to x SW a subsequent read of x (subsequent in SO). Happens-Before (HB) is the transitive closure of PO and SW.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronizes-With (SW) pairs the specific actions that "see" each other: a volatile-write to x SW a subsequent read of x (subsequent in SO). Happens-Before (HB) is the transitive closure of PO and SW.
Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
<i>Synchronizes-With (SW)</i> pairs the specific actions that "see" each other: a <code>volatile</code>-write to <code>x</code> SW {{c1::a subsequent read of <code>x</code> (subsequent in SO)}}. <i>Happens-Before (HB)</i> is {{c2::the transitive closure of PO and SW}}. |
| Extra |
|
Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 6: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: +YN}l=~9iw
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: space lower bound is linear in the maximum number of threads; correctness relies on no memory reordering, which requires expensive memory barriers in real hardware; these algorithms are not wait-free; the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: space lower bound is linear in the maximum number of threads; correctness relies on no memory reordering, which requires expensive memory barriers in real hardware; these algorithms are not wait-free; the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead.
The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling \(O(1)\) space mutexes with practical performance.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: {{c1::space lower bound is linear in the maximum number of threads}}; {{c1::correctness relies on no memory reordering, which requires expensive memory barriers in real hardware}}; {{c1::these algorithms are not wait-free}}; {{c1::the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead}}. |
| Extra |
|
The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling \(O(1)\) space mutexes with practical performance. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 7: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: -T60qL&2-|
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The three conditions for a correct critical-section solution according to Ben-Ari are mutual exclusion (statements from CSes of different processes must not interleave), freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed), and freedom from starvation (any process trying to enter its CS must eventually succeed).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The three conditions for a correct critical-section solution according to Ben-Ari are mutual exclusion (statements from CSes of different processes must not interleave), freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed), and freedom from starvation (any process trying to enter its CS must eventually succeed).
Hierarchy: starvation freedom \(\Rightarrow\) deadlock freedom (if every individual gets through, then certainly some individual gets through), but not the converse.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The three conditions for a correct critical-section solution according to Ben-Ari are {{c1::mutual exclusion (statements from CSes of different processes must not interleave)}}, {{c2::freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed)}}, and {{c3::freedom from starvation (any process trying to enter its CS must eventually succeed)}}. |
| Extra |
|
Hierarchy: starvation freedom \(\Rightarrow\) deadlock freedom (if every individual gets through, then certainly <i>some</i> individual gets through), but not the converse. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 8: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /,nT@jP#V6
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Randomisierter Algorithmus für Optimierungsprobleme
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die beste der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Randomisierter Algorithmus für Optimierungsprobleme
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die beste der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]
Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.
Typischerweise ist \(f(I) = \mathbb{E}[A(I)]\), d.h. man möchte mit hoher Wahrscheinlichkeit mindestens den Erwartungswert erreichen.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.
Goals:
- leaves room for hardware and compiler optimisations
- but provides guidelines that allow correct multi-threaded programs to be written.
Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Randomisierter Algorithmus für Optimierungsprobleme</b> <br>Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die beste der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\] |
A {{c1::memory model}} (e.g. of a programming language like Java) is a {{c2::contract that gives minimal guarantees on the effects of memory operations}}. |
| Extra |
Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.<br><br>Typischerweise ist \(f(I) = \mathbb{E}[A(I)]\), d.h. man möchte mit hoher Wahrscheinlichkeit mindestens den Erwartungswert erreichen. |
Goals:<br>- leaves room for hardware and compiler optimisations<br>- but provides guidelines that allow correct multi-threaded programs to be written.<br><br>Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Note 9: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /_rk8FS{Ft
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region). The forbidden situation then becomes simply both processes in the collapsed CS state.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region). The forbidden situation then becomes simply both processes in the collapsed CS state.
Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be {{c1::collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region)}}. The forbidden situation then becomes simply {{c2::both processes in the collapsed CS state}}. |
| Extra |
|
Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 10: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 09qe2{A(
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Canonical deadlock example:
synchronized void transferTo(int amt, BankAccount a) {
this.withdraw(amt);
a.deposit(amt); // acquires a's lock
} Two concurrent calls
x.transferTo(1, y) and
y.transferTo(1, x) deadlock because
Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Canonical deadlock example:
synchronized void transferTo(int amt, BankAccount a) {
this.withdraw(amt);
a.deposit(amt); // acquires a's lock
} Two concurrent calls
x.transferTo(1, y) and
y.transferTo(1, x) deadlock because
Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.
The bug is intrinsic to the design: transferTo acquires two locks - the receiver's via this on entry, the target's via the synchronised a.deposit(). With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Canonical deadlock example: <pre>synchronized void transferTo(int amt, BankAccount a) {
this.withdraw(amt);
a.deposit(amt); // acquires a's lock
}</pre> Two concurrent calls <code>x.transferTo(1, y)</code> and <code>y.transferTo(1, x)</code> deadlock because {{c1::Thread 1 holds <code>x</code>'s lock and waits for <code>y</code>'s, while Thread 2 holds <code>y</code>'s and waits for <code>x</code>'s}}. |
| Extra |
|
The bug is intrinsic to the design: <code>transferTo</code> acquires two locks - the receiver's via <code>this</code> on entry, the target's via the synchronised <code>a.deposit()</code>. With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 11: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 0~]{jI&/:2
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The
Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing
\(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets
level[me] = \ell and
victim[\ell] = me, then waits while
\exists k \neq me: level[k] >= \ell && victim[\ell] == me
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The
Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing
\(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets
level[me] = \ell and
victim[\ell] = me, then waits while
\exists k \neq me: level[k] >= \ell && victim[\ell] == me
Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.
unlock just resets level[me] = 0.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The {{c1::Filter Lock}} extends Peterson's lock from 2 to \(n\) processes by introducing {{c2::\(n-1\) levels}}: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets <code>level[me] = \ell</code> and <code>victim[\ell] = me</code>, then waits while <pre>\exists k \neq me: level[k] >= \ell && victim[\ell] == me</pre> |
| Extra |
|
Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.<br><br><code>unlock</code> just resets <code>level[me] = 0</code>. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 12: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 1*Jyz+ZJLF
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.
Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
When constructing mutex algorithms from atomic registers, three assumptions are made: {{c1::atomic reads and writes of primitive-type variables}}; {{c1::no reorderings of read/write sequences (not true in practice; assumed here for simplicity)}}; {{c1::threads that enter a critical section eventually leave it}}. About progress {{c2::outside the critical section, no assumption is made}}. |
| Extra |
|
Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 13: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 1eIQehPP;^
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.
A violation of mutual exclusion shows up whenever a state with both program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
A {{c1::state-space diagram}} is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are {{c3::single atomic steps of one of the threads}}. |
| Extra |
|
A violation of mutual exclusion shows up whenever a state with <i>both</i> program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 14: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 4%UUWqRVem
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.
This is the standard pattern used inside java.util.concurrent. AtomicLong guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.
Using System.identityHashCode is tempting but unsafe - hashes can collide.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned {{c1::a unique <code>final long</code> index from a class-level <code>AtomicLong counter</code>}} via <code>counter.incrementAndGet()</code> in the constructor. Locks are then always acquired in {{c2::ascending order of <code>index</code>}}. |
| Extra |
|
This is the standard pattern used inside <code>java.util.concurrent</code>. <code>AtomicLong</code> guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.<br><br>Using <code>System.identityHashCode</code> is tempting but unsafe - hashes can collide. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 15: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7/+$V!A5-^
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.
compareAndSet ≅ CAS, getAndSet ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
<code>java.util.concurrent.atomic.AtomicBoolean</code> exposes two key RMW operations on top of plain <code>get</code> / <code>set</code>: {{c1::<code>compareAndSet(expect, update)</code>}}, which atomically writes <code>update</code> iff the current value equals <code>expect</code> and {{c2::returns <code>true</code> on success}}; and {{c3::<code>getAndSet(newValue)</code>}}, which {{c4::atomically writes <code>newValue</code> and returns the previous value}}. |
| Extra |
|
<code>compareAndSet</code> ≅ CAS, <code>getAndSet</code> ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 16: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7c[p[vx$A1
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.
These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
An <i>atomic register</i> \(r\) (a shared memory object, not a CPU register) supports <code>r.read()</code> and <code>r.write(v)</code> with four properties: each invocation \(J\) {{c1::takes effect at a single point in time \(\tau(J)\)}}; \(\tau(J)\) {{c1::lies between the start and end of \(J\)}}; two operations on the same register {{c1::have distinct effect times \(\tau(J) \neq \tau(K)\)}}; an invocation of <code>r.read()</code> returns {{c1::the value written by the <code>r.write(v)</code> invocation with the closest preceding effect time}}. |
| Extra |
|
These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 17: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7gI{,sQ;#o
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The 3rd-try mutex (strict alternation)
p2: while(turn != 1);
p3: critical section
p4: turn = 2
satisfies mutual exclusion but allows
starvation, because
a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The 3rd-try mutex (strict alternation)
p2: while(turn != 1);
p3: critical section
p4: turn = 2
satisfies mutual exclusion but allows
starvation, because
a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.
This is exactly why we make no assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The 3rd-try mutex (strict alternation) <pre>p2: while(turn != 1);
p3: critical section
p4: turn = 2</pre> satisfies mutual exclusion but allows {{c1::starvation}}, because {{c2::a process stuck in its non-CS never resets <code>turn</code>, so the other process is forced to wait forever}}. |
| Extra |
|
This is exactly why we make <i>no</i> assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 18: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7js:~
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.
Ziel: immer korrekt/gut, meistens schnell.
Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.
Ziel: immer korrekt/gut, meistens schnell.
Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.
Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.
Example: x=1; y=x+1; z=x+1; is sequentially equivalent to x=1; z=x+1; y=x+1; and even to x=1; z=2; y=2;.
In a parallel context these reorderings can become observable and break correctness.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Bei einem <b>Las-Vegas-Algorithmus</b> ist {{c1::die Laufzeit}} eine Zufallsvariable, die {{c1::Korrektheit}} hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer korrekt/gut, meistens schnell}}.<br><br><b>Alternative Sichtweise:</b> Algorithmus, der manchmal {{c3::'???' statt eines Ergebnisses}} ausgibt. Man kann ihn dann entweder {{c4::wiederholen, bis ein Ergebnis kommt}}, oder {{c4::nach fester Laufzeit abbrechen}}. |
Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does {{c1::not affect the semantics of a <i>sequential</i> execution}}. |
| Extra |
Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler. |
Example: <code>x=1; y=x+1; z=x+1;</code> is sequentially equivalent to <code>x=1; z=x+1; y=x+1;</code> and even to <code>x=1; z=2; y=2;</code>.<br><br>In a parallel context these reorderings can become observable and break correctness. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 19: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7kNVF}Z|oG
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Which memory operations actually get reordered depends on the {{c1::hardware architecture}}; e.g. {{c2::AMD64 (x86) differs significantly from ARM}}: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering. |
| Extra |
|
Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Note 20: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: :UEQ*@nnq1
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):
Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):
Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?
Deterministisch: \(\Theta(n)\) (Worst Case).
Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).
Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).
Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.
The globality of SO is the reason that volatile and locks can synchronise threads at all.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):<br><br>Was ist die Laufzeit des schnellsten <b>deterministischen</b> Algorithmus, was die <b>erwartete</b> Laufzeit des schnellsten <b>Las-Vegas</b>-Algorithmus? |
The <i>Synchronization Order (SO)</i> in the JMM is {{c1::a total order over all synchronization actions}}; {{c1::all threads see the SA in the same order}}; {{c1::the SA inside a thread occur in PO}}; {{c1::SO is consistent: every read in SO sees the latest write in SO}}. |
| Extra |
Deterministisch: \(\Theta(n)\) (Worst Case).<br>Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).<br><br>Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).<br><br>Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss. |
The globality of SO is the reason that <code>volatile</code> and locks can synchronise threads at all. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 21: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID:
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.
Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only {{c1::atomic read/write registers}}, then it must use at least {{c2::\(n\) shared variables}}. |
| Extra |
|
Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 22: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: >aq{oB;vc7
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.
Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The structure of the critical-section problem per process inside an infinite loop is: {{c1::non-critical section}} \(\to\) {{c2::preprotocol}} \(\to\) {{c3::critical section}} \(\to\) {{c4::postprotocol}}. |
| Extra |
|
Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 23: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ?7[F1KIShh
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.
Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".
Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Threads produce a sequence of <i>events</i>; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted {{c1::\(p_i^j\)}}. The <i>precedence relation</i> \(a \to b\) ("\(a\) occurs before \(b\)") is {{c2::a total order over events}}. |
| Extra |
|
Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".<br><br>Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting). |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 24: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @Id@]fUyPJ
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Achtung: \(N\) skaliert hier quadratisch in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.
Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.
Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Monte-Carlo, Fehlerwahrscheinlichkeit </b>\(< \tfrac{1}{2}\) <br>Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die Mehrheit der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\] |
In a multi-core memory hierarchy each core has {{c1::its own registers and its own L1 cache (private)}}, while {{c2::the L2 cache and system memory are shared between cores}}. |
| Extra |
Achtung: \(N\) skaliert hier <b>quadratisch</b> in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.<br><br>Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe. |
Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Note 25: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @rzMyS*OnN
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.
Recommendation: only for experts; otherwise use the standard library (java.util.concurrent, AtomicInteger, …).
Caveat: volatile guarantees visibility, but not atomicity of compound operations like i++.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Accesses to <code>volatile</code> fields in Java do {{c1::not count as a data race}}. In terms of performance, <code>volatile</code> is {{c2::slower than ordinary fields, but faster than locks}}. |
| Extra |
|
Recommendation: only for experts; otherwise use the standard library (<code>java.util.concurrent</code>, <code>AtomicInteger</code>, …).<br><br>Caveat: <code>volatile</code> guarantees visibility, but <b>not atomicity of compound operations</b> like <code>i++</code>. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Note 26: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Aajp);{.(r
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.
If STREX fails, software retries the LL/SC pair. Compared with x86's monolithic CMPXCHG, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a {{c1::Load-Linked / Store-Conditional}} pair: <code>LDREX</code> loads from memory and {{c2::marks the address as exclusive for this core}}; the subsequent <code>STREX</code> performs {{c3::a conditional store that only succeeds if no other core has written to the address in the meantime}}. |
| Extra |
|
If <code>STREX</code> fails, software retries the LL/SC pair. Compared with x86's monolithic <code>CMPXCHG</code>, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 27: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ApSiaX&)Ab
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a
ticket (label) larger than all outstanding tickets:
flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {}; where ties are broken by
lexicographic order on (thread id, label), so two equal labels are still totally ordered.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a
ticket (label) larger than all outstanding tickets:
flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {}; where ties are broken by
lexicographic order on (thread id, label), so two equal labels are still totally ordered.
Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).
The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
{{c1::Lamport's Bakery algorithm}} solves \(n\)-process mutex by assigning each thread a {{c2::ticket (label) larger than all outstanding tickets}}: <pre>flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};</pre> where ties are broken by {{c3::lexicographic order on (thread id, label), so two equal labels are still totally ordered}}. |
| Extra |
|
Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).<br><br>The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 28: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: C9O/B$skj#
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).
The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its <code>while</code> loop forever. Then \(Q\) is in one of three cases - {{c1::stuck in non-CS (then <code>flag[Q]=false</code>, \(P\) proceeds, contradiction)}}; {{c2::repeatedly entering and leaving its CS (each entry sets <code>victim=Q</code>, freezing <code>victim</code> away from \(P\), so \(P\) proceeds, contradiction)}}; {{c3::also stuck in its lock loop (then \(Q\) needs <code>victim != Q</code>, but <code>victim</code> can equal at most one of \(P,Q\), contradiction)}}. |
| Extra |
|
The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 29: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: CR2my^EB=I
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Ein
klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:
- \(A(I)\) ist immer genau und korrekt,
- bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
- dieselbe Laufzeit.
Ein
randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist
manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist
nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Ein
klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:
- \(A(I)\) ist immer genau und korrekt,
- bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
- dieselbe Laufzeit.
Ein
randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist
manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist
nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).
Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).
Yet the assertion assert(b >= a) can still fail in practice: pure interleaving reasoning ignores memory reordering by the compiler and the hardware.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Ein <b>klassischer</b> Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:<ul><li>\(A(I)\) ist immer {{c1::genau und korrekt}},</li><li>bei gleicher Eingabe immer {{c2::dieselbe Ausgabe (\(A\) ist eine Funktion)}},</li><li>{{c3::dieselbe Laufzeit}}.</li></ul>Ein <b>randomisierter</b> Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist {{c4::manchmal richtig, manchmal fast richtig, manchmal schnell}}, das Verhalten ist {{c5::nicht reproduzierbar}}. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\). |
In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: {{c1::\(b \geq a\) is always true}}. This proof technique is called {{c2::proof by exhaustion (full enumeration of interleavings)}}. |
| Extra |
Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht. |
Yet the assertion <code>assert(b >= a)</code> can still fail in practice: pure interleaving reasoning ignores <b>memory reordering</b> by the compiler and the hardware. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 30: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: D|&w=l8gFV
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
The
Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first:
do {
while(state.get()) {}
} while(!state.compareAndSet(false, true)); The inner loop only
reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only
after the lock looks free.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
The
Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first:
do {
while(state.get()) {}
} while(!state.compareAndSet(false, true)); The inner loop only
reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only
after the lock looks free.
Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The <i>Test-And-Test-And-Set (TATAS)</i> lock improves on TAS by spinning on a plain read first: <pre>do {
while(state.get()) {}
} while(!state.compareAndSet(false, true));</pre> The inner loop only {{c1::reads the cached value, generating no coherence traffic while the lock stays held}}; the costly RMW is attempted only {{c2::after the lock <i>looks</i> free}}. |
| Extra |
|
Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 31: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: F?scKnbfC[
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.
Plain CMPXCHG is atomic only with respect to other observers on the same core. The LOCK prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
On x86 the {{c1::<code>CMPXCHG</code> mem, reg}} instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with {{c2::the <code>LOCK</code> prefix}}. |
| Extra |
|
Plain <code>CMPXCHG</code> is atomic only with respect to other observers on the same core. The <code>LOCK</code> prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 32: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: FGboL4isC9
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.
This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Another deadlock fix is to protect every <code>transferTo</code> with {{c1::a single static global lock}}, which trivially avoids cycles because there is only one lock to acquire. Cost: {{c2::no two transfers can run in parallel, even on completely disjoint pairs of accounts}}. |
| Extra |
|
This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 33: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Fc~lyNYx7K
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A naive TAS-based spinlock
while(state.getAndSet(true)) {} scales poorly because
every spinning thread issues writes to the lock variable, which
generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A naive TAS-based spinlock
while(state.getAndSet(true)) {} scales poorly because
every spinning thread issues writes to the lock variable, which
generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.
Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
A naive TAS-based spinlock <pre>while(state.getAndSet(true)) {}</pre> scales poorly because {{c1::every spinning thread issues writes to the lock variable}}, which {{c2::generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry}}. |
| Extra |
|
Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 34: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Jj5bulRE#&
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 1st-try mutex
p2: while(wantq);
p3: wantp = true;
p4: critical section
the order "first wait for the neighbour, then set my own flag" violates
mutual exclusion, because {{c2::both threads can leave their
while with
wantp{=}wantq{=}false before either of them sets its flag}}.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 1st-try mutex
p2: while(wantq);
p3: wantp = true;
p4: critical section
the order "first wait for the neighbour, then set my own flag" violates
mutual exclusion, because {{c2::both threads can leave their
while with
wantp{=}wantq{=}false before either of them sets its flag}}.
Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.
Naive fix: set the flag before the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
In the 1st-try mutex <pre>p2: while(wantq);
p3: wantp = true;
p4: critical section</pre> the order "first wait for the neighbour, then set my own flag" violates {{c1::mutual exclusion}}, because {{c2::both threads can leave their <code>while</code> with <code>wantp{=}wantq{=}false</code> before either of them sets its flag}}. |
| Extra |
|
Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.<br><br>Naive fix: set the flag <i>before</i> the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 35: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: MH1KRE1DzH
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Las-Vegas-Algorithmus für das Üppige-Auswahl-Problemrepeat:
i := Uniform({1, ..., n})
if x_i = 1: return iSei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Las-Vegas-Algorithmus für das Üppige-Auswahl-Problemrepeat:
i := Uniform({1, ..., n})
if x_i = 1: return iSei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]
\(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).
Konkrete Werte: \(\Pr[T > 10] = 1/1024 < 0{,}001\), also \(\Pr[T \leq 10] > 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(> 0{,}999999\).
Wichtig: \(T\) hängt nicht von \(n\) ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)).
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.
Its only purpose is to link an execution back to the original program.
Intra-thread consistency: per thread, PO is consistent with the thread's isolated execution. E.g. taking the else-branch of an if whose condition was true makes the execution invalid.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem</b><pre>repeat:
i := Uniform({1, ..., n})
if x_i = 1: return i</pre>Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\] |
<i>Program Order (PO)</i> in the JMM is a {{c1::total order over the actions of a single thread, defined on an execution trace}}. PO gives {{c2::no ordering guarantee for memory accesses}} and is {{c2::not total across threads}}. |
| Extra |
\(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).<br><br>Konkrete Werte: \(\Pr[T > 10] = 1/1024 < 0{,}001\), also \(\Pr[T \leq 10] > 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(> 0{,}999999\).<br><br>Wichtig: \(T\) hängt <b>nicht von \(n\)</b> ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)). |
Its only purpose is to link an execution back to the original program.<br><br><b>Intra-thread consistency:</b> per thread, PO is consistent with the thread's isolated execution. E.g. taking the <code>else</code>-branch of an <code>if</code> whose condition was true makes the execution invalid. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 36: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: O96Epv!-$o
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.
Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning minDelay / maxDelay is workload-specific.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
<i>Exponential-backoff</i> lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles <code>limit</code> up to <code>maxDelay</code>}} before retrying. This removes the {{c2::synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released}}. |
| Extra |
|
Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning <code>minDelay</code> / <code>maxDelay</code> is workload-specific. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 37: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Pa^@bE6s8n
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen
genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne
einen Index \(i\) mit \(x_i = 1\).
Trivialer deterministischer Algorithmus (in Zeit \(O(
n)\)):
for i = 1, ..., n:
if x_i = 1: return i
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen
genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne
einen Index \(i\) mit \(x_i = 1\).
Trivialer deterministischer Algorithmus (in Zeit \(O(
n)\)):
for i = 1, ..., n:
if x_i = 1: return i
Das Problem heisst üppig, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).
Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Üppige-Auswahl-Problem<br></b>Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen {{c1::genau die Hälfte}} gleich \(1\) sind (\(n\) gerade).<br>Berechne {{c2::einen Index \(i\) mit \(x_i = 1\)}}.<br><br>Trivialer deterministischer Algorithmus (in Zeit \(O({{c3::n}})\)):<pre>for i = 1, ..., n:
if x_i = 1: return i</pre> |
In the classic x86 example with \(x{=}y{=}0\), thread 1: <code>x=1; j=y;</code> and thread 2: <code>y=1; i=x;</code>, besides \((1,1), (0,1), (1,0)\), the outcome {{c1::\((i,j) = (0,0)\) is also possible}}, because under x86's "stores reordered after loads" rule {{c2::the load (<code>j=y</code> / <code>i=x</code>) may be executed before the store (<code>x=1</code> / <code>y=1</code>)}}. |
| Extra |
Das Problem heisst <i>üppig</i>, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt. |
Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad). |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Note 38: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: RDjwIm/b}G
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.
Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).
Deadlock detection is a cycle search in this graph.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means {{c1::thread \(T_i\) is attempting to acquire resource \(R_j\)}}; edge \(R_j \to T_i\) means {{c2::resource \(R_j\) is currently held by thread \(T_i\)}}. A deadlock is present iff {{c3::this graph contains a cycle}}. |
| Extra |
|
Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).<br><br>Deadlock <i>detection</i> is a cycle search in this graph. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 39: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: T+YN#N>,$x
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen
nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)
Möglich ist sie aber bei:
- einseitigem Fehler, oder
- {{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen
nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)
Möglich ist sie aber bei:
- einseitigem Fehler, oder
- {{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.
Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).
Bei Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument).
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).
What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen <b>nicht möglich</b>.<br>(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)<br><br>Möglich ist sie aber bei:<ul><li>{{c1::einseitigem Fehler}}, oder</li><li>{{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.</li></ul> |
Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: {{c1::pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally)}} and {{c2::per-processor local caches (loads and stores become visible to other processors only after some delay)}}. |
| Extra |
Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).<br>Bei Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument). |
What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …). |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 40: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: TezZ0;|FPI
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.
TAS is a strict subset of CAS (CAS with old=0, new=1 behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.
Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
<i>Test-And-Set</i> on a memory cell <code>s</code>, atomically: {{c1::if <code>mem[s] == 0</code>, set it to 1 and return <code>true</code>; else return <code>false</code>}}. <i>Compare-And-Swap</i> on cell <code>a</code> with arguments <code>old, new</code>, atomically: {{c2::read <code>oldval = mem[a]</code>; if <code>oldval == old</code>, set <code>mem[a] = new</code>; in either case return <code>oldval</code>}}. |
| Extra |
|
TAS is a strict subset of CAS (CAS with <code>old=0, new=1</code> behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.<br><br>Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 41: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: VKeMFH&qvc
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.
Ziel: immer schnell, meistens korrekt/gut.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.
Ziel: immer schnell, meistens korrekt/gut.
Merksatz: Monte-Carlo ist schnell und vielleicht falsch, Las-Vegas ist korrekt und vielleicht langsam.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.
Assume \(b{<}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic happened before chain - contradiction.
The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Bei einem <b>Monte-Carlo-Algorithmus</b> ist {{c1::die Korrektheit (bzw. Qualität)}} eine Zufallsvariable, die {{c1::Laufzeit}} hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer schnell, meistens korrekt/gut}}. |
A proof by contradiction that <code>assert(b >= a)</code> cannot fail (assuming sequential program order) uses the relation {{c1::<i>happened before</i>}} together with its property of {{c2::transitivity}}. |
| Extra |
Merksatz: Monte-Carlo ist <i>schnell und vielleicht falsch</i>, Las-Vegas ist <i>korrekt und vielleicht langsam</i>. |
Assume \(b{<}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic <i>happened before</i> chain - contradiction.<br><br>The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 42: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: VQe2x::6Vm
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.
For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
One way to break the <code>transferTo</code> deadlock is to {{c1::remove the outer <code>synchronized</code> on <code>transferTo</code>}}, leaving only the per-method locks on <code>withdraw</code> and <code>deposit</code>. The trade-off is a {{c2::transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts}}. |
| Extra |
|
For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 43: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Xhsn{vOeua
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").
Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.
This does not work on multi-core, because other cores keep running in parallel.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol {{c1::switching interrupts off and on respectively ("switch off/on IRQs")}}. |
| Extra |
|
Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.<br><br>This does <b>not</b> work on multi-core, because other cores keep running in parallel. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 44: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: YD,MtGo!Tz
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.
Atomicity gives us "each operation has a single point in time", but does not pin down which point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Even with atomic registers, programs can still be non-deterministic, because {{c1::the model says nothing about the order of effect times for <i>concurrent</i> operations}}. |
| Extra |
|
Atomicity gives us "each operation has a single point in time", but does not pin down <i>which</i> point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 45: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ]SzQ}B-b0#
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).
The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. AtomicIntegerArray exposes per-element atomic and volatile-like operations.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
A common Java bug when implementing Peterson's lock is writing <code>volatile boolean[] flag = new boolean[2];</code>, which makes {{c1::only the array <i>reference</i> volatile, not its <i>elements</i>}}. Element accesses then have no volatile semantics. The correct fix is to use {{c2::<code>AtomicIntegerArray</code> (or <code>AtomicIntegerArray</code> for <code>victim</code>)}}. |
| Extra |
|
The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. <code>AtomicIntegerArray</code> exposes per-element atomic and volatile-like operations. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 46: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: _9&=
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).
Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Two strategies for handling deadlocks: {{c1::detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state)}} and {{c2::avoidance (statically ensure a cycle can never form)}}. Two standard avoidance techniques are {{c3::two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted)}} and {{c3::resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional)}}. |
| Extra |
|
Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Note 47: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: a+AQXsL|IC
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.
Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.
Factor between register and RAM: roughly \(200{\times}\).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
<b>Monte-Carlo, einseitiger Fehler<br></b>Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta}}.\] |
Typical latencies in the memory hierarchy of one core (order of magnitude): registers {{c1::~0.5 ns}}, L1 cache {{c1::~1 ns}}, L2 cache {{c1::~7 ns}}, system memory {{c1::~100 ns}}. |
| Extra |
Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist. |
Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.<br><br>Factor between register and RAM: roughly \(200{\times}\). |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Note 48: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: g?{Lm[l$i^
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.
Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
<i>HB consistency</i>: a read sees either {{c1::the latest write in HB}} or {{c1::any other write that is unordered with respect to it in HB}}. Consequence: the JMM {{c2::explicitly allows data races}} rather than forbidding them. |
| Extra |
|
Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 49: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: hd
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).
The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Any lock pre-protocol can be split into a {{c1::doorway interval \(D\) (finite number of steps)}} and a {{c2::waiting interval \(W\) (unbounded number of steps)}}. A lock is <i>first-come-first-served</i> (FCFS) iff for any two threads \(A, B\): {{c3::\(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\)}}. |
| Extra |
|
The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 50: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: m;emJ<(X_K
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).
Beweisidee (Adversary-Argument):
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).
Beweisidee (Adversary-Argument):
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.
Das ist ein typisches Adversary-Argument: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).
The JMM restricts the allowed outcomes of programs. Without volatile / synchronized the result is essentially "arbitrary".
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega({{c1::n}})\).<br><br><b>Beweisidee (Adversary-Argument):</b> <br>Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit {{c2::\(0\)}}, solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens {{c3::\(n/2 + 1\)}} Eingabezahlen, bevor er eine \(1\) sieht. |
The Java Memory Model (JMM) defines <i>actions</i> such as {{c1::<code>read(x):1</code> ("read variable x; the value read is 1")}}. <i>Executions</i> combine actions with four orderings: {{c2::Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB)}}. |
| Extra |
Das ist ein typisches <b>Adversary-Argument</b>: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen. |
The JMM restricts the allowed outcomes of programs. Without <code>volatile</code> / <code>synchronized</code> the result is essentially "arbitrary". |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 51: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: p)%lBpQ>Q<
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.
These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Three common families of hardware atomic instructions: {{c1::Test-And-Set (TAS)}} - e.g. m68k <code>TSL</code>; {{c2::Compare-And-Swap (CAS)}} - e.g. x86 <code>LOCK CMPXCHG</code>, SPARC <code>CASA</code>; {{c3::Load-Linked / Store-Conditional}} - e.g. ARM <code>LDREX/STREX</code>, MIPS / POWER / RISC-V <code>LL/SC</code>. They are typically {{c4::10-100x slower than plain reads / writes}}. |
| Extra |
|
These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Note 52: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: qzA@de.%OP
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.
Reason: there is no global ticket; progress depends on the level/victim arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The Filter Lock satisfies {{c1::mutual exclusion, deadlock freedom, and starvation freedom}}, but it is {{c2::not first-come-first-served}}: a thread that finished its doorway later may overtake one that finished earlier. |
| Extra |
|
Reason: there is no global ticket; progress depends on the level/<code>victim</code> arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 53: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: t&2cm[5&m<
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).
Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).
Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.
\[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).
Direkter Ansatz (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):
\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T > k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]Indirekter Ansatz (Rekursion via Fallunterscheidung im ersten Versuch):
\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:
\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\]
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.
SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).<br><br>Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an. |
<i>Synchronization actions (SA)</i> in the JMM are: {{c1::read/write of a <code>volatile</code> variable}}; {{c1::lock and unlock of a monitor}}; {{c1::the first and last action of a thread (synthetic)}}; {{c1::actions that start a thread}}; {{c1::actions that determine whether a thread has terminated}}. |
| Extra |
\[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).<br><br><b>Direkter Ansatz</b> (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):<br>\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T > k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]<b>Indirekter Ansatz</b> (Rekursion via Fallunterscheidung im ersten Versuch):<br>\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:<br>\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\] |
SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Note 54: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: u{Ck&o_a<_
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of <code>flag[Q]</code> happens after \(Q\) wrote <code>flag[Q]=true</code>, so \(P\) {{c2::must have read <code>flag[Q] = true</code>}}, and reading <code>victim</code> must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction. |
| Extra |
|
Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\). |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 55: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: w(*5>B[{%<
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 2nd-try mutex
p2: wantp = true;
p3: while(wantq);
p4: critical section
the order "first set my flag, then wait" satisfies mutual exclusion but
deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 2nd-try mutex
p2: wantp = true;
p3: while(wantq);
p4: critical section
the order "first set my flag, then wait" satisfies mutual exclusion but
deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
In the 2nd-try mutex <pre>p2: wantp = true;
p3: while(wantq);
p4: critical section</pre> the order "first set my flag, then wait" satisfies mutual exclusion but {{c1::deadlocks}}, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}. |
| Extra |
|
Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 56: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: z|ZzS?_M6b
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.
So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is not a total order.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
An <i>interval</i> \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff {{c1::\(a_1 \to b_0\)}}. If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are {{c2::concurrent}}. |
| Extra |
|
So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is <i>not</i> a total order. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Note 57: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: |2|f7e7DL}
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Eine
Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:
- Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
- Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Eine
Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:
- Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
- Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.
In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten.
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement).
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Eine <b>Zufallsquelle</b> liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:<ul><li><b>Physikalische Zufallsgeneratoren</b>: {{c1::Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente::Beispiele?}}.</li><li><b>Deterministische (Pseudo-)Zufallsgeneratoren</b>: liefern, basierend auf einem {{c2::Startwert (seed)}} \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht {{c3::Reproduzierbarkeit, wenn man sich den Startwert merkt}}.</li></ul> |
Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}. |
| Extra |
In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten. |
Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement). |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 58: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ~d57MCW=Q9
modified
Before
Front
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.
Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?
Back
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.
Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?
\(N\) skaliert logarithmisch in \(\delta^{-1}\).
Praktisch: Um die Fehlerwahrscheinlichkeit um einen konstanten Faktor (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur konstant viele zusätzliche Iterationen nötig.
Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)).
After
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
For the home-made rendezvous
void wait() { x = 1; while(x==1); }
void arrive(){ x = 2; } GCC with optimisations compiles the loop into
an infinite loop (jmp test without ever reloading x).
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
For the home-made rendezvous
void wait() { x = 1; while(x==1); }
void arrive(){ x = 2; } GCC with optimisations compiles the loop into
an infinite loop (jmp test without ever reloading x).
Reason: the compiler hoists the read of x out of the loop (register hoisting) because, from its sequential view, x is not modified inside the loop. Writes from other threads are invisible to it.
Fix: volatile or a lock.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.<br><br>Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch? |
For the home-made rendezvous <pre>void wait() { x = 1; while(x==1); }
void arrive(){ x = 2; }</pre> GCC with optimisations compiles the loop into {{c1::an infinite loop (<code>jmp test</code> without ever reloading <code>x</code>)}}. |
| Extra |
\(N\) skaliert <b>logarithmisch</b> in \(\delta^{-1}\).<br><br>Praktisch: Um die Fehlerwahrscheinlichkeit um einen <b>konstanten Faktor</b> (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur <b>konstant viele zusätzliche Iterationen</b> nötig.<br><br>Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)). |
Reason: the compiler hoists the read of <code>x</code> out of the loop (register hoisting) because, from its sequential view, <code>x</code> is not modified inside the loop. Writes from other threads are invisible to it.<br><br>Fix: <code>volatile</code> or a lock. |
Tags:
ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Note 59: ETH::2. Semester::PProg
Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ~i|&N8&P3M
added
Previous
Note did not exist
New Note
Front
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Realistic example of incorrect code:
boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); } There is
no guarantee that thread 1 will ever stop - although it
will "likely work in practice".
Back
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Realistic example of incorrect code:
boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); } There is
no guarantee that thread 1 will ever stop - although it
will "likely work in practice".
Reason: the compiler may hoist stop into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.
Fix: volatile boolean stop or AtomicBoolean.
Field-by-field Comparison
| Field |
Before |
After |
| Text |
|
Realistic example of incorrect code: <pre>boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }</pre> There is {{c1::no guarantee that thread 1 will ever stop}} - although it {{c2::will "likely work in practice"}}. |
| Extra |
|
Reason: the compiler may hoist <code>stop</code> into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.<br><br>Fix: <code>volatile boolean stop</code> or <code>AtomicBoolean</code>. |
Tags:
ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering