Anki Deck Changes

Commit: f9be7172 - warum liegt hier slop

Author: lhorva <lhorva@student.ethz.ch>

Date: 2026-04-30T17:52:39+02:00

Changes: 60 note(s) changed (44 added, 16 modified, 0 deleted)

ℹ️ Cosmetic Changes Hidden: 1 note(s) had formatting-only changes and are not shown below • 1 HTML formatting changes

Note 1: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #7KfXN=Mrp
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A spinlock built on TAS:
Acquire: while(!TAS(lock));
Release: lock = 0;
A spinlock built on CAS:
Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0);  // result ignored
Both use \(O(1)\) shared state regardless of \(n\), in contrast to Filter / Bakery which need \(\Theta(n)\).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A spinlock built on TAS:
Acquire: while(!TAS(lock));
Release: lock = 0;
A spinlock built on CAS:
Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0);  // result ignored
Both use \(O(1)\) shared state regardless of \(n\), in contrast to Filter / Bakery which need \(\Theta(n)\).

The release on CAS could just write 0 directly (faster); using CAS(lock,1,0) just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread.
Field-by-field Comparison
Field Before After
Text A spinlock built on TAS: <pre>Acquire: while(!TAS(lock)); Release: lock = 0;</pre> A spinlock built on CAS: <pre>Acquire: while(CAS(lock, 0, 1) != 0); Release: CAS(lock, 1, 0); // result ignored</pre> Both use {{c1::\(O(1)\) shared state regardless of \(n\)}}, in contrast to Filter / Bakery which need \(\Theta(n)\).
Extra The release on CAS could just write 0 directly (faster); using <code>CAS(lock,1,0)</code> just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 2: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #P|j?^In$#
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Dekker's algorithm fixes the previous mutex tries by combining the flag-based approach (try 2) with a turn variable used only for conflict resolution (try 3): when both flags are set, the process whose turn it is not temporarily lowers its flag and waits, then tries again.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Dekker's algorithm fixes the previous mutex tries by combining the flag-based approach (try 2) with a turn variable used only for conflict resolution (try 3): when both flags are set, the process whose turn it is not temporarily lowers its flag and waits, then tries again.

Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol.
Field-by-field Comparison
Field Before After
Text {{c1::Dekker's algorithm}} fixes the previous mutex tries by combining the {{c2::flag-based approach (try 2)}} with a {{c2::<code>turn</code> variable used only for conflict resolution (try 3)}}: when both flags are set, the process whose turn it is <i>not</i> temporarily lowers its flag and waits, then tries again.
Extra Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 3: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #vrtD+3u4N
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Fehlerreduktion Las-Vegas 
Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).

Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis
\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:
\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Fehlerreduktion Las-Vegas 
Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).

Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis
\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:
\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Beweisskizze: Pr[\(A_\delta\) gibt '???' aus] \(\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta\) für \(N \geq \varepsilon^{-1} \ln \delta^{-1}\), via \(1 - x \leq e^{-x}\).

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern compilers give no guarantee of a global ordering of memory accesses; individual accesses may even be optimised away entirely. Typical optimisations that cause reordering are dead code elimination, register hoisting, and locality optimisations.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern compilers give no guarantee of a global ordering of memory accesses; individual accesses may even be optimised away entirely. Typical optimisations that cause reordering are dead code elimination, register hoisting, and locality optimisations.

Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made.
Field-by-field Comparison
Field Before After
Text <b>Fehlerreduktion Las-Vegas</b>&nbsp;<br>Sei \(A\) ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit \(\Pr[A(I) \text{ korrekt}] \geq \varepsilon\).<br><br>Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis <br>\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:<br>\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta}}.\] Modern compilers give <b>no</b> guarantee of a global ordering of memory accesses; individual accesses may even be {{c1::optimised away entirely}}. Typical optimisations that cause reordering are {{c2::dead code elimination, register hoisting, and locality optimisations}}.
Extra Beweisskizze: Pr[\(A_\delta\) gibt '???' aus] \(\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta\) für \(N \geq \varepsilon^{-1} \ln \delta^{-1}\), via \(1 - x \leq e^{-x}\). Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 4: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: )Wj64y)bWf
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's lock (for two processes \(P\) and \(Q\)) uses two shared variables: flag[1..2] ("I am interested") and victim ("who yields if both want in"). The pre-protocol for thread id is
flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's lock (for two processes \(P\) and \(Q\)) uses two shared variables: flag[1..2] ("I am interested") and victim ("who yields if both want in"). The pre-protocol for thread id is
flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);

Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten victim (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free.
Field-by-field Comparison
Field Before After
Text {{c1::Peterson's lock}} (for two processes \(P\) and \(Q\)) uses two shared variables: {{c2::<code>flag[1..2]</code> ("I am interested")}} and {{c3::<code>victim</code> ("who yields if both want in")}}. The pre-protocol for thread <code>id</code> is <pre>flag[id] = true; victim = id; while(flag[1-id] &amp;&amp; victim == id);</pre>
Extra Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten <code>victim</code> (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 5: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: */_gG7JffH
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronizes-With (SW) pairs the specific actions that "see" each other: a volatile-write to x SW a subsequent read of x (subsequent in SO). Happens-Before (HB) is the transitive closure of PO and SW.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronizes-With (SW) pairs the specific actions that "see" each other: a volatile-write to x SW a subsequent read of x (subsequent in SO). Happens-Before (HB) is the transitive closure of PO and SW.

Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place.
Field-by-field Comparison
Field Before After
Text <i>Synchronizes-With (SW)</i> pairs the specific actions that "see" each other: a <code>volatile</code>-write to <code>x</code> SW {{c1::a subsequent read of <code>x</code> (subsequent in SO)}}. <i>Happens-Before (HB)</i> is {{c2::the transitive closure of PO and SW}}.
Extra Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 6: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: +YN}l=~9iw
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: space lower bound is linear in the maximum number of threads; correctness relies on no memory reordering, which requires expensive memory barriers in real hardware; these algorithms are not wait-free; the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: space lower bound is linear in the maximum number of threads; correctness relies on no memory reordering, which requires expensive memory barriers in real hardware; these algorithms are not wait-free; the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead.

The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling \(O(1)\) space mutexes with practical performance.
Field-by-field Comparison
Field Before After
Text Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: {{c1::space lower bound is linear in the maximum number of threads}}; {{c1::correctness relies on no memory reordering, which requires expensive memory barriers in real hardware}}; {{c1::these algorithms are not wait-free}}; {{c1::the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead}}.
Extra The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling \(O(1)\) space mutexes with practical performance.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 7: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: -T60qL&2-|
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The three conditions for a correct critical-section solution according to Ben-Ari are mutual exclusion (statements from CSes of different processes must not interleave), freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed), and freedom from starvation (any process trying to enter its CS must eventually succeed).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The three conditions for a correct critical-section solution according to Ben-Ari are mutual exclusion (statements from CSes of different processes must not interleave), freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed), and freedom from starvation (any process trying to enter its CS must eventually succeed).

Hierarchy: starvation freedom \(\Rightarrow\) deadlock freedom (if every individual gets through, then certainly some individual gets through), but not the converse.
Field-by-field Comparison
Field Before After
Text The three conditions for a correct critical-section solution according to Ben-Ari are {{c1::mutual exclusion (statements from CSes of different processes must not interleave)}}, {{c2::freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed)}}, and {{c3::freedom from starvation (any process trying to enter its CS must eventually succeed)}}.
Extra Hierarchy: starvation freedom \(\Rightarrow\) deadlock freedom (if every individual gets through, then certainly <i>some</i> individual gets through), but not the converse.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 8: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /,nT@jP#V6
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Randomisierter Algorithmus für Optimierungsprobleme 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die beste der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Randomisierter Algorithmus für Optimierungsprobleme 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die beste der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]

Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.

Typischerweise ist \(f(I) = \mathbb{E}[A(I)]\), d.h. man möchte mit hoher Wahrscheinlichkeit mindestens den Erwartungswert erreichen.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.

Goals:
- leaves room for hardware and compiler optimisations
- but provides guidelines that allow correct multi-threaded programs to be written.

Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined.
Field-by-field Comparison
Field Before After
Text <b>Randomisierter Algorithmus für Optimierungsprobleme</b>&nbsp;<br>Sei \(\varepsilon &gt; 0\) und \(A\) ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle \(f(I)\). Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die beste der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\] A {{c1::memory model}} (e.g. of a programming language like Java) is a {{c2::contract that gives minimal guarantees on the effects of memory operations}}.
Extra Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.<br><br>Typischerweise ist \(f(I) = \mathbb{E}[A(I)]\), d.h. man möchte mit hoher Wahrscheinlichkeit mindestens den Erwartungswert erreichen. Goals:<br>- leaves room for hardware and compiler optimisations<br>- but provides guidelines that allow correct multi-threaded programs to be written.<br><br>Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model

Note 9: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /_rk8FS{Ft
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region). The forbidden situation then becomes simply both processes in the collapsed CS state.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region). The forbidden situation then becomes simply both processes in the collapsed CS state.

Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand.
Field-by-field Comparison
Field Before After
Text When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be {{c1::collapsed (e.g. \(p1\equiv p2\) in the pre-CS region, \(p4\equiv p5\) in the post-CS region)}}. The forbidden situation then becomes simply {{c2::both processes in the collapsed CS state}}.
Extra Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 10: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 09qe2{A(
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Canonical deadlock example:
synchronized void transferTo(int amt, BankAccount a) {
  this.withdraw(amt);
  a.deposit(amt);  // acquires a's lock
}
Two concurrent calls x.transferTo(1, y) and y.transferTo(1, x) deadlock because Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Canonical deadlock example:
synchronized void transferTo(int amt, BankAccount a) {
  this.withdraw(amt);
  a.deposit(amt);  // acquires a's lock
}
Two concurrent calls x.transferTo(1, y) and y.transferTo(1, x) deadlock because Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.

The bug is intrinsic to the design: transferTo acquires two locks - the receiver's via this on entry, the target's via the synchronised a.deposit(). With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.
Field-by-field Comparison
Field Before After
Text Canonical deadlock example: <pre>synchronized void transferTo(int amt, BankAccount a) { this.withdraw(amt); a.deposit(amt); // acquires a's lock }</pre> Two concurrent calls <code>x.transferTo(1, y)</code> and <code>y.transferTo(1, x)</code> deadlock because {{c1::Thread 1 holds <code>x</code>'s lock and waits for <code>y</code>'s, while Thread 2 holds <code>y</code>'s and waits for <code>x</code>'s}}.
Extra The bug is intrinsic to the design: <code>transferTo</code> acquires two locks - the receiver's via <code>this</code> on entry, the target's via the synchronised <code>a.deposit()</code>. With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 11: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 0~]{jI&/:2
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing \(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets level[me] = \ell and victim[\ell] = me, then waits while
\exists k \neq me: level[k] >= \ell && victim[\ell] == me

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing \(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets level[me] = \ell and victim[\ell] = me, then waits while
\exists k \neq me: level[k] >= \ell && victim[\ell] == me

Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.

unlock just resets level[me] = 0.
Field-by-field Comparison
Field Before After
Text The {{c1::Filter Lock}} extends Peterson's lock from 2 to \(n\) processes by introducing {{c2::\(n-1\) levels}}: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets <code>level[me] = \ell</code> and <code>victim[\ell] = me</code>, then waits while <pre>\exists k \neq me: level[k] &gt;= \ell &amp;&amp; victim[\ell] == me</pre>
Extra Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.<br><br><code>unlock</code> just resets <code>level[me] = 0</code>.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 12: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 1*Jyz+ZJLF
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.

Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.
Field-by-field Comparison
Field Before After
Text When constructing mutex algorithms from atomic registers, three assumptions are made: {{c1::atomic reads and writes of primitive-type variables}}; {{c1::no reorderings of read/write sequences (not true in practice; assumed here for simplicity)}}; {{c1::threads that enter a critical section eventually leave it}}. About progress {{c2::outside the critical section, no assumption is made}}.
Extra Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 13: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 1eIQehPP;^
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.

A violation of mutual exclusion shows up whenever a state with both program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.
Field-by-field Comparison
Field Before After
Text A {{c1::state-space diagram}} is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are {{c3::single atomic steps of one of the threads}}.
Extra A violation of mutual exclusion shows up whenever a state with <i>both</i> program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 14: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 4%UUWqRVem
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.

This is the standard pattern used inside java.util.concurrent. AtomicLong guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.

Using System.identityHashCode is tempting but unsafe - hashes can collide.
Field-by-field Comparison
Field Before After
Text When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned {{c1::a unique <code>final long</code> index from a class-level <code>AtomicLong counter</code>}} via <code>counter.incrementAndGet()</code> in the constructor. Locks are then always acquired in {{c2::ascending order of <code>index</code>}}.
Extra This is the standard pattern used inside <code>java.util.concurrent</code>. <code>AtomicLong</code> guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.<br><br>Using <code>System.identityHashCode</code> is tempting but unsafe - hashes can collide.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 15: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7/+$V!A5-^
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.

compareAndSet ≅ CAS, getAndSet ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.
Field-by-field Comparison
Field Before After
Text <code>java.util.concurrent.atomic.AtomicBoolean</code> exposes two key RMW operations on top of plain <code>get</code> / <code>set</code>: {{c1::<code>compareAndSet(expect, update)</code>}}, which atomically writes <code>update</code> iff the current value equals <code>expect</code> and {{c2::returns <code>true</code> on success}}; and {{c3::<code>getAndSet(newValue)</code>}}, which {{c4::atomically writes <code>newValue</code> and returns the previous value}}.
Extra <code>compareAndSet</code> ≅ CAS, <code>getAndSet</code> ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 16: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7c[p[vx$A1
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.

These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.
Field-by-field Comparison
Field Before After
Text An <i>atomic register</i> \(r\) (a shared memory object, not a CPU register) supports <code>r.read()</code> and <code>r.write(v)</code> with four properties: each invocation \(J\) {{c1::takes effect at a single point in time \(\tau(J)\)}}; \(\tau(J)\) {{c1::lies between the start and end of \(J\)}}; two operations on the same register {{c1::have distinct effect times \(\tau(J) \neq \tau(K)\)}}; an invocation of <code>r.read()</code> returns {{c1::the value written by the <code>r.write(v)</code> invocation with the closest preceding effect time}}.
Extra These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 17: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7gI{,sQ;#o
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The 3rd-try mutex (strict alternation)
p2: while(turn != 1);
p3: critical section
p4: turn = 2
satisfies mutual exclusion but allows starvation, because a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The 3rd-try mutex (strict alternation)
p2: while(turn != 1);
p3: critical section
p4: turn = 2
satisfies mutual exclusion but allows starvation, because a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.

This is exactly why we make no assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.
Field-by-field Comparison
Field Before After
Text The 3rd-try mutex (strict alternation) <pre>p2: while(turn != 1); p3: critical section p4: turn = 2</pre> satisfies mutual exclusion but allows {{c1::starvation}}, because {{c2::a process stuck in its non-CS never resets <code>turn</code>, so the other process is forced to wait forever}}.
Extra This is exactly why we make <i>no</i> assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 18: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7js:~
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.

Ziel: immer korrekt/gut, meistens schnell.

Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.

Ziel: immer korrekt/gut, meistens schnell.

Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.

Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.

Example: x=1; y=x+1; z=x+1; is sequentially equivalent to x=1; z=x+1; y=x+1; and even to x=1; z=2; y=2;.

In a parallel context these reorderings can become observable and break correctness.
Field-by-field Comparison
Field Before After
Text Bei einem <b>Las-Vegas-Algorithmus</b> ist {{c1::die Laufzeit}} eine Zufallsvariable, die {{c1::Korrektheit}} hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer korrekt/gut, meistens schnell}}.<br><br><b>Alternative Sichtweise:</b> Algorithmus, der manchmal {{c3::'???' statt eines Ergebnisses}} ausgibt. Man kann ihn dann entweder {{c4::wiederholen, bis ein Ergebnis kommt}}, oder {{c4::nach fester Laufzeit abbrechen}}. Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does {{c1::not affect the semantics of a <i>sequential</i> execution}}.
Extra Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler. Example: <code>x=1; y=x+1; z=x+1;</code> is sequentially equivalent to <code>x=1; z=x+1; y=x+1;</code> and even to <code>x=1; z=2; y=2;</code>.<br><br>In a parallel context these reorderings can become observable and break correctness.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 19: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 7kNVF}Z|oG
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.

Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.
Field-by-field Comparison
Field Before After
Text Which memory operations actually get reordered depends on the {{c1::hardware architecture}}; e.g. {{c2::AMD64 (x86) differs significantly from ARM}}: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
Extra Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model

Note 20: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: :UEQ*@nnq1
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):

Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):

Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?

Deterministisch: \(\Theta(n)\) (Worst Case).
Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).

Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).

Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.

The globality of SO is the reason that volatile and locks can synchronise threads at all.
Field-by-field Comparison
Field Before After
Text Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):<br><br>Was ist die Laufzeit des schnellsten <b>deterministischen</b> Algorithmus, was die <b>erwartete</b> Laufzeit des schnellsten <b>Las-Vegas</b>-Algorithmus? The <i>Synchronization Order (SO)</i> in the JMM is {{c1::a total order over all synchronization actions}}; {{c1::all threads see the SA in the same order}}; {{c1::the SA inside a thread occur in PO}}; {{c1::SO is consistent: every read in SO sees the latest write in SO}}.
Extra Deterministisch: \(\Theta(n)\) (Worst Case).<br>Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).<br><br>Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).<br><br>Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss. The globality of SO is the reason that <code>volatile</code> and locks can synchronise threads at all.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 21: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID:
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.

Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.
Field-by-field Comparison
Field Before After
Text Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only {{c1::atomic read/write registers}}, then it must use at least {{c2::\(n\) shared variables}}.
Extra Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 22: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: >aq{oB;vc7
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.

Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.
Field-by-field Comparison
Field Before After
Text The structure of the critical-section problem per process inside an infinite loop is: {{c1::non-critical section}} \(\to\) {{c2::preprotocol}} \(\to\) {{c3::critical section}} \(\to\) {{c4::postprotocol}}.
Extra Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 23: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ?7[F1KIShh
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.

Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".

Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).
Field-by-field Comparison
Field Before After
Text Threads produce a sequence of <i>events</i>; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted {{c1::\(p_i^j\)}}. The <i>precedence relation</i> \(a \to b\) ("\(a\) occurs before \(b\)") is {{c2::a total order over events}}.
Extra Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".<br><br>Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 24: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @Id@]fUyPJ
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Achtung: \(N\) skaliert hier quadratisch in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.

Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.

Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.
Field-by-field Comparison
Field Before After
Text <b>Monte-Carlo, Fehlerwahrscheinlichkeit </b>\(&lt; \tfrac{1}{2}\)&nbsp;<br>Sei \(\varepsilon &gt; 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die Mehrheit der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\] In a multi-core memory hierarchy each core has {{c1::its own registers and its own L1 cache (private)}}, while {{c2::the L2 cache and system memory are shared between cores}}.
Extra Achtung: \(N\) skaliert hier <b>quadratisch</b> in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.<br><br>Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe. Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy

Note 25: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @rzMyS*OnN
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.

Recommendation: only for experts; otherwise use the standard library (java.util.concurrent, AtomicInteger, …).

Caveat: volatile guarantees visibility, but not atomicity of compound operations like i++.
Field-by-field Comparison
Field Before After
Text Accesses to <code>volatile</code> fields in Java do {{c1::not count as a data race}}. In terms of performance, <code>volatile</code> is {{c2::slower than ordinary fields, but faster than locks}}.
Extra Recommendation: only for experts; otherwise use the standard library (<code>java.util.concurrent</code>, <code>AtomicInteger</code>, …).<br><br>Caveat: <code>volatile</code> guarantees visibility, but <b>not atomicity of compound operations</b> like <code>i++</code>.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model

Note 26: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Aajp);{.(r
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.

If STREX fails, software retries the LL/SC pair. Compared with x86's monolithic CMPXCHG, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.
Field-by-field Comparison
Field Before After
Text ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a {{c1::Load-Linked / Store-Conditional}} pair: <code>LDREX</code> loads from memory and {{c2::marks the address as exclusive for this core}}; the subsequent <code>STREX</code> performs {{c3::a conditional store that only succeeds if no other core has written to the address in the meantime}}.
Extra If <code>STREX</code> fails, software retries the LL/SC pair. Compared with x86's monolithic <code>CMPXCHG</code>, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 27: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ApSiaX&)Ab
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a ticket (label) larger than all outstanding tickets:
flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};
where ties are broken by lexicographic order on (thread id, label), so two equal labels are still totally ordered.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a ticket (label) larger than all outstanding tickets:
flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};
where ties are broken by lexicographic order on (thread id, label), so two equal labels are still totally ordered.

Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).

The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.
Field-by-field Comparison
Field Before After
Text {{c1::Lamport's Bakery algorithm}} solves \(n\)-process mutex by assigning each thread a {{c2::ticket (label) larger than all outstanding tickets}}: <pre>flag[me] = true; label[me] = max(label[0..n-1]) + 1; while(\exists k != me: flag[k] &amp;&amp; (k, label[k]) &lt;_l (me, label[me])) {};</pre> where ties are broken by {{c3::lexicographic order on (thread id, label), so two equal labels are still totally ordered}}.
Extra Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).<br><br>The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 28: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: C9O/B$skj#
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).

The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.
Field-by-field Comparison
Field Before After
Text Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its <code>while</code> loop forever. Then \(Q\) is in one of three cases - {{c1::stuck in non-CS (then <code>flag[Q]=false</code>, \(P\) proceeds, contradiction)}}; {{c2::repeatedly entering and leaving its CS (each entry sets <code>victim=Q</code>, freezing <code>victim</code> away from \(P\), so \(P\) proceeds, contradiction)}}; {{c3::also stuck in its lock loop (then \(Q\) needs <code>victim != Q</code>, but <code>victim</code> can equal at most one of \(P,Q\), contradiction)}}.
Extra The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 29: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: CR2my^EB=I
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Ein klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:
  • \(A(I)\) ist immer genau und korrekt,
  • bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
  • dieselbe Laufzeit.
Ein randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Ein klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:
  • \(A(I)\) ist immer genau und korrekt,
  • bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
  • dieselbe Laufzeit.
Ein randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).

Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).

Yet the assertion assert(b >= a) can still fail in practice: pure interleaving reasoning ignores memory reordering by the compiler and the hardware.
Field-by-field Comparison
Field Before After
Text Ein <b>klassischer</b> Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:<ul><li>\(A(I)\) ist immer {{c1::genau und korrekt}},</li><li>bei gleicher Eingabe immer {{c2::dieselbe Ausgabe (\(A\) ist eine Funktion)}},</li><li>{{c3::dieselbe Laufzeit}}.</li></ul>Ein <b>randomisierter</b> Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist {{c4::manchmal richtig, manchmal fast richtig, manchmal schnell}}, das Verhalten ist {{c5::nicht reproduzierbar}}. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\). In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: {{c1::\(b \geq a\) is always true}}. This proof technique is called {{c2::proof by exhaustion (full enumeration of interleavings)}}.
Extra Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht. Yet the assertion <code>assert(b &gt;= a)</code> can still fail in practice: pure interleaving reasoning ignores <b>memory reordering</b> by the compiler and the hardware.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 30: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: D|&w=l8gFV
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
The Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first:
do {
  while(state.get()) {}
} while(!state.compareAndSet(false, true));
The inner loop only reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only after the lock looks free.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
The Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first:
do {
  while(state.get()) {}
} while(!state.compareAndSet(false, true));
The inner loop only reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only after the lock looks free.

Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.
Field-by-field Comparison
Field Before After
Text The <i>Test-And-Test-And-Set (TATAS)</i> lock improves on TAS by spinning on a plain read first: <pre>do { while(state.get()) {} } while(!state.compareAndSet(false, true));</pre> The inner loop only {{c1::reads the cached value, generating no coherence traffic while the lock stays held}}; the costly RMW is attempted only {{c2::after the lock <i>looks</i> free}}.
Extra Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 31: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: F?scKnbfC[
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.

Plain CMPXCHG is atomic only with respect to other observers on the same core. The LOCK prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.
Field-by-field Comparison
Field Before After
Text On x86 the {{c1::<code>CMPXCHG</code> mem, reg}} instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with {{c2::the <code>LOCK</code> prefix}}.
Extra Plain <code>CMPXCHG</code> is atomic only with respect to other observers on the same core. The <code>LOCK</code> prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 32: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: FGboL4isC9
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.

This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.
Field-by-field Comparison
Field Before After
Text Another deadlock fix is to protect every <code>transferTo</code> with {{c1::a single static global lock}}, which trivially avoids cycles because there is only one lock to acquire. Cost: {{c2::no two transfers can run in parallel, even on completely disjoint pairs of accounts}}.
Extra This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 33: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Fc~lyNYx7K
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A naive TAS-based spinlock
while(state.getAndSet(true)) {}
scales poorly because every spinning thread issues writes to the lock variable, which generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
A naive TAS-based spinlock
while(state.getAndSet(true)) {}
scales poorly because every spinning thread issues writes to the lock variable, which generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.

Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.
Field-by-field Comparison
Field Before After
Text A naive TAS-based spinlock <pre>while(state.getAndSet(true)) {}</pre> scales poorly because {{c1::every spinning thread issues writes to the lock variable}}, which {{c2::generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry}}.
Extra Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 34: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Jj5bulRE#&
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 1st-try mutex
p2: while(wantq);
p3: wantp = true;
p4: critical section
the order "first wait for the neighbour, then set my own flag" violates mutual exclusion, because {{c2::both threads can leave their while with wantp{=}wantq{=}false before either of them sets its flag}}.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 1st-try mutex
p2: while(wantq);
p3: wantp = true;
p4: critical section
the order "first wait for the neighbour, then set my own flag" violates mutual exclusion, because {{c2::both threads can leave their while with wantp{=}wantq{=}false before either of them sets its flag}}.

Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.

Naive fix: set the flag before the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.
Field-by-field Comparison
Field Before After
Text In the 1st-try mutex <pre>p2: while(wantq); p3: wantp = true; p4: critical section</pre> the order "first wait for the neighbour, then set my own flag" violates {{c1::mutual exclusion}}, because {{c2::both threads can leave their <code>while</code> with <code>wantp{=}wantq{=}false</code> before either of them sets its flag}}.
Extra Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.<br><br>Naive fix: set the flag <i>before</i> the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 35: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: MH1KRE1DzH
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem
repeat:
    i := Uniform({1, ..., n})
    if x_i = 1: return i
Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem
repeat:
    i := Uniform({1, ..., n})
    if x_i = 1: return i
Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]

\(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).

Konkrete Werte: \(\Pr[T > 10] = 1/1024 < 0{,}001\), also \(\Pr[T \leq 10] > 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(> 0{,}999999\).

Wichtig: \(T\) hängt nicht von \(n\) ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)).

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.

Its only purpose is to link an execution back to the original program.

Intra-thread consistency: per thread, PO is consistent with the thread's isolated execution. E.g. taking the else-branch of an if whose condition was true makes the execution invalid.
Field-by-field Comparison
Field Before After
Text <b>Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem</b><pre>repeat: i := Uniform({1, ..., n}) if x_i = 1: return i</pre>Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T &gt; k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\] <i>Program Order (PO)</i> in the JMM is a {{c1::total order over the actions of a single thread, defined on an execution trace}}. PO gives {{c2::no ordering guarantee for memory accesses}} and is {{c2::not total across threads}}.
Extra \(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).<br><br>Konkrete Werte: \(\Pr[T &gt; 10] = 1/1024 &lt; 0{,}001\), also \(\Pr[T \leq 10] &gt; 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(&gt; 0{,}999999\).<br><br>Wichtig: \(T\) hängt <b>nicht von \(n\)</b> ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)). Its only purpose is to link an execution back to the original program.<br><br><b>Intra-thread consistency:</b> per thread, PO is consistent with the thread's isolated execution. E.g. taking the <code>else</code>-branch of an <code>if</code> whose condition was true makes the execution invalid.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 36: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: O96Epv!-$o
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.

Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning minDelay / maxDelay is workload-specific.
Field-by-field Comparison
Field Before After
Text <i>Exponential-backoff</i> lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles <code>limit</code> up to <code>maxDelay</code>}} before retrying. This removes the {{c2::synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released}}.
Extra Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning <code>minDelay</code> / <code>maxDelay</code> is workload-specific.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 37: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Pa^@bE6s8n
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne einen Index \(i\) mit \(x_i = 1\).

Trivialer deterministischer Algorithmus (in Zeit \(O(n)\)):
for i = 1, ..., n:
    if x_i = 1: return i

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne einen Index \(i\) mit \(x_i = 1\).

Trivialer deterministischer Algorithmus (in Zeit \(O(n)\)):
for i = 1, ..., n:
    if x_i = 1: return i

Das Problem heisst üppig, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).

Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).
Field-by-field Comparison
Field Before After
Text <b>Üppige-Auswahl-Problem<br></b>Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen {{c1::genau die Hälfte}} gleich \(1\) sind (\(n\) gerade).<br>Berechne {{c2::einen Index \(i\) mit \(x_i = 1\)}}.<br><br>Trivialer deterministischer Algorithmus (in Zeit \(O({{c3::n}})\)):<pre>for i = 1, ..., n: if x_i = 1: return i</pre> In the classic x86 example with \(x{=}y{=}0\), thread 1: <code>x=1; j=y;</code> and thread 2: <code>y=1; i=x;</code>, besides \((1,1), (0,1), (1,0)\), the outcome {{c1::\((i,j) = (0,0)\) is also possible}}, because under x86's "stores reordered after loads" rule {{c2::the load (<code>j=y</code> / <code>i=x</code>) may be executed before the store (<code>x=1</code> / <code>y=1</code>)}}.
Extra Das Problem heisst <i>üppig</i>, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt. Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model

Note 38: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: RDjwIm/b}G
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.

Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).

Deadlock detection is a cycle search in this graph.
Field-by-field Comparison
Field Before After
Text Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means {{c1::thread \(T_i\) is attempting to acquire resource \(R_j\)}}; edge \(R_j \to T_i\) means {{c2::resource \(R_j\) is currently held by thread \(T_i\)}}. A deadlock is present iff {{c3::this graph contains a cycle}}.
Extra Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).<br><br>Deadlock <i>detection</i> is a cycle search in this graph.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 39: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: T+YN#N>,$x
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)

Möglich ist sie aber bei:
  • einseitigem Fehler, oder
  • {{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)

Möglich ist sie aber bei:
  • einseitigem Fehler, oder
  • {{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.

Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).
Bei Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument).

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).

What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).
Field-by-field Comparison
Field Before After
Text Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen <b>nicht möglich</b>.<br>(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)<br><br>Möglich ist sie aber bei:<ul><li>{{c1::einseitigem Fehler}}, oder</li><li>{{c2::Fehlerwahrscheinlichkeit \(&lt; \tfrac{1}{2}\)}}.</li></ul> Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: {{c1::pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally)}} and {{c2::per-processor local caches (loads and stores become visible to other processors only after some delay)}}.
Extra Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).<br>Bei Fehlerwahrscheinlichkeit \(&lt; \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument). What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 40: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: TezZ0;|FPI
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.

TAS is a strict subset of CAS (CAS with old=0, new=1 behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.

Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.
Field-by-field Comparison
Field Before After
Text <i>Test-And-Set</i> on a memory cell <code>s</code>, atomically: {{c1::if <code>mem[s] == 0</code>, set it to 1 and return <code>true</code>; else return <code>false</code>}}. <i>Compare-And-Swap</i> on cell <code>a</code> with arguments <code>old, new</code>, atomically: {{c2::read <code>oldval = mem[a]</code>; if <code>oldval == old</code>, set <code>mem[a] = new</code>; in either case return <code>oldval</code>}}.
Extra TAS is a strict subset of CAS (CAS with <code>old=0, new=1</code> behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.<br><br>Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 41: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: VKeMFH&qvc
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.

Ziel: immer schnell, meistens korrekt/gut.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.

Ziel: immer schnell, meistens korrekt/gut.

Merksatz: Monte-Carlo ist schnell und vielleicht falsch, Las-Vegas ist korrekt und vielleicht langsam.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.

Assume \(b{<}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic happened before chain - contradiction.

The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering.
Field-by-field Comparison
Field Before After
Text Bei einem <b>Monte-Carlo-Algorithmus</b> ist {{c1::die Korrektheit (bzw. Qualität)}}&nbsp;eine Zufallsvariable, die {{c1::Laufzeit}}&nbsp;hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer schnell, meistens korrekt/gut}}. A proof by contradiction that <code>assert(b &gt;= a)</code> cannot fail (assuming sequential program order) uses the relation {{c1::<i>happened before</i>}} together with its property of {{c2::transitivity}}.
Extra Merksatz: Monte-Carlo ist <i>schnell und vielleicht falsch</i>, Las-Vegas ist <i>korrekt und vielleicht langsam</i>. Assume \(b{&lt;}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic <i>happened before</i> chain - contradiction.<br><br>The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 42: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: VQe2x::6Vm
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.

For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states.
Field-by-field Comparison
Field Before After
Text One way to break the <code>transferTo</code> deadlock is to {{c1::remove the outer <code>synchronized</code> on <code>transferTo</code>}}, leaving only the per-method locks on <code>withdraw</code> and <code>deposit</code>. The trade-off is a {{c2::transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts}}.
Extra For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 43: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Xhsn{vOeua
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").

Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.

This does not work on multi-core, because other cores keep running in parallel.
Field-by-field Comparison
Field Before After
Text On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol {{c1::switching interrupts off and on respectively ("switch off/on IRQs")}}.
Extra Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.<br><br>This does <b>not</b> work on multi-core, because other cores keep running in parallel.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 44: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: YD,MtGo!Tz
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.

Atomicity gives us "each operation has a single point in time", but does not pin down which point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values.
Field-by-field Comparison
Field Before After
Text Even with atomic registers, programs can still be non-deterministic, because {{c1::the model says nothing about the order of effect times for <i>concurrent</i> operations}}.
Extra Atomicity gives us "each operation has a single point in time", but does not pin down <i>which</i> point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 45: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ]SzQ}B-b0#
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).

The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. AtomicIntegerArray exposes per-element atomic and volatile-like operations.
Field-by-field Comparison
Field Before After
Text A common Java bug when implementing Peterson's lock is writing <code>volatile boolean[] flag = new boolean[2];</code>, which makes {{c1::only the array <i>reference</i> volatile, not its <i>elements</i>}}. Element accesses then have no volatile semantics. The correct fix is to use {{c2::<code>AtomicIntegerArray</code> (or <code>AtomicIntegerArray</code> for <code>victim</code>)}}.
Extra The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. <code>AtomicIntegerArray</code> exposes per-element atomic and volatile-like operations.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 46: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: _9&=
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).

Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach.
Field-by-field Comparison
Field Before After
Text Two strategies for handling deadlocks: {{c1::detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state)}} and {{c2::avoidance (statically ensure a cycle can never form)}}. Two standard avoidance techniques are {{c3::two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted)}} and {{c3::resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional)}}.
Extra Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Note 47: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: a+AQXsL|IC
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.

Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.

Factor between register and RAM: roughly \(200{\times}\).
Field-by-field Comparison
Field Before After
Text <b>Monte-Carlo, einseitiger Fehler<br></b>Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta}}.\] Typical latencies in the memory hierarchy of one core (order of magnitude): registers {{c1::~0.5 ns}}, L1 cache {{c1::~1 ns}}, L2 cache {{c1::~7 ns}}, system memory {{c1::~100 ns}}.
Extra Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist. Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.<br><br>Factor between register and RAM: roughly \(200{\times}\).
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy

Note 48: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: g?{Lm[l$i^
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.

Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise.
Field-by-field Comparison
Field Before After
Text <i>HB consistency</i>: a read sees either {{c1::the latest write in HB}} or {{c1::any other write that is unordered with respect to it in HB}}. Consequence: the JMM {{c2::explicitly allows data races}} rather than forbidding them.
Extra Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 49: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: hd
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).

The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections.
Field-by-field Comparison
Field Before After
Text Any lock pre-protocol can be split into a {{c1::doorway interval \(D\) (finite number of steps)}} and a {{c2::waiting interval \(W\) (unbounded number of steps)}}. A lock is <i>first-come-first-served</i> (FCFS) iff for any two threads \(A, B\): {{c3::\(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\)}}.
Extra The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 50: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: m;emJ<(X_K
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).

Beweisidee (Adversary-Argument):
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).

Beweisidee (Adversary-Argument):
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.

Das ist ein typisches Adversary-Argument: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).

The JMM restricts the allowed outcomes of programs. Without volatile / synchronized the result is essentially "arbitrary".
Field-by-field Comparison
Field Before After
Text Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega({{c1::n}})\).<br><br><b>Beweisidee (Adversary-Argument):</b> <br>Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit&nbsp;{{c2::\(0\)}}, solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens {{c3::\(n/2 + 1\)}} Eingabezahlen, bevor er eine \(1\) sieht. The Java Memory Model (JMM) defines <i>actions</i> such as {{c1::<code>read(x):1</code> ("read variable x; the value read is 1")}}. <i>Executions</i> combine actions with four orderings: {{c2::Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB)}}.
Extra Das ist ein typisches <b>Adversary-Argument</b>: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen. The JMM restricts the allowed outcomes of programs. Without <code>volatile</code> / <code>synchronized</code> the result is essentially "arbitrary".
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 51: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: p)%lBpQ>Q<
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.

These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures.
Field-by-field Comparison
Field Before After
Text Three common families of hardware atomic instructions: {{c1::Test-And-Set (TAS)}} - e.g. m68k <code>TSL</code>; {{c2::Compare-And-Swap (CAS)}} - e.g. x86 <code>LOCK CMPXCHG</code>, SPARC <code>CASA</code>; {{c3::Load-Linked / Store-Conditional}} - e.g. ARM <code>LDREX/STREX</code>, MIPS / POWER / RISC-V <code>LL/SC</code>. They are typically {{c4::10-100x slower than plain reads / writes}}.
Extra These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 52: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: qzA@de.%OP
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.

Reason: there is no global ticket; progress depends on the level/victim arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this.
Field-by-field Comparison
Field Before After
Text The Filter Lock satisfies {{c1::mutual exclusion, deadlock freedom, and starvation freedom}}, but it is {{c2::not first-come-first-served}}: a thread that finished its doorway later may overtake one that finished earlier.
Extra Reason: there is no global ticket; progress depends on the level/<code>victim</code> arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 53: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: t&2cm[5&m<
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).

Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).

Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.

\[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).

Direkter Ansatz (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):
\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T > k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]Indirekter Ansatz (Rekursion via Fallunterscheidung im ersten Versuch):
\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:
\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\]

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.

SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action.
Field-by-field Comparison
Field Before After
Text Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T &gt; k] = (1/2)^k\).<br><br>Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an. <i>Synchronization actions (SA)</i> in the JMM are: {{c1::read/write of a <code>volatile</code> variable}}; {{c1::lock and unlock of a monitor}}; {{c1::the first and last action of a thread (synthetic)}}; {{c1::actions that start a thread}}; {{c1::actions that determine whether a thread has terminated}}.
Extra \[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).<br><br><b>Direkter Ansatz</b> (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):<br>\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T &gt; k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]<b>Indirekter Ansatz</b> (Rekursion via Fallunterscheidung im ersten Versuch):<br>\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:<br>\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\] SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 54: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: u{Ck&o_a<_
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.

Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\).
Field-by-field Comparison
Field Before After
Text The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of <code>flag[Q]</code> happens after \(Q\) wrote <code>flag[Q]=true</code>, so \(P\) {{c2::must have read <code>flag[Q] = true</code>}}, and reading <code>victim</code> must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
Extra Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\).
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 55: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: w(*5>B[{%<
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 2nd-try mutex
p2: wantp = true;
p3: while(wantq);
p4: critical section
the order "first set my flag, then wait" satisfies mutual exclusion but deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
In the 2nd-try mutex
p2: wantp = true;
p3: while(wantq);
p4: critical section
the order "first set my flag, then wait" satisfies mutual exclusion but deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.

Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock.
Field-by-field Comparison
Field Before After
Text In the 2nd-try mutex <pre>p2: wantp = true; p3: while(wantq); p4: critical section</pre> the order "first set my flag, then wait" satisfies mutual exclusion but {{c1::deadlocks}}, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
Extra Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 56: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: z|ZzS?_M6b
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.

So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is not a total order.
Field-by-field Comparison
Field Before After
Text An <i>interval</i> \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff {{c1::\(a_1 \to b_0\)}}. If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are {{c2::concurrent}}.
Extra So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is <i>not</i> a total order.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 57: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: |2|f7e7DL}
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Eine Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:
  • Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
  • Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Eine Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:
  • Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
  • Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.

In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten.

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.

Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement).
Field-by-field Comparison
Field Before After
Text Eine <b>Zufallsquelle</b> liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:<ul><li><b>Physikalische Zufallsgeneratoren</b>: {{c1::Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente::Beispiele?}}.</li><li><b>Deterministische (Pseudo-)Zufallsgeneratoren</b>: liefern, basierend auf einem {{c2::Startwert (seed)}} \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht {{c3::Reproduzierbarkeit, wenn man sich den Startwert merkt}}.</li></ul> Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
Extra In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten. Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement).
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 58: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ~d57MCW=Q9
modified

Before

Front

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.

Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?

Back

ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.

Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?

\(N\) skaliert logarithmisch in \(\delta^{-1}\).

Praktisch: Um die Fehlerwahrscheinlichkeit um einen konstanten Faktor (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur konstant viele zusätzliche Iterationen nötig.

Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)).

After

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
For the home-made rendezvous
void wait()  { x = 1; while(x==1); }
void arrive(){ x = 2; }
GCC with optimisations compiles the loop into an infinite loop (jmp test without ever reloading x).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
For the home-made rendezvous
void wait()  { x = 1; while(x==1); }
void arrive(){ x = 2; }
GCC with optimisations compiles the loop into an infinite loop (jmp test without ever reloading x).

Reason: the compiler hoists the read of x out of the loop (register hoisting) because, from its sequential view, x is not modified inside the loop. Writes from other threads are invisible to it.

Fix: volatile or a lock.
Field-by-field Comparison
Field Before After
Text Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.<br><br>Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch? For the home-made rendezvous <pre>void wait() { x = 1; while(x==1); } void arrive(){ x = 2; }</pre> GCC with optimisations compiles the loop into {{c1::an infinite loop (<code>jmp test</code> without ever reloading <code>x</code>)}}.
Extra \(N\) skaliert <b>logarithmisch</b> in \(\delta^{-1}\).<br><br>Praktisch: Um die Fehlerwahrscheinlichkeit um einen <b>konstanten Faktor</b> (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur <b>konstant viele zusätzliche Iterationen</b> nötig.<br><br>Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)). Reason: the compiler hoists the read of <code>x</code> out of the loop (register hoisting) because, from its sequential view, <code>x</code> is not modified inside the loop. Writes from other threads are invisible to it.<br><br>Fix: <code>volatile</code> or a lock.
Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 59: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ~i|&N8&P3M
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Realistic example of incorrect code:
boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }
There is no guarantee that thread 1 will ever stop - although it will "likely work in practice".

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
Realistic example of incorrect code:
boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }
There is no guarantee that thread 1 will ever stop - although it will "likely work in practice".

Reason: the compiler may hoist stop into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.

Fix: volatile boolean stop or AtomicBoolean.
Field-by-field Comparison
Field Before After
Text Realistic example of incorrect code: <pre>boolean stop = false; void f() { while(!stop) { /* draw */ } } void g() { stop = didUserQuit(); }</pre> There is {{c1::no guarantee that thread 1 will ever stop}} - although it {{c2::will "likely work in practice"}}.
Extra Reason: the compiler may hoist <code>stop</code> into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.<br><br>Fix: <code>volatile boolean stop</code> or <code>AtomicBoolean</code>.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
↑ Top