Anki Deck Changes

Note 1: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #7KfXN=Mrp

added

Note did not exist

New Note

Front

A spinlock built on TAS:

Acquire: while(!TAS(lock));
Release: lock = 0;

A spinlock built on CAS:

Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0);  // result ignored

Both use $O(1)$ shared state regardless of $n$, in contrast to Filter / Bakery which need $\Theta(n)$.

Back

A spinlock built on TAS:

Acquire: while(!TAS(lock));
Release: lock = 0;

A spinlock built on CAS:

Acquire: while(CAS(lock, 0, 1) != 0);
Release: CAS(lock, 1, 0);  // result ignored

Both use $O(1)$ shared state regardless of $n$, in contrast to Filter / Bakery which need $\Theta(n)$.

The release on CAS could just write 0 directly (faster); using CAS(lock,1,0) just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread.

Field-by-field Comparison

Field	Before	After
Text		A spinlock built on TAS: <pre>Acquire: while(!TAS(lock)); Release: lock = 0;</pre> A spinlock built on CAS: <pre>Acquire: while(CAS(lock, 0, 1) != 0); Release: CAS(lock, 1, 0); // result ignored</pre> Both use {{c1::$O(1)$ shared state regardless of $n$}}, in contrast to Filter / Bakery which need $\Theta(n)$.
Extra		The release on CAS could just write 0 directly (faster); using <code>CAS(lock,1,0)</code> just illustrates that CAS can serve both phases. The lock is correct because the entry is a single atomic RMW that wins for at most one thread.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations

Note 2: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #P|j?^In$#

added

Note did not exist

New Note

Front

Dekker's algorithm fixes the previous mutex tries by combining the flag-based approach (try 2) with a turn variable used only for conflict resolution (try 3): when both flags are set, the process whose turn it is not temporarily lowers its flag and waits, then tries again.

Back

Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol.

Field-by-field Comparison

Field	Before	After
Text		{{c1::Dekker's algorithm}} fixes the previous mutex tries by combining the {{c2::flag-based approach (try 2)}} with a {{c2::<code>turn</code> variable used only for conflict resolution (try 3)}}: when both flags are set, the process whose turn it is <i>not</i> temporarily lowers its flag and waits, then tries again.
Extra		Properties: mutual exclusion, deadlock freedom, starvation freedom. But it is verbose: the code needs an inner conditional and a second wait loop. Peterson's algorithm achieves the same with a much shorter pre-protocol.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 3: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #vrtD+3u4N

modified

Before

Front

Fehlerreduktion Las-Vegas
Sei $A$ ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit $\Pr[A(I) \text{ korrekt}] \geq \varepsilon$.

Sei $A_\delta$ der Algorithmus, der $A$ solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis
\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:
\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]

Back

Beweisskizze: Pr[$A_\delta$ gibt '???' aus] $\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta$ für $N \geq \varepsilon^{-1} \ln \delta^{-1}$, via $1 - x \leq e^{-x}$.

After

Front

Modern compilers give no guarantee of a global ordering of memory accesses; individual accesses may even be optimised away entirely. Typical optimisations that cause reordering are dead code elimination, register hoisting, and locality optimisations.

Back

Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made.

Field-by-field Comparison

Field	Before	After
Text	<b>Fehlerreduktion Las-Vegas</b> <br>Sei $A$ ein randomisierter Algorithmus, der nie eine falsche Antwort gibt, aber zuweilen '???' ausgibt, mit $\Pr[A(I) \text{ korrekt}] \geq \varepsilon$.<br><br>Sei $A_\delta$ der Algorithmus, der $A$ solange aufruft, bis entweder ein Wert verschieden von '???' ausgegeben wird (und diesen Wert dann ebenfalls ausgibt) oder bis <br>\[N = {{c1::~~\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\] mal '???' ausgegeben wurde (und dann '???' ausgibt). Dann gilt:<br>\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta~~}}.\]	Modern compilers give <b>no</b> guarantee of a global ordering of memory accesses; individual accesses may even be {{c1::optimised away entirely}}. Typical optimisations that cause reordering are {{c2::dead code elimination, register hoisting, and locality optimisations}}.
Extra	B~~eweisskizze: Pr[$A_\delta$ gibt '???' aus] $\leq (1-\varepsilon)^N \leq e^{-\varepsilon N} \leq \delta$~~ für ~~$N \geq \varepsilon^{-1} \ln \delta^{-1}$, via $1 - x \leq e^{-x}$~~.	Background: such optimisations are essential for performance, but they create huge potential for bugs whenever assumptions about a global memory order are made.

Tags: ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Note 4: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: )Wj64y)bWf

added

Note did not exist

New Note

Front

Peterson's lock (for two processes $P$ and $Q$) uses two shared variables: flag[1..2] ("I am interested") and victim ("who yields if both want in"). The pre-protocol for thread id is

flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);

Back

Peterson's lock (for two processes $P$ and $Q$) uses two shared variables: flag[1..2] ("I am interested") and victim ("who yields if both want in"). The pre-protocol for thread id is

flag[id] = true;
victim = id;
while(flag[1-id] && victim == id);

Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten victim (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free.

Field-by-field Comparison

Field	Before	After
Text		{{c1::Peterson's lock}} (for two processes $P$ and $Q$) uses two shared variables: {{c2::<code>flag[1..2]</code> ("I am interested")}} and {{c3::<code>victim</code> ("who yields if both want in")}}. The pre-protocol for thread <code>id</code> is <pre>flag[id] = true; victim = id; while(flag[1-id] && victim == id);</pre>
Extra		Intuition: a thread can enter as soon as either the other thread has dropped its flag (no contention) or the other thread has overwritten <code>victim</code> (so it became the new victim). Far simpler than Dekker, and provably ME + starvation-free.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 5: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: */_gG7JffH

added

Note did not exist

New Note

Front

Synchronizes-With (SW) pairs the specific actions that "see" each other: a volatile-write to x SW a subsequent read of x (subsequent in SO). Happens-Before (HB) is the transitive closure of PO and SW.

Back

Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place.

Field-by-field Comparison

Field	Before	After
Text		<i>Synchronizes-With (SW)</i> pairs the specific actions that "see" each other: a <code>volatile</code>-write to <code>x</code> SW {{c1::a subsequent read of <code>x</code> (subsequent in SO)}}. <i>Happens-Before (HB)</i> is {{c2::the transitive closure of PO and SW}}.
Extra		Unlike PO or SO, SW is not a generic ordering; it specifically pairs the actions between which a synchronisation effect takes place.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model

Note 6: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: +YN}l=~9iw

added

Note did not exist

New Note

Front

Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: space lower bound is linear in the maximum number of threads; correctness relies on no memory reordering, which requires expensive memory barriers in real hardware; these algorithms are not wait-free; the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead.

Back

The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling $O(1)$ space mutexes with practical performance.

Field-by-field Comparison

Field	Before	After
Text		Mutual exclusion built only from atomic registers (Filter, Bakery, Peterson) is not used in practice for four reasons: {{c1::space lower bound is linear in the maximum number of threads}}; {{c1::correctness relies on no memory reordering, which requires expensive memory barriers in real hardware}}; {{c1::these algorithms are not wait-free}}; {{c1::the "primitive" assumption is unrealistic - hardware provides faster combined RMW instructions instead}}.
Extra		The way out: extend the model. Modern multiprocessor architectures provide special instructions for atomically reading and writing at once (TAS, CAS, LL/SC), enabling $O(1)$ space mutexes with practical performance.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 7: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: -T60qL&2-|

added

Note did not exist

New Note

Front

The three conditions for a correct critical-section solution according to Ben-Ari are mutual exclusion (statements from CSes of different processes must not interleave), freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed), and freedom from starvation (any process trying to enter its CS must eventually succeed).

Back

Hierarchy: starvation freedom $\Rightarrow$ deadlock freedom (if every individual gets through, then certainly some individual gets through), but not the converse.

Field-by-field Comparison

Field	Before	After
Text		The three conditions for a correct critical-section solution according to Ben-Ari are {{c1::mutual exclusion (statements from CSes of different processes must not interleave)}}, {{c2::freedom from deadlock (if some processes are trying to enter a CS, at least one must eventually succeed)}}, and {{c3::freedom from starvation (any process trying to enter its CS must eventually succeed)}}.
Extra		Hierarchy: starvation freedom $\Rightarrow$ deadlock freedom (if every individual gets through, then certainly <i>some</i> individual gets through), but not the converse.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 8: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /,nT@jP#V6

modified

Before

Front

Randomisierter Algorithmus für Optimierungsprobleme
Sei $\varepsilon > 0$ und $A$ ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle $f(I)$. Sei $A_\delta$ der Algorithmus, der\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von $A$ macht und die beste der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]

Back

Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.

Typischerweise ist $f(I) = \mathbb{E}[A(I)]$, d.h. man möchte mit hoher Wahrscheinlichkeit mindestens den Erwartungswert erreichen.

After

Front

A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.

Back

A memory model (e.g. of a programming language like Java) is a contract that gives minimal guarantees on the effects of memory operations.

Goals:
- leaves room for hardware and compiler optimisations
- but provides guidelines that allow correct multi-threaded programs to be written.

Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined.

Field-by-field Comparison

Field	Before	After
Text	<b>Randomisierter Algorithmus für Optimierungsprobleme</b> <br>Sei $\varepsilon > 0$ und $A$ ein randomisierter Algorithmus, der immer eine zulässige Lösung liefert, mit\[\Pr[A(I) \geq f(I)] \geq \varepsilon\]für eine Zielschwelle $f(I)$. Sei $A_\delta$ der Algorithmus, der\[N = {{c1::~~\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]Aufrufe von $A$ macht und {{c2::die beste der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \geq f(I)] \geq 1 - \delta.\]~~	A {{c1::memory model}} (e.g. of a programming language like Java) is a {{c2::contract that gives minimal guarantees on the effects of memory operations}}.
Extra	Selbe Iterationsschranke wie bei Las-Vegas und MC mit einseitigem Fehler. Die Idee ist analog: ein einziger Aufruf, der die Schwelle überschreitet, genügt, da das Maximum mehrerer Aufrufe genommen wird.<br><br>Typischerweise ist $f(I) = \mathbb{E}[A(I)]$, d.h. man möchte mit hoher Wahrscheinlichkeit mindestens ~~den Erwartungswert erreichen~~.	Goals:<br>- leaves room for hardware and compiler optimisations<br>- but provides guidelines that allow correct multi-threaded programs to be written.<br><br>Without such a model, the actual behaviour of threads communicating via shared memory depends on hardware, runtime, and language and is undefined.

Note 9: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: /_rk8FS{Ft

added

Note did not exist

New Note

Front

When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be collapsed (e.g. $p1\equiv p2$ in the pre-CS region, $p4\equiv p5$ in the post-CS region). The forbidden situation then becomes simply both processes in the collapsed CS state.

Back

Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand.

Field-by-field Comparison

Field	Before	After
Text		When a state-space diagram is too large, only state transitions of the protocol matter; states with the same effect can be {{c1::collapsed (e.g. $p1\equiv p2$ in the pre-CS region, $p4\equiv p5$ in the post-CS region)}}. The forbidden situation then becomes simply {{c2::both processes in the collapsed CS state}}.
Extra		Collapsing keeps the proof structure intact (mutual exclusion / deadlock / starvation are checked on the reduced graph) while making it tractable to draw and reason about by hand.

Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers

Note 10: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID:

09qe2{A(


            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Canonical deadlock example: synchronized void transferTo(int amt, BankAccount a) {
  this.withdraw(amt);
  a.deposit(amt);  // acquires a's lock
}
 Two concurrent calls x.transferTo(1, y) and y.transferTo(1, x) deadlock because Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Canonical deadlock example: synchronized void transferTo(int amt, BankAccount a) {
  this.withdraw(amt);
  a.deposit(amt);  // acquires a's lock
}
 Two concurrent calls x.transferTo(1, y) and y.transferTo(1, x) deadlock because Thread 1 holds x's lock and waits for y's, while Thread 2 holds y's and waits for x's.
    
    
    
    The bug is intrinsic to the design: transferTo acquires two locks - the receiver's via this on entry, the target's via the synchronised a.deposit(). With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Canonical deadlock example: <pre>synchronized void transferTo(int amt, BankAccount a) {
  this.withdraw(amt);
  a.deposit(amt);  // acquires a's lock
}</pre> Two concurrent calls <code>x.transferTo(1, y)</code> and <code>y.transferTo(1, x)</code> deadlock because {{c1::Thread 1 holds <code>x</code>'s lock and waits for <code>y</code>'s, while Thread 2 holds <code>y</code>'s and waits for <code>x</code>'s}}.
                        
                        
                        
                            Extra
                            
                            The bug is intrinsic to the design: <code>transferTo</code> acquires two locks - the receiver's via <code>this</code> on entry, the target's via the synchronised <code>a.deposit()</code>. With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance

Field	Before	After
Text		Canonical deadlock example: <pre>synchronized void transferTo(int amt, BankAccount a) { this.withdraw(amt); a.deposit(amt); // acquires a's lock }</pre> Two concurrent calls <code>x.transferTo(1, y)</code> and <code>y.transferTo(1, x)</code> deadlock because {{c1::Thread 1 holds <code>x</code>'s lock and waits for <code>y</code>'s, while Thread 2 holds <code>y</code>'s and waits for <code>x</code>'s}}.
Extra		The bug is intrinsic to the design: <code>transferTo</code> acquires two locks - the receiver's via <code>this</code> on entry, the target's via the synchronised <code>a.deposit()</code>. With no global ordering on the second acquisition, a cycle can form. Three classical fixes follow: shrink the CS, single global lock, or impose a global lock order.


        
        
            Note 11: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 0~]{jI&/:2
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing \(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets level[me] = \ell and victim[\ell] = me, then waits while \exists k \neq me: level[k] >= \ell && victim[\ell] == me
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The Filter Lock extends Peterson's lock from 2 to \(n\) processes by introducing \(n-1\) levels: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets level[me] = \ell and victim[\ell] = me, then waits while \exists k \neq me: level[k] >= \ell && victim[\ell] == me
    
    
    
    Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.

unlock just resets level[me] = 0.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The {{c1::Filter Lock}} extends Peterson's lock from 2 to \(n\) processes by introducing {{c2::\(n-1\) levels}}: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets <code>level[me] = \ell</code> and <code>victim[\ell] = me</code>, then waits while <pre>\exists k \neq me: level[k] &gt;= \ell &amp;&amp; victim[\ell] == me</pre>
                        
                        
                        
                            Extra
                            
                            Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.<br><br><code>unlock</code> just resets <code>level[me] = 0</code>.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 12: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 1*Jyz+ZJLF
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    When constructing mutex algorithms from atomic registers, three assumptions are made: atomic reads and writes of primitive-type variables; no reorderings of read/write sequences (not true in practice; assumed here for simplicity); threads that enter a critical section eventually leave it. About progress outside the critical section, no assumption is made.
    
    
    
    Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            When constructing mutex algorithms from atomic registers, three assumptions are made: {{c1::atomic reads and writes of primitive-type variables}}; {{c1::no reorderings of read/write sequences (not true in practice; assumed here for simplicity)}}; {{c1::threads that enter a critical section eventually leave it}}. About progress {{c2::outside the critical section, no assumption is made}}.
                        
                        
                        
                            Extra
                            
                            Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 13: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 1eIQehPP;^
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    A state-space diagram is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are single atomic steps of one of the threads.
    
    
    
    A violation of mutual exclusion shows up whenever a state with both program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            A {{c1::state-space diagram}} is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are {{c3::single atomic steps of one of the threads}}.
                        
                        
                        
                            Extra
                            
                            A violation of mutual exclusion shows up whenever a state with <i>both</i> program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 14: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 4%UUWqRVem
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned a unique final long index from a class-level AtomicLong counter via counter.incrementAndGet() in the constructor. Locks are then always acquired in ascending order of index.
    
    
    
    This is the standard pattern used inside java.util.concurrent. AtomicLong guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.

Using System.identityHashCode is tempting but unsafe - hashes can collide.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned {{c1::a unique <code>final long</code> index from a class-level <code>AtomicLong counter</code>}} via <code>counter.incrementAndGet()</code> in the constructor. Locks are then always acquired in {{c2::ascending order of <code>index</code>}}.
                        
                        
                        
                            Extra
                            
                            This is the standard pattern used inside <code>java.util.concurrent</code>. <code>AtomicLong</code> guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.<br><br>Using <code>System.identityHashCode</code> is tempting but unsafe - hashes can collide.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
                    
                
            
            
        
        
        
            Note 15: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 7/+$V!A5-^
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    java.util.concurrent.atomic.AtomicBoolean exposes two key RMW operations on top of plain get / set: compareAndSet(expect, update), which atomically writes update iff the current value equals expect and returns true on success; and getAndSet(newValue), which atomically writes newValue and returns the previous value.
    
    
    
    compareAndSet ≅ CAS, getAndSet ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            <code>java.util.concurrent.atomic.AtomicBoolean</code> exposes two key RMW operations on top of plain <code>get</code> / <code>set</code>: {{c1::<code>compareAndSet(expect, update)</code>}}, which atomically writes <code>update</code> iff the current value equals <code>expect</code> and {{c2::returns <code>true</code> on success}}; and {{c3::<code>getAndSet(newValue)</code>}}, which {{c4::atomically writes <code>newValue</code> and returns the previous value}}.
                        
                        
                        
                            Extra
                            
                            <code>compareAndSet</code> ≅ CAS, <code>getAndSet</code> ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 16: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 7c[p[vx$A1
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    An atomic register \(r\) (a shared memory object, not a CPU register) supports r.read() and r.write(v) with four properties: each invocation \(J\) takes effect at a single point in time \(\tau(J)\); \(\tau(J)\) lies between the start and end of \(J\); two operations on the same register have distinct effect times \(\tau(J) \neq \tau(K)\); an invocation of r.read() returns the value written by the r.write(v) invocation with the closest preceding effect time.
    
    
    
    These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            An <i>atomic register</i> \(r\) (a shared memory object, not a CPU register) supports <code>r.read()</code> and <code>r.write(v)</code> with four properties: each invocation \(J\) {{c1::takes effect at a single point in time \(\tau(J)\)}}; \(\tau(J)\) {{c1::lies between the start and end of \(J\)}}; two operations on the same register {{c1::have distinct effect times \(\tau(J) \neq \tau(K)\)}}; an invocation of <code>r.read()</code> returns {{c1::the value written by the <code>r.write(v)</code> invocation with the closest preceding effect time}}.
                        
                        
                        
                            Extra
                            
                            These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 17: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 7gI{,sQ;#o
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The 3rd-try mutex (strict alternation) p2: while(turn != 1);
p3: critical section
p4: turn = 2
 satisfies mutual exclusion but allows starvation, because a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The 3rd-try mutex (strict alternation) p2: while(turn != 1);
p3: critical section
p4: turn = 2
 satisfies mutual exclusion but allows starvation, because a process stuck in its non-CS never resets turn, so the other process is forced to wait forever.
    
    
    
    This is exactly why we make no assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The 3rd-try mutex (strict alternation) <pre>p2: while(turn != 1);
p3: critical section
p4: turn = 2</pre> satisfies mutual exclusion but allows {{c1::starvation}}, because {{c2::a process stuck in its non-CS never resets <code>turn</code>, so the other process is forced to wait forever}}.
                        
                        
                        
                            Extra
                            
                            This is exactly why we make <i>no</i> assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 18: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 7js:~
            

            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.

Ziel: immer korrekt/gut, meistens schnell.

Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei einem Las-Vegas-Algorithmus ist die Laufzeit eine Zufallsvariable, die Korrektheit hingegen nicht vom Zufall abhängig.

Ziel: immer korrekt/gut, meistens schnell.

Alternative Sichtweise: Algorithmus, der manchmal '???' statt eines Ergebnisses ausgibt. Man kann ihn dann entweder wiederholen, bis ein Ergebnis kommt, oder nach fester Laufzeit abbrechen.
    
    
    
    Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does not affect the semantics of a sequential execution.
    
    
    
    Example: x=1; y=x+1; z=x+1; is sequentially equivalent to x=1; z=x+1; y=x+1; and even to x=1; z=2; y=2;.

In a parallel context these reorderings can become observable and break correctness.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Bei einem <b>Las-Vegas-Algorithmus</b> ist {{c1::die Laufzeit}} eine Zufallsvariable, die {{c1::Korrektheit}} hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer korrekt/gut, meistens schnell}}.<br><br><b>Alternative Sichtweise:</b> Algorithmus, der manchmal {{c3::'???' statt eines Ergebnisses}} ausgibt. Man kann ihn dann entweder {{c4::wiederholen, bis ein Ergebnis kommt}}, oder {{c4::nach fester Laufzeit abbrechen}}.
                            Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does {{c1::not affect the semantics of a <i>sequential</i> execution}}.
                        
                        
                        
                            Extra
                            Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler.
                            Example: <code>x=1; y=x+1; z=x+1;</code> is sequentially equivalent to <code>x=1; z=x+1; y=x+1;</code> and even to <code>x=1; z=2; y=2;</code>.<br><br>In a parallel context these reorderings can become observable and break correctness.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        

        
        
            Note 19: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 7kNVF}Z|oG
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    Which memory operations actually get reordered depends on the hardware architecture; e.g. AMD64 (x86) differs significantly from ARM: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
    
    
    
    Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Which memory operations actually get reordered depends on the {{c1::hardware architecture}}; e.g. {{c2::AMD64 (x86) differs significantly from ARM}}: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
                        
                        
                        
                            Extra
                            
                            Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
                    
                
            
            
        
        
        
            Note 20: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: :UEQ*@nnq1
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):

Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):

Was ist die Laufzeit des schnellsten deterministischen Algorithmus, was die erwartete Laufzeit des schnellsten Las-Vegas-Algorithmus?
    
    
    
    Deterministisch: \(\Theta(n)\) (Worst Case).
Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).

Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).

Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    The Synchronization Order (SO) in the JMM is a total order over all synchronization actions; all threads see the SA in the same order; the SA inside a thread occur in PO; SO is consistent: every read in SO sees the latest write in SO.
    
    
    
    The globality of SO is the reason that volatile and locks can synchronise threads at all.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):<br><br>Was ist die Laufzeit des schnellsten <b>deterministischen</b> Algorithmus, was die <b>erwartete</b> Laufzeit des schnellsten <b>Las-Vegas</b>-Algorithmus?
                            The <i>Synchronization Order (SO)</i> in the JMM is {{c1::a total order over all synchronization actions}}; {{c1::all threads see the SA in the same order}}; {{c1::the SA inside a thread occur in PO}}; {{c1::SO is consistent: every read in SO sees the latest write in SO}}.
                        
                        
                        
                            Extra
                            Deterministisch: \(\Theta(n)\) (Worst Case).<br>Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).<br><br>Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).<br><br>Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss.
                            The globality of SO is the reason that <code>volatile</code> and locks can synchronise threads at all.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
                    
                
            
            
        
        
        
            Note 21: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: 
            

            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only atomic read/write registers, then it must use at least \(n\) shared variables.
    
    
    
    Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only {{c1::atomic read/write registers}}, then it must use at least {{c2::\(n\) shared variables}}.
                        
                        
                        
                            Extra
                            
                            Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        

        
        
            Note 22: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: >aq{oB;vc7
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The structure of the critical-section problem per process inside an infinite loop is: non-critical section \(\to\) preprotocol \(\to\) critical section \(\to\) postprotocol.
    
    
    
    Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The structure of the critical-section problem per process inside an infinite loop is: {{c1::non-critical section}} \(\to\) {{c2::preprotocol}} \(\to\) {{c3::critical section}} \(\to\) {{c4::postprotocol}}.
                        
                        
                        
                            Extra
                            
                            Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 23: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: ?7[F1KIShh
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Threads produce a sequence of events; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted \(p_i^j\). The precedence relation \(a \to b\) ("\(a\) occurs before \(b\)") is a total order over events.
    
    
    
    Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".

Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Threads produce a sequence of <i>events</i>; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted {{c1::\(p_i^j\)}}. The <i>precedence relation</i> \(a \to b\) ("\(a\) occurs before \(b\)") is {{c2::a total order over events}}.
                        
                        
                        
                            Extra
                            
                            Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".<br><br>Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 24: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: @Id@]fUyPJ
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Monte-Carlo, Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) 
Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und die Mehrheit der erhaltenen Antworten ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
    
    
    
    Achtung: \(N\) skaliert hier quadratisch in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.

Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
  
  
    In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
  
  
    In a multi-core memory hierarchy each core has its own registers and its own L1 cache (private), while the L2 cache and system memory are shared between cores.
    
    
    
    Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            <b>Monte-Carlo, Fehlerwahrscheinlichkeit </b>\(&lt; \tfrac{1}{2}\)&nbsp;<br>Sei \(\varepsilon &gt; 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die Mehrheit der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
                            In a multi-core memory hierarchy each core has {{c1::its own registers and its own L1 cache (private)}}, while {{c2::the L2 cache and system memory are shared between cores}}.
                        
                        
                        
                            Extra
                            Achtung: \(N\) skaliert hier <b>quadratisch</b> in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.<br><br>Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe.
                            Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
                    
                
            
            
        
        
        
            Note 25: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: @rzMyS*OnN
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    Accesses to volatile fields in Java do not count as a data race. In terms of performance, volatile is slower than ordinary fields, but faster than locks.
    
    
    
    Recommendation: only for experts; otherwise use the standard library (java.util.concurrent, AtomicInteger, …).

Caveat: volatile guarantees visibility, but not atomicity of compound operations like i++.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Accesses to <code>volatile</code> fields in Java do {{c1::not count as a data race}}. In terms of performance, <code>volatile</code> is {{c2::slower than ordinary fields, but faster than locks}}.
                        
                        
                        
                            Extra
                            
                            Recommendation: only for experts; otherwise use the standard library (<code>java.util.concurrent</code>, <code>AtomicInteger</code>, …).<br><br>Caveat: <code>volatile</code> guarantees visibility, but <b>not atomicity of compound operations</b> like <code>i++</code>.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
                    
                
            
            
        
        
        
            Note 26: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: Aajp);{.(r
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a Load-Linked / Store-Conditional pair: LDREX loads from memory and marks the address as exclusive for this core; the subsequent STREX performs a conditional store that only succeeds if no other core has written to the address in the meantime.
    
    
    
    If STREX fails, software retries the LL/SC pair. Compared with x86's monolithic CMPXCHG, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a {{c1::Load-Linked / Store-Conditional}} pair: <code>LDREX</code> loads from memory and {{c2::marks the address as exclusive for this core}}; the subsequent <code>STREX</code> performs {{c3::a conditional store that only succeeds if no other core has written to the address in the meantime}}.
                        
                        
                        
                            Extra
                            
                            If <code>STREX</code> fails, software retries the LL/SC pair. Compared with x86's monolithic <code>CMPXCHG</code>, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 27: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: ApSiaX&)Ab
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a ticket (label) larger than all outstanding tickets: flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};
 where ties are broken by lexicographic order on (thread id, label), so two equal labels are still totally ordered.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Lamport's Bakery algorithm solves \(n\)-process mutex by assigning each thread a ticket (label) larger than all outstanding tickets: flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};
 where ties are broken by lexicographic order on (thread id, label), so two equal labels are still totally ordered.
    
    
    
    Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).

The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            {{c1::Lamport's Bakery algorithm}} solves \(n\)-process mutex by assigning each thread a {{c2::ticket (label) larger than all outstanding tickets}}: <pre>flag[me] = true;
label[me] = max(label[0..n-1]) + 1;
while(\exists k != me: flag[k] &amp;&amp; (k, label[k]) &lt;_l (me, label[me])) {};</pre> where ties are broken by {{c3::lexicographic order on (thread id, label), so two equal labels are still totally ordered}}.
                        
                        
                        
                            Extra
                            
                            Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).<br><br>The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 28: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: C9O/B$skj#
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its while loop forever. Then \(Q\) is in one of three cases - stuck in non-CS (then flag[Q]=false, \(P\) proceeds, contradiction); repeatedly entering and leaving its CS (each entry sets victim=Q, freezing victim away from \(P\), so \(P\) proceeds, contradiction); also stuck in its lock loop (then \(Q\) needs victim != Q, but victim can equal at most one of \(P,Q\), contradiction).
    
    
    
    The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its <code>while</code> loop forever. Then \(Q\) is in one of three cases - {{c1::stuck in non-CS (then <code>flag[Q]=false</code>, \(P\) proceeds, contradiction)}}; {{c2::repeatedly entering and leaving its CS (each entry sets <code>victim=Q</code>, freezing <code>victim</code> away from \(P\), so \(P\) proceeds, contradiction)}}; {{c3::also stuck in its lock loop (then \(Q\) needs <code>victim != Q</code>, but <code>victim</code> can equal at most one of \(P,Q\), contradiction)}}.
                        
                        
                        
                            Extra
                            
                            The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 29: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: CR2my^EB=I
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Ein klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:\(A(I)\) ist immer genau und korrekt,
bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
dieselbe Laufzeit.
Ein randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Ein klassischer Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:\(A(I)\) ist immer genau und korrekt,
bei gleicher Eingabe immer dieselbe Ausgabe (\(A\) ist eine Funktion),
dieselbe Laufzeit.
Ein randomisierter Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist manchmal richtig, manchmal fast richtig, manchmal schnell, das Verhalten ist nicht reproduzierbar. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).
    
    
    
    Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: \(b \geq a\) is always true. This proof technique is called proof by exhaustion (full enumeration of interleavings).
    
    
    
    Yet the assertion assert(b >= a) can still fail in practice: pure interleaving reasoning ignores memory reordering by the compiler and the hardware.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Ein <b>klassischer</b> Algorithmus \(A\) erhält Eingabe \(I\) und liefert Ausgabe \(A(I)\). Es gilt:<ul><li>\(A(I)\) ist immer {{c1::genau und korrekt}},</li><li>bei gleicher Eingabe immer {{c2::dieselbe Ausgabe (\(A\) ist eine Funktion)}},</li><li>{{c3::dieselbe Laufzeit}}.</li></ul>Ein <b>randomisierter</b> Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist {{c4::manchmal richtig, manchmal fast richtig, manchmal schnell}}, das Verhalten ist {{c5::nicht reproduzierbar}}. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\).
                            In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: {{c1::\(b \geq a\) is always true}}. This proof technique is called {{c2::proof by exhaustion (full enumeration of interleavings)}}.
                        
                        
                        
                            Extra
                            Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht.
                            Yet the assertion <code>assert(b &gt;= a)</code> can still fail in practice: pure interleaving reasoning ignores <b>memory reordering</b> by the compiler and the hardware.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        
        
        
            Note 30: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: D|&w=l8gFV
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    The Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first: do {
  while(state.get()) {}
} while(!state.compareAndSet(false, true));
 The inner loop only reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only after the lock looks free.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    The Test-And-Test-And-Set (TATAS) lock improves on TAS by spinning on a plain read first: do {
  while(state.get()) {}
} while(!state.compareAndSet(false, true));
 The inner loop only reads the cached value, generating no coherence traffic while the lock stays held; the costly RMW is attempted only after the lock looks free.
    
    
    
    Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The <i>Test-And-Test-And-Set (TATAS)</i> lock improves on TAS by spinning on a plain read first: <pre>do {
  while(state.get()) {}
} while(!state.compareAndSet(false, true));</pre> The inner loop only {{c1::reads the cached value, generating no coherence traffic while the lock stays held}}; the costly RMW is attempted only {{c2::after the lock <i>looks</i> free}}.
                        
                        
                        
                            Extra
                            
                            Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 31: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: F?scKnbfC[
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    On x86 the CMPXCHG mem, reg instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with the LOCK prefix.
    
    
    
    Plain CMPXCHG is atomic only with respect to other observers on the same core. The LOCK prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            On x86 the {{c1::<code>CMPXCHG</code> mem, reg}} instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with {{c2::the <code>LOCK</code> prefix}}.
                        
                        
                        
                            Extra
                            
                            Plain <code>CMPXCHG</code> is atomic only with respect to other observers on the same core. The <code>LOCK</code> prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 32: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: FGboL4isC9
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Another deadlock fix is to protect every transferTo with a single static global lock, which trivially avoids cycles because there is only one lock to acquire. Cost: no two transfers can run in parallel, even on completely disjoint pairs of accounts.
    
    
    
    This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Another deadlock fix is to protect every <code>transferTo</code> with {{c1::a single static global lock}}, which trivially avoids cycles because there is only one lock to acquire. Cost: {{c2::no two transfers can run in parallel, even on completely disjoint pairs of accounts}}.
                        
                        
                        
                            Extra
                            
                            This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
                    
                
            
            
        
        
        
            Note 33: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: Fc~lyNYx7K
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    A naive TAS-based spinlock while(state.getAndSet(true)) {}
 scales poorly because every spinning thread issues writes to the lock variable, which generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    A naive TAS-based spinlock while(state.getAndSet(true)) {}
 scales poorly because every spinning thread issues writes to the lock variable, which generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry.
    
    
    
    Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            A naive TAS-based spinlock <pre>while(state.getAndSet(true)) {}</pre> scales poorly because {{c1::every spinning thread issues writes to the lock variable}}, which {{c2::generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry}}.
                        
                        
                        
                            Extra
                            
                            Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 34: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: Jj5bulRE#&
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    In the 1st-try mutex p2: while(wantq);
p3: wantp = true;
p4: critical section
 the order "first wait for the neighbour, then set my own flag" violates mutual exclusion, because {{c2::both threads can leave their while with wantp{=}wantq{=}false before either of them sets its flag}}.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    In the 1st-try mutex p2: while(wantq);
p3: wantp = true;
p4: critical section
 the order "first wait for the neighbour, then set my own flag" violates mutual exclusion, because {{c2::both threads can leave their while with wantp{=}wantq{=}false before either of them sets its flag}}.
    
    
    
    Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.

Naive fix: set the flag before the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            In the 1st-try mutex <pre>p2: while(wantq);
p3: wantp = true;
p4: critical section</pre> the order "first wait for the neighbour, then set my own flag" violates {{c1::mutual exclusion}}, because {{c2::both threads can leave their <code>while</code> with <code>wantp{=}wantq{=}false</code> before either of them sets its flag}}.
                        
                        
                        
                            Extra
                            
                            Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.<br><br>Naive fix: set the flag <i>before</i> the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 35: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: MH1KRE1DzH
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Las-Vegas-Algorithmus für das Üppige-Auswahl-Problemrepeat:
    i := Uniform({1, ..., n})
    if x_i = 1: return i
Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Las-Vegas-Algorithmus für das Üppige-Auswahl-Problemrepeat:
    i := Uniform({1, ..., n})
    if x_i = 1: return i
Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]
    
    
    
    \(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).

Konkrete Werte: \(\Pr[T > 10] = 1/1024 < 0{,}001\), also \(\Pr[T \leq 10] > 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(> 0{,}999999\).

Wichtig: \(T\) hängt nicht von \(n\) ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)).
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    Program Order (PO) in the JMM is a total order over the actions of a single thread, defined on an execution trace. PO gives no ordering guarantee for memory accesses and is not total across threads.
    
    
    
    Its only purpose is to link an execution back to the original program.

Intra-thread consistency: per thread, PO is consistent with the thread's isolated execution. E.g. taking the else-branch of an if whose condition was true makes the execution invalid.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            <b>Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem</b><pre>repeat:
    i := Uniform({1, ..., n})
    if x_i = 1: return i</pre>Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T &gt; k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]
                            <i>Program Order (PO)</i> in the JMM is a {{c1::total order over the actions of a single thread, defined on an execution trace}}. PO gives {{c2::no ordering guarantee for memory accesses}} and is {{c2::not total across threads}}.
                        
                        
                        
                            Extra
                            \(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).<br><br>Konkrete Werte: \(\Pr[T &gt; 10] = 1/1024 &lt; 0{,}001\), also \(\Pr[T \leq 10] &gt; 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(&gt; 0{,}999999\).<br><br>Wichtig: \(T\) hängt <b>nicht von \(n\)</b> ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)).
                            Its only purpose is to link an execution back to the original program.<br><br><b>Intra-thread consistency:</b> per thread, PO is consistent with the thread's isolated execution. E.g. taking the <code>else</code>-branch of an <code>if</code> whose condition was true makes the execution invalid.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
                    
                
            
            
        
        
        
            Note 36: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: O96Epv!-$o
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Exponential-backoff lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles limit up to maxDelay}} before retrying. This removes the synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released.
    
    
    
    Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning minDelay / maxDelay is workload-specific.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            <i>Exponential-backoff</i> lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles <code>limit</code> up to <code>maxDelay</code>}} before retrying. This removes the {{c2::synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released}}.
                        
                        
                        
                            Extra
                            
                            Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning <code>minDelay</code> / <code>maxDelay</code> is workload-specific.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 37: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: Pa^@bE6s8n
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne einen Index \(i\) mit \(x_i = 1\).

Trivialer deterministischer Algorithmus (in Zeit \(O(n)\)):for i = 1, ..., n:
    if x_i = 1: return i
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Üppige-Auswahl-Problem
Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen genau die Hälfte gleich \(1\) sind (\(n\) gerade).
Berechne einen Index \(i\) mit \(x_i = 1\).

Trivialer deterministischer Algorithmus (in Zeit \(O(n)\)):for i = 1, ..., n:
    if x_i = 1: return i
    
    
    
    Das Problem heisst üppig, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
  
  
    In the classic x86 example with \(x{=}y{=}0\), thread 1: x=1; j=y; and thread 2: y=1; i=x;, besides \((1,1), (0,1), (1,0)\), the outcome \((i,j) = (0,0)\) is also possible, because under x86's "stores reordered after loads" rule the load (j=y / i=x) may be executed before the store (x=1 / y=1).
    
    
    
    Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            <b>Üppige-Auswahl-Problem<br></b>Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen {{c1::genau die Hälfte}} gleich \(1\) sind (\(n\) gerade).<br>Berechne {{c2::einen Index \(i\) mit \(x_i = 1\)}}.<br><br>Trivialer deterministischer Algorithmus (in Zeit \(O({{c3::n}})\)):<pre>for i = 1, ..., n:
    if x_i = 1: return i</pre>
                            In the classic x86 example with \(x{=}y{=}0\), thread 1: <code>x=1; j=y;</code> and thread 2: <code>y=1; i=x;</code>, besides \((1,1), (0,1), (1,0)\), the outcome {{c1::\((i,j) = (0,0)\) is also possible}}, because under x86's "stores reordered after loads" rule {{c2::the load (<code>j=y</code> / <code>i=x</code>) may be executed before the store (<code>x=1</code> / <code>y=1</code>)}}.
                        
                        
                        
                            Extra
                            Das Problem heisst <i>üppig</i>, weil es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während ein einfacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt.
                            Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::3._Memory_Model
                    
                
            
            
        
        
        
            Note 38: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: RDjwIm/b}G
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means thread \(T_i\) is attempting to acquire resource \(R_j\); edge \(R_j \to T_i\) means resource \(R_j\) is currently held by thread \(T_i\). A deadlock is present iff this graph contains a cycle.
    
    
    
    Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).

Deadlock detection is a cycle search in this graph.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means {{c1::thread \(T_i\) is attempting to acquire resource \(R_j\)}}; edge \(R_j \to T_i\) means {{c2::resource \(R_j\) is currently held by thread \(T_i\)}}. A deadlock is present iff {{c3::this graph contains a cycle}}.
                        
                        
                        
                            Extra
                            
                            Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).<br><br>Deadlock <i>detection</i> is a cycle search in this graph.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
                    
                
            
            
        
        
        
            Note 39: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: T+YN#N>,$x
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)

Möglich ist sie aber bei:einseitigem Fehler, oder
{{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen nicht möglich.
(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)

Möglich ist sie aber bei:einseitigem Fehler, oder
{{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.
    
    
    
    Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).
Bei Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument).
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally) and per-processor local caches (loads and stores become visible to other processors only after some delay).
    
    
    
    What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen <b>nicht möglich</b>.<br>(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)<br><br>Möglich ist sie aber bei:<ul><li>{{c1::einseitigem Fehler}}, oder</li><li>{{c2::Fehlerwahrscheinlichkeit \(&lt; \tfrac{1}{2}\)}}.</li></ul>
                            Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: {{c1::pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally)}} and {{c2::per-processor local caches (loads and stores become visible to other processors only after some delay)}}.
                        
                        
                        
                            Extra
                            Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).<br>Bei Fehlerwahrscheinlichkeit \(&lt; \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument).
                            What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        
        
        
            Note 40: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: TezZ0;|FPI
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Test-And-Set on a memory cell s, atomically: if mem[s] == 0, set it to 1 and return true; else return false. Compare-And-Swap on cell a with arguments old, new, atomically: read oldval = mem[a]; if oldval == old, set mem[a] = new; in either case return oldval.
    
    
    
    TAS is a strict subset of CAS (CAS with old=0, new=1 behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.

Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            <i>Test-And-Set</i> on a memory cell <code>s</code>, atomically: {{c1::if <code>mem[s] == 0</code>, set it to 1 and return <code>true</code>; else return <code>false</code>}}. <i>Compare-And-Swap</i> on cell <code>a</code> with arguments <code>old, new</code>, atomically: {{c2::read <code>oldval = mem[a]</code>; if <code>oldval == old</code>, set <code>mem[a] = new</code>; in either case return <code>oldval</code>}}.
                        
                        
                        
                            Extra
                            
                            TAS is a strict subset of CAS (CAS with <code>old=0, new=1</code> behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.<br><br>Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 41: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: VKeMFH&qvc
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.

Ziel: immer schnell, meistens korrekt/gut.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei einem Monte-Carlo-Algorithmus ist die Korrektheit (bzw. Qualität) eine Zufallsvariable, die Laufzeit hingegen nicht vom Zufall abhängig.

Ziel: immer schnell, meistens korrekt/gut.
    
    
    
    Merksatz: Monte-Carlo ist schnell und vielleicht falsch, Las-Vegas ist korrekt und vielleicht langsam.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    A proof by contradiction that assert(b >= a) cannot fail (assuming sequential program order) uses the relation happened before together with its property of transitivity.
    
    
    
    Assume \(b{<}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic happened before chain - contradiction.

The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Bei einem <b>Monte-Carlo-Algorithmus</b> ist {{c1::die Korrektheit (bzw. Qualität)}}&nbsp;eine Zufallsvariable, die {{c1::Laufzeit}}&nbsp;hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer schnell, meistens korrekt/gut}}.
                            A proof by contradiction that <code>assert(b &gt;= a)</code> cannot fail (assuming sequential program order) uses the relation {{c1::<i>happened before</i>}} together with its property of {{c2::transitivity}}.
                        
                        
                        
                            Extra
                            Merksatz: Monte-Carlo ist <i>schnell und vielleicht falsch</i>, Las-Vegas ist <i>korrekt und vielleicht langsam</i>.
                            Assume \(b{&lt;}a\); this forces \(a{=}1\) and \(b{=}0\), which produces a cyclic <i>happened before</i> chain - contradiction.<br><br>The proof relies on programs executing in program order; precisely this assumption breaks in practice due to reordering.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        
        
        
            Note 42: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: VQe2x::6Vm
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    One way to break the transferTo deadlock is to remove the outer synchronized on transferTo, leaving only the per-method locks on withdraw and deposit. The trade-off is a transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts.
    
    
    
    For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            One way to break the <code>transferTo</code> deadlock is to {{c1::remove the outer <code>synchronized</code> on <code>transferTo</code>}}, leaving only the per-method locks on <code>withdraw</code> and <code>deposit</code>. The trade-off is a {{c2::transient inconsistency: between the withdraw and the deposit, the money is briefly missing from both accounts}}.
                        
                        
                        
                            Extra
                            
                            For financial systems this is usually unacceptable: an external observer (audit, balance read) could see the missing amount. Smaller critical sections are a good move whenever the application can tolerate seeing intermediate states.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
                    
                
            
            
        
        
        
            Note 43: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: Xhsn{vOeua
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol switching interrupts off and on respectively ("switch off/on IRQs").
    
    
    
    Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.

This does not work on multi-core, because other cores keep running in parallel.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            On a single-core system, mutual exclusion is achieved trivially by using as preprotocol and postprotocol {{c1::switching interrupts off and on respectively ("switch off/on IRQs")}}.
                        
                        
                        
                            Extra
                            
                            Reasoning: without interrupts no context switch occurs, so no other thread on the same core can interleave.<br><br>This does <b>not</b> work on multi-core, because other cores keep running in parallel.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 44: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: YD,MtGo!Tz
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Even with atomic registers, programs can still be non-deterministic, because the model says nothing about the order of effect times for concurrent operations.
    
    
    
    Atomicity gives us "each operation has a single point in time", but does not pin down which point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Even with atomic registers, programs can still be non-deterministic, because {{c1::the model says nothing about the order of effect times for <i>concurrent</i> operations}}.
                        
                        
                        
                            Extra
                            
                            Atomicity gives us "each operation has a single point in time", but does not pin down <i>which</i> point. So two concurrent writes can serialise either way, and a concurrent read may legitimately observe either of the two values.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 45: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: ]SzQ}B-b0#
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    A common Java bug when implementing Peterson's lock is writing volatile boolean[] flag = new boolean[2];, which makes only the array reference volatile, not its elements. Element accesses then have no volatile semantics. The correct fix is to use AtomicIntegerArray (or AtomicIntegerArray for victim).
    
    
    
    The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. AtomicIntegerArray exposes per-element atomic and volatile-like operations.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            A common Java bug when implementing Peterson's lock is writing <code>volatile boolean[] flag = new boolean[2];</code>, which makes {{c1::only the array <i>reference</i> volatile, not its <i>elements</i>}}. Element accesses then have no volatile semantics. The correct fix is to use {{c2::<code>AtomicIntegerArray</code> (or <code>AtomicIntegerArray</code> for <code>victim</code>)}}.
                        
                        
                        
                            Extra
                            
                            The buggy version may appear to work in practice (especially on x86), but provides no JMM guarantees about visibility and ordering of writes to individual array slots. <code>AtomicIntegerArray</code> exposes per-element atomic and volatile-like operations.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 46: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: _9&=
            

            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
  
  
    Two strategies for handling deadlocks: detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state) and avoidance (statically ensure a cycle can never form). Two standard avoidance techniques are two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted) and resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional).
    
    
    
    Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Two strategies for handling deadlocks: {{c1::detection (find cycles in the resource graph at runtime; cannot in general be healed because releasing locks usually leaves inconsistent state)}} and {{c2::avoidance (statically ensure a cycle can never form)}}. Two standard avoidance techniques are {{c3::two-phase locking with retry (release everything and restart on conflict; common in databases where transactions can be aborted)}} and {{c3::resource ordering (impose a global total order on locks; common in parallel programming where state mutations are not transactional)}}.
                        
                        
                        
                            Extra
                            
                            Detection is reactive; avoidance is proactive. Avoidance is preferred when consistency requirements rule out the rollback-and-retry approach.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::7._Deadlock_Avoidance
                    
                
            
            
        

        
        
            Note 47: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: a+AQXsL|IC
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Monte-Carlo, einseitiger Fehler
Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]
    
    
    
    Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
  
  
    Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
  
  
    Typical latencies in the memory hierarchy of one core (order of magnitude): registers ~0.5 ns, L1 cache ~1 ns, L2 cache ~7 ns, system memory ~100 ns.
    
    
    
    Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.

Factor between register and RAM: roughly \(200{\times}\).
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            <b>Monte-Carlo, einseitiger Fehler<br></b>Sei \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\begin{gathered}\Pr[A(I) = \text{JA}] = 1, \text{ falls } I \text{ JA-Instanz},\\ \Pr[A(I) = \text{NEIN}] \geq \varepsilon, \text{ falls } I \text{ NEIN-Instanz}.\end{gathered}\]Sei \(A_\delta\) der Algorithmus, der \(A\) solange aufruft, bis entweder NEIN ausgegeben wird (dann selbst NEIN) oder bis\[N = {{c1::\left\lceil \varepsilon^{-1} \ln \delta^{-1} \right\rceil}}\]mal JA ausgegeben wurde (dann selbst JA). Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq {{c2::1 - \delta}}.\]
                            Typical latencies in the memory hierarchy of one core (order of magnitude): registers {{c1::~0.5 ns}}, L1 cache {{c1::~1 ns}}, L2 cache {{c1::~7 ns}}, system memory {{c1::~100 ns}}.
                        
                        
                        
                            Extra
                            Selbe Iterationsschranke wie bei Las-Vegas: \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\). Tatsächlich ist das hier äquivalent zur Las-Vegas-Sicht: man kann den einseitig-fehlerhaften MC-Algorithmus als Las-Vegas-Algorithmus auffassen, der '???' (alias 'JA') ausgibt, wenn er nicht sicher ist.
                            Trade-off across the hierarchy: top is fast, low latency, high cost, low capacity; bottom is slow, high latency, low cost, high capacity.<br><br>Factor between register and RAM: roughly \(200{\times}\).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::2._Memory_Hierarchy
                    
                
            
            
        
        
        
            Note 48: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: g?{Lm[l$i^
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    HB consistency: a read sees either the latest write in HB or any other write that is unordered with respect to it in HB. Consequence: the JMM explicitly allows data races rather than forbidding them.
    
    
    
    Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            <i>HB consistency</i>: a read sees either {{c1::the latest write in HB}} or {{c1::any other write that is unordered with respect to it in HB}}. Consequence: the JMM {{c2::explicitly allows data races}} rather than forbidding them.
                        
                        
                        
                            Extra
                            
                            Example: in a race a read may observe an "old" value even though a newer write exists, as long as that newer write is not HB-before the read. This is how the typical "impossible" outcomes arise.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
                    
                
            
            
        
        
        
            Note 49: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: hd
            

            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    Any lock pre-protocol can be split into a doorway interval \(D\) (finite number of steps) and a waiting interval \(W\) (unbounded number of steps). A lock is first-come-first-served (FCFS) iff for any two threads \(A, B\): \(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\).
    
    
    
    The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Any lock pre-protocol can be split into a {{c1::doorway interval \(D\) (finite number of steps)}} and a {{c2::waiting interval \(W\) (unbounded number of steps)}}. A lock is <i>first-come-first-served</i> (FCFS) iff for any two threads \(A, B\): {{c3::\(D_A^j \to D_B^k \implies CS_A^j \to CS_B^k\)}}.
                        
                        
                        
                            Extra
                            
                            The split formalises "once you finished the bounded entry handshake (\(D\)), you are committed; the only remaining wait is the unbounded \(W\)". FCFS demands that the order in which threads finish their doorways is the same as the order in which they enter their critical sections.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        

        
        
            Note 50: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: m;emJ<(X_K
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).

Beweisidee (Adversary-Argument): 
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega(n)\).

Beweisidee (Adversary-Argument): 
Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit \(0\), solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens \(n/2 + 1\) Eingabezahlen, bevor er eine \(1\) sieht.
    
    
    
    Das ist ein typisches Adversary-Argument: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    The Java Memory Model (JMM) defines actions such as read(x):1 ("read variable x; the value read is 1"). Executions combine actions with four orderings: Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB).
    
    
    
    The JMM restricts the allowed outcomes of programs. Without volatile / synchronized the result is essentially "arbitrary".
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Jeder deterministische Algorithmus für das Üppige-Auswahl-Problem benötigt Laufzeit \(\Omega({{c1::n}})\).<br><br><b>Beweisidee (Adversary-Argument):</b> <br>Für fixen Algorithmus konstruieren wir die Instanz schrittweise. Auf jede Anfrage \(x_i\) des Algorithmus antworten wir mit&nbsp;{{c2::\(0\)}}, solange wir noch nicht \(n/2\) Nullen platziert haben. Sobald \(n/2\) Nullen platziert sind, füllen wir die restlichen Positionen mit \(1\) auf. Damit liest der Algorithmus mindestens {{c3::\(n/2 + 1\)}} Eingabezahlen, bevor er eine \(1\) sieht.
                            The Java Memory Model (JMM) defines <i>actions</i> such as {{c1::<code>read(x):1</code> ("read variable x; the value read is 1")}}. <i>Executions</i> combine actions with four orderings: {{c2::Program Order (PO), Synchronizes-With (SW), Synchronization Order (SO), Happens-Before (HB)}}.
                        
                        
                        
                            Extra
                            Das ist ein typisches <b>Adversary-Argument</b>: Der "Gegenspieler" entscheidet die Eingabe erst während der Ausführung, immer so, dass der Algorithmus möglichst lange braucht. Da der Algorithmus deterministisch ist, weiss der Adversary genau, welche Position als nächstes gelesen wird, und kann sie auf \(0\) setzen.
                            The JMM restricts the allowed outcomes of programs. Without <code>volatile</code> / <code>synchronized</code> the result is essentially "arbitrary".
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
                    
                
            
            
        
        
        
            Note 51: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: p)%lBpQ>Q<
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
  
  
    Three common families of hardware atomic instructions: Test-And-Set (TAS) - e.g. m68k TSL; Compare-And-Swap (CAS) - e.g. x86 LOCK CMPXCHG, SPARC CASA; Load-Linked / Store-Conditional - e.g. ARM LDREX/STREX, MIPS / POWER / RISC-V LL/SC. They are typically 10-100x slower than plain reads / writes.
    
    
    
    These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Three common families of hardware atomic instructions: {{c1::Test-And-Set (TAS)}} - e.g. m68k <code>TSL</code>; {{c2::Compare-And-Swap (CAS)}} - e.g. x86 <code>LOCK CMPXCHG</code>, SPARC <code>CASA</code>; {{c3::Load-Linked / Store-Conditional}} - e.g. ARM <code>LDREX/STREX</code>, MIPS / POWER / RISC-V <code>LL/SC</code>. They are typically {{c4::10-100x slower than plain reads / writes}}.
                        
                        
                        
                            Extra
                            
                            These primitives are needed because plain atomic registers cannot break the \(\Omega(n)\) space lower bound for \(n\)-thread mutex (Burns-Lynch). They also enable lock-free data structures.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::6._Locks_with_Atomic_Operations
                    
                
            
            
        
        
        
            Note 52: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: qzA@de.%OP
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The Filter Lock satisfies mutual exclusion, deadlock freedom, and starvation freedom, but it is not first-come-first-served: a thread that finished its doorway later may overtake one that finished earlier.
    
    
    
    Reason: there is no global ticket; progress depends on the level/victim arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The Filter Lock satisfies {{c1::mutual exclusion, deadlock freedom, and starvation freedom}}, but it is {{c2::not first-come-first-served}}: a thread that finished its doorway later may overtake one that finished earlier.
                        
                        
                        
                            Extra
                            
                            Reason: there is no global ticket; progress depends on the level/<code>victim</code> arrays, which are updated continuously. A thread that arrives later can become "victim" at a high level after another thread already passed that level. Bakery is the canonical lock that fixes this.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 53: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: t&2cm[5&m<
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).

Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
  
  
    Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T > k] = (1/2)^k\).

Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.
    
    
    
    \[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).

Direkter Ansatz (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):
\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T > k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]Indirekter Ansatz (Rekursion via Fallunterscheidung im ersten Versuch):
\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:
\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\]
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
  
  
    Synchronization actions (SA) in the JMM are: read/write of a volatile variable; lock and unlock of a monitor; the first and last action of a thread (synthetic); actions that start a thread; actions that determine whether a thread has terminated.
    
    
    
    SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Beim Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem ist \(T\) die Anzahl an Wiederholungen mit \(\Pr[T &gt; k] = (1/2)^k\).<br><br>Was ist \(\mathbb{E}[T]\)? Gib zwei Herleitungen an.
                            <i>Synchronization actions (SA)</i> in the JMM are: {{c1::read/write of a <code>volatile</code> variable}}; {{c1::lock and unlock of a monitor}}; {{c1::the first and last action of a thread (synthetic)}}; {{c1::actions that start a thread}}; {{c1::actions that determine whether a thread has terminated}}.
                        
                        
                        
                            Extra
                            \[\mathbb{E}[T] = 2.\]Erwartete Laufzeit des Algorithmus: \(O(1)\), unabhängig von \(n\).<br><br><b>Direkter Ansatz</b> (über \(\mathbb{E}[T] = \sum_{k \geq 1} \Pr[T \geq k]\)):<br>\[\begin{gathered}\mathbb{E}[T] = \sum_{k=1}^{\infty} \Pr[T \geq k] = \sum_{k=1}^{\infty} \Pr[T &gt; k - 1] = \sum_{k=1}^{\infty} \left(\tfrac{1}{2}\right)^{k-1}\\= 1 + \tfrac{1}{2} + \tfrac{1}{4} + \tfrac{1}{8} + \cdots = 2.\end{gathered}\]<b>Indirekter Ansatz</b> (Rekursion via Fallunterscheidung im ersten Versuch):<br>\[\mathbb{E}[T] = \Pr[x_i = 1] \cdot 1 + \Pr[x_i = 0] \cdot (1 + \mathbb{E}[T']),\]wobei \(T'\) die Anzahl an weiteren Wiederholungen ist, falls die erste fehlschlägt. Da \(T'\) und \(T\) gleichverteilt sind:<br>\[\mathbb{E}[T] = \tfrac{1}{2} \cdot 1 + \tfrac{1}{2}(1 + \mathbb{E}[T]) = 1 + \tfrac{1}{2}\mathbb{E}[T] \;\Longrightarrow\; \mathbb{E}[T] = 2.\]
                            SA are the building blocks of the synchronization order (SO). Anything else is an "ordinary" action.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::5._Üppige-Auswahl-Problem
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::4._Java_Memory_Model
                    
                
            
            
        
        
        
            Note 54: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: u{Ck&o_a<_
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of flag[Q] happens after \(Q\) wrote flag[Q]=true, so \(P\) must have read flag[Q] = true, and reading victim must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
    
    
    
    Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\).
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            The mutual-exclusion proof for Peterson's lock proceeds by contradiction: assume \(\mathrm{CS}_P\) and \(\mathrm{CS}_Q\) overlap and WLOG {{c1::\(W_Q(\text{victim}{=}Q) \to W_P(\text{victim}{=}P)\)}}. Then by transitivity along each thread's code, \(P\)'s read of <code>flag[Q]</code> happens after \(Q\) wrote <code>flag[Q]=true</code>, so \(P\) {{c2::must have read <code>flag[Q] = true</code>}}, and reading <code>victim</code> must yield \(P\); both conjuncts of the while-condition hold, so \(P\) cannot have entered. Contradiction.
                        
                        
                        
                            Extra
                            
                            Key levers used: the writes inside one thread are totally ordered by program order (so \(W(\text{flag}) \to W(\text{victim}) \to R(\text{flag}) \to R(\text{victim}) \to CS\)), and atomic registers let us chain the orderings via \(\tau\).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 55: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: w(*5>B[{%<
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    In the 2nd-try mutex p2: wantp = true;
p3: while(wantq);
p4: critical section
 the order "first set my flag, then wait" satisfies mutual exclusion but deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    In the 2nd-try mutex p2: wantp = true;
p3: while(wantq);
p4: critical section
 the order "first set my flag, then wait" satisfies mutual exclusion but deadlocks, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
    
    
    
    Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            In the 2nd-try mutex <pre>p2: wantp = true;
p3: while(wantq);
p4: critical section</pre> the order "first set my flag, then wait" satisfies mutual exclusion but {{c1::deadlocks}}, because {{c2::both threads can set their flags before either reads the other's, leaving both stuck spinning at state \((p3, q3, \text{true}, \text{true})\)}}.
                        
                        
                        
                            Extra
                            
                            Trade-off compared with the 1st try: ME is now ensured (the flag is set before the check), but the symmetric race produces a stable bad state. No process is making progress, yet none is starving in the technical sense - it is a true deadlock.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 56: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: z|ZzS?_M6b
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
  
  
    An interval \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff \(a_1 \to b_0\). If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are concurrent.
    
    
    
    So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is not a total order.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            An <i>interval</i> \((a_0, a_1)\) is a pair of events with \(a_0 \to a_1\). For two intervals \(I_A = (a_0, a_1)\) and \(I_B = (b_0, b_1)\), we write \(I_A \to I_B\) iff {{c1::\(a_1 \to b_0\)}}. If neither \(I_A \to I_B\) nor \(I_B \to I_A\) holds, they are {{c2::concurrent}}.
                        
                        
                        
                            Extra
                            
                            So intervals only "precede" each other when one ends entirely before the other begins. Overlapping intervals are concurrent. Unlike the precedence on events, the relation on intervals is <i>not</i> a total order.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::5._Locks_with_Atomic_Registers
                    
                
            
            
        
        
        
            Note 57: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: |2|f7e7DL}
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Eine Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Eine Zufallsquelle liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:Physikalische Zufallsgeneratoren: Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente.
Deterministische (Pseudo-)Zufallsgeneratoren: liefern, basierend auf einem Startwert (seed) \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht Reproduzierbarkeit, wenn man sich den Startwert merkt.
    
    
    
    In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten.
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
    
    
    
    Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement).
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Eine <b>Zufallsquelle</b> liefert nach vorgegebener Verteilung zufällige unabhängige Zufallsbits oder -zahlen. Zwei Typen:<ul><li><b>Physikalische Zufallsgeneratoren</b>: {{c1::Lottozahlen, Geigerzähler, Rauschen, Quantenexperimente::Beispiele?}}.</li><li><b>Deterministische (Pseudo-)Zufallsgeneratoren</b>: liefern, basierend auf einem {{c2::Startwert (seed)}} \(s\), eine Folge \(s \to r_0(s), r_1(s), r_2(s), \ldots\) Das ermöglicht {{c3::Reproduzierbarkeit, wenn man sich den Startwert merkt}}.</li></ul>
                            Number of interleavings for 2 threads with \(k\) statements each: {{c1::\(\binom{2k}{k} = O\!\left(\dfrac{4^k}{\sqrt{2k} }\right) \)}}.
                        
                        
                        
                            Extra
                            In der Analyse geht man von idealen Zufallsgeneratoren aus. Bei der Anwendung muss man aber berücksichtigen, dass Pseudo-Zufallsgeneratoren das nur näherungsweise leisten.
                            Derivation: the merged trace has length \(2k\). Once we fix which \(k\) of the \(2k\) positions belong to thread 1, the interleaving is determined (sampling without replacement).
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        
        
        
            Note 58: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: ~d57MCW=Q9
            
            modified
            

            
            
            
                
                    Before
                    
                    
                        Front
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.

Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
  
  
    Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.

Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?
    
    
    
    \(N\) skaliert logarithmisch in \(\delta^{-1}\).

Praktisch: Um die Fehlerwahrscheinlichkeit um einen konstanten Faktor (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur konstant viele zusätzliche Iterationen nötig.

Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)).
  



                        
                    
                    
                
                
                    After
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    For the home-made rendezvous void wait()  { x = 1; while(x==1); }
void arrive(){ x = 2; }
 GCC with optimisations compiles the loop into an infinite loop (jmp test without ever reloading x).
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    For the home-made rendezvous void wait()  { x = 1; while(x==1); }
void arrive(){ x = 2; }
 GCC with optimisations compiles the loop into an infinite loop (jmp test without ever reloading x).
    
    
    
    Reason: the compiler hoists the read of x out of the loop (register hoisting) because, from its sequential view, x is not modified inside the loop. Writes from other threads are invisible to it.

Fix: volatile or a lock.
  



                        
                    
                    
                
            
            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            Bei der Fehlerreduktion eines Las-Vegas-Algorithmus mit Erfolgswahrscheinlichkeit \(\geq \varepsilon\) genügen \(N = \lceil \varepsilon^{-1} \ln \delta^{-1} \rceil\) Wiederholungen, um \(\Pr[\text{korrekt}] \geq 1 - \delta\) zu erreichen.<br><br>Wie skaliert \(N\) in \(\delta\)? Was bedeutet das praktisch?
                            For the home-made rendezvous <pre>void wait()  { x = 1; while(x==1); }
void arrive(){ x = 2; }</pre> GCC with optimisations compiles the loop into {{c1::an infinite loop (<code>jmp test</code> without ever reloading <code>x</code>)}}.
                        
                        
                        
                            Extra
                            \(N\) skaliert <b>logarithmisch</b> in \(\delta^{-1}\).<br><br>Praktisch: Um die Fehlerwahrscheinlichkeit um einen <b>konstanten Faktor</b> (z.B. von \(10^{-2}\) auf \(10^{-3}\)) zu reduzieren, sind nur <b>konstant viele zusätzliche Iterationen</b> nötig.<br><br>Beispiel \(\varepsilon = 0{,}25\): \(\delta = 10^{-1} \Rightarrow N = 10\), \(\delta = 10^{-2} \Rightarrow N = 19\), \(\delta = 10^{-3} \Rightarrow N = 28\), \(\ldots\) (jeweils \(+9\)).
                            Reason: the compiler hoists the read of <code>x</code> out of the loop (register hoisting) because, from its sequential view, <code>x</code> is not modified inside the loop. Writes from other threads are invisible to it.<br><br>Fix: <code>volatile</code> or a lock.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    ETH::2._Semester::A&W::2._Wahrscheinlichkeitstheorie_und_randomisierte_Algorithmen::8._Randomisierte_Algorithmen::1._Reduktion_der_Fehlerwahrscheinlichkeit
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
                    
                
            
            
        
        
        
            Note 59: ETH::2. Semester::PProg
            
                Deck: ETH::2. Semester::PProg

                Note Type: Horvath Cloze

                GUID: ~i|&N8&P3M
            
            added
            

            
            
            
                
                    Previous
                    Note did not exist
                
                
                    New Note
                    
                    
                        Front
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Realistic example of incorrect code: boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }
 There is no guarantee that thread 1 will ever stop - although it will "likely work in practice".
  



                        
                    
                    
                        Back
                        
                            
  ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering
  
  
    Realistic example of incorrect code: boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }
 There is no guarantee that thread 1 will ever stop - although it will "likely work in practice".
    
    
    
    Reason: the compiler may hoist stop into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.

Fix: volatile boolean stop or AtomicBoolean.
  



                        
                    
                    
                
            

            

            
            
            
                Field-by-field Comparison
                
                    
                        
                            Field
                            Before
                            After
                        
                    
                    
                        
                        
                            Text
                            
                            Realistic example of incorrect code: <pre>boolean stop = false;
void f() { while(!stop) { /* draw */ } }
void g() { stop = didUserQuit(); }</pre> There is {{c1::no guarantee that thread 1 will ever stop}} - although it {{c2::will "likely work in practice"}}.
                        
                        
                        
                            Extra
                            
                            Reason: the compiler may hoist <code>stop</code> into a register and never reload it inside the loop. Additionally, thread 2's write may sit in its local cache.<br><br>Fix: <code>volatile boolean stop</code> or <code>AtomicBoolean</code>.
                        
                        
                    
                
            
            

            
            
            
                Tags:
                
                    
                    
                    
                    
                    
                    
                    ETH::2._Semester::PProg::12._Shared_Memory_Concurrency::1._Memory_Reordering

Field	Before	After
Text		The {{c1::Filter Lock}} extends Peterson's lock from 2 to \(n\) processes by introducing {{c2::\(n-1\) levels}}: each thread climbs from level 1 up to level \(n-1\), and at each level \(\ell\) it sets <code>level[me] = \ell</code> and <code>victim[\ell] = me</code>, then waits while <pre>\exists k \neq me: level[k] >= \ell && victim[\ell] == me</pre>
Extra		Intuition: at each level, Peterson's mechanism filters out at most one thread (the current victim, if anyone else is still around). After traversing all \(n-1\) levels, at most one thread can be left, which then enters the CS.<br><br><code>unlock</code> just resets <code>level[me] = 0</code>.

Field	Before	After
Text		When constructing mutex algorithms from atomic registers, three assumptions are made: {{c1::atomic reads and writes of primitive-type variables}}; {{c1::no reorderings of read/write sequences (not true in practice; assumed here for simplicity)}}; {{c1::threads that enter a critical section eventually leave it}}. About progress {{c2::outside the critical section, no assumption is made}}.
Extra		Threads may therefore stall, die, or pause arbitrarily long outside a CS. This matters for arguments like starvation freedom: an algorithm must not rely on the "other" thread continuing to make progress.

Field	Before	After
Text		A {{c1::state-space diagram}} is a proof technique for mutex algorithms: nodes are {{c2::global states \([p, q, \text{shared vars}]\) consisting of program counters and all shared variables}}, edges are {{c3::single atomic steps of one of the threads}}.
Extra		A violation of mutual exclusion shows up whenever a state with <i>both</i> program counters in the CS is reachable (e.g. \((p4, q4, \dots)\)). Advantage over enumerating interleavings: it terminates even with loops, and it spots deadlocks/livelocks as inescapable subgraphs.

Field	Before	After
Text		When applying resource ordering and the objects do not have a natural total order, generate one: each instance is assigned {{c1::a unique <code>final long</code> index from a class-level <code>AtomicLong counter</code>}} via <code>counter.incrementAndGet()</code> in the constructor. Locks are then always acquired in {{c2::ascending order of <code>index</code>}}.
Extra		This is the standard pattern used inside <code>java.util.concurrent</code>. <code>AtomicLong</code> guarantees uniqueness and monotonicity even under concurrent construction, and a 64-bit counter never realistically overflows.<br><br>Using <code>System.identityHashCode</code> is tempting but unsafe - hashes can collide.

Field	Before	After
Text		<code>java.util.concurrent.atomic.AtomicBoolean</code> exposes two key RMW operations on top of plain <code>get</code> / <code>set</code>: {{c1::<code>compareAndSet(expect, update)</code>}}, which atomically writes <code>update</code> iff the current value equals <code>expect</code> and {{c2::returns <code>true</code> on success}}; and {{c3::<code>getAndSet(newValue)</code>}}, which {{c4::atomically writes <code>newValue</code> and returns the previous value}}.
Extra		<code>compareAndSet</code> ≅ CAS, <code>getAndSet</code> ≅ TAS. Note: even though they are called "atomic", the JLS does not guarantee these map to a single hardware instruction; they may be implemented via internal locks. They are guaranteed atomic, not guaranteed lock-free.

Field	Before	After
Text		An <i>atomic register</i> \(r\) (a shared memory object, not a CPU register) supports <code>r.read()</code> and <code>r.write(v)</code> with four properties: each invocation \(J\) {{c1::takes effect at a single point in time \(\tau(J)\)}}; \(\tau(J)\) {{c1::lies between the start and end of \(J\)}}; two operations on the same register {{c1::have distinct effect times \(\tau(J) \neq \tau(K)\)}}; an invocation of <code>r.read()</code> returns {{c1::the value written by the <code>r.write(v)</code> invocation with the closest preceding effect time}}.
Extra		These four conditions justify treating each operation as if it happened instantaneously at \(\tau(J)\), even though physically it spans an interval. This is the foundation for proofs about Peterson, Filter, Bakery, etc.

Field	Before	After
Text		The 3rd-try mutex (strict alternation) <pre>p2: while(turn != 1); p3: critical section p4: turn = 2</pre> satisfies mutual exclusion but allows {{c1::starvation}}, because {{c2::a process stuck in its non-CS never resets <code>turn</code>, so the other process is forced to wait forever}}.
Extra		This is exactly why we make <i>no</i> assumption about progress outside the CS: a correct mutex must not rely on the other thread continuing to take turns.

Field	Before	After
Text	Bei einem <b>Las-Vegas-Algorithmus</b> ist {{c1::die Laufzeit}} eine Zufallsvariable, die {{c1::Korrektheit}} hingegen nicht vom Zufall abhängig.<br><br>Ziel: {{c2::immer korrekt/gut, meistens schnell}}.<br><br><b>Alternative Sichtweise:</b> Algorithmus, der manchmal {{c3::'???' statt eines Ergebnisses}} ausgibt. Man kann ihn dann entweder {{c4::~~wiederholen, bis ein Ergebnis kommt}}, oder {{c4::nach fester Laufzeit abbreche~~n}}.	Rule of thumb for memory reordering: the compiler and the hardware are allowed to make any change that does {{c1::not affect the semantics of a <i>sequential</i> execution}}.
Extra	Die alternative Sichtweise ist die Brücke zum Monte-Carlo-Algorithmus: Lässt man den Las-Vegas-Algorithmus nach fester Laufzeit abbrechen, wird daraus ein Monte-Carlo-Algorithmus mit einseitigem Fehler.	Example: <code>x=1; y=x+1; z=x+1;</code> is sequentially equivalent to <code>x=1; z=x+1; y=x+1;</code> and even to <code>x=1; z=2; y=2;</code>.<br><br>In a parallel context these reorderings can become observable and break correctness.

Field	Before	After
Text		Which memory operations actually get reordered depends on the {{c1::hardware architecture}}; e.g. {{c2::AMD64 (x86) differs significantly from ARM}}: x86 essentially only allows "stores reordered after loads", while ARM, POWER, and Alpha permit almost every kind of reordering.
Extra		Consequence: correct multi-threaded code that "works" on x86 can break on ARM without any change to the source. This is exactly why a memory model is needed as a language-level guarantee.

Field	Before	After
Text	Üppige-Auswahl-Problem (gegeben \(x_1, \ldots, x_n \in \{0, 1\}\) mit \(n/2\) Einsen, finde \(i\) mit \(x_i = 1\)):<br><br>Was ist die Laufzeit des schnellsten <b>deterministischen</b> Algorithmus, was die <b>erwartete</b> Laufzeit des schnellsten <b>Las-Vegas</b>-Algorithmus?	The <i>Synchronization Order (SO)</i> in the JMM is {{c1::a total order over all synchronization actions}}; {{c1::all threads see the SA in the same order}}; {{c1::the SA inside a thread occur in PO}}; {{c1::SO is consistent: every read in SO sees the latest write in SO}}.
Extra	Deterministisch: \(\Theta(n)\) (Worst Case).<br>Las-Vegas: \(O(1)\) erwartet, unabhängig von \(n\).<br><br>Das ist ein dramatischer Unterschied und ein Paradebeispiel für den Vorteil von Randomisierung: der Las-Vegas-Algorithmus probiert wiederholt zufällige Indizes und stoppt, sobald er eine \(1\) findet. Da die Hälfte der Bits \(1\) ist, ist die erwartete Anzahl Versuche \(\mathbb{E}[T] = 2\).<br><br>Die deterministische untere Schranke folgt aus einem Adversary-Argument: gegen jeden festen Algorithmus kann man eine Instanz konstruieren, auf der er \(n/2 + 1\) Bits lesen muss.	The globality of SO is the reason that <code>volatile</code> and locks can synchronise threads at all.

Field	Before	After
Text		Burns-Lynch theorem (1993): if a system of \(n\) processes solves mutual exclusion with deadlock freedom using only {{c1::atomic read/write registers}}, then it must use at least {{c2::\(n\) shared variables}}.
Extra		Intuition: a single read/write only ever exposes the last write, so the system cannot distinguish the order in which \(n\) threads wrote unless it has a separate slot per thread. This is exactly why Filter Lock and Bakery Lock both need \(O(n)\) shared state. Read-modify-write primitives sidestep the bound by combining the read and the write atomically.

Field	Before	After
Text		The structure of the critical-section problem per process inside an infinite loop is: {{c1::non-critical section}} \(\to\) {{c2::preprotocol}} \(\to\) {{c3::critical section}} \(\to\) {{c4::postprotocol}}.
Extra		Preprotocol = entry code (lock acquire). Postprotocol = exit code (lock release). Only global variables are shared between processes; local variables are private per process.

Field	Before	After
Text		Threads produce a sequence of <i>events</i>; the \(j\)-th occurrence of event \(i\) in thread \(P\) is denoted {{c1::\(p_i^j\)}}. The <i>precedence relation</i> \(a \to b\) ("\(a\) occurs before \(b\)") is {{c2::a total order over events}}.
Extra		Indexed events are needed because programs typically loop, so the same line of code can produce many events; \(p_5^3\) means "event \(p_5\) in the third iteration".<br><br>Total ordering models a single global timeline of when events actually happen (we are not in a relativistic setting).

Field	Before	After
Text	<b>Monte-Carlo, Fehlerwahrscheinlichkeit </b>\(< \tfrac{1}{2}\) <br>Sei \(\varepsilon > 0\) und \(A\) ein randomisierter Algorithmus, der immer JA oder NEIN ausgibt, mit\[\Pr[A(I) \text{ korrekt}] \geq \tfrac{1}{2} + \varepsilon.\]Sei \(A_\delta\) der Algorithmus, der\[N = {{c1::\left\lceil 4\varepsilon^{-2} \ln \delta^{-1} \right\rceil}}\]Aufrufe von \(A\) macht und {{c2::die Mehrheit der erhaltenen Antworten}} ausgibt. Dann gilt:\[\Pr[A_\delta(I) \text{ korrekt}] \geq 1 - \delta.\]	In a multi-core memory hierarchy each core has {{c1::its own registers and its own L1 cache (private)}}, while {{c2::the L2 cache and system memory are shared between cores}}.
Extra	Achtung: \(N\) skaliert hier <b>quadratisch</b> in \(\varepsilon^{-1}\) (statt linear wie bei einseitigem Fehler oder Las-Vegas) und nutzt nicht die erste richtige Antwort, sondern den Mehrheitsentscheid.<br><br>Beweis nutzt die Chernoff-Schranke \(\Pr[X \leq (1-\eta)\mathbb{E}[X]] \leq e^{-\eta^2 \mathbb{E}[X]/2}\) auf die Anzahl korrekter Aufrufe.	Consequence: when core 1 writes to its L1, core 2 sees the change only after it has propagated downwards (cache coherence protocol). This is a primary source of observable memory reordering at the hardware level.

Field	Before	After
Text		Accesses to <code>volatile</code> fields in Java do {{c1::not count as a data race}}. In terms of performance, <code>volatile</code> is {{c2::slower than ordinary fields, but faster than locks}}.
Extra		Recommendation: only for experts; otherwise use the standard library (<code>java.util.concurrent</code>, <code>AtomicInteger</code>, …).<br><br>Caveat: <code>volatile</code> guarantees visibility, but <b>not atomicity of compound operations</b> like <code>i++</code>.

Field	Before	After
Text		ARM (and MIPS, POWER, RISC-V) implement atomic RMW via a {{c1::Load-Linked / Store-Conditional}} pair: <code>LDREX</code> loads from memory and {{c2::marks the address as exclusive for this core}}; the subsequent <code>STREX</code> performs {{c3::a conditional store that only succeeds if no other core has written to the address in the meantime}}.
Extra		If <code>STREX</code> fails, software retries the LL/SC pair. Compared with x86's monolithic <code>CMPXCHG</code>, LL/SC is more flexible (you can compute the new value between LL and SC) but requires explicit retry logic in software.

Field	Before	After
Text		{{c1::Lamport's Bakery algorithm}} solves \(n\)-process mutex by assigning each thread a {{c2::ticket (label) larger than all outstanding tickets}}: <pre>flag[me] = true; label[me] = max(label[0..n-1]) + 1; while(\exists k != me: flag[k] && (k, label[k]) <_l (me, label[me])) {};</pre> where ties are broken by {{c3::lexicographic order on (thread id, label), so two equal labels are still totally ordered}}.
Extra		Properties: mutual exclusion, deadlock freedom, starvation freedom, and FCFS (the doorway is the label assignment).<br><br>The bug-resistant trick is the lexicographic tie-break: two threads can read the same max simultaneously and end up with equal labels, but the smaller-id thread still goes first - no live or dead lock.

Field	Before	After
Text		Peterson's starvation-freedom proof is by exhaustive contradiction: assume \(P\) is stuck in its <code>while</code> loop forever. Then \(Q\) is in one of three cases - {{c1::stuck in non-CS (then <code>flag[Q]=false</code>, \(P\) proceeds, contradiction)}}; {{c2::repeatedly entering and leaving its CS (each entry sets <code>victim=Q</code>, freezing <code>victim</code> away from \(P\), so \(P\) proceeds, contradiction)}}; {{c3::also stuck in its lock loop (then \(Q\) needs <code>victim != Q</code>, but <code>victim</code> can equal at most one of \(P,Q\), contradiction)}}.
Extra		The case split is exhaustive because these are the only three possible long-term behaviours of \(Q\) under our assumptions. Each leads to a contradiction, so \(P\) cannot be stuck.

Field	Before	After
Text	~~Ein <b>klassischer</b> Algorithmus \(A\) erhält Eingabe \(I\~~) und ~~liefert Ausgabe \(A(I)\). Es gilt:<ul><li>\(A(I)\) ist immer {{c1::genau und korrekt}},</li><li>bei gleicher Eingabe immer~~ {{c2::dieselbe Ausgabe (\(A\) ist eine Funktion)}},</li><li>{{c3::dieselbe Laufzeit}}.</li></ul>Ein <b>randomisierter</b> Algorithmus erhält zusätzlich eine Zufallsquelle \(R\) und liefert Ausgabe \(A(I, R)\). Die Ausgabe ist {{c4::~~manchmal richtig, manchmal fast richtig, manchmal schnell}}, das Verhalten ist {{c5::nicht reproduzierbar}}. \(A\) ist nach wie vor eine Funktion auf den Paaren \((I, R)\)~~.	In the classic example with \(x{=}1;\,y{=}1\) (Thread 1) and \(a{=}y;\,b{=}x\) (Thread 2), every possible interleaving satisfies: {{c1::\(b \geq a\) is always true}}. This proof technique is called {{c2::proof by exhaustion (full enumeration of interleavings)}}.
Extra	~~Reproduzierbarkeit ist die zentrale Eigenschaft, die beim Übergang zum randomisierten Algorithmus verloren geht~~.	Yet the assertion <code>assert(b >= a)</code> can still fail in practice: pure interleaving reasoning ignores <b>memory reordering</b> by the compiler and the hardware.

Field	Before	After
Text		The <i>Test-And-Test-And-Set (TATAS)</i> lock improves on TAS by spinning on a plain read first: <pre>do { while(state.get()) {} } while(!state.compareAndSet(false, true));</pre> The inner loop only {{c1::reads the cached value, generating no coherence traffic while the lock stays held}}; the costly RMW is attempted only {{c2::after the lock <i>looks</i> free}}.
Extra		Caveat: TATAS does not generalise. The classical "double-checked locking" pattern - a TATAS-style optimisation around lazy initialisation - is famously broken without explicit memory barriers, because compilers and hardware may reorder the publish of the new object with respect to the flag write.

Field	Before	After
Text		On x86 the {{c1::<code>CMPXCHG</code> mem, reg}} instruction implements compare-and-swap: it compares register A with a memory location and, if equal, copies the second operand into the location and sets ZF. To make it atomic across cores, it must be combined with {{c2::the <code>LOCK</code> prefix}}.
Extra		Plain <code>CMPXCHG</code> is atomic only with respect to other observers on the same core. The <code>LOCK</code> prefix locks the bus / cache line, which is what gives the instruction its inter-processor atomicity guarantee - and also what makes RMW operations 10-100x slower than ordinary loads.

Field	Before	After
Text		Another deadlock fix is to protect every <code>transferTo</code> with {{c1::a single static global lock}}, which trivially avoids cycles because there is only one lock to acquire. Cost: {{c2::no two transfers can run in parallel, even on completely disjoint pairs of accounts}}.
Extra		This sacrifices nearly all the available concurrency for safety. It is a sensible default when contention is rare or when correctness clearly trumps throughput, but rarely the right answer at scale.

Field	Before	After
Text		A naive TAS-based spinlock <pre>while(state.getAndSet(true)) {}</pre> scales poorly because {{c1::every spinning thread issues writes to the lock variable}}, which {{c2::generates bus contention and forces the cache-coherence protocol to invalidate all other cached copies on every retry}}.
Extra		Result: the runtime per thread grows roughly linearly (or worse) with the number of contending threads. Each spinning core hammers the bus, so even threads not currently trying to acquire the lock pay a cost. The fix is to spin on a read until the lock looks free (TTAS), then attempt the RMW only optimistically.

Field	Before	After
Text		In the 1st-try mutex <pre>p2: while(wantq); p3: wantp = true; p4: critical section</pre> the order "first wait for the neighbour, then set my own flag" violates {{c1::mutual exclusion}}, because {{c2::both threads can leave their <code>while</code> with <code>wantp{=}wantq{=}false</code> before either of them sets its flag}}.
Extra		Both threads enter the CS together; visible as state \((p4, q4, true, true)\) in the state-space diagram.<br><br>Naive fix: set the flag <i>before</i> the check. That solves ME but creates new problems (deadlock/livelock); see Dekker / Peterson.

Field	Before	After
Text	<b>Las-Vegas-Algorithmus für das Üppige-Auswahl-Problem</b><pre>repeat: i := Uniform({1, ..., n}) if x_i = 1: return i</pre>Sei \(T\) die Anzahl an Wiederholungen. Da \(\Pr[x_i = 1] = \Pr[x_i = 0] = \tfrac{1}{2}\) für einen zufälligen Index \(i\), gilt:\[\Pr[T > k] = {{c1::\left(\tfrac{1}{2}\right)^k}}.\]Folglich:\[\Pr[T \leq k] = {{c2::1 - \left(\tfrac{1}{2}\right)^k}}.\]	<i>Program Order (PO)</i> in the JMM is a {{c1::total order over the actions of a single thread, defined on an execution trace}}. PO gives {{c2::no ordering guarantee for memory accesses}} and is {{c2::not total across threads}}.
Extra	\(T\) ist geometrisch verteilt mit Parameter \(p = \tfrac{1}{2}\).<br><br>Konkrete Werte: \(\Pr[T > 10] = 1/1024 < 0{,}001\), also \(\Pr[T \leq 10] > 0{,}999\). Mit \(\leq 20\) Wiederholungen erreicht man Wahrscheinlichkeit \(> 0{,}999999\).<br><br>Wichtig: \(T\) hängt <b>nicht von \(n\)</b> ab, weil die Erfolgswahrscheinlichkeit pro Versuch konstant \(\tfrac{1}{2}\) ist (genau die Hälfte der Bits sind \(1\)).	Its only purpose is to link an execution back to the original program.<br><br><b>Intra-thread consistency:</b> per thread, PO is consistent with the thread's isolated execution. E.g. taking the <code>else</code>-branch of an <code>if</code> whose condition was true makes the execution invalid.

Field	Before	After
Text		<i>Exponential-backoff</i> lock: when a TATAS acquire attempt fails, the thread {{c1::sleeps for a random duration drawn from \([0, \text{limit}]\), then doubles <code>limit</code> up to <code>maxDelay</code>}} before retrying. This removes the {{c2::synchronised stampede of all waiters trying to acquire simultaneously the moment the lock is released}}.
Extra		Empirically this gives near-flat scaling under high contention (vs. linear-or-worse for TAS / TTAS). Trade-offs: a thread may be sleeping when the lock becomes free, so latency under low contention is worse; tuning <code>minDelay</code> / <code>maxDelay</code> is workload-specific.

Field	Before	After
Text	~~<b>Üppige-Auswahl-Problem<br></b>Gegeben sind \(x_1, \ldots, x_n \in \{0, 1\}\), von denen {{c1::genau die Hälfte}} gleich \(1\) sind (\(n\) gerade).<br>Berechn~~e {{c2::~~einen Index \(i\) mit \(x_i = 1\)}}.<br><br>Trivialer deterministischer Algorithmus (in Zeit \(O(~~{{c3::~~n}})\)):<pre>for i = 1, ..., n: if x_i = 1: return i</pre>~~	In the classic x86 example with \(x{=}y{=}0\), thread 1: <code>x=1; j=y;</code> and thread 2: <code>y=1; i=x;</code>, besides \((1,1), (0,1), (1,0)\), the outcome {{c1::\((i,j) = (0,0)\) is also possible}}, because under x86's "stores reordered after loads" rule {{c2::the load (<code>j=y</code> / <code>i=x</code>) may be executed before the store (<code>x=1</code> / <code>y=1</code>)}}.
Extra	~~Das Problem heisst <i>üppig</i>,~~ weil ~~es sehr viele gültige Antworten gibt (genau \(n/2\)). Trotzdem braucht jeder deterministische Algorithmus im Worst Case \(\Omega(n)\) Zeit, während e~~in e~~infacher Las-Vegas-Algorithmus mit erwartet \(O(1)\) auskommt~~.	Both threads thus read \(y\) and \(x\) while still 0, before either store becomes visible. This is a simple, actually-reproducible reordering effect on x86 (StoreLoad).

Field	Before	After
Text		Draw a directed graph with threads \(T_1, \dots, T_n\) and resources \(R_1, \dots, R_m\) as nodes. Edge \(T_i \to R_j\) means {{c1::thread \(T_i\) is attempting to acquire resource \(R_j\)}}; edge \(R_j \to T_i\) means {{c2::resource \(R_j\) is currently held by thread \(T_i\)}}. A deadlock is present iff {{c3::this graph contains a cycle}}.
Extra		Equivalent to the Coffman conditions: the cycle witnesses circular wait, while the edge semantics encode mutual exclusion (R held by exactly one T), hold-and-wait (T owns one R while requesting another), and no-preemption (edges are not removed by force).<br><br>Deadlock <i>detection</i> is a cycle search in this graph.

Field	Before	After
Text	Im Allgemeinen ist die Fehlerreduktion bei Monte-Carlo-Algorithmen <b>nicht möglich</b>.<br>(Gegenbeispiel: Algorithmus, der mit Wahrscheinlichkeit \(\tfrac{1}{2}\) JA bzw. NEIN antwortet, unabhängig von der Eingabe.)<br><br>Möglich ist sie aber bei:<ul><li>{{c1::einseitigem Fehler}}, oder</li><li>{{c2::Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\)}}.</li></ul>	Modern multiprocessors do not enforce a global ordering of all instructions for two main reasons: {{c1::pipelining (processors execute parts of multiple instructions simultaneously and may reorder them internally)}} and {{c2::per-processor local caches (loads and stores become visible to other processors only after some delay)}}.
Extra	Bei einseitigem Fehler genügt eine einzige korrekte Antwort, um die Wahrheit zu kennen (Wiederholung + erste eindeutige Antwort).<br>Bei Fehlerwahrscheinlichkeit \(< \tfrac{1}{2}\) hilft Mehrheitsentscheid (Chernoff-Argument).	What exactly is guaranteed varies a lot across architectures (x86 vs ARM vs POWER vs Alpha …).

Field	Before	After
Text		<i>Test-And-Set</i> on a memory cell <code>s</code>, atomically: {{c1::if <code>mem[s] == 0</code>, set it to 1 and return <code>true</code>; else return <code>false</code>}}. <i>Compare-And-Swap</i> on cell <code>a</code> with arguments <code>old, new</code>, atomically: {{c2::read <code>oldval = mem[a]</code>; if <code>oldval == old</code>, set <code>mem[a] = new</code>; in either case return <code>oldval</code>}}.
Extra		TAS is a strict subset of CAS (CAS with <code>old=0, new=1</code> behaves like TAS, modulo return convention). CAS is strictly more expressive: it can be used to implement queues, stacks, and other lock-free structures, while TAS alone cannot.<br><br>Both make the read and the conditional write a single indivisible step - that is precisely what beats the atomic-register lower bound.