Anki Deck Changes

Commit: 1a363331 - add pprog detail cards

Author: obrhubr <obrhubr+noreply@noreply.com>

Date: 2026-03-25T12:25:34+01:00

Changes: 69 note(s) changed (69 added, 0 modified, 0 deleted)

Note 1: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: #G4wpQ?X^2
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
In practice, \(T_p\) \(> T_1/p\) because of synchronization overhead, scheduling overhead, load imbalance, and memory contention.

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
In practice, \(T_p\) \(> T_1/p\) because of synchronization overhead, scheduling overhead, load imbalance, and memory contention.

Perfect efficiency (E = 1, S_p = p) is rarely achievable. Sub-linear speedup is the normal situation; super-linear is exceptional.
Field-by-field Comparison
Field Before After
Text In practice, \(T_p\) {{c1::\(> T_1/p\)}} because of {{c2::synchronization overhead, scheduling overhead, load imbalance, and memory contention}}.
Extra Perfect efficiency (E = 1, S_p = p) is rarely achievable. Sub-linear speedup is the normal situation; super-linear is exceptional.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 2: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: $lOO8/w@3@
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
A static synchronized method acquires a lock on the Class object (MyClass.class) — a global (class-level) lock, shared across all instances.

Back

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
A static synchronized method acquires a lock on the Class object (MyClass.class) — a global (class-level) lock, shared across all instances.

A non-static synchronized method locks this (the instance). The two locks are independent — a static and a non-static synchronized method on the same class can execute concurrently.
Field-by-field Comparison
Field Before After
Text A <code>static synchronized</code> method acquires a lock on {{c1::the <code>Class</code> object (<code>MyClass.class</code>)}} — a {{c2::global (class-level) lock}}, shared across all instances.
Extra A non-static <code>synchronized</code> method locks <code>this</code> (the instance). The two locks are independent — a static and a non-static synchronized method on the same class can execute concurrently.
Tags: ETH::2._Semester::PProg::05._Java_Threads::2._synchronized

Note 3: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: %2]9g85BmT
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What does Amdahl's Law give for \(f=0\) (fully parallel) and \(f=1\) (fully serial)?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What does Amdahl's Law give for \(f=0\) (fully parallel) and \(f=1\) (fully serial)?

  • \(f = 0\): \(S_P = P\) → linear speedup, efficiency = 1. Adding more cores always helps.
  • \(f = 1\): \(S_P = 1\) → no speedup at all, regardless of P. The program is entirely sequential.
Field-by-field Comparison
Field Before After
Front What does Amdahl's Law give for \(f=0\) (fully parallel) and \(f=1\) (fully serial)?
Back <ul><li>\(f = 0\): \(S_P = P\) → <b>linear speedup</b>, efficiency = 1. Adding more cores always helps.</li><li>\(f = 1\): \(S_P = 1\) → <b>no speedup at all</b>, regardless of P. The program is entirely sequential.</li></ul>
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 4: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: )Ln/^}M1}?
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under what conditions does Amdahl's bound \(S_p \leq \frac{1}{f + (1-f)/p}\) become an equality?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under what conditions does Amdahl's bound \(S_p \leq \frac{1}{f + (1-f)/p}\) become an equality?

The bound is achieved when: (1) the parallel part scales perfectly (no synchronization overhead, no load imbalance), and (2) \(T_p = f \cdot T_1 + \frac{(1-f) \cdot T_1}{p}\) exactly. In practice the bound is never reached — Amdahl gives an idealized upper bound, not a prediction.
Field-by-field Comparison
Field Before After
Front Under what conditions does Amdahl's bound \(S_p \leq \frac{1}{f + (1-f)/p}\) become an <b>equality</b>?
Back The bound is achieved when: (1) the parallel part scales <b>perfectly</b> (no synchronization overhead, no load imbalance), and (2) \(T_p = f \cdot T_1 + \frac{(1-f) \cdot T_1}{p}\) exactly. In practice the bound is never reached — Amdahl gives an idealized <b>upper bound</b>, not a prediction.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 5: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: )|$CP#Zb3O
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
What is a "lost notification" and when does it occur?

Back

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
What is a "lost notification" and when does it occur?

If notify() is called before any thread has called wait(), the notification is silently discarded — it is not buffered. Any thread that subsequently calls wait() will block indefinitely. Fix: always protect the condition with a flag variable and check it before waiting.
Field-by-field Comparison
Field Before After
Front What is a "lost notification" and when does it occur?
Back If <code>notify()</code> is called <b>before</b> any thread has called <code>wait()</code>, the notification is silently discarded — it is <b>not buffered</b>. Any thread that subsequently calls <code>wait()</code> will block indefinitely. Fix: always protect the condition with a flag variable and check it before waiting.
Tags: ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll

Note 6: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: //c-)1n-D%
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
Why must wait() always be called inside a while loop, not an if?

Back

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
Why must wait() always be called inside a while loop, not an if?

Because of spurious wakeups — a thread can wake from wait() without notify() ever being called (allowed by the JVM for performance reasons). If the condition is checked only once with if, the thread proceeds even when the condition is still false. A while loop re-checks the condition after every wakeup.
Field-by-field Comparison
Field Before After
Front Why must <code>wait()</code> always be called inside a <code>while</code> loop, not an <code>if</code>?
Back Because of <b>spurious wakeups</b> — a thread can wake from <code>wait()</code> without <code>notify()</code> ever being called (allowed by the JVM for performance reasons). If the condition is checked only once with <code>if</code>, the thread proceeds even when the condition is still false. A <code>while</code> loop re-checks the condition after every wakeup.
Tags: ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll

Note 7: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 1(+0hH>|3h
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
A Fork/Join task graph (DAG) is dynamic — it unfolds as execution proceeds, since which tasks are spawned depends on runtime values.

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
A Fork/Join task graph (DAG) is dynamic — it unfolds as execution proceeds, since which tasks are spawned depends on runtime values.

Independent nodes in the DAG *can* but do not *have to* execute in parallel — it depends on the scheduler and available threads.
Field-by-field Comparison
Field Before After
Text A Fork/Join task graph (DAG) is {{c1::dynamic}} — it {{c2::unfolds as execution proceeds}}, since which tasks are spawned depends on runtime values.
Extra Independent nodes in the DAG *can* but do not *have to* execute in parallel — it depends on the scheduler and available threads.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 8: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: 3fV+]~d+w}
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What are the work and span of parallel merge sort, and what limits its parallelism?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What are the work and span of parallel merge sort, and what limits its parallelism?

  • Work: \(O(n \log n)\) — same as sequential
  • Span: \(O(\log^2 n)\) — recursive calls parallelized (O(log n) depth) but merge step takes O(log n) span via parallel merge
  • Parallelism: \(O(n/\log n)\)
Bottleneck: the merge step. A sequential merge gives O(n) span, limiting parallelism to O(log n).
Field-by-field Comparison
Field Before After
Front What are the work and span of parallel merge sort, and what limits its parallelism?
Back <ul><li><b>Work</b>: \(O(n \log n)\) — same as sequential</li><li><b>Span</b>: \(O(\log^2 n)\) — recursive calls parallelized (O(log n) depth) but merge step takes O(log n) span via parallel merge</li><li><b>Parallelism</b>: \(O(n/\log n)\)</li></ul>Bottleneck: the merge step. A sequential merge gives O(n) span, limiting parallelism to O(log n).
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 9: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: 3h>LP|0z)0
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
In a RecursiveTask, why should you fork() one subtask and compute() the other directly in the current thread (rather than forking both)?

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
In a RecursiveTask, why should you fork() one subtask and compute() the other directly in the current thread (rather than forking both)?

Forking both sides creates two new tasks and leaves the current thread idle waiting for both. By calling compute() on one side directly, the current thread does useful work instead of idling — halving the number of tasks created and reducing scheduling overhead. Always join() the forked task after compute() finishes.
Field-by-field Comparison
Field Before After
Front In a RecursiveTask, why should you <code>fork()</code> one subtask and <code>compute()</code> the other directly in the current thread (rather than forking both)?
Back Forking both sides creates two new tasks and leaves the current thread idle waiting for both. By calling <code>compute()</code> on one side directly, the current thread does useful work instead of idling — halving the number of tasks created and reducing scheduling overhead. Always <code>join()</code> the forked task after <code>compute()</code> finishes.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework

Note 10: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: 6X!+L
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Pipeline throughput bound (infinite stream, one execution unit per stage) \(= {{c1::\dfrac{1}{\max_i(\text{stage\_time}_i)}}}\)

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Pipeline throughput bound (infinite stream, one execution unit per stage) \(= {{c1::\dfrac{1}{\max_i(\text{stage\_time}_i)}}}\)

Throughput is limited by the slowest (bottleneck) stage. For a balanced pipeline all stage times are equal, so throughput = 1/stage_time.
Field-by-field Comparison
Field Before After
Text Pipeline throughput bound (infinite stream, one execution unit per stage) \(= {{c1::\dfrac{1}{\max_i(\text{stage\_time}_i)}}}\)
Extra Throughput is limited by the slowest (bottleneck) stage. For a balanced pipeline all stage times are equal, so throughput = 1/stage_time.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 11: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: 6wa}o?][4W
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What does it mean for a parallel program to have efficiency E = 1 (100%), and when does efficiency approach 0?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What does it mean for a parallel program to have efficiency E = 1 (100%), and when does efficiency approach 0?

\(E = S_p / p = T_1 / (p \cdot T_p)\). E = 1 means \(T_p = T_1/p\) — every processor is doing useful work 100% of the time, no idle time, no overhead. Efficiency → 0 when: (1) p is very large (Amdahl: speedup saturates while p grows), or (2) overhead (synchronization, scheduling) dominates useful work.
Field-by-field Comparison
Field Before After
Front What does it mean for a parallel program to have efficiency E = 1 (100%), and when does efficiency approach 0?
Back \(E = S_p / p = T_1 / (p \cdot T_p)\). E = 1 means \(T_p = T_1/p\) — every processor is doing useful work 100% of the time, no idle time, no overhead. Efficiency → 0 when: (1) p is very large (Amdahl: speedup saturates while p grows), or (2) overhead (synchronization, scheduling) dominates useful work.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 12: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID:
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Pack has \(O(n)\) work and \(O(\log n)\) span.

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Pack has \(O(n)\) work and \(O(\log n)\) span.

Each of the 3 steps (flag, prefix sum, scatter) is O(n) work and O(log n) span. The steps can be partially combined without changing asymptotic complexity.
Field-by-field Comparison
Field Before After
Text Pack has {{c1::\(O(n)\) work}} and {{c2::\(O(\log n)\) span}}.
Extra Each of the 3 steps (flag, prefix sum, scatter) is O(n) work and O(log n) span. The steps can be partially combined without changing asymptotic complexity.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 13: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ?aiVa43NDU
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Summing an array of n elements in parallel has work \(O(n)\), span \(O(\log n)\), and parallelism \(O(n / \log n)\).

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Summing an array of n elements in parallel has work \(O(n)\), span \(O(\log n)\), and parallelism \(O(n / \log n)\).

Work = O(n) (same as sequential). Span = O(log n) (tree depth). Parallelism = work/span = O(n/log n), meaning you benefit from up to O(n/log n) processors before hitting the span bottleneck.
Field-by-field Comparison
Field Before After
Text Summing an array of n elements in parallel has work \({{c1::O(n)}}\), span \({{c2::O(\log n)}}\), and parallelism \({{c3::O(n / \log n)}}\).
Extra Work = O(n) (same as sequential). Span = O(log n) (tree depth). Parallelism = work/span = O(n/log n), meaning you benefit from up to O(n/log n) processors before hitting the span bottleneck.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns

Note 14: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: ?knS$%
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is the FJ work-stealing guarantee \(T_p = O(T_1/p + T_\infty)\) considered asymptotically optimal?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is the FJ work-stealing guarantee \(T_p = O(T_1/p + T_\infty)\) considered asymptotically optimal?

The two terms match the fundamental lower bounds: \(T_p \geq T_1/p\) (work law) and \(T_p \geq T_\infty\) (span law). The FJ scheduler achieves \(O(T_1/p + T_\infty)\), which equals \(O(\max(T_1/p, T_\infty))\) up to a constant factor. No scheduler can do asymptotically better — FJ matches the information-theoretic lower bound.
Field-by-field Comparison
Field Before After
Front Why is the FJ work-stealing guarantee \(T_p = O(T_1/p + T_\infty)\) considered asymptotically optimal?
Back The two terms match the fundamental lower bounds: \(T_p \geq T_1/p\) (work law) and \(T_p \geq T_\infty\) (span law). The FJ scheduler achieves \(O(T_1/p + T_\infty)\), which equals \(O(\max(T_1/p, T_\infty))\) up to a constant factor. <b>No scheduler can do asymptotically better</b> — FJ matches the information-theoretic lower bound.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 15: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @>I0NXoq2&
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
Callable<V> differs from Runnable in that it returns a result of type V and can throw checked exceptions.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
Callable<V> differs from Runnable in that it returns a result of type V and can throw checked exceptions.

exec.submit(callable) returns a Future<V>. exec.execute(runnable) is fire-and-forget (no return value).
Field-by-field Comparison
Field Before After
Text {{c1::<code>Callable&lt;V&gt;</code>}} differs from {{c2::<code>Runnable</code>}} in that it {{c3::returns a result of type V and can throw checked exceptions}}.
Extra <code>exec.submit(callable)</code> returns a <code>Future&lt;V&gt;</code>. <code>exec.execute(runnable)</code> is fire-and-forget (no return value).
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 16: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: @G4j5S8izo
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
How can an unbalanced pipeline be fixed, and what is the caveat?

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
How can an unbalanced pipeline be fixed, and what is the caveat?

Split the bottleneck (slowest) stage into two or more sub-stages of equal duration. Caveat: too much splitting is impractical — it introduces enormous switching/management overhead.
Field-by-field Comparison
Field Before After
Front How can an unbalanced pipeline be fixed, and what is the caveat?
Back Split the bottleneck (slowest) stage into two or more sub-stages of equal duration. Caveat: too much splitting is impractical — it introduces enormous switching/management overhead.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 17: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: @r1TbI1^$~
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
The span \(T_\infty\) in a DAG corresponds to the serial fraction f in Amdahl's Law: the longest chain of sequential dependencies that no amount of additional parallelism can overcome.

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
The span \(T_\infty\) in a DAG corresponds to the serial fraction f in Amdahl's Law: the longest chain of sequential dependencies that no amount of additional parallelism can overcome.

Designing parallel algorithms means decreasing span without increasing work too much — directly equivalent to reducing f in Amdahl's Law.
Field-by-field Comparison
Field Before After
Text The span \(T_\infty\) in a DAG corresponds to {{c1::the serial fraction f in Amdahl's Law}}: the {{c2::longest chain of sequential dependencies}} that no amount of additional parallelism can overcome.
Extra Designing parallel algorithms means <b>decreasing span</b> without increasing work too much — directly equivalent to reducing f in Amdahl's Law.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 18: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: AYPiAlScH$
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Latency of the first element through a pipeline \(= {{c1::\sum_{i} \text{time}(\text{stage}_i)}}\) (sum of all stage times).

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Latency of the first element through a pipeline \(= {{c1::\sum_{i} \text{time}(\text{stage}_i)}}\) (sum of all stage times).

"Latency" by default means the first element. For a balanced pipeline this equals num_stages × max(stage_time).
Field-by-field Comparison
Field Before After
Text Latency of the <b>first</b> element through a pipeline \(= {{c1::\sum_{i} \text{time}(\text{stage}_i)}}\) (sum of all stage times).
Extra "Latency" by default means the first element. For a balanced pipeline this equals num_stages × max(stage_time).
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 19: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: AeyQBG3vzh
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads
What is the difference between a race condition and a data race?

Back

ETH::2._Semester::PProg::05._Java_Threads
What is the difference between a race condition and a data race?

A data race is a low-level memory event: a shared variable is accessed by two threads, at least one access is a write, and there is no synchronization. A race condition is a higher-level correctness issue: the program's observable behavior depends on execution order. A data race usually causes a race condition, but a race condition can exist even without a data race (e.g. check-then-act with synchronized operations). In Java, data races invoke undefined behavior per the JMM.
Field-by-field Comparison
Field Before After
Front What is the difference between a <b>race condition</b> and a <b>data race</b>?
Back A <b>data race</b> is a low-level memory event: a shared variable is accessed by two threads, at least one access is a write, and there is no synchronization. A <b>race condition</b> is a higher-level correctness issue: the program's observable behavior depends on execution order. A data race usually causes a race condition, but a race condition can exist even without a data race (e.g. check-then-act with synchronized operations). In Java, data races invoke undefined behavior per the JMM.
Tags: ETH::2._Semester::PProg::05._Java_Threads

Note 20: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: E-#>p+?}Ok
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::08._Divide_and_conquer::2._Divide_and_conquer
In Fork/Join, each task should ideally perform between 1,000 and 100,000 sequential operations to ensure the task has enough work to offset scheduling overhead.

Back

ETH::2._Semester::PProg::08._Divide_and_conquer::2._Divide_and_conquer
In Fork/Join, each task should ideally perform between 1,000 and 100,000 sequential operations to ensure the task has enough work to offset scheduling overhead.

Too small → overhead dominates. Too large → poor load balancing and under-utilization. The exact threshold is empirical.
Field-by-field Comparison
Field Before After
Text In Fork/Join, each task should ideally perform between {{c1::1,000 and 100,000}} sequential operations to ensure {{c2::the task has enough work to offset scheduling overhead}}.
Extra Too small → overhead dominates. Too large → poor load balancing and under-utilization. The exact threshold is empirical.
Tags: ETH::2._Semester::PProg::08._Divide_and_conquer::2._Divide_and_conquer

Note 21: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: EdVyAbz-qv
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
java.util.concurrent.atomic.AtomicInteger provides atomic compound operations such as getAndIncrement() and compareAndSet(expected, update) without requiring synchronized.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
java.util.concurrent.atomic.AtomicInteger provides atomic compound operations such as getAndIncrement() and compareAndSet(expected, update) without requiring synchronized.

Implemented via hardware CAS (compare-and-swap). Faster than synchronized for simple counters. Part of java.util.concurrent.atomic.
Field-by-field Comparison
Field Before After
Text <code>java.util.concurrent.atomic.{{c1::AtomicInteger}}</code> provides {{c2::atomic compound operations}} such as <code>{{c3::getAndIncrement()}}</code> and <code>{{c4::compareAndSet(expected, update)}}</code> without requiring <code>synchronized</code>.
Extra Implemented via hardware CAS (compare-and-swap). Faster than <code>synchronized</code> for simple counters. Part of <code>java.util.concurrent.atomic</code>.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency

Note 22: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: F7x(&s]1kQ
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
What does the upper bound \(T_p \leq T_1/p + O(T_\infty)\) tell us about scheduling?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
What does the upper bound \(T_p \leq T_1/p + O(T_\infty)\) tell us about scheduling?

A good scheduler can always achieve time within a constant factor of the ideal. The two terms represent two different sources of time: \(T_1/p\) is the work-limited regime, and \(O(T_\infty)\) is the unavoidable sequential overhead from dependencies (span-limited regime). Their sum is an upper bound — a scheduler can't be much worse than their sum.
Field-by-field Comparison
Field Before After
Front What does the upper bound \(T_p \leq T_1/p + O(T_\infty)\) tell us about scheduling?
Back A <b>good scheduler</b> can always achieve time within a constant factor of the ideal. The two terms represent two different sources of time: \(T_1/p\) is the work-limited regime, and \(O(T_\infty)\) is the unavoidable sequential overhead from dependencies (span-limited regime). Their sum is an upper bound — a scheduler can't be much worse than their sum.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 23: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: GH&YPQRsTh
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Lead-in and lead-out reduce effective throughput below the theoretical bound for a finite number of elements N. Their impact is largest when N is small (short pipelines).

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Lead-in and lead-out reduce effective throughput below the theoretical bound for a finite number of elements N. Their impact is largest when N is small (short pipelines).

Example: for a finite run, effective throughput was 0.7 while the infinite-stream bound was 1. As N → ∞, effective throughput approaches 1/max(stage_time).
Field-by-field Comparison
Field Before After
Text Lead-in and lead-out {{c1::reduce effective throughput below the theoretical bound}} for {{c2::a finite number of elements N}}. Their impact is largest when {{c3::N is small (short pipelines)}}.
Extra Example: for a finite run, effective throughput was 0.7 while the infinite-stream bound was 1. As N → ∞, effective throughput approaches 1/max(stage_time).
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 24: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: H*>f
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
What advantages does ReentrantLock offer over synchronized?

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
What advantages does ReentrantLock offer over synchronized?

  • tryLock() — non-blocking attempt (returns boolean)
  • tryLock(timeout, unit) — timed lock attempt
  • lockInterruptibly() — interruptible while waiting
  • Multiple Condition objects per lock (vs. one monitor per synchronized object)
  • More flexible critical section shapes (lock and unlock in different scopes)
⚠️ Must always call unlock() in a finally block to avoid leaked locks.
Field-by-field Comparison
Field Before After
Front What advantages does <code>ReentrantLock</code> offer over <code>synchronized</code>?
Back <ul><li><code>tryLock()</code> — non-blocking attempt (returns boolean)</li><li><code>tryLock(timeout, unit)</code> — timed lock attempt</li><li><code>lockInterruptibly()</code> — interruptible while waiting</li><li>Multiple <code>Condition</code> objects per lock (vs. one monitor per <code>synchronized</code> object)</li><li>More flexible critical section shapes (lock and unlock in different scopes)</li></ul>⚠️ Must always call <code>unlock()</code> in a <code>finally</code> block to avoid leaked locks.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency

Note 25: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: KD1&||&6xW
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is \(T_p \geq T_1/p\) a strict lower bound — why can no scheduler beat it?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is \(T_p \geq T_1/p\) a strict lower bound — why can no scheduler beat it?

\(T_1\) is the total work (sum of all node costs in the DAG). With p processors, at most p units of work can be done per time step. Therefore the minimum time to do \(T_1\) units of work is \(T_1/p\). This is a counting argument — independent of the scheduler's strategy.
Field-by-field Comparison
Field Before After
Front Why is \(T_p \geq T_1/p\) a strict lower bound — why can no scheduler beat it?
Back \(T_1\) is the total work (sum of all node costs in the DAG). With p processors, at most p units of work can be done per time step. Therefore the minimum time to do \(T_1\) units of work is \(T_1/p\). This is a counting argument — independent of the scheduler's strategy.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 26: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: KqO7gK}Fu/
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
When should you prefer notifyAll() over notify()?

Back

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
When should you prefer notifyAll() over notify()?

Prefer notifyAll() when multiple threads may be waiting and they have different conditions to wait for. notify() wakes only one (arbitrary) thread — if it wakes a thread whose condition is still false, it goes back to sleep and no progress is made. notifyAll() wakes all waiters; each re-checks its condition in its while loop. Cost: more context switches. Safe default: always use notifyAll() unless you have a single-condition, single-consumer pattern.
Field-by-field Comparison
Field Before After
Front When should you prefer <code>notifyAll()</code> over <code>notify()</code>?
Back Prefer <code>notifyAll()</code> when <b>multiple threads</b> may be waiting and they have <b>different conditions</b> to wait for. <code>notify()</code> wakes only one (arbitrary) thread — if it wakes a thread whose condition is still false, it goes back to sleep and no progress is made. <code>notifyAll()</code> wakes all waiters; each re-checks its condition in its <code>while</code> loop. Cost: more context switches. Safe default: always use <code>notifyAll()</code> unless you have a single-condition, single-consumer pattern.
Tags: ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll

Note 27: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: Lj=X5W/02X
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What does each node store after the up pass (reduce pass) of the parallel prefix-sum algorithm?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What does each node store after the up pass (reduce pass) of the parallel prefix-sum algorithm?

Each internal node stores the sum of all leaves in its subtree. Leaves store the original values. This pass is a parallel tree reduction — done bottom-up with O(n) work and O(log n) span.
Field-by-field Comparison
Field Before After
Front What does each node store after the <b>up pass</b> (reduce pass) of the parallel prefix-sum algorithm?
Back Each internal node stores the <b>sum of all leaves in its subtree</b>. Leaves store the original values. This pass is a parallel tree reduction — done bottom-up with O(n) work and O(log n) span.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 28: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: NLVo5v1@Dy
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Pack (also called filter) takes a collection and a predicate and returns a new array containing only the elements satisfying the predicate, preserving their original relative order.

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Pack (also called filter) takes a collection and a predicate and returns a new array containing only the elements satisfying the predicate, preserving their original relative order.

Example: pack([3,1,4,1,5], x > 2) → [3,4,5]. This is the parallel equivalent of a sequential filter loop.
Field-by-field Comparison
Field Before After
Text {{c1::Pack}} (also called filter) takes a collection and a predicate and returns {{c2::a new array containing only the elements satisfying the predicate, preserving their original relative order}}.
Extra Example: pack([3,1,4,1,5], x &gt; 2) → [3,4,5]. This is the parallel equivalent of a sequential filter loop.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 29: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Nc)fu4)D&q
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
Calling wait(), notify(), or notifyAll() on an object without holding its lock throws IllegalMonitorStateException at runtime.

Back

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
Calling wait(), notify(), or notifyAll() on an object without holding its lock throws IllegalMonitorStateException at runtime.

The check is runtime-only — the compiler does not catch this. Always ensure these calls are inside a synchronized(obj) block or a synchronized method on the same object.
Field-by-field Comparison
Field Before After
Text Calling <code>wait()</code>, <code>notify()</code>, or <code>notifyAll()</code> on an object without holding its lock throws {{c1::<code>IllegalMonitorStateException</code>}} at runtime.
Extra The check is runtime-only — the compiler does not catch this. Always ensure these calls are inside a <code>synchronized(obj)</code> block or a <code>synchronized</code> method on the same object.
Tags: ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll

Note 30: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: O>b+4REo=(
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::2._Instruction_Level_Parallelism
Instruction-level parallelism (ILP) uses two techniques: dependency analysis (execute independent instructions in parallel, reorder to keep the CPU busy) and speculative execution (execute a branch before the condition is resolved, roll back if wrong).

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::2._Instruction_Level_Parallelism
Instruction-level parallelism (ILP) uses two techniques: dependency analysis (execute independent instructions in parallel, reorder to keep the CPU busy) and speculative execution (execute a branch before the condition is resolved, roll back if wrong).

ILP happens on a single thread — no concurrency. It is hardware-dependent (requires superscalar execution units).
Field-by-field Comparison
Field Before After
Text Instruction-level parallelism (ILP) uses two techniques: {{c1::dependency analysis}} (execute independent instructions in parallel, reorder to keep the CPU busy) and {{c2::speculative execution}} (execute a branch before the condition is resolved, roll back if wrong).
Extra ILP happens on a <b>single thread</b> — no concurrency. It is hardware-dependent (requires superscalar execution units).
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::2._Instruction_Level_Parallelism

Note 31: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: OG4szVeI~<
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Why does parallel prefix sum require two passes (up then down) rather than one?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Why does parallel prefix sum require two passes (up then down) rather than one?

The up pass computes subtree sums — it doesn't know what came "before" each element globally. The down pass distributes accumulated sums from the root downward, so each leaf learns the total sum of all elements to its left. No single pass can do both simultaneously because each node needs both its subtree total (from children) and the prefix from its ancestors.
Field-by-field Comparison
Field Before After
Front Why does parallel prefix sum require two passes (up then down) rather than one?
Back The up pass computes subtree sums — it doesn't know what came "before" each element globally. The down pass distributes accumulated sums from the root downward, so each leaf learns the total sum of all elements to its left. No single pass can do both simultaneously because each node needs both its subtree total (from children) and the prefix from its ancestors.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 32: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Q$hlo>kuR^
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Latency of the n-th element in an unbalanced pipeline \(= {{c1::\text{latency}_1 + \bigl(\max(\text{stage\_time}) - \text{time}(\text{stage}_1)\bigr) \cdot (n-1)}}\)

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Latency of the n-th element in an unbalanced pipeline \(= {{c1::\text{latency}_1 + \bigl(\max(\text{stage\_time}) - \text{time}(\text{stage}_1)\bigr) \cdot (n-1)}}\)

Latency grows linearly with n when the pipeline is unbalanced — elements pile up at the bottleneck. If balanced, this simplifies to latency₁ (constant).
Field-by-field Comparison
Field Before After
Text Latency of the <b>n-th</b> element in an unbalanced pipeline \(= {{c1::\text{latency}_1 + \bigl(\max(\text{stage\_time}) - \text{time}(\text{stage}_1)\bigr) \cdot (n-1)}}\)
Extra Latency grows linearly with n when the pipeline is unbalanced — elements pile up at the bottleneck. If balanced, this simplifies to latency₁ (constant).
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 33: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: Q({*qU*f^i
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
What are the lead-in and lead-out phases of a pipeline?

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
What are the lead-in and lead-out phases of a pipeline?

Lead-in (warm-up): the pipeline is filling — earlier stages have started but later stages are still idle. Not all stages are doing useful work yet.
Lead-out (cool-down): the pipeline is draining — earlier stages have finished but later stages are still processing. Earlier stages become idle.
Field-by-field Comparison
Field Before After
Front What are the <b>lead-in</b> and <b>lead-out</b> phases of a pipeline?
Back <b>Lead-in</b> (warm-up): the pipeline is filling — earlier stages have started but later stages are still idle. Not all stages are doing useful work yet.<br><b>Lead-out</b> (cool-down): the pipeline is draining — earlier stages have finished but later stages are still processing. Earlier stages become idle.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 34: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: QiCwzwdx4X
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
In Java ForkJoin, RecursiveTask<V> is used when the task returns a result of type V, while RecursiveAction is used when no result is returned (void).

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
In Java ForkJoin, RecursiveTask<V> is used when the task returns a result of type V, while RecursiveAction is used when no result is returned (void).

Both extend ForkJoinTask and override compute(). Use RecursiveTask for sum/merge, RecursiveAction for in-place sort.
Field-by-field Comparison
Field Before After
Text In Java ForkJoin, {{c1::<code>RecursiveTask&lt;V&gt;</code>}} is used when the task {{c2::returns a result of type V}}, while {{c3::<code>RecursiveAction</code>}} is used when {{c4::no result is returned (void)}}.
Extra Both extend <code>ForkJoinTask</code> and override <code>compute()</code>. Use <code>RecursiveTask</code> for sum/merge, <code>RecursiveAction</code> for in-place sort.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework

Note 35: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: RJC9+1Wgmi
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
exec.shutdown() stops accepting new tasks but lets already-submitted tasks finish. exec.shutdownNow() attempts to interrupt all running tasks and returns the list of unstarted tasks.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
exec.shutdown() stops accepting new tasks but lets already-submitted tasks finish. exec.shutdownNow() attempts to interrupt all running tasks and returns the list of unstarted tasks.

Use awaitTermination(timeout, unit) after shutdown() to block until all tasks complete or the timeout expires.
Field-by-field Comparison
Field Before After
Text <code>exec.{{c1::shutdown()}}</code> {{c2::stops accepting new tasks but lets already-submitted tasks finish}}. <code>exec.{{c3::shutdownNow()}}</code> {{c4::attempts to interrupt all running tasks and returns the list of unstarted tasks}}.
Extra Use <code>awaitTermination(timeout, unit)</code> after <code>shutdown()</code> to block until all tasks complete or the timeout expires.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 36: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: TAzr]%>^o%
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
Name the four standard ExecutorService pool types and their key characteristics.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
Name the four standard ExecutorService pool types and their key characteristics.

  • newFixedThreadPool(n) — fixed n threads; queues excess tasks
  • newSingleThreadExecutor() — exactly 1 thread; tasks execute sequentially
  • newCachedThreadPool() — elastic; creates threads on demand, reuses idle ones
  • newScheduledThreadPool(n) — supports delayed and periodic tasks
Field-by-field Comparison
Field Before After
Front Name the four standard <code>ExecutorService</code> pool types and their key characteristics.
Back <ul><li><code>newFixedThreadPool(n)</code> — fixed n threads; queues excess tasks</li><li><code>newSingleThreadExecutor()</code> — exactly 1 thread; tasks execute sequentially</li><li><code>newCachedThreadPool()</code> — elastic; creates threads on demand, reuses idle ones</li><li><code>newScheduledThreadPool(n)</code> — supports delayed and periodic tasks</li></ul>
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 37: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Tc|J+GdHu-
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
The Java volatile keyword guarantees visibility — every read of a volatile field sees the most recent write by any thread — but does not guarantee atomicity of compound operations (e.g. i++).

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
The Java volatile keyword guarantees visibility — every read of a volatile field sees the most recent write by any thread — but does not guarantee atomicity of compound operations (e.g. i++).

Use volatile for simple flags (e.g. volatile boolean running). For compound operations, use synchronized or AtomicInteger.
Field-by-field Comparison
Field Before After
Text The Java <code>volatile</code> keyword guarantees {{c1::visibility}} — every read of a volatile field sees the {{c2::most recent write by any thread}} — but does <b>not</b> guarantee {{c3::atomicity of compound operations (e.g. i++)}}.
Extra Use <code>volatile</code> for simple flags (e.g. <code>volatile boolean running</code>). For compound operations, use <code>synchronized</code> or <code>AtomicInteger</code>.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency

Note 38: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: U/}jSkoobw
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Is it possible for \(T_p < T_1/p\)? What does this mean and what causes it?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Is it possible for \(T_p < T_1/p\)? What does this mean and what causes it?

Yes — this is super-linear speedup (\(S_p > p\)). Causes: (1) Cache effects — smaller per-thread working sets fit in L1/L2 cache, making memory access faster than in the sequential run which had cache misses. (2) Algorithmic differences — the parallel version may explore the problem space differently. Not considered 'magic' — it has a real explanation.
Field-by-field Comparison
Field Before After
Front Is it possible for \(T_p < T_1/p\)? What does this mean and what causes it?
Back Yes — this is <b>super-linear speedup</b> (\(S_p > p\)). Causes: (1) <b>Cache effects</b> — smaller per-thread working sets fit in L1/L2 cache, making memory access faster than in the sequential run which had cache misses. (2) <b>Algorithmic differences</b> — the parallel version may explore the problem space differently. Not considered 'magic' — it has a real explanation.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 39: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: VOg3W+qe2?
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
If an exception is thrown inside a synchronized block, the lock is automatically released when the exception propagates out of the block.

Back

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
If an exception is thrown inside a synchronized block, the lock is automatically released when the exception propagates out of the block.

This is because synchronized is structured (unlike ReentrantLock which requires explicit finally). The JVM guarantees lock release on any exit from the block, including exceptions.
Field-by-field Comparison
Field Before After
Text If an exception is thrown inside a <code>synchronized</code> block, the lock is {{c1::automatically released}} when the exception propagates out of the block.
Extra This is because <code>synchronized</code> is structured (unlike <code>ReentrantLock</code> which requires explicit <code>finally</code>). The JVM guarantees lock release on any exit from the block, including exceptions.
Tags: ETH::2._Semester::PProg::05._Java_Threads::2._synchronized

Note 40: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: WSSC!D36#|
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
What is a pipeline (in CPU instruction execution)?

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
What is a pipeline (in CPU instruction execution)?

A pipeline breaks instruction execution into multiple stages (e.g. fetch, decode, execute, write-back) that are overlapped: while stage k processes instruction n+1, stage k+1 processes instruction n. Multiple instructions are in-flight simultaneously — like a factory assembly line.
Field-by-field Comparison
Field Before After
Front What is a pipeline (in CPU instruction execution)?
Back A pipeline breaks instruction execution into multiple stages (e.g. fetch, decode, execute, write-back) that are overlapped: while stage k processes instruction n+1, stage k+1 processes instruction n. Multiple instructions are in-flight simultaneously — like a factory assembly line.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 41: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: XKz4m$d8-k
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under Gustafson's Law, \(\lim_{P \to \infty} S_P = \infty\), because \(S_P = P - f(P-1)\) grows linearly with P (assuming f < 1).

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under Gustafson's Law, \(\lim_{P \to \infty} S_P = \infty\), because \(S_P = P - f(P-1)\) grows linearly with P (assuming f < 1).

This does not contradict Amdahl — Gustafson scales the problem size with P (fixed wall time), so the serial fraction f is a fixed fraction of the larger workload, not a fixed amount of work.
Field-by-field Comparison
Field Before After
Text Under Gustafson's Law, \(\lim_{P \to \infty} S_P = {{c1::\infty}}\), because {{c2::\(S_P = P - f(P-1)\) grows linearly with P}} (assuming f &lt; 1).
Extra This does not contradict Amdahl — Gustafson scales the <b>problem size</b> with P (fixed wall time), so the serial fraction f is a fixed fraction of the larger workload, not a fixed amount of work.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 42: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: XZFu@~1-gN
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
After calling notifyAll(), the calling thread does not release the lock immediately — it only releases it when it exits the synchronized block or calls wait().

Back

ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll
After calling notifyAll(), the calling thread does not release the lock immediately — it only releases it when it exits the synchronized block or calls wait().

All woken threads move to RUNNABLE state and compete for the lock, but only one can acquire it at a time. The notifying thread still holds the lock until it finishes its synchronized section.
Field-by-field Comparison
Field Before After
Text After calling <code>notifyAll()</code>, the calling thread {{c1::does not release the lock immediately}} — it only releases it when {{c2::it exits the <code>synchronized</code> block or calls <code>wait()</code>}}.
Extra All woken threads move to RUNNABLE state and compete for the lock, but only one can acquire it at a time. The notifying thread still holds the lock until it finishes its synchronized section.
Tags: ETH::2._Semester::PProg::05._Java_Threads::3._Wait,_Notify,_NotifyAll

Note 43: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: Y!RdYt+6Bf
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
The Fork/Join work-stealing guarantee \(T_p = O(T_1/p + T_\infty)\) is an expected-time guarantee — it holds in expectation (on average) due to randomized work stealing, not in the worst case.

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
The Fork/Join work-stealing guarantee \(T_p = O(T_1/p + T_\infty)\) is an expected-time guarantee — it holds in expectation (on average) due to randomized work stealing, not in the worst case.

The randomness comes from which victim thread is chosen when stealing. In the worst case a bad sequence of steals is possible, but the expected cost is O(T₁/p + T∞).
Field-by-field Comparison
Field Before After
Text The Fork/Join work-stealing guarantee \(T_p = {{c2::O(T_1/p + T_\infty)}}\) is an {{c1::expected-time}} guarantee — it holds in expectation (on average) due to randomized work stealing, not in the worst case.
Extra The randomness comes from which victim thread is chosen when stealing. In the worst case a bad sequence of steals is possible, but the expected cost is O(T₁/p + T∞).
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 44: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: [+?JtoU/P6
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
How does the Fork/Join work-stealing scheduler work?

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework
How does the Fork/Join work-stealing scheduler work?

Each worker thread has its own deque (double-ended queue) of tasks. It processes its own tasks LIFO from the front. When a thread runs out of work, it steals a task from the back of another thread's deque (the oldest = largest task). This keeps all threads busy and achieves the O(T₁/p + T∞) expected-time guarantee.
Field-by-field Comparison
Field Before After
Front How does the Fork/Join work-stealing scheduler work?
Back Each worker thread has its own <b>deque</b> (double-ended queue) of tasks. It processes its own tasks <b>LIFO</b> from the front. When a thread runs out of work, it <b>steals</b> a task from the <b>back</b> of another thread's deque (the oldest = largest task). This keeps all threads busy and achieves the O(T₁/p + T∞) expected-time guarantee.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::4._Java's_ForkJoin_framework

Note 45: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: [SF)]6&aQp
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
exec.invokeAll(tasks) submits all tasks and blocks until all complete, returning a list of Futures. exec.invokeAny(tasks) returns the result of the first task to complete successfully.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
exec.invokeAll(tasks) submits all tasks and blocks until all complete, returning a list of Futures. exec.invokeAny(tasks) returns the result of the first task to complete successfully.

invokeAll → wait for all. invokeAny → wait for any one. Both submit all tasks to the pool regardless.
Field-by-field Comparison
Field Before After
Text <code>exec.{{c1::invokeAll(tasks)}}</code> submits all tasks and {{c2::blocks until all complete}}, returning a list of Futures. <code>exec.{{c3::invokeAny(tasks)}}</code> returns {{c4::the result of the first task to complete successfully}}.
Extra invokeAll → wait for <i>all</i>. invokeAny → wait for <i>any one</i>. Both submit all tasks to the pool regardless.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 46: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: [e+1vU[=c8
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
The ExecutorService does not guarantee the order in which submitted tasks are executed.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
The ExecutorService does not guarantee the order in which submitted tasks are executed.

A FixedThreadPool uses a FIFO queue internally, but execution order depends on thread availability and OS scheduling. Do not rely on ordering unless using invokeAll or explicit dependencies.
Field-by-field Comparison
Field Before After
Text The ExecutorService does <b>not</b> guarantee {{c1::the order in which submitted tasks are executed}}.
Extra A FixedThreadPool uses a FIFO queue internally, but execution order depends on thread availability and OS scheduling. Do not rely on ordering unless using invokeAll or explicit dependencies.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 47: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: [nTh4x4&-6
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What are the three steps of the parallel Pack algorithm?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What are the three steps of the parallel Pack algorithm?

  1. Flag: Compute a boolean/flag array F where F[i]=1 if element i satisfies the predicate, else 0.
  2. Prefix sum: Compute the exclusive prefix sum of F — gives the output index for each passing element.
  3. Scatter: For each i with F[i]=1, copy A[i] to output[prefix_sum[i]].
Field-by-field Comparison
Field Before After
Front What are the three steps of the parallel Pack algorithm?
Back <ol><li><b>Flag</b>: Compute a boolean/flag array F where F[i]=1 if element i satisfies the predicate, else 0.</li><li><b>Prefix sum</b>: Compute the exclusive prefix sum of F — gives the output index for each passing element.</li><li><b>Scatter</b>: For each i with F[i]=1, copy A[i] to <code>output[prefix_sum[i]]</code>.</li></ol>
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 48: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: ]axf
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What is the fundamental difference in assumption between Amdahl's Law and Gustafson's Law?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What is the fundamental difference in assumption between Amdahl's Law and Gustafson's Law?

  • Amdahl's Law: fixed problem size — asks "how much faster can we solve the same problem with more cores?" Pessimistic: speedup is bounded by 1/f.
  • Gustafson's Law: fixed wall-clock time — asks \"how much more work can we do in the same time with more cores?\" Optimistic: speedup grows as \(P - f(P-1)\), unbounded as \(P \to \infty\).
Key insight: they answer different questions — not contradictory.
Field-by-field Comparison
Field Before After
Front What is the fundamental difference in assumption between Amdahl's Law and Gustafson's Law?
Back <ul><li><b>Amdahl's Law</b>: fixed problem size — asks "how much faster can we solve the same problem with more cores?" Pessimistic: speedup is bounded by 1/f.</li><li><b>Gustafson's Law</b>: fixed wall-clock time — asks \"how much more work can we do in the same time with more cores?\" Optimistic: speedup grows as \(P - f(P-1)\), unbounded as \(P \to \infty\).</li></ul>Key insight: they answer different questions — not contradictory.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 49: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ^+Co39jxRN
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
A pipeline is balanced if all stages take the same amount of time.

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
A pipeline is balanced if all stages take the same amount of time.

When balanced: throughput is maximised and latency is constant across all elements. When unbalanced: the slowest stage bottlenecks throughput and latency grows with each element.
Field-by-field Comparison
Field Before After
Text A pipeline is {{c1::balanced}} if {{c2::all stages take the same amount of time}}.
Extra When balanced: throughput is maximised and latency is constant across all elements. When unbalanced: the slowest stage bottlenecks throughput and latency grows with each element.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 50: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: c9@eiL*Ug=
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What does an exclusive parallel prefix sum compute?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What does an exclusive parallel prefix sum compute?

For input array A[0..n-1], it produces output B where \(B[i] = \sum_{j=0}^{i-1} A[j]\) (sum of all elements before index i). So B[0] = 0 always.
Example: A = [3,1,4,1,5] → B = [0,3,4,8,9].
Field-by-field Comparison
Field Before After
Front What does an <b>exclusive parallel prefix sum</b> compute?
Back For input array A[0..n-1], it produces output B where \(B[i] = \sum_{j=0}^{i-1} A[j]\) (sum of all elements <i>before</i> index i). So B[0] = 0 always.<br>Example: A = [3,1,4,1,5] → B = [0,3,4,8,9].
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 51: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: eib*6wnIWb
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Fill in the work, span, and parallelism for standard parallel patterns on n elements:
Map / Reduction / Prefix sum / Pack

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Fill in the work, span, and parallelism for standard parallel patterns on n elements:
Map / Reduction / Prefix sum / Pack

All four achieve: Work \(O(n)\), Span \(O(\log n)\), Parallelism \(O(n/\log n)\) via divide-and-conquer trees. Note: these assume arrays. Using linked lists instead degrades span to O(n).
Field-by-field Comparison
Field Before After
Front Fill in the work, span, and parallelism for standard parallel patterns on n elements:<br>Map / Reduction / Prefix sum / Pack
Back All four achieve: <b>Work</b> \(O(n)\), <b>Span</b> \(O(\log n)\), <b>Parallelism</b> \(O(n/\log n)\) via divide-and-conquer trees. Note: these assume arrays. Using linked lists instead degrades span to O(n).
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns

Note 52: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: fG2/O&24Z-
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::03._Java_Threads::6._Interrupts
What happens to a thread's interrupt flag after it catches an InterruptedException?

Back

ETH::2._Semester::PProg::03._Java_Threads::6._Interrupts
What happens to a thread's interrupt flag after it catches an InterruptedException?

The flag is automatically cleared (set to false). If you catch the exception and do nothing, the thread behaves as if it was never interrupted. Best practices: either re-interrupt (Thread.currentThread().interrupt()) or propagate the exception upward.
Field-by-field Comparison
Field Before After
Front What happens to a thread's interrupt flag after it catches an <code>InterruptedException</code>?
Back The flag is <b>automatically cleared</b> (set to false). If you catch the exception and do nothing, the thread behaves as if it was never interrupted. Best practices: either re-interrupt (<code>Thread.currentThread().interrupt()</code>) or propagate the exception upward.
Tags: ETH::2._Semester::PProg::03._Java_Threads::6._Interrupts

Note 53: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: i/wHwua}M}
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads
Why is the following code a race condition even if map is a thread-safe ConcurrentHashMap?
if (!map.containsKey(key)) { map.put(key, value); }

Back

ETH::2._Semester::PProg::05._Java_Threads
Why is the following code a race condition even if map is a thread-safe ConcurrentHashMap?
if (!map.containsKey(key)) { map.put(key, value); }

The check (containsKey) and the act (put) are two separate atomic operations — another thread can put the same key between them. This is the check-then-act anti-pattern. Fix: use map.putIfAbsent(key, value) (single atomic operation) or protect the whole block with synchronized.
Field-by-field Comparison
Field Before After
Front Why is the following code a race condition even if <code>map</code> is a thread-safe <code>ConcurrentHashMap</code>?<br><pre>if (!map.containsKey(key)) { map.put(key, value); }</pre>
Back The <b>check</b> (<code>containsKey</code>) and the <b>act</b> (<code>put</code>) are two separate atomic operations — another thread can <code>put</code> the same key between them. This is the <b>check-then-act</b> anti-pattern. Fix: use <code>map.putIfAbsent(key, value)</code> (single atomic operation) or protect the whole block with <code>synchronized</code>.
Tags: ETH::2._Semester::PProg::05._Java_Threads

Note 54: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: iy@XpW4n&4
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
When does the work law (\(T_p \geq T_1/p\)) dominate vs. the span law (\(T_p \geq T_\infty\))?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
When does the work law (\(T_p \geq T_1/p\)) dominate vs. the span law (\(T_p \geq T_\infty\))?

Work law dominates when p is small — not enough workers to exploit available parallelism. Span law dominates when p is large — adding more processors no longer helps because \(T_\infty\) is the bottleneck. The crossover is at \(p = T_1/T_\infty\) (the parallelism of the DAG).
Field-by-field Comparison
Field Before After
Front When does the work law (\(T_p \geq T_1/p\)) dominate vs. the span law (\(T_p \geq T_\infty\))?
Back <b>Work law dominates</b> when p is small — not enough workers to exploit available parallelism. <b>Span law dominates</b> when p is large — adding more processors no longer helps because \(T_\infty\) is the bottleneck. The crossover is at \(p = T_1/T_\infty\) (the parallelism of the DAG).
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 55: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: l!G8RlO+dn
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
future.get() blocks the calling thread until the submitted task completes and returns its result.

Back

ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service
future.get() blocks the calling thread until the submitted task completes and returns its result.

future.get(timeout, unit) throws TimeoutException if the task does not finish in time. future.cancel(true) attempts to interrupt the running task.
Field-by-field Comparison
Field Before After
Text {{c1::<code>future.get()</code>}} {{c2::blocks}} the calling thread until the submitted task completes and returns its result.
Extra <code>future.get(timeout, unit)</code> throws <code>TimeoutException</code> if the task does not finish in time. <code>future.cancel(true)</code> attempts to interrupt the running task.
Tags: ETH::2._Semester::PProg::09._Divide_and_conquer::3._Executor_Service

Note 56: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: m7O|KIKqyC
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under Amdahl's Law, as \(P \to \infty\), efficiency \(E = S_P / P\) approaches 0, because speedup is bounded by \(1/f\) while P grows without bound.

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
Under Amdahl's Law, as \(P \to \infty\), efficiency \(E = S_P / P\) approaches 0, because speedup is bounded by \(1/f\) while P grows without bound.

Even if you get the maximum speedup \(1/f\), dividing by \(P \to \infty\) drives efficiency to zero. More cores are increasingly wasted.
Field-by-field Comparison
Field Before After
Text Under Amdahl's Law, as \(P \to \infty\), efficiency \(E = S_P / P\) {{c1::approaches 0}}, because {{c2::speedup is bounded by \(1/f\) while P grows without bound}}.
Extra Even if you get the maximum speedup \(1/f\), dividing by \(P \to \infty\) drives efficiency to zero. More cores are increasingly wasted.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 57: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: m]{8XT!DLG
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Pipeline throughput for N finite elements \(= {{c1::\dfrac{N}{\text{total time for } N \text{ elements}}}}\)

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Pipeline throughput for N finite elements \(= {{c1::\dfrac{N}{\text{total time for } N \text{ elements}}}}\)

As \(N \to \infty\), this converges to \(1/\max_i(\text{stage\_time}_i)\). For small N, lead-in/lead-out reduce it below the theoretical bound.
Field-by-field Comparison
Field Before After
Text Pipeline throughput for N finite elements \(= {{c1::\dfrac{N}{\text{total time for } N \text{ elements}}}}\)
Extra As \(N \to \infty\), this converges to \(1/\max_i(\text{stage\_time}_i)\). For small N, lead-in/lead-out reduce it below the theoretical bound.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 58: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: nvfkOuvf3v
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Compare work/span/parallelism for two parallel quicksort strategies:
1. Parallelize only the recursive calls
2. Also parallelize the partition step (via pack)

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
Compare work/span/parallelism for two parallel quicksort strategies:
1. Parallelize only the recursive calls
2. Also parallelize the partition step (via pack)

  1. Parallel recursive calls only: Work \(O(n \log n)\), Span \(O(n)\), Parallelism \(O(\log n)\)
  2. + parallel partition (pack): Work \(O(n \log n)\), Span \(O(\log^2 n)\), Parallelism \(O(n/\log n)\)
Parallelizing the partition reduces span from \(O(n)\) to \(O(\log^2 n)\), dramatically increasing parallelism.
Field-by-field Comparison
Field Before After
Front Compare work/span/parallelism for two parallel quicksort strategies:<br>1. Parallelize only the recursive calls<br>2. Also parallelize the partition step (via pack)
Back <ol><li>Parallel recursive calls only: Work \(O(n \log n)\), Span \(O(n)\), Parallelism \(O(\log n)\)</li><li>+ parallel partition (pack): Work \(O(n \log n)\), Span \(O(\log^2 n)\), Parallelism \(O(n/\log n)\)</li></ol>Parallelizing the partition reduces span from \(O(n)\) to \(O(\log^2 n)\), dramatically increasing parallelism.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 59: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: n~&W/iS)c^
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is \(T_p \geq T_\infty\) a strict lower bound — why can no scheduler beat it?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Why is \(T_p \geq T_\infty\) a strict lower bound — why can no scheduler beat it?

\(T_\infty\) is the length of the critical path — a chain of nodes where each depends on the previous. Even with infinite processors, these nodes must execute sequentially. No amount of parallelism can compress a dependency chain.
Field-by-field Comparison
Field Before After
Front Why is \(T_p \geq T_\infty\) a strict lower bound — why can no scheduler beat it?
Back \(T_\infty\) is the length of the <b>critical path</b> — a chain of nodes where each depends on the previous. Even with infinite processors, these nodes must execute <b>sequentially</b>. No amount of parallelism can compress a dependency chain.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 60: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: o5rCKq0mpv
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What causes super-linear speedup (S_p > p), and is it "real"?

Back

ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance
What causes super-linear speedup (S_p > p), and is it "real"?

Yes — it can occur in practice. Most common cause: cache effects — with p processors, each processor's working set is smaller and fits in faster cache (L1/L2). This makes per-processor work genuinely faster than the single-processor baseline. Also possible via better branch prediction or algorithmic differences in the parallel version.
Field-by-field Comparison
Field Before After
Front What causes super-linear speedup (S_p &gt; p), and is it "real"?
Back Yes — it can occur in practice. Most common cause: <b>cache effects</b> — with p processors, each processor's working set is smaller and fits in faster cache (L1/L2). This makes per-processor work genuinely faster than the single-processor baseline. Also possible via better branch prediction or algorithmic differences in the parallel version.
Tags: ETH::2._Semester::PProg::07._Concepts::2._Parallel_performance

Note 61: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: pogaT0kdC(
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What happens during the down pass (prefix pass) of the parallel prefix-sum algorithm?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms
What happens during the down pass (prefix pass) of the parallel prefix-sum algorithm?

The root receives 0 (for exclusive prefix sum). At each internal node, the value passed down is forwarded to the left child as-is, and to the right child as: value_from_above + left_child's_up-pass_value. Leaf i ends with the exclusive prefix sum B[i]. Done top-down with O(n) work and O(log n) span.
Field-by-field Comparison
Field Before After
Front What happens during the <b>down pass</b> (prefix pass) of the parallel prefix-sum algorithm?
Back The root receives <b>0</b> (for exclusive prefix sum). At each internal node, the value passed down is forwarded to the <b>left child as-is</b>, and to the <b>right child</b> as: <code>value_from_above + left_child's_up-pass_value</code>. Leaf i ends with the exclusive prefix sum B[i]. Done top-down with O(n) work and O(log n) span.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::3._Parallel_algorithms

Note 62: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: rP!!UaISrY
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Given a DAG with parallelism \(T_1/T_\infty\), what happens when you use \(p > T_1/T_\infty\) cores?

Back

ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures
Given a DAG with parallelism \(T_1/T_\infty\), what happens when you use \(p > T_1/T_\infty\) cores?

Beyond \(p = T_1/T_\infty\), the span law \(T_p \geq T_\infty\) becomes the binding constraint — extra cores are wasted. In practice: never worth deploying more than \(T_1/T_\infty\) processors for a given task graph — diminishing returns become total waste.
Field-by-field Comparison
Field Before After
Front Given a DAG with parallelism \(T_1/T_\infty\), what happens when you use \(p > T_1/T_\infty\) cores?
Back Beyond \(p = T_1/T_\infty\), the span law \(T_p \geq T_\infty\) becomes the binding constraint — extra cores are wasted. In practice: never worth deploying more than \(T_1/T_\infty\) processors for a given task graph — diminishing returns become total waste.
Tags: ETH::2._Semester::PProg::10._Parallel_Algorithms::1._Task_graphs_and_performance_measures

Note 63: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: sRs/ljskpO
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
Why should you never use synchronized(someInteger) or synchronized(someString) as a lock?

Back

ETH::2._Semester::PProg::05._Java_Threads::2._synchronized
Why should you never use synchronized(someInteger) or synchronized(someString) as a lock?

Boxed types (Integer, Long) and String are immutable — operations like integer++ create a new object. The lock object changes, so two threads may lock on different objects and both enter the critical section simultaneously. Use a dedicated private final Object lock = new Object() instead.
Field-by-field Comparison
Field Before After
Front Why should you never use <code>synchronized(someInteger)</code> or <code>synchronized(someString)</code> as a lock?
Back Boxed types (<code>Integer</code>, <code>Long</code>) and <code>String</code> are <b>immutable</b> — operations like <code>integer++</code> create a new object. The lock object changes, so two threads may lock on different objects and both enter the critical section simultaneously. Use a dedicated <code>private final Object lock = new Object()</code> instead.
Tags: ETH::2._Semester::PProg::05._Java_Threads::2._synchronized

Note 64: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: uQ3(lLENx0
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Why does it matter whether you implement a parallel reduction on a linked list vs. a balanced tree/array?

Back

ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns
Why does it matter whether you implement a parallel reduction on a linked list vs. a balanced tree/array?

On a linked list, you must traverse O(n) nodes — span is O(n), no better than sequential. On a balanced tree or array, divide-and-conquer reaches any element in O(log n) steps, giving O(log n) span. Trees give the same flexibility as lists while preserving O(log n) span — preferred for parallel algorithms.
Field-by-field Comparison
Field Before After
Front Why does it matter whether you implement a parallel reduction on a linked list vs. a balanced tree/array?
Back On a <b>linked list</b>, you must traverse O(n) nodes — span is O(n), no better than sequential. On a <b>balanced tree or array</b>, divide-and-conquer reaches any element in O(log n) steps, giving O(log n) span. Trees give the same flexibility as lists while preserving O(log n) span — preferred for parallel algorithms.
Tags: ETH::2._Semester::PProg::11._Parallel_Algorithms::2._Parallel_patterns

Note 65: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: vIJVWjg2>Y
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
A SpinLock is a lock where a thread trying to acquire it busy-waits (spins) in a loop until it becomes available, rather than sleeping.

Back

ETH::2._Semester::PProg::12._Shared_Memory_Concurrency
A SpinLock is a lock where a thread trying to acquire it busy-waits (spins) in a loop until it becomes available, rather than sleeping.

Trade-off: no context-switch overhead (good for short critical sections), but wastes CPU cycles (bad for long waits). Contrast with blocking locks (synchronized, ReentrantLock) which suspend the thread.
Field-by-field Comparison
Field Before After
Text A {{c1::SpinLock}} is a lock where a thread trying to acquire it {{c2::busy-waits (spins) in a loop}} until it becomes available, rather than sleeping.
Extra Trade-off: no context-switch overhead (good for short critical sections), but wastes CPU cycles (bad for long waits). Contrast with blocking locks (<code>synchronized</code>, <code>ReentrantLock</code>) which suspend the thread.
Tags: ETH::2._Semester::PProg::12._Shared_Memory_Concurrency

Note 66: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: zzYi%6?711
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::Terminology
What are the four necessary conditions for deadlock (Coffman conditions)?

Back

ETH::2._Semester::PProg::Terminology
What are the four necessary conditions for deadlock (Coffman conditions)?

  1. Mutual exclusion — at least one resource is held in non-shareable mode
  2. Hold and wait — a thread holds at least one resource while waiting for another
  3. No preemption — resources cannot be forcibly taken away
  4. Circular wait — a cycle of threads each waiting for a resource held by the next
All four must hold simultaneously. Breaking any one prevents deadlock.
Field-by-field Comparison
Field Before After
Front What are the four necessary conditions for deadlock (Coffman conditions)?
Back <ol><li><b>Mutual exclusion</b> — at least one resource is held in non-shareable mode</li><li><b>Hold and wait</b> — a thread holds at least one resource while waiting for another</li><li><b>No preemption</b> — resources cannot be forcibly taken away</li><li><b>Circular wait</b> — a cycle of threads each waiting for a resource held by the next</li></ol>All four must hold simultaneously. Breaking any one prevents deadlock.
Tags: ETH::2._Semester::PProg::Terminology

Note 67: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Classic
GUID: |xpGo>FdYs
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
A pipeline has 4 stages with times [2, 4, 2, 2]. Is it balanced? What is the throughput? What is the latency of the 1st element? Of the 3rd?

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
A pipeline has 4 stages with times [2, 4, 2, 2]. Is it balanced? What is the throughput? What is the latency of the 1st element? Of the 3rd?

  • Balanced? No — stage 2 takes 4 units; others take 2. Bottleneck is stage 2.
  • Throughput \(= 1/\max(2,4,2,2) = 1/4\) items per time unit.
  • Latency (1st element) \(= 2+4+2+2 = 10\)
  • Latency (3rd element) \(= 10 + (4-2)\cdot(3-1) = 10+4 = 14\)
Field-by-field Comparison
Field Before After
Front A pipeline has 4 stages with times [2, 4, 2, 2]. Is it balanced? What is the throughput? What is the latency of the 1st element? Of the 3rd?
Back <ul><li><b>Balanced?</b> No — stage 2 takes 4 units; others take 2. Bottleneck is stage 2.</li><li><b>Throughput</b> \(= 1/\max(2,4,2,2) = 1/4\) items per time unit.</li><li><b>Latency (1st element)</b> \(= 2+4+2+2 = 10\)</li><li><b>Latency (3rd element)</b> \(= 10 + (4-2)\cdot(3-1) = 10+4 = 14\)</li></ul>
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 68: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: |~C[soZuiK
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Bandwidth of a pipeline is the amount of work being processed in parallel at any given time.

Back

ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining
Bandwidth of a pipeline is the amount of work being processed in parallel at any given time.

Distinct from throughput (items/time) and latency (time/item). Bandwidth captures how many elements are simultaneously in-flight across all stages.
Field-by-field Comparison
Field Before After
Text {{c1::Bandwidth}} of a pipeline is {{c2::the amount of work being processed in parallel at any given time}}.
Extra Distinct from throughput (items/time) and latency (time/item). Bandwidth captures how many elements are simultaneously in-flight across all stages.
Tags: ETH::2._Semester::PProg::06._Parallel_Architecture::3._Pipelining

Note 69: ETH::2. Semester::PProg

Deck: ETH::2. Semester::PProg
Note Type: Horvath Cloze
GUID: ~^81m|RfVI
added

Previous

Note did not exist

New Note

Front

ETH::2._Semester::PProg::Terminology
A standard way to prevent deadlock is to always acquire multiple locks in a globally consistent (total) order, so that circular wait is impossible.

Back

ETH::2._Semester::PProg::Terminology
A standard way to prevent deadlock is to always acquire multiple locks in a globally consistent (total) order, so that circular wait is impossible.

Example: always acquire lockA before lockB. If every thread follows this order, no cycle can form. This breaks the "circular wait" Coffman condition.
Field-by-field Comparison
Field Before After
Text A standard way to prevent deadlock is to always acquire multiple locks in {{c1::a globally consistent (total) order}}, so that {{c2::circular wait is impossible}}.
Extra Example: always acquire lockA before lockB. If every thread follows this order, no cycle can form. This breaks the "circular wait" Coffman condition.
Tags: ETH::2._Semester::PProg::Terminology
↑ Top