Numerical
problems with Pipelining
Q1: If 15 milliseconds
are given to each clock cycle, and there are 4 instructions that pass through 5
stages to complete its execution in the pipeline,
1. How
much time is required to complete the execution of all instructions?
2. Calculate
the efficiency of the system
Solution: A
Total
clock cycle = K+ (n-1); as K are stages so K=5 and n are no of instructions so
n=4
= 5+(4-1)
=8
Time
for one clock cycle 15 ms
Time
for 8 clock cycles = 15*8 ms
Solution: B
Efficiency
or utilization = total no of used boxes in pipelining/ Total no of boxes
In
4 instructions diagram with 5 stages, the total box boxes are 40.
8
instructions, each using 5 stages, so a total of 20 boxes are used
Efficiency
or utilization = 20/40= ½
Therefore, CPI
is almost one in pipelining and has higher efficiency and throughput in the
pipeline as compared to non-pipelining.
Speedup Formula
The ratio between non-pipelining and
pipeline is speed up. 8 instructions were completed in 12
clock cycles in the pipeline, but 40 cycles were required for non-pipelining.
So, Speedup will be
Speedup = NP/P = 40/12 =3.1, so 3.1 times is Speedup.
NP is non-pipelining, and P is pipelining.
Stage Delay
Every stage has circuits
that are used to process data. So, some time is required at every stage, called
stage delay.
Registers Delay
Registers between stages
are used to store intermediate results. These registers store the input value
from the previous stage for the very next stage. If the stage delay is uniform,
then we have no delay in registers. We can directly pass it to the next stage.
But if one stage’s
processing speed is mismatched with another stage (means to say stage 1 is
complete in 5ns but stage 2 is still in processing or its delay time is 8ns),
then we have to store intermediate results in registers for some time to
complete the next stage (stage 2).
Stages delay, and
registers delay are given below in the diagram,
Q2: A 4-stage pipeline has stage delays as 150,120, 160, and 140ns.
Registers are used between stages and have a delay of 5ns each. Assuming a
constant clock rate, the total time taken to process 1000 data items on this
pipeline will be …...?
Solution:
Consider a maximum stage delay so that the other instructions may
executed, it founds in stage 3 which is equal to 165 (160-stage
delay+5-register delay).
First, instruction/data passes through the entire stage, and the rest of
the instructions will follow the pipeline. Every instruction is complete in
every stage. So, the formula will be as follows.
First instruction x stages x time + Rest instructions x stages x time
= 1x4x165 + 999 x 1 x 165 ns = 165.5 usec.
OR
Total Time Taken =
(1 * 4 * 165) + (999 * 1 * 165)
= 1654595 ns
= 1654.595 µs
Q3: Consider a non-pipelined processor with a clock rate of 2.5 GHz and
an. Cycle/instructions of four. The same processor is upgraded to a pipelined
processor with five stages. However, the clock speed is reduced to 2 GHz due to
internal pipeline delay. Assume that there is no stall (ideal condition) in the
pipeline. The Speedup achieved in the pipeline processor is?
Solution:
Speedup = TNP/TP (“NP” is non-pipelining and “P” is pipelining)
As T= 1/F = So,
TNP = 4×1/2.5×109 Sec
TP = 1x 1/2×109 Sec
Speedup = (4×1/2.5×109 Sec) / (1x 1/2×109 Sec)
Note: Time for one instruction = cycles per
instruction x clock rate
Q4: Consider a pipeline having 4
phases with duration 60, 50, 90 and 80 ns. Given latch delay is 10 ns.
Calculate-
1.
Pipeline cycle time
2.
Non-pipeline execution time
3.
Speed up ratio
4.
Pipeline time for 1000
tasks
5.
Sequential time for 1000
tasks
6.
Throughput
Solution-
Given-
1) Four stage pipeline is used
2) Delay of stages = 60, 50, 90 and 80 ns
3) Latch delay or delay due to each register = 10 ns
Part-01: Pipeline
Cycle Time-
Cycle time
= Maximum
delay due to any stage + Delay due to its register
= Max {60, 50, 90, 80} + 10 ns
= 90 ns + 10 ns
= 100 ns
Part-02: Non-Pipeline Execution Time-
Non-pipeline
execution time for one instruction
= 60 ns + 50 ns + 90 ns + 80 ns
= 280 ns
Part-03: Speed Up
Ratio-
Speed up
= Non-pipeline
execution time / Pipeline execution time
= 280 ns /
Cycle time
= 280 ns /
100 ns
= 2.8
Part-04: Pipeline
Time For 1000 Tasks-
Pipeline
time for 1000 tasks
= Time
taken for 1st task + Time taken for remaining 999 tasks
= 1 x 4
clock cycles + 999 x 1 clock cycle
= 4 x
cycle time + 999 x cycle time
= 4 x
100 ns + 999 x 100 ns
= 400 ns +
99900 ns
= 100300
ns
Part-05:
Sequential Time For 1000 Tasks-
Non-pipeline
time for 1000 tasks
= 1000 x Time taken for one task
= 1000 x
280 ns
= 280000
ns
Part-06:
Throughput-
Throughput
for pipelined execution
= Number
of instructions executed per unit time
= 1000
tasks / 100300 ns
Q5: A four-stage pipeline has the stage delays as 150, 120, 160 and 140 ns respectively.
Registers are used between the stages and have a delay of 5 ns each. Assuming
constant clocking rate, the total time taken to process 1000 data items on the
pipeline will be-
1. 120.4 microseconds
2. 160.5 microseconds
3. 165.5 microseconds
4.
590.0 microseconds
Solution-
Given-
· Four stage pipeline is used
· Delay of stages = 150, 120, 160 and 140 ns
· Delay due to each register = 5 ns
· 1000 data items or instructions are processed
Cycle Time-
Cycle time
= Maximum
delay due to any stage + Delay due to its register
= Max
{150, 120, 160, 140} + 5 ns
= 160 ns +
5 ns
= 165 ns
Pipeline Time To
Process 1000 Data Items-
Pipeline
time to process 1000 data items
= Time
taken for 1st data item + Time taken for remaining 999 data items
= 1 x 4
clock cycles + 999 x 1 clock cycle
= 4 x
cycle time + 999 x cycle time
= 4 x 165
ns + 999 x 165 ns
= 660 ns +
164835 ns
= 165495
ns
=
165.5 μs
Thus,
Option (C) is correct.
0 Comments