I have a design with a path similar to the following:
Figure 1: Design Path
All buffers have min/max delays of 1ns/2ns, with the exception of U3 which has a slow delay of 1.9ns instead of 2ns (represented with a slightly smaller buffer symbol). When I generate a timing report, I can see that the last pin which is common between the launch and capture paths is U4/Z, with a CRP of 2. However, the CRP value is less than I expect:
pt_shell> report_timing -from FF1/CP -to FF2/D \ -path_type full_clock_expanded ... Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK) Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK) Path Group: CLK Path Type: max Point Incr Path --------------------------------------------------------------- clock CLK (rise edge) 0.00 0.00 clock source latency 0.00 0.00 CLK (in) 0.00 0.00 r U1/Z (BUFFD1) 2.00 * 2.00 r U4/Z (BUFFD1) 2.00 * 4.00 r U5/Z (BUFFD1) 2.00 * 6.00 r UMUX/Z (OR2D1) 2.00 * 8.00 r FF1/CP (DFD1) 0.00 * 8.00 r FF1/Q (DFD1) 2.00 * 10.00 r FF2/D (DFD1) 0.00 * 10.00 r data arrival time 10.00 clock CLK (rise edge) 10.00 10.00 clock source latency 0.00 10.00 CLK (in) 0.00 10.00 r U1/Z (BUFFD1) 1.00 * 11.00 r U4/Z (BUFFD1) 1.00 * 12.00 r FF2/CP (DFD1) 0.00 * 12.00 r clock reconvergence pessimism 1.00 13.00 library setup time 0.00 * 13.00 data required time 13.00 --------------------------------------------------------------- data required time 13.00 data arrival time -10.00 --------------------------------------------------------------- slack (MET) 3.00 When I report the common point with report_crpr, I see that the reported common pin is U1/Z, which explains the lesser CRP value of 1:
pt_shell> report_crpr -from FF1/CP -to FF2/CP -setup
...
Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK)
Common Point: U1/Z
Common Clock: CLK
Launching edge at common point: RISING
Capturing edge at common point: RISING
CRPR threshold: 0.02
Arrival Times Early Late CRP
---------------------------------------------------------------
Rise 1.000 2.000 1.000
Fall 1.000 2.000 1.000
---------------------------------------------------------------
Selection Details
---------------------------------------------------------------
Edge Match: Match, using rise CRP
---------------------------------------------------------------
clock reconvergence pessimism 1.000
I expected U4/Z to be reported as the common point since it is the last pin that appears in both the launch and capture clock paths. Why is this happening?
Answer:
Despite its initial appearance, this is not a bug. To understand why, let's take a closer look at the launch and capture paths traced in our example above:
Figure 2: Default Launch and Capture Paths
Since this is a setup path, the launch clock path to FF1/CP uses the latest arrival and the capture clock path to FF2/CP uses the earliest arrival. We see that there are actually two launch paths available to FF1/CP, but due to the slightly faster delay at U3 in the upper path, the lower path is chosen.
Based only on the timing report, we expected the common pin to be U4/Z resulting in a CRP value of 2. Let's call this the visual common point since it's the common point that we would deduce from visually analyzing the timing report. However, PrimeTime's CRPR algorithm computes a CRP value which comprehensively considers all possible clock paths to the launch and capture flops, rather than just the early/late paths which are shown in the timing report. This is called the topological common point since it's the common point which is valid for all possible paths.
Why does CRPR need to consider the other paths? To understand this, let's force PrimeTime to use the upper launch path so we can see what happens. We could do this by using set_disable_timing
on U5, but this is not the best way to perform our experiment - it requires design knowledge to know that U5 is the right gate to disable, and disabling the connectivity completely may cause other issues. Instead of disabling the connectivity through U5, let's leave it in place and set an annotated delay to our expected common pin U4/Z which is so fast, the tool will only use a launch path through U4 as a last resort:
pt_shell> set_annotated_delay -cell -to U4/Z -max -1000
1
Figure 3: Alternative Launch Path Used
The timing report now becomes:
pt_shell> report_timing -from FF1/CP -to FF2/D \
-path_type full_clock_expanded
...
Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK)
Path Group: CLK
Path Type: max
Point Incr Path
---------------------------------------------------------------
clock CLK (rise edge) 0.00 0.00
clock source latency 0.00 0.00
CLK (in) 0.00 0.00 r
U1/Z (BUFFD1) 2.00 * 2.00 r
U2/Z (BUFFD1) 2.00 * 4.00 r
U3/Z (BUFFD1) 1.90 * 5.90 r
UMUX/Z (OR2D1) 2.00 * 7.90 r
FF1/CP (DFD1) 0.00 * 7.90 r
FF1/Q (DFD1) 2.00 * 9.90 r
FF2/D (DFD1) 0.00 * 9.90 r
data arrival time 9.90
clock CLK (rise edge) 10.00 10.00
clock source latency 0.00 10.00
CLK (in) 0.00 10.00 r
U1/Z (BUFFD1) 1.00 * 11.00 r
U4/Z (BUFFD1) 1.00 * 12.00 r
FF2/CP (DFD1) 0.00 * 12.00 r
clock reconvergence pessimism 1.00 13.00
library setup time 0.00 * 13.00
data required time 13.00
---------------------------------------------------------------
data required time 13.00
data arrival time -9.90
---------------------------------------------------------------
slack (MET) 3.10
When this slightly faster launch path is used, the slack now improves by 0.1ns, from its original slack of +3.0ns to a new slack of +3.1ns. During operation of the device, both paths are equally valid launch paths and either path could be used. Looking back at the original timing path in the question section, if we had used a CRP value of 2 instead of 1, our slack would have been +4ns. This would have been optimistic since we have just proven that another alternative (and equally valid) path exists with a slack of +3.1ns!
In our example, using the visual common point U4/Z would have caused our slack to be improved to the point where other potential clock paths (not shown in the report) could have become more critical. To ensure this cannot happen, CRPR must ensure that all possible launch/capture paths are considered in the CRP computation rather than just the paths shown in the report by using the topological common point. This is not the same thing as using the earliest common pin in the path (which would simply be the root pin!), but is actually the latest common pin across all possible latency paths which are available to the device in operation.
Useful Analysis Tools
Two Tcl procedures are provided in this article to help explore and understand timing paths which have multiple possible clock paths. The first procedure is the compare_common_point procedure:
compare_common_point [get_timing_paths -path full_clock_expanded ...]
This procedure analyzes a collection of one or more timing_path objects to determine if the visual and topological common points are different. If a path is found where they differ, the timing report is printed along with pin markers indicating the common points:
pt_shell> compare_common_point [get_timing_paths -path full_clock_expanded \
? -to FF2/D]
Visual common point: U4/Z
Topological common point: U1/Z
...
Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK)
Path Group: CLK
Path Type: max
Point Incr Path
--------------------------------------------------------------
clock CLK (rise edge) 0.00 0.00
clock source latency 0.00 0.00
CLK (in) 0.00 0.00 r
U1/I (BUFFD1) 0.00 * 0.00 r
U1/Z (BUFFD1) 2.00 * 2.00 r <-- TOPOLOGICAL
U4/I (BUFFD1) 0.00 * 2.00 r
U4/Z (BUFFD1) 2.00 * 4.00 r <-- VISUAL
U5/I (BUFFD1) 0.00 * 4.00 r
U5/Z (BUFFD1) 2.00 * 6.00 r
UMUX/A2 (OR2D1) 0.00 * 6.00 r
UMUX/Z (OR2D1) 2.00 * 8.00 r
FF1/CP (DFD1) 0.00 * 8.00 r
FF1/Q (DFD1) 2.00 * 10.00 r
FF2/D (DFD1) 0.00 * 10.00 r
data arrival time 10.00
clock CLK (rise edge) 10.00 10.00
clock source latency 0.00 10.00
CLK (in) 0.00 10.00 r
U1/I (BUFFD1) 0.00 * 10.00 r
U1/Z (BUFFD1) 1.00 * 11.00 r <-- TOPOLOGICAL
U4/I (BUFFD1) 0.00 * 11.00 r
U4/Z (BUFFD1) 1.00 * 12.00 r <-- VISUAL
FF2/CP (DFD1) 0.00 * 12.00 r
clock reconvergence pessimism 1.00 13.00
library setup time 0.00 * 13.00
data required time 13.00
--------------------------------------------------------------
data required time 13.00
data arrival time -10.00
--------------------------------------------------------------
slack (MET) 3.00
Note that the timing paths provided to the procedure must be obtained with -path full_clock_expanded so the clock paths are present for the procedure to analyze.
By default, the compare_common_point command only prints the timing reports of interest where the visual and topological common points differ, so that it can be given a large set of timing paths to process where only the interesting cases will be printed:
compare_common_point [get_timing_paths -path full_clock_expanded \
-delay min -slack_lesser_than 0 -start_end_pair]
When analyzing a particular path, if you want to have the path report always printed even when the visual and topological common points are the same, specify the -force option:
compare_common_point $this_path -force
Once a path of interest has been identified, use the second Tcl procedure make_clock_path_undesirable. When given a single timing path obtained with -path full_clock_expanded, the procedure will annotate very optimistic cell and net delays along the launch and capture paths contained in the timing_path object:
set_annotated_delay -cell -min +1000 $all_clock_min_arcs
set_annotated_delay -cell -max -1000 $all_clock_max_arcs
By setting such optimistic annotated delays, PrimeTime will prefer any other possible clock path that might exist, making the existence of any side paths easy to see:
pt_shell> make_clock_path_undesirable [get_timing_paths -to FF2/D -path full_clock_expanded]
1
pt_shell> update_timing
1
pt_shell> report_timing -to FF2/D -path full_clock_expanded -input_pins
****************************************
Report : timing
-path_type full_clock_expanded
-delay_type max
-max_paths 1
Design : test
Version: D-2009.12-SP1
Date : Tue Feb 9 05:42:18 2010
****************************************
Point Incr Path
---------------------------------------------------------------
clock CLK (rise edge) 0.00 0.00
clock source latency 0.00 0.00
CLK (in) 0.00 0.00 r
U1/I (BUFFD1) 0.00 * 0.00 r
U1/Z (BUFFD1) -1000.00 * -1000.00 r
U2/I (BUFFD1) 0.00 * -1000.00 r
U2/Z (BUFFD1) 2.00 * -998.00 r
U3/I (BUFFD1) 0.00 * -998.00 r
U3/Z (BUFFD1) 1.90 * -996.10 r
UMUX/A1 (OR2D1) 0.00 * -996.10 r
UMUX/Z (OR2D1) 2.00 * -994.10 r
FF1/CP (DFD1) -1000.00 * -1994.10 r
FF1/Q (DFD1) 2.00 * -1992.10 r
FF2/D (DFD1) 0.00 * -1992.10 r
data arrival time -1992.10
clock CLK (rise edge) 10.00 10.00
clock source latency 0.00 10.00
CLK (in) 0.00 10.00 r
U1/I (BUFFD1) 0.00 * 10.00 r
U1/Z (BUFFD1) 1000.00 * 1010.00 r
U4/I (BUFFD1) 1000.00 * 2010.00 r
U4/Z (BUFFD1) 1000.00 * 3010.00 r
FF2/CP (DFD1) 1000.00 * 4010.00 r
clock reconvergence pessimism 0.00 4010.00
library setup time 0.00 * 4010.00
data required time 4010.00
---------------------------------------------------------------
data required time 4010.00
data arrival time 1992.10
---------------------------------------------------------------
slack (MET) 6002.10
The -input_pins option was used so that both cell and net arcs are individually visible. If a clock path uses only the "impossible" annotated delays (as the capture path does above), we know that no other path was available. If another latency path is used, we know that at least one other alternative clock path was present.
The make_clock_path_undesirable procedure sets annotated delays along the clock path which require a timing update before the next report can be generated. It is recommended that the session be saved with save_session prior to annotating the clock path so the original timing can be restored after the side paths have been explored:
save_session ./temporary_session
make_clock_path_undesirable ...
update_timing
report_timing ...
restore_session ./temporary_session
If it is determined that the limiting side path can never happen, simply disable the connectivity through the limiting path using set_clock_sense -logical_stop_propagation to allow the CRPR algorithm to use a later common pin. By using logical clock stopping, the edge can still propagate through the pin to act as an SI aggressor, but will not be considered through that pin for clocking or CRPR purposes.