PEKU Suite

( Placement Example with Known Upperbound of wirelength )

Director : Prof. Jason Cong

Authors : Michail Romesis and Min Xie

Copyright© 2002-2003 the Regents of University of California


 

Motivation

The study with the PEKO suite suffers one drawback. In the optimal solutions, all the nets are local, i.e., their wirelength is the minimum possible value. This may not be true in real circuits, which may also have global connections that span the entire chip, even when they are optimally placed. It can be seen that although the number of global connections is small, their wirelength contribution is significant. Therefore, the performance of a placer can be quite different due to the presence of these global nets. It is worthwhile to study the effectiveness and stability of different placers in the presence of global nets.

 

Circuit Description

G-PEKU

One way to introduce global connections is to create circuits with global nets only. We constructed examples where each net spans either an entire row or column, as shown in Fig. 1. The number of nets is the total number of rows and columns. Such an example is similar to datapath placement example, where data flows horizontally through the bit-slice along buses, while control signal flows vertically. One solution to such benchmarks has a configuration similar to Fig. 1. An upper bound of the optimal wirelength is the sum of the length of each row and column. Additional input includes the aspect ratio of the chip. In our construction, we extracted the module numbers from ISPD [2] and use aspect ratio of 1. Table 1 gives the characteristics of fives circuits used in [2]. The circuits can be downloaded here.

 

Figure 1. Circuit with global connections only

 

Table 1. Characteristics of G-PEKU

circuit

#cell

#net

#row

UB

GPeku01

12506

224

113

7.93E5

GPeku05

28146

336

169

1.79E6

GPeku10

68685

525

263

4.38E6

GPeku15

161187

803

402

1.03E7

GPeku18

210341

918

460

1.34E7

PEKU

The second approach is to introduce non-local nets into benchmarks which have only local nets in the optimal solution. Compared with local nets, the non-local nets usually have a longer wirelength, and are used to mimic the global connections in our study. We used the module number and net distribution vector extracted from ISPD98 [2]. By varying the percentage of non-local nets in the placement examples, we generated several sets of placement examples with known upperbound of wirelength. These circuits are grouped as PEKU suite. Table 2 gives a description of the circuits. The circuits can be downloaded here.

 

Table 2. Characteristics of PEKU

% non-local nets

circuit

#cell

#net

#row

Row utilization

LB

UB

0

Peku01

12506

14111

113

85%

8.14E5

8.14E5

Peku05

28146

28446

169

85%

1.91E6

1.91E6

Peku10

68685

75196

263

85%

4.73E6

4.73E6

Peku15

161187

186608

402

85%

1.15E7

1.15E7

Peku18

210341

201920

460

85%

1.32E7

1.32E7

0.25%

Peku01

12506

14111

113

85%

8.14E5

9.23E5

Peku05

28146

28446

169

85%

1.91E6

2.24E6

Peku10

68685

75196

263

85%

4.73E6

6.17E6

Peku15

161187

186608

402

85%

1.15E7

1.71E7

Peku18

210341

201920

460

85%

1.32E7

2.01E7

0.50%

Peku01

12506

14111

113

85%

8.14E5

1.02E6

Peku05

28146

28446

169

85%

1.91E6

2.63E6

Peku10

68685

75196

263

85%

4.73E6

7.52E6

Peku15

161187

186608

402

85%

1.15E7

2.30E7

Peku18

210341

201920

460

85%

1.32E7

2.75E7

0.75%

Peku01

12506

14111

113

85%

8.14E5

1.12E6

Peku05

28146

28446

169

85%

1.91E6

2.95E6

Peku10

68685

75196

263

85%

4.73E6

9.09E6

Peku15

161187

186608

402

85%

1.15E7

2.84E7

Peku18

210341

201920

460

85%

1.32E7

3.43E7

1%

Peku01

12506

14111

113

85%

8.14E5

1.25E6

Peku05

28146

28446

169

85%

1.91E6

3.31E6

Peku10

68685

75196

263

85%

4.73E6

1.06E7

Peku15

161187

186608

402

85%

1.15E7

3.39E7

Peku18

210341

201920

460

85%

1.32E7

4.19E7

2%

Peku01

12506

14111

113

85%

8.14E5

1.67E6

Peku05

28146

28446

169

85%

1.91E6

4.70E6

Peku10

68685

75196

263

85%

4.73E6

1.68E7

Peku15

161187

186608

402

85%

1.15E7

5.66E7

Peku18

210341

201920

460

85%

1.32E7

7.01E7

5%

Peku01

12506

14111

113

85%

8.14E5

3.02E6

Peku05

28146

28446

169

85%

1.91E6

9.06E6

Peku10

68685

75196

263

85%

4.73E6

3.44E7

Peku15

161187

186608

402

85%

1.15E7

1.24E8

Peku18

210341

201920

460

85%

1.32E7

1.56E8

10%

Peku01

12506

14111

113

85%

8.14E5

5.28E6

Peku05

28146

28446

169

85%

1.91E6

1.63E7

Peku10

68685

75196

263

85%

4.73E6

6.43E7

Peku15

161187

186608

402

85%

1.15E7

2.37E8

Peku18

210341

201920

460

85%

1.32E7

3.01E8

 

Experimental Result

We experimented with four state-of-the-art placers, including:

        Capo: Capo is built on a multilevel framework with recursive bisection. It aims to enhance routability with such techniques as tolerance computation and block splitting heuristics [4]. The version used in our experiment is Capo v.8.5.

        Dragon: Dragon uses hMetis to derive an initial partition of the netlist. Then it uses simulated annealing with bin-based swapping to further refine the result [5]. The version used in the experiment is Dragon v.2.20.

        mPL: mPL is a multilevel placement algorithm. It uses nonlinear programming to handle the non-overlapping constraints on the coarsest level, then uses Goto based relaxation in subsequent refinement stages. The version used is mPL v.1.2. It has an additional V cycle with distance-based clustering in the second V cycle [6].

        mPG: mPG is built on a multilevel framework. It uses FC clustering and uses hierarchical density control to minimize the overflow of each placement bin during the refinement process. If necessary, it builds incremental A-tree to optimized routability. We used mPG v.1.0 given in [7].

The experiments are performed on the following two machines:

         Sun Blade workstation 750 MHz running SunOs 5.8 with 4GB of memory. Experiments with Dragon, mPL, and mPG are performed on this machine.

         Pentium IV 2.2 GHz running Redhat 8.0 with 1.5GB of memory. Experiment with Capo is performed on this machine.

Table 3. gives the experimental results on G-PEKU. For these examples with all global nets, the gap between their solutions and the upper bound varies between 41% and 102% in the worst case, similar to the results obtained on PEKO, which has local nets only. This is another validation that there is significant room for improvement for the placement problem.

Table 3. Placement results on G-PEKU

circuit

#cells

UB

Dragon v.2.20 [5] QR

Capo v.8.5 [4] QR

mPG v.1.0 [7] QR

mPL v.1.2 [6] QR

GPeku01

12506

7.93E5

1.98

1.56

1.91

1.11

GPeku05

28146

1.79E6

2.01

1.69

1.97

1.16

GPeku10

68685

4.38E6

2.02

1.72

1.98

1.41

GPeku15

161187

1.03E7

1.99

1.79

1.97

Abort

GPeku18

210341

1.34E7

2.02

1.78

1.98

Abort

Fig. 2 gives the experimental results on PEKU. For each a, we picked five of the circuits and fed them into the placers. Each circuit was placed 5 times by the placers. We calculated the QRs as the ratio of the wirelength to the upper bound. Fig. 6 shows the change of the average QRs with the increase of a on PEKU for the four placers. For mPL, its QR increases when a is increased from 0 to 0.75%. For the other three placers, QRs are steadily decreasing. However, this does not necessarily indicate that their solution qualities are improving. We believe it is because the upper bounds of the optimal wirelength are becoming looser as the percentage of non-local nets increases. The detailed spreadsheet can be obtained here.

Figure 2. . QR vs % of non-local nets for four placement algorithms under evaluation

The experiments suggest two things:

        The placers are not stable in the presence of global nets. Without global nets, mPL gives the shortest wirelength. However, its Quality Ratio shows an increase of more than 80% with a small increase of non-local nets. With non-local nets, Dragon¨s wirelength gradually becomes the closest to the upper bound. A placer¨s Quality Ratio can vary significantly for designs of similar sizes but different characteristics.

        The study suggests that new hybrid techniques, which are more scalable and stable, may be needed for future generations.

Our experiment is not intended to compare the four placers. Rather, it is for the optimality and stability study to see how much room remains for improvement in placement algorithms. Please do not interpret the results here as a ranking of the placers.

Achieving good results on PEKU does not guarantee good performance on real circuits. Also, it is not encouraged to tune the placer in order to beat these examples.

Reference

1.    C. C. Chang, J. Cong, M. Xie, "Optimality and Scalability Study of Existing Placement Algorithms," Proc. Asia South-Pacific Design Automation Conference, pp. 621-627, 2003.

2.    C. J. Alpert, "The ISPD98 Circuit Benchmark Suite," Proc. International Symposium on Physical Design, pp. 85-90, 1998.

3.    J. Cong, M. Romesis, M. Xie, Optimality, Scalability and Stability Study of Partitioning & Placement Algorithms, To appear in Proc. International Symposium on Physical Design, 2003.

4.    E. Caldwell, A. B. Kahng and I. L. Markov, "Can Recursive Bisection Produce Routable Placements?" Proc. Design Automation Conference, pp. 477-482, 2000.

5.    M. Wang, X. Yang and M. Sarrafzadeh, "Dragon2000: Standard-cell Placement Tool for Large Industry Circuits," Proc. International Conference on Computer-Aided Design, pp. 260-264, 2000.

6.    T. Chan, J. Cong, T. Kong, and J. Shinnerl, "Multilevel Optimization for Large-Scale Circuit Placement," Proc. International Conference on Computer-Aided Design, pp. 171-176, 2000.

7.    C-C. Chang, J. Cong, Z. Pan, X. Yuan, Physical Hierarchy Generation with Routing Congestion Control,Proc. International Symposium on Physical Design, pp. 36-41, 2002.


 Please direct your questions to xie@cs.ucla.edu. Thanks.