

# High-performance Computation Platform and Particle Track Reconstruction in PANDA

Justus-Liebig University in Giessen Ming Liu



## **Outline**

- Computation platform and compute node
- MDC tracking computation



# PANDA DAQ System





# Computation Network



#### Feature extraction network

#### Interconnected CNs

- Internal interconnections to partition the algorithms and process data in parallel
- External interconnections to receive data from detectors and send results to PC farm for storage (Optical link & Gigabit Ethernet)



# Computation Network





- ATCA full-mesh backplane for high speed internal interconnections
  - 13 CN boards in a box



# Compute Node



- Prototype board with 5 Xilinx Virtex-4 FX60 FPGAs
- 4 FPGAs as algo. processors
- 1 FPGA as a switch
- Full-mesh communication on-board
- External links:
  - Optical links
  - Gigabit Ethernet



## **FPGA Node**



#### HW

- PowerPC 405 CPU
- Feature extraction processors
- Peripherals
- **—** ...

#### SW

- OpenSource Linux 2.6
- Device drivers for all peripherals
- Applications



## Track Reconstruction



- 4 MDCs
- Particles' tracks bended in the magnetic area
- Straight line tracks from target to MDC II, and from MDC III to MDC IV
- Currently focusing on the tracks from target to MDC II



# Principle of Track Reconstruction





# Tracking Processing Unit Design





## **LUTs**

- Two LUTs: projection LUT & address LUT
- Projection LUT provides the projection mappings for all the wires.
- Address LUT stores the address info. for each wire:
  - Plane starting addr.
  - LUT starting addr.
  - length





## Accumulate Unit





## Peak Finder



 To find out the exact peak bin where probably a particle passed through



## Peak Finder





# Implementation Results

| Resources        | TPU               | compute node        | PLB-IPIF            | system with     |
|------------------|-------------------|---------------------|---------------------|-----------------|
|                  |                   | platform            |                     | TPU (sum)       |
| 4-input LUTs     | 4755 out of 50560 | 8531 out of 50560   | 2900  out of  50560 | 21817 out of    |
|                  | (9.4%)            | (16.9%)             | (5.7%)              | 50560~(43.2%)   |
| Slice Flip-Flops | 2744 out of 50560 | 5724 out of $50560$ | 1640  out of  50560 | 10108 out of    |
|                  | (5.4%)            | (11.3%)             | (3.2%)              | 50560 (20%)     |
| Block RAMs       | 24 out of 232     | 18 out of 232       | 0                   | 42 out of $232$ |
|                  | (10.3%)           | (7.8%)              |                     | (18.1%)         |
| DSP Slices       | 0                 | 8 out of 128        | 0                   | 8 out of 128    |
|                  |                   | (6.3%)              |                     | (6.3%)          |

- Resource utilization is acceptable.
- Timing limitation: 125 Mhz.
- We choose 100 Mhz, matching the speed of PLB.



## Conclusion

- Computation Platform for feature extraction computation.
- Compute Nodes are interconnected for parallel processing.
- FPGA features for communication and computation.
- Inner track reconstruction described in VHDL
- Tracking system feasibly implemented in a single FPGA.