# THE PSI DSP CARRIER (PDC) BOARD - A DIGITAL BACK-END FOR BUNCH-TO-BUNCH AND GLOBAL ORBIT FEEDBACKS IN LINEAR ACCELERATORS AND STORAGE RINGS

B. Keil, R. Kramert, G. Marinkovic, P. Pollet, M. Roggli, PSI, Villigen, Switzerland\*

## Abstract

PSI has developed a signal processing VXS/VME64x board for accelerator applications like low-latency bunchto-bunch feedbacks, global orbit feedbacks or low-level RF systems. The board is a joint development of PSI/SLS staff and staff working on the contribution of PSI for the European X-ray FEL (E-XFEL). Future applications of the board include the Intra-Bunch-Train Feedback (IBFB) [1] of the E-XFEL as well as the upgrade of the SLS Fast Orbit Feedback (FOFB) and Multibunch Feedback (MBFB). The PDC board has four Virtex-4 FPGAs, two TS201 Tiger Share DSPs, VXS and VME64x 2eSST interfaces, and two front panel SFP multi-gigabit fibre optic links. Two 500-pin LVDS/multi-gigabit mezzanine connectors allow to interface the FPGAs to two application-dependent mezzanine modules each containing e.g. four 500MSPS 12-bit ADCs and two 14bit DACs for the IBFB and MBFB, or four multi-gigabit SFP fibre optic transceivers for the FOFB. This paper reports on hardware and firmware concepts, system topologies and synergies of future applications.

## HARDWARE ARCHITECTURE

## FPGA-Based Low-Latency Feedbacks

Two Virtex-4 SX FPGAs ("Feedback FPGAs") on the PDC allow to perform bunch-to-bunch feedback algorithms by receiving beam position monitor (BPM) data from the ADCs of the above mentioned mezzanine, calculating suitable correction kicks, and writing the kick amplitudes to the DACs that drive the kicker amplifiers. With suitable low-latency BPM and kicker electronics and optimised cabling, overall feedback loop latencies in the order of 200ns which is the bunch spacing of the 600µs long E-XFEL bunch trains are feasible. In case of the E-XFEL IBFB the main requirements for ADCs and DACs are a low latency ideally far below the 200ns bunch spacing, while the sample rate is less critical. However, the chosen maximum sample rate of 500MSPS allows the ADC/DAC mezzanine and PDC also to be used for storage ring MBFBs with bunch spacings down to 2 ns and required overall loop latencies of e.g. ~1 µs for the SLS. One programmable on-board clock phase shifter with ~10ps step size and ~10ns range for each ADC and DAC allows to adjust the sample phases for MBFB applications for minimal position measurement and kicker crosstalk of adjacent bunches. In case of the IBFB, the clock shifters allow e.g. to sample exactly on the top of the short output pulses of low-latency BPM RF front-ends (RFFEs), thus minimising clock jitter impact.

06 Instrumentation, Controls, Feedback & Operational Aspects



Figure 1: PDC board, equipped with an ADC/DAC mezzanine for the E-X EL IBFB application.

## DSP-Based Feedbacks and Feed-Forwards

The Feedback FPGAs of the PDC are ideal for ultra-low-latency bunch-by-bunch feedback algorithms of moderate complexity like FIR filters for IBFB or MBFB. However, more complex real-time feedback algorithms or data analysis tasks with relaxed latency requirements like bunch train to bunch train adaptive feed-forwards for IBFBs or global storage rings FOFBs with typ. 5-20 kHz correction rate are usually much easier implemented e.g. in C/C++ based processor software rather than in VHDL-based FPGA firmware. Therefore the PDC also contains two Analog Devices TS201 DSPs with 3 GFlops/s peak performance.

## General PDC Board Features

The PDC boots the FPGA firmware and PowerPC software for all Virtex-4 FPGAs from a compact flash card that can be updated and modified e.g. via VMEbus and that may also serve as a disk drive for an operating system running on the System FPGA (e.g. Linux). The actual firmware and software boot process is handled by a smaller Spartan3 FPGA ("Configuration FPGA") that loads its own firmware independently from an EEPROM, controls the power-up/-down sequence of numerous on-board voltage regulators, loads the firmware from the compact flash into the other FPGAs, and supervises e.g. FPGA temperatures, power supply voltages and currents. A compact front panel maintenance connector enables JTAG access to all FPGAs and DSPs and also provides an additional serial port for test and maintenance purposes.

External DRAMs for System and Communication FPGA provide memory e.g. for PowerPC program code. The DRAMs connected to the Feedback FPGAs allow

long-term data logging e.g. of IBFB beam positions calculated by the Feedback FPGA. The high-speed QDR2 SRAMs with their deterministic access timing (no refresh cycles needed) enable real-time storage of ADC and DAC raw data streams for IBFB or MBFB, but may also be used for lookup tables of feedback algorithms that do not fit into the much smaller internal FPGA memory.

# Connectivity and Control System Interfacing

The PDC hardware and firmware concepts allow control system access of all internal Virtex4 RAMs and registers, external RAMs and DSPs via the VME 2eVME/2eSST interface at up to 250 MByte/s. The PDC also supports the VME successor standard VXS consisting of 8 full-duplex multi-gigabit "Rocket IO" differential pairs of the "P0" VXS backplane connector that enable control system access to the board either via protocols defined in the VXS (VITA 41.x) standards (PCI Express, Gigabit Ethernet, ...) or by user-defined protocols. Supported Rocket IO baud rates of 2.125, 2.5, 4.25 and 5Gbaud allow VXS transfer rates of 2-5 GByte/s full duplex. Since VXS is physically based on switched point-to-point connections and not on a shared common bus like VME, a VXS crate with several PDC boards could achieve such high data rates simultaneously for each board in the crate, with cumulated overall transfer rates of 50-100 GByte/s for a large crate compared to 250 MByte/s for VME. This makes the PDC also suitable for beamline applications where e.g. 2D detector data could be acquired by a suitable mezzanine, streamed at up to 12 GByte/s to the Feedback FPGA, compressed, and then transferred via VXS e.g. to a hard disk storage array.

It should be noted that the PDC board does not have to be used in a VME or VXS crate but can also operate autonomously in a single-board standalone housing, with the VME J1/J2 backplane connectors only being used for the 3.3V and 5V supply power, and with the SFP front panel connector serving as control system interface,

## APPLICATION TOPOLOGIES

The backplane VXS and front panel SFP fiber optic Rocket IOs can also be used for data exchange with other PDC boards or accelerator devices like RF BPMs, X-Ray BPMs, and low-level RF (LLRF) systems for distributed real-time feedback applications like IBFBs or FOFBs. Fig. 2 shows the simplified topology of the E-XFEL IBFB. The low-latency RFFEs of two so-called upstream BPMs ("BA") as well as the kicker amplifiers ("MA") are connected to the PDC via analogue cables for lowest feedback loop latency. Two "downstream BPMs" after the kickers with slightly slower digital multi-gigabit connections to the PDC are used to supervise and automatically optimise this feedback loop by checking if the kicks calculated from the upstream BPM positions have the expected effect. Additional BPMs directly in front of the undulators are attached to the PDC via multigigabit fiber optic links e.g. in a ring topology, allowing to correct medium-to-low-frequency beam trajectory perturbations between IBFB and undulators caused e.g.

by magnet vibrations or imperfections of the beam distribution kicker pulser.

Fig. 3a and 3b show example topologies for a PDCbased 3D storage ring MBFB (where the 3rd "BPM" and "kicker" in the figure symbolize longitudinal elements) and a FOFB. The IBFB is basically a combination of a MBFB-like topology for the ultrafast upstream BPMs and kickers and a FOFB-like topology where larger amounts of slower BPMs and (in case of global storage ring FOFBs) corrector magnet power supplies are connected to the PDC by multi-gigabit fiber optics links in a ring topology. Due to these similarities, the PDC board hardware and suitably designed generic FPGA firmware and fiber optics data transfer protocols for the IBFB could also be used for MBFBs and FOFBs e.g. for the SLS where existing systems [2,3] are to be upgraded in some years, mainly for spare part and maintenance efficiency reasons but also due to the added functionality and performance available by newer hardware components.



Figure 2: Topology of the E-XFEL IBFB.



Figure 3a/3b: Topology of a 3D- MBFB (left) and a FOFB with Rocket-IO based BPMs and magnet supplies (right).

#### DESIGN ASPECTS AND FORM FACTOR

Fig. 4 and 5 show the mechanical form factor and the PCB layout complexity of the 22-layer PDC board that uses six layers with stackable Cu-filled laser drilled vias both on the upper and on the bottom side of the PCB as well as different mechanically drilled buried via types and 75 micron minimum trace width and spacing for maximum component density and performance. While the PDC requires only one VME/VXS crate slot with suitable mezzanines, it needs two slots when equipped with the IBFB/MBFB ADC/DAC mezzanine that actually consists of two stacked PCBs. The group of four ADCs and two DACs each have a dedicated clock input. Additional digital I/Os allow to start and stop data acquisition or feedback algorithms by external triggers.

T03 Beam Diagnostics and Instrumentation

All PCB traces were routed with a complete set of constraints for delays, skew, trace lengths, impedances etc. in order to guarantee reliable board operation at the target clock and data rates. The delay matching constraints even account for FPGA on-chip signal flight times (from silicon to solder ball) that have non-negligible ball-to-ball variations of >200ps for the ~1150-1500 solder balls of Virtex-4 ball grid array chips.



Figure 4: PDC board with IBFB ADC/DAC mezzanine.



Figure 5: PCB PCB layout: ~11000 signal traces.

# GENERIC FPGA FIRMWARE CONCEPT

Fig. 6 and 7 show a generic FPGA firmware concept for System, and Feedback FPGA. The Communication FPGA firmware is not shown since it is similar to the System FPGA.



Figure 6: System FPGA firmware architecture.

The concept allows nearly the same firmware to be used for IBFB, MBFB and FOFB, with the exception e.g. of the few functional blocks in Fig. 7 that perform the actual IBFB or MBFB low-latency feedback algorithm and that one might want or need to adapt to the specific accelerator and desired add-on functionality like storage ring betatron tune measurement and feedback using e.g. a combination of digital phase-locked loop and swept-frequency techniques [4] but exciting only a single bunch rather than all bunches for minimal emittance blow-up. Each block in Fig. 6 and 7 represents a module written in VHDL code. The thick green lines symbolise memory-mapped on-chip buses ("OCBs", with circled "M"s

06 Instrumentation, Controls, Feedback & Operational Aspects

denoting OCB bus masters) that can be accessed by a VME or VXS/Rocket IO based control system via suitable Inter-FPGA connections. The thick blue line in Fig. 6 is a bus accessible only by the DSPs, enabling their prioritized real-time access e.g. to multi-ported DDR2 SDRAM without risk of being slowed down by pending VME/VXS block transfers on the same bus.



Figure 7: Feedback FPGA firmware architecture.

## SUMMARY AND OUTLOOK

The generic hardware and firmware architecture concepts of the PDC board maximise synergies and thus minimise the hardware and firmware development and maintenance effort for different feedback applications like feedbacks in linear accelerators, bunch-to-bunch multibunch feedbacks in storage rings, or fast orbit feedbacks. While the PDC is targeting high-performance signal processing and feedback applications where a smaller number of bards with DSPs and high-end FPGAs are needed, a low-cost version of the PDC (called "GPAC" = Generic PSI ADC Carrier) without DSPs and with less expensive Xilinx Virtex-5 LXT/FXT and Spartan3 FPGAs is currently being developed as a digital back-end for large-volume applications like the European XFEL BPM system or other diagnostics applications and upgrades of existing systems at PSI. PDC and GPAC have compatible 500-pin 10 GSPS high-speed mezzanine connectors, allowing flexible combinations of ADC/DAC or other frontends mezzanines and back-end carrier boards depending on the required back-end performance.

## REFERENCES

- [1] B. Keil et al., "Design of an Intra-Bunch-Train Feedback System for the European X-Ray FEL", Proc. DIPAC'07, Venice, Italy, May 2007
- [2] T. Schilcher et al., "Commissioning of the Fast Orbit Feedback at SLS", Proc. PAC'03, Portland, Oregon, May 2003
- [3] M. Dehler et al., "Status of the SLS Multi Bunch Feedbacks", Proc. APAC'07, India, 2007
- [4] B. Keil et al., "The DSP-Based Betatron Tune Feedback of the Ramped Electron Storage Ring Bodo", Proc. EPAC '04, Lucerne, Switzerland, 2004