Alma Mater Studiorum  $\cdot$  University of Bologna

School of Science Department of Physics and Astronomy Master Degree in Physics

## Quality investigation of an ATLAS Phase-II DAQ board via Signal Integrity simulations

Supervisor: Prof. Alessandro Gabrielli Submitted by: Marco Collesei

**Co-supervisor:** 

Ing. Luca Pelliccioni Ing. Roberto Carretta

Academic Year 2019/2020

"Siate felici, e se qualche volta la felicità si scorda di voi, voi non vi scordate della felicità." - Roberto Benigni

#### Abstract

The desire of discovering the unknown features of the Universe has always been the driving force of particle physicists. Through particle colliders of increasing performances, over almost five decades of research in High Energy Physics (HEP), many milestones of the Standard Model (SM), the most accurate and predictive theory of matter up to now, have been set. The impulse of achieving knowledge and finding answers to fundamental questions has lead to the construction of more and more large and powerful apparati. The Large Hadron Collider (LHC) is the most significant expression of the efforts of physicists all over the world and it is always under development to push its colliding and detection limits.

As a matter of fact, by the end of 2024 the installation work of a renewed collider, capable of reaching a nominal luminosity of  $\sim 7.5 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup>, will take place. The new apparatus, called High Luminosity-LHC (HL-LHC), will be operational at least for another decade and, to accomplish the challenges posed by the search for new physics, the main detectors such as A Toroidal LHC ApparuS (ATLAS) and Compact Muon Solenoid (CMS) will be upgraded for Phase-II.

Together with the structural changings in ATLAS sub-detectors, the entire Trigger and Data Acquisition (TDAQ) strategy will be upgraded by the implementation of new technologies.

Among these technology advancements the FrontEnd LInk eXchange (FELIX) DAQ board, developed by Brookhaven National Laboratory (BNL) together with ATLAS, will play a prominent role. The new high-speed Printed Circuit Board (PCB) will handle the communication between the sub-detectors, the first level of trigger and the Dataflow, thanks to its multi Gb/s links and programmable logic, which makes the board a versa-tile and long-lasting element of the TDAQ chain.

My personal work has been testing some of the transmission channels of FELIX Phase-II board, in the light of Signal Integrity (SI), with the goal to supply simulation results confirming the reliability of its high data-rate lines. Ultimately, I have given evidence that high-speed digital design is a key step in DAQ boards research and development, and it is an essential instrument to reach optimal performances.

# Contents

### Introduction

| 1            | ATLAS experiment                  | 1  |
|--------------|-----------------------------------|----|
|              | 1.1 LHC and ATLAS state of art    | 2  |
|              | 1.1.1 LHC                         | 2  |
|              | 1.1.2 ATLAS                       | 3  |
|              | 1.2 Phase-II upgrade              | 13 |
|              | 1.3 FELIX project                 | 21 |
| <b>2</b>     | Signal Integrity                  | 27 |
|              | 2.1 Signal Integrity              | 28 |
|              | 2.1.1 Transmission                | 28 |
|              | 2.1.2 Main issues $\ldots$        | 29 |
|              | 2.1.3 SI methodology              | 32 |
| 3            | Simulation with HyperLynx         | 35 |
|              | 3.1 Groundwork                    | 35 |
|              | 3.1.1 Models                      | 36 |
|              | 3.2 Results                       | 41 |
| 4            | Increasing performances           | 45 |
|              | 4.1 Parameters                    | 47 |
|              | 4.1.1 Geometrical                 | 47 |
|              | 4.1.2 Configuration               | 48 |
|              | 4.2 Proposed solution             | 48 |
| 5            | Conclusions                       | 51 |
| $\mathbf{A}$ | FPGA                              | 55 |
| в            | Supplementary diagnostic diagrams | 57 |

 $\mathbf{v}$ 

## Introduction

Particle physics investigates the structure of matter and the mechanisms that govern the Universe. Since the late 50s, particle accelerators have been used by physicists all over the world to closely observe the nuclear and subnuclear reactions of fundamental bricks of Nature. These accelerators can be of various shapes and sizes, capable of accelerating particles to different energies. Over the decades, accelerated beam energies of gradually increasing orders of magnitude have been reached to better understand the phenomena described by the Standard Model (SM).

Near Geneva, is present and active the Conseil Européen pour la Recherche Nucléaire (CERN), the largest collaboration in the world of physicists, computer scientists and engineers. CERN hosts the Large Hadron Collider (LHC), the most powerful particle accelerator ever build, with a ring structure 27 km long and a center of mass energy of 13 TeV.

Four main detectors take place around LHC interaction points: A Toroidal LHC ApparuS (ATLAS), Compact Muon Solenoid (CMS), A Large Ion Collider Experiment (ALICE) and LHCb, which have the task of reconstructing, starting from the trajectories of the particles, the mechanisms underlying the fundamental interactions.

The current theory, described by the SM, predicts the existence of a defined number of elementary constituents subjected to fundamental forces mediated by other fundamental particles.

Despite the numerous confirmations and many predictions of the Standard Model obtained through experiments, nowadays there are still some open problems such as the unification of fundamental forces (GUT), the reconciliation between general relativity and quantum theory, the understanding of Dark Matter and Dark Energy and the determination of many SM parameters with increased accuracy.

To allow researchers to perform increasingly precise and meaningful experiments, the accelerators and all the measuring devices technologies are constantly updated. In particular, the ATLAS experiment, as well as the other detectors present at CERN, will be updated in several phases, both in the structure of its sub-detectors and in the data acquisition electronics, to cope with the increasing luminosity and number of interactions per event.

Especially, with the major improvement from LHC to High Luminosity-LHC (HL-LHC) by the end of 2024, ATLAS sub-detectors performances will be enhanced with partial or entire substitution with new sub-systems. Together with these structural changes, and

since the produced amount of information will rise considerably, the whole Trigger and Data Acquisition (TDAQ) chain will be redesigned to accomplish Phase-II requirements. A basic component of the new TDAQ outline for Phase-II upgrade of ATLAS is the Front-End LInk eXchange (FELIX) board, which is the result of joint developing efforts of Brookhaven National Laboratory (BNL) and CERN. The idea behind the project is a high-speed, versatile, non-custom Printed Circuit Board (PCB) that can be programmed and upgraded to handle the trigger and the data received from the new Inner Tracker (ITk), the Calorimeters and the Muon System.

The object of study of this thesis is the quality of signal transmission on FELIX Phase-II acquisition board. In detail the work is articulated in four main chapters:

Chapter 1 is a detailed description of the actual status of LHC and the ATLAS experiment, with the attention posed to ATLAS Phase-II upgrade and the introduction of the FELIX project together with the main characteristics of the board.

Chapter 2 introduces the world of Signal Integrity (SI), in the optic of supplying the background of high-speed digital design, with related considerations and issues.

SI is the core of the investigation that I have driven on the FELIX Phase-II board; all the preliminary considerations that I have made on the project, and the following choices that I have taken, are based on the theory of high-speed signal theory.

Chapter 3 is where the previous considerations are applied to the FELIX. Together with the technical support received from the Link Engineering of S.Giovanni in Persiceto (BO), in the person of Roberto Carretta as Signal And Power Integrity Analysis Specialist and Luca Pelliccioni as Chief Technical Officer, and the collaboration of Bologna INFN section, I have lead the SI simulations on the board.

The aim of these simulations was to ensure the goodness of signal transmission over possible critical nets operating in the order of 10 Gb/s. As a matter of fact, all the simulations but one performed well, thus I have put particular attention to the only one that was affected by signal degradation.

Chapter 4 is the final expression of all the observations that have been made so far. I have proposed two successful solutions to restore the SI of the poor quality transmission line, identified in the first series of SI simulations, proving that the knowledge that I have acquired and the experience that I have made in this thesis work can provide an essential support to the new FELIX board development.

## Chapter 1

## ATLAS experiment

For several decades, particle colliders have been essential tools for particle physics. From the very beginning, such accelerators have been among the most complicated scientific instruments ever built, involving tens of thousands of scientists over the world. The Large Hadron Collider (LHC) is the most illustrative example of this technology. Among the past two decades, it has opened a new frontier in particle physics thanks to its high collision energy and luminosity.

One of its largest experiments, A Toroidal LHC ApparatuS (ATLAS), has been built guided by the principle of maximising the discovery potential for new physics, such as Higgs bosons and supersymmetric particles, while keeping the capability of high-accuracy measurements of known objects like heavy quarks and gauge bosons. The basic design criteria of the detector include:

The basic design criteria of the detector include:

- a very good electromagnetic calorimetry for electron and photon identification and measurements, paired with full-coverage hadronic calorimetry for accurate jet and missing transverse energy measurements;
- high-precision muon momentum measurements, capable of measurements at the highest luminosity using the external muon spectrometer;
- efficient tracking at high luminosity for high transverse momentum measurements and full event reconstruction at lower luminosity;
- large acceptance in pseudorapidity  $(\eta)$  with almost full azimuthal angle  $(\phi)$  coverage everywhere.  $\phi$  angle is measured around the beam axis, whereas  $\eta$  relates to the polar angle  $\theta$  (where  $\theta$  is the angle from the z-direction);
- triggering and measurements of particles at low transverse momentum thresholds, providing high efficiencies for most physics processes of interest at LHC.

### 1.1 LHC and ATLAS state of art



Figure 1.1: CERN accelerator complex; the LHC in blue is the last stage of the accelerating chain which comprises smaller rings.

### 1.1.1 LHC

The LHC is a circular collider with 14 TeV centre of mass energy and design luminosity of  $10^{34}$  cm<sup>-2</sup>s<sup>-1</sup>. Beam crossings are 25 ns apart and, at design luminosity, there are 23 interactions per crossing. It is supplied with protons and Pb ions from the existing injector chain comprising:

- the injector chain Linac2;
- Proton Synchrotron Booster (PSB);
- Proton Synchrotron (PS);
- Super Proton Synchrotron (SPS).

The project, approved by CERN council in December 1994, took place into the underground infrastructure of Large Electron Positron (LEP) collider that consisted of a 26.7 km long ring tunnel. LEP included experimental areas at four points (2, 4, 6 and 8), each incorporating experimental and service caverns and, for the LHC project, the existing LEP tunnel has been re-used after the complete dismantling of the LEP machine. Besides, new structures have been added including experimental and service caverns destined to hold two new experiments. Of the four LHC experimental areas, two have been constructed on almost empty sites where there was very little existing infrastructure. As such, two large experimental zones for ATLAS and CMS took place at points 1 and 5 respectively. For the two smaller experiments (ALICE and LHCb) the existing infrastructure has required only minor modifications to locate the new detectors.



Figure 1.2: Open view of sub-detectors geometry in the ATLAS experiment, including the central solenoid magnet and the barrel and end-cap toroid magnets.

#### 1.1.2 ATLAS

ATLAS experiment is a general-purpose detector designed to exploit the full discovery potential of the LHC. Many of the interesting physics questions at the LHC require high luminosity, and so the primary goal is to operate at a luminosity of  $10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> with a detector that provides as many signatures as possible, using electron, photon, muon, jet, and missing transverse energy measurements, as well as b-quark tagging. The variety of signatures, both at low and high transverse momentum, is important in the high-rate environment of the LHC to achieve robust and redundant physics measurements. The detector was meant to achieve different physics goals such as:

- the search of the Higgs boson, which resent some of the most challenging signatures, involving high-resolution measurements of electrons, photons and muons, excellent secondary vertex detection for  $\tau$ -leptons and b-quarks, high-resolution calorimetry for jets and missing transverse energy essential to explore the full range of possible masses;
- searches for SUSY at the electroweak scale with expected abundant production of squarks and gluinos leading to a variety of signatures involving multi-jets, leptons, photons, heavy flavours and missing energy;
- the precision measurements of the W and top-quark masses, gauge boson couplings, CP violation and the determination of the Cabibbo-Kobayashi-Maskawa unitarity triangle.

The dimensions of the ATLAS detector are remarkable, as a matter of fact the outer chambers of the barrel are at a radius of about 11 m. The half-length of the barrel toroid coils is 12.5 m, and the third layer of the forward muon chambers, mounted on the cavern wall, is located about 23 m from the interaction point. Also its weight is significant as the overall weight of the ATLAS detector is about 7000 tons. Large superconducting air-core toroid magnets surround a variety of sub-detectors here listed [1].

**Muon spectrometer** The conceptual layout of the muon spectrometer is based on the magnetic deflection of muon tracks. Over the range  $|\eta| \leq 1.0$ , magnetic bending is provided by the large barrel toroid. For  $1.4 \leq |\eta| \leq 2.7$ , muon tracks are bent by two smaller end-cap magnets inserted into both ends of the barrel toroid. Over  $1.0 \leq |\eta| \leq 1.4$  magnetic deflection is provided by a combination of barrel and end-cap fields.

This magnet configuration provides a field mostly orthogonal to the muon trajectories. Trigger and reconstruction algorithms are optimised to cope with the difficult background conditions resulting from penetrating primary collision products and from radiation background produced from secondary interactions in the calorimeters.

In the barrel region, tracks are measured in chambers arranged in three cylindrical layers around the beam axis; in the transition and end-cap regions instead, the chambers are placed vertically. Four chamber technologies, arranged such that particles from the interaction point traverse three stations of chambers, are described below.



Figure 1.3: Cross sections of the four technologies employed for muon detection.

**MDT** Monitored Drift Tubes (MDT) are aluminium tubes of 30 mm diameter and 400  $\mu$ m wall thickness, with a 50  $\mu$ m diameter central W–Re wire. The tubes are filled with a non-flammable mixture of 93% Ar and 7% CO<sub>2</sub> at 3 bar of pressure, allowing a maximum drift time of ~700 ns, a small Lorentz angle, and excellent ageing properties with a single-wire resolution of ~80  $\mu$ m.

**CSC** Cathode Strip Chambers (CSC) are multi-wire proportional chambers with cathode strip readout and with a symmetric cell in which the anode-cathode spacing is equal to the anode wire pitch. The precision coordinate is obtained by measuring the charge induced on the segmented cathode by the avalanche formed on the anode wire. A good spatial resolution is obtained by segmenting the readout cathode and by charge interpolation between neighbouring strips. The anode wire pitch is 2.54 mm and the cathode readout pitch is 5.08 mm providing with position resolutions of better than 60  $\mu$ m and small electron drift times (30ns), good time resolution (7 ns) and low neutron sensitivity. The gas mixture is a non-flammable mixture of 30% Ar, 50% CO<sub>2</sub> and 20% CF<sub>4</sub>, with a total volume of 1.1 m<sup>3</sup>.

**RPC** Resistive Plate Chambers (RPC) are a gaseous detector providing a typical space-time resolution of  $1 \text{ cm} \times 1$  ns with digital readout. The basic RPC unit is a narrow gas gap formed by two parallel resistive Bakelite plates, separated by insulating spacers. The primary ionisation electrons are multiplied into avalanches by a high, uniform electric field of typically 4.5 kV/mm and amplification in avalanche mode produces pulses of typically 0.5 pC.

The gas mixture is based on tetrafluoroethane  $(C_2H_2F_4)$  with some small admixture of  $SF_6$ , a non-flammable and safe gas. The signal is read out via capacitive coupling by metal strips on both sides of the detector. Each chamber is made from two detector layers and four readout strip panels.

**TGC** Thin Gap Chambers (TGC) are similar in design to multi-wire proportional chambers, with the difference that the anode wire pitch is larger than the cathode–anode distance. Signals from the anode wires, arranged parallel to the MDT wires, provide the trigger information together with readout strips, also used to measure the second coordinate. This type of cell operates with a highly quenching gas mixture of 55% CO<sub>2</sub> and 45% n-pentane (n-C<sub>5</sub>H<sub>12</sub>) working in a saturated mode, reaching small sensitivity to mechanical deformations, small dependence of the pulse height on the incident angle and nearly Gaussian pulse height distribution with small Landau tails.

The main dimensional characteristics of the chambers are a cathode-cathode distance (gas gap) of 2.8 mm, a wire pitch of 1.8 mm, and a wire diameter of 50  $\mu$ m and, thanks to the electric field configuration and the small wire distance, a short drift time and thus a good time resolution can be reached.

**Calorimeters** The calorimetry consists of an electromagnetic (EM) calorimeter covering the pseudorapidity region  $|\eta| < 3.2$ , a hadronic barrel calorimeter covering  $|\eta| < 1.7$ , hadronic end-cap calorimeters covering  $1.5 < |\eta| < 3.2$  and forward calorimeters covering  $3.1 < |\eta| < 4.9$ .

**EM** The Electromagnetic (EM) calorimeter is a lead/liquid-argon (LAr) detector divided into a barrel part ( $|\eta| < 1.475$ ) and two end-caps (1.375  $< |\eta| < 3.2$ ). The barrel calorimeter consists of two identical half-barrels, separated by a small gap (6 mm) at z=0. Each end-cap calorimeter is mechanically divided into two wheels: an outer wheel covering the region  $1.375 < |\eta| < 2.5$ , and an inner wheel covering the region  $2.5 < |\eta| < 3.2$ .

The lead thickness in the absorber plates is optimised in terms of EM calorimeter performance in energy resolution. The LAr gap has a constant thickness of 2.1 mm in the barrel. The total thickness of the EM calorimeter is > 24 radiation lengths (X<sub>0</sub>) in the barrel and > 26 X<sub>0</sub> in the end-caps.

Over the region devoted to precision physics, the EM calorimeter is segmented into three longitudinal sections. The strip section, which has a constant thickness of ~ 6 X<sub>0</sub>, acts as a pre-shower detector enhancing particle identification ( $\gamma/\pi^0$ , e/p separation, etc.) and providing a precise position measurement. The middle section instead is transversally segmented into square towers. The total calorimeter thickness up to the end of the second section is 24 X<sub>0</sub>, tapered with increasing rapidity (this includes also the upstream material). The signals from the EM calorimeters are extracted at the detector inner and outer faces and sent to pre-amplifiers located outside the cryostats.

**HAD** The Hadronic (HAD) barrel calorimeter is a cylinder divided into three sections: the central barrel and two identical extended barrels based on a sampling technique with plastic scintillator plates (tiles) embedded in an iron absorber.

The ATLAS HAD calorimeters cover the range  $|\eta| < 4.9$  using different techniques best suited for the widely varying requirements and radiation environment over the large  $\eta$ range. Over the range  $|\eta| < 1.7$ , the iron scintillating-tile technique is used for the barrel and extended barrel tile calorimeters while in the range  $\sim 1.5 < |\eta| < 4.9$  LAr calorimeters were chosen. The hadronic end-cap calorimeter (HEC) extends to  $|\eta| < 3.2$ , while the range  $3.1 < |\eta| < 4.9$  is covered by the high density forward calorimeter (FCAL). Both the HEC and the FCAL are integrated into the same cryostat as that housing the EM end-caps.

An important parameter in the design of the hadronic calorimeter is its thickness providing good containment for hadronic showers and reduced punch-through into the muon system. The total thickness is 11 interaction lengths ( $\lambda$ ) at  $\eta = 0$ ; Close to 10  $\lambda$  of active calorimeter are adequate to provide good resolution for high energy jets. Together with the large  $\eta$  coverage, this guarantees a good ET miss measurement, important for many physics signatures and in particular for SUSY particle searches.

In particular, the large hadronic barrel calorimeter is a sampling calorimeter using iron

as the absorber and scintillating tiles as the active material. The tiles, 3 mm thick, are radially disposed from an inner radius of 2.28 m to an outer radius of 4.25 m and periodically staggered in depth along z with layers of iron, each 14 mm thick. Two sides of the scintillating tiles are read out by wavelength shifting (WLS) fibres into two separate photomultipliers (PMTs) with low dark current and fast rise time (few ns).



Figure 1.4: Representation of the ATLAS Inner Detector with the SCT and TRT technologies disposition in the barrels and end-caps; inside the barrels it is arranged the Pixel Detector.

Inner detector The Inner Detector (ID) is contained within a cylinder of length 7 m and a radius of 1.15 m, surrounded by the solenoidal magnetic field of 2 T [2]. Pattern recognition, momentum and vertex measurements, and electron identification are achieved with a combination of discrete high-resolution semiconductor pixel and strip detectors in the inner part of the tracking volume, and continuous straw-tube tracking detectors with transition radiation capability in its outer part. The momentum and vertex resolution targets require high-precision measurements to be made with fine-granularity sub-detectors, presented below, arranged on concentric cylinders around the beam axis in the region with  $|\eta| < 1$ , while the end-cap detectors are mounted on disks perpendicular to the beam axis.



Figure 1.5: Detailed illustration of the effective dimensions of the ID tracking system around the beam pipe.

**TRT** The Transition Radiation Tracker (TRT) is based on the use of straw tubes of 4 mm in diameter, giving a fast response and good mechanical properties for a maximum straw length of 150 cm. This sub-detector can operate at the very high rates and electron identification capability is achieved added by employing Xenon gas to detect transition-radiation photons created in a radiator between the straws. This technique, which is intrinsically radiation hard, allows a large number of measurements on every track. However the detector has to cope with a large occupancy and high counting rates. The structure of the barrel consists of about 50000 straws, each divided in two at the centre in order to reduce the occupancy and read out at each end. The end-caps contain 320000 radial straws, with the readout at the outer radius giving a total number of electronic channels of 420000. Each channel provides a drift-time measurement, resulting in a spatial resolution of 170  $\mu$ m per straw. These allow the detector to discriminate between tracking hits and transition-radiation hits.

Two end-caps, each consisting of 18 wheels, cover the radial range from 48 to 103 cm. The TRT contributes to the accuracy of the momentum measurement in the Inner Detector by providing a set of measurements roughly equivalent to a single point of 50  $\mu$ m precision. It also helps the pattern recognition by the addition of around 36 hits per track, and allows a simple and fast level-2 track trigger.

**SCT** The Semiconductor Tracker (SCT) barrel, designed to provide four precision measurements per track in the intermediate radial range, uses four layers of silicon microstrip detectors to provide precision points in the  $R\phi$  and z coordinates contributing to the measurement of momentum, impact parameter and vertex position. It also provides good pattern recognition by the use of high granularity.

Each silicon detector is  $6.36 \times 6.40 \text{ cm}^2$  with 768 readout strips each with 80  $\mu$ m pitch. The spatial resolution is 16  $\mu$ m in R $\phi$  and 580  $\mu$ m in z, and tracks can be distinguished if separated by more than ~ 200  $\mu$ m. The system requires a very high dimensional stability, cold operation of the detectors, and the evacuation of the heat generated by the electronics and leakage current.

The readout chain consists of a front-end amplifier and discriminator, followed by a binary pipeline which stores the hits above threshold until the first level trigger decision.



Figure 1.6: Composition of the Pixel Detector, the main tracking system inside ATLAS used to reconstruct interaction vertices.

**PD** The Pixel Detector (PD) consists of three barrels at average radii of  $\sim 4$  cm, 11 cm, and 14 cm, and four disks on each side, between radii of 11 and 20 cm designed to provide a very high-granularity, high-precision set of measurements as close to the interaction point as possible, supplying precision measurements over the full acceptance. It also determines the impact parameter resolution and detects short-lived particles such as b-quarks and  $\tau$ -leptons.



Figure 1.7: 3D reconstruction of the layers composing the ASIC chips included in the PD.

The readout chips are of large area, with individual circuits for each pixel element, including buffering to store the data while awaiting the level-1 trigger decision. In addition, the chips are radiation hardened to withstand over 300 kGy of ionising radiation and over  $5 \times 10^{14}$  neutrons per cm<sup>2</sup> in years of operation. The pixel modules are very similar in design for the disks and barrels; each barrel module is 62.4 mm long and 22.4 mm wide, with 61440 pixel elements, read out by 16 chips each serving an array of 24 by 160 pixels. The output signals are routed on the sensor surface to a hybrid on top of the chips, and from there to a separate clock and control integrated circuit.



Figure 1.8: Each readout component of the PD is mounted on staves displaced around the beam pipe with a precise angle to cover the entire geometry.

**IBL** The Insertable Barrel Layer (IBL) is a fourth layer added to the ID between a new beam pipe and the current PD to face the following issues [3]:

- irreparable failures of modules in the PD layers due to radiation which could have been partially compensated during reconstruction at the cost of an increased fake rate, deteriorating the impact parameter resolution, directly affecting the btagging. The IBL restores the full b-tagging efficiency;
- tracking precision is enhanced with IBL located close to the interaction point, improving the quality of impact parameter reconstruction for tracks, and thereby vertexing and b-tagging performances. As a result, sensitivity for signals in physics channels involving b jets is improved;
- luminosity at least twice than the current one is expected before the High Luminosity-LHC (HL-LHC) is complete. With high luminosity the event pileup is increased, leading to high occupancy that can induce readout inefficiencies. Readout inefficiencies, particularly at higher luminosity, would limit the b-tagging efficiency. The presence of event pileup requires redundancy in the measurement of tracks in order to control the fake rate arising from pileup background. The addition of the IBL layer helps to preserve tracking performance in face of luminosity effects.

For IBL pixel sensors two concurrent technologies are exploited: the well known planar design, already used in the rest of the Pixel Detector, and 3D sensors. The main structural difference between 3D and planar sensors is that the electrodes penetrates the bulk in form of column instead of being implanted on the surface. With this configuration the depletion is parallel to the wafer surface and since the column can be positioned at a distance smaller than the pixel thickness, the voltage necessary for full depletion is smaller, as well as the pulse rise-time. With a small depletion voltage, also the power dissipation due to current leakage is smaller, and with it also the cooling requirements.



Figure 1.9: IBL planar (a) and 3D (b) pixels read by Front End chips and connected to the readout chain trough flexible circuits.

A new front-end chip, called FE-I4, was fabricated using a 130 nm CMOS architecture and contains readout circuitry for 26880 pixels arranged in 80 columns by 336 rows. The connection of a sensor and a FE-I4 chip plus a flex hybrid, a double-sided flexible printed circuit that allows connection to external services, constitute a module, the basic building block of the IBL detector. According to the sensor used, there are two types of module: planar modules, where a planar sensor is connected to two FE-I4 chips, and 3D modules, where a 3D sensor is connected to a FE-I4. The support structure, holding together 20 modules, electrical services and a cooling pipe is called stave and the entire IBL detector is composed by 14 staves covering the full azimuthal angle and  $|\eta| < 3$ .

### 1.2 Phase-II upgrade



Figure 1.10: Roadmap of LHC main upgrade stages together with centre of mass energy and integrated luminosity targets to be reached.

To sustain and extend its discovery potential, the LHC will undergo a major upgrade in the 2020s. This will increase its instantaneous luminosity (rate of collisions) by a factor of five beyond the original design value  $(5 \times 10^{34} \text{ cm}^{-2}\text{s}^{-1})$  and the integrated luminosity (total number of collisions) by a factor ten (250 fb<sup>-1</sup> per year). This upgrade will require new infrastructures (underground and on the surface) and over a decade to implement. The new configuration, known as HL-LHC, relies on several key innovations that push accelerator technology beyond its present limits [4]. Among these are cuttingedge 11–12 Tesla superconducting magnets, compact superconducting cavities for beam rotation with ultra-precise phase control, new technology and physical processes for beam collimation and a renewed data acquisition chain.

The present roadmap of LHC, as defined consequently of the corona-virus pandemic, plans by the end of Run 3 main equipment upgrades and layout modifications for ATLAS and CMS, the two high-luminosity general-purpose detectors. The ATLAS and CMS detectors will be upgraded to handle an average pile-up, the number of events per bunch crossing, of at least 140 (ultimately 200) for operation with 25 ns beams consisting of 2760 bunches at 7 TeV, and for an inelastic cross-section  $\sigma_{in} = 81$  mb. These detectors are also expected to handle a peak line density of pile-up events of at least 1.3 events per mm per bunch crossing and ultimately larger values with limited reduction of the detection efficiency. **HL-LHC** The foreseen upgrade should provide the potential for good performance over a wide range of parameters. The machine and experiments will find the best practical set of parameters in actual operation but the most relevant ones for optimizing the luminosity performance can be listed here:

- the total beam current will be a hard limit in the LHC since many systems, like RF power system and RF cavities, collimation system and absorbers, are affected by this parameter. Apart from radiation effects, all existing systems have been designed for  $I_{beam} = 0.86$  A; however, the HL-LHC will need to go 30% beyond ultimate beam current with 25 ns bunch spacing;
- the beam brightness, the ratio of the bunch intensity to its transverse emittance, is a beam characteristic that must be maximized at the beginning of beam generation and preserved throughout the entire injector chain and the operation cycle in the LHC itself. The HL-LHC project has as its primary objective increasing the number of protons per bunch above the nominal design value while keeping emittance at the present low value;
- the beta function, which determines the maximum amplitude a single particle trajectory can reach at a given position in the ring, is determined by the focusing properties of the lattice. A classical route for a luminosity upgrade with head-on collisions is to reduce β\* (β at the interaction point) using stronger and larger aperture quadrupoles thus reducing the transverse size of the luminous region resulting in a gain in peak luminosity;
- the luminosity reduction factor R is reduced by a larger crossing angle required to keep a small β\* and to face this difficulty various methods can be employed to at least partially mitigate this effect. The most efficient and elegant solution for compensating the geometric reduction factor is the use of special superconducting RF cavities, capable of generating transverse electric fields that rotate each bunch longitudinally.

The expected installation of new equipment, and the previous de-installation and removal of the LHC equipment over a length of about 1.2 km, covers new technologies and infrastructures such as:

- almost complete renewal of the insertion region IR1 (around ATLAS experiment) and IR5 (around CMS), from the quadrupoles to the cryogenics and vacuum systems with the installation of new collimators in IR1 and IR5; as well as the upgrade of most secondary collimators and the insertion of crab cavities;
- installation of one new large 1.9 K refrigerator unit at both P1 and P5 providing the cooling power needed to absorb the five times larger heat load and, with the adoption of a new cryogenic distribution line;
- modification of the extraction and injection systems, in particular installation of new upgraded absorbers to cope with injection failures.

**ATLAS upgrades** The upgrade of the central tracking system for the ATLAS experiment for the operation at the HL-LHC will start in the middle of 2026. At that time the LHC will have been upgraded to reach a peak instantaneous luminosity of  $7.5 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup>, which corresponds to an average of about 200 inelastic proton-proton collisions per beam-crossing. The new tracking detector will be operational for more than ten years, during this time ATLAS aims to accumulate a total data set with integrated luminosity of  $4000 \text{ fb}^{-1}$ .

Meeting all of the requirements of a charged particle tracking detector, close to the beamline at the HL-LHC, presents a unique challenge for the design of an all-silicon system. In any case, its design can benefit from the enormous amount of experience gained over more than two decades in the construction and operation of the existing inner tracking detector, that has been highly successful for the exploitation of LHC physics up to and well beyond its original design requirements.



Figure 1.11: Display of the ATLAS Phase-II Inner Tracker ITk layout with all its shells.

**ITK** The development of the new Inner Tracker (ITk) detector layout is carried considering the following set of goals [5]

- designing a tracking detector that provides the required tracking performance to the ATLAS Phase-II physics programme, in events with an average pile-up of up to 200 simultaneous interactions;
- the detector should provide robust tracking in presence of detector defects, like sensor inefficiencies due to expected radiation effects, as well as dead modules due to eventual component failures;

- the aim to minimise cost by reducing as much as possible the total silicon surface necessary to achieve the required hit coverage and by choosing simple solutions whenever possible;
- try to choose layout options that allow minimising the CPU time needed for reconstruction, which is one of the cost drivers for the computing budget for the ATLAS Phase-II programme.



Figure 1.12: Schematic layout of one quadrant of the ITk Inclined Duals layout for the HL-LHC, the active elements of the barrel and end-cap Strip Detector are shown in blue, for the Pixel Detector the sensors are shown in red for the barrel layers and in dark red for the end-cap rings.

The layout of ITk combines precision central tracking in the presence of pile-up events with the ability to extend the tracking coverage to a pseudorapidity of 4 while maintaining excellent tracking efficiency and performance. The ITk comprises two subsystems: a Strip Detector, that has four barrel layers and six end-cap petal-design disks covering  $|\eta| < 2.7$ , surrounding a 5 layers Pixel Detector that extends the coverage to  $|\eta| < 4$ .

Even though ITk is still in R&D phase, the requirement to be able to replace the inner section of the ITk Pixel Detector during a long LHC shutdown places severe constraints on the design of the pixel package. The mechanical design is thought not to rely on the presence of the inner section. Besides, the pixel package must be able to support the beam pipe without requiring the inner section to be present, in the same way the IBL was implemented in 2014 around a new, smaller radius beam-pipe. In this way, it is possible to guarantee the integrity of the ATLAS detector and some limited data taking capabilities even in event of catastrophic failures of the inner section, such failures that could be repaired in a long shutdown.

The new pixel module will be a hybrid pixel module similar to the one adopted for the present ATLAS Pixel Detector and the IBL. The hybrid pixel module is made of two parts: a passive high resistivity silicon sensor and a front-end read-out chip fabricated in CMOS technology, called a bare module, and a flexible PCB, called a module flex. The silicon sensor and front-end read-out chip are joined using a high-density connection technique. There will be three types of hybrid pixel modules:

- quad modules consisting of four chips bump-bonded to a single sensor, (around  $4 \times 4 \text{ cm}^2$  in area), which are used in the outer flat barrel layers and the outer end-cap rings;
- dual modules consisting of two front-end chips bump-bonded to a single sensor, (around  $4 \times 4$  cm<sup>2</sup> in area), which will be used in the innermost barrel layer and the inclined part of the outer barrel;
- single-chip modules consisting of one front-end chip bump-bonded to a sensor (around  $2 \times 2 \text{ cm}^2$  in area) required for the inclined part of the innermost barrel layer.

Compared to the design used for the IBL, the ITk Pixel module required several improvements:

- the pixel size has been reduced to  $50 \times 50 \ \mu m^2$  or  $25 \times 100 \ \mu m^2$  to improve intrinsic resolution and two-track separation;
- the design of the read-out chip has been improved in several ways: the analogue front-end can operate at lower threshold compensating for the loss of collected charge due to radiation damage, the read-out architecture has been improved to comply with the higher hit density and event rate and the radiation tolerance has been increased to  $1.4 \times 10^{16} n_{eq}/\text{cm}^2$ . The power consumption was also reduced and this has a positive effect on the material budget (less massive cables and reduced cooling requirements);
- the output bandwidth has been increased to 5.12 Gb/s per front-end chip, to cope with the hit rates in the innermost section of the tracker;
- the size of the module has been increased; with the largest module the size of four front-end chips, which is about 16 cm<sup>2</sup>, to reduce cost and fabrication time.

**TDAQ** Meeting HL-LHC requirements poses significant challenges to the Trigger and to the Data Acquisition system (TDAQ) to fully exploit the physics potential of the new collider. The overall goal of the TDAQ Phase-II upgrade project is to design, build, and install new trigger and data acquisition hardware with its firmware and needed software during the third long shutdown of the LHC in 2024 [6]. A baseline architecture, based on a single-level hardware trigger with a maximum rate of 1 MHz and 10  $\mu$ s latency has been proposed and, compared to the existing ATLAS Detector, the design of the TDAQ system for the ITk Pixel Detector is much more challenging. This is due to the larger trigger rate (10 times higher than current ATLAS), the larger number of hits associated with 200 proton-proton interactions per crossing and associated volume of data that is generated, in particular in the inner layers.



Figure 1.13: Schematic of the overall baseline design of the TDAQ system in Phase-II with the trigger levels and dataflow indicated in the legend aside. In particular the DAQ system is made up of FELIX, the Data Handlers, and the Dataflow subsystem at 1 MHz.

#### 1.2 Phase-II upgrade

In details, the new set-up expects the Inner Tracker, the Calorimeters and the Muon System to produce the first level L0 of trigger data at 40 MHz, and this will be handled by a series of event filters and trigger logic from L0Calo and L0Muon. Once a Global Trigger has been processed and gone through the Central Trigger Processor (CTP), the L0 accept signal will be received by the FELIX board. The FELIX will also receive the read-out data from detectors at 1 MHz and, once all these informations have been collected, the Dataflow will proceed with the Event Builder, the Storage Handler and the Event Aggregator, ultimately providing data both to the Permanent Storage and the final Event Filter where data are processed in farms.

Therefore, the FELIX will play a pivotal role in receiving the trigger and the data as a unique, versatile and programmable board and this thesis focuses on its development and functioning test.

The older generation of data links that operate at 160 Mb/s will be replaced with a new design that provides multi-Gb/s read-out without increasing the mass inside the tracking volume, taking advantage of increased performance available in modern electronics. Each front-end chip will have four serial output lines that can transmit data at 1.28 Gb/s (for a total of 5.12 Gb/s). This serial outputs will be coupled to an electrical data transmission line, connecting the front-end chip with the optical conversion stage.

ATLAS aims to fully explore the mechanism of electroweak symmetry breaking through the properties of the Higgs boson, to search for new physics through the study of rare Standard Model processes, to search for new heavy states, and measure the properties of any newly discovered particles. Both the necessity of a highly efficient selection of events with Higgs bosons in decay modes and of events accessing new, unexplored physics scenarios requires exceptional trigger and data acquisition performance.

The new TDAQ will benefit from increased granularity provided by the calorimeters, improved efficiency for muon-based triggers and will perform hardware-based tracking profiting from the extended coverage of the ITk. This is why the upgrade of the TDAQ system will require larger bandwidth and a better processing capacity than the actual one to efficiently select events at high luminosity.

In detail, the expected ATLAS physics programme for the HL-LHC will cover a wide spectrum of physics goals and a representation of analyses including:

- unveiling the paradigm of electroweak symmetry breaking through precision measurements of the properties of the Higgs boson;
- improved measurements of all relevant Standard Model parameters including the study of rare Standard Model processes;
- searches for Beyond the Standard Model (BSM) signatures and flavour physics;
- specific challenges of the heavy-ion physics.

The new architecture of the TDAQ system will rely on a single-level hardware trigger ( Level-0, formed using calorimeter and muon information) with a detector readout rate of 1 MHz and a maximum latency of 10  $\mu$ s. The Phase-I calorimeter trigger processors will be maintained during the HL-LHC operations, and their firmware optimised for the pile-up conditions also complementing it by additional processors that will implement more sophisticate algorithms to provide additional background rejection.

The Event Filter (EF) system will select events based on a processor farm and a custom Hardware-based Tracking for the Trigger (HTT) to reduce the overall CPU requirements. Each system and sub-system is designed to be capable of evolving to a dual-level hardware-based trigger architecture as a mitigation strategy in case pile-up conditions at the HL-LHC either challenge the readout capabilities of certain detectors (for example of the innermost layers of the ITk) to the limits of the bandwidth available or in case the rates of hadronic trigger signatures surpass the current predictions.

The result of the Level-0 trigger decision is transmitted to all detectors, upon which the resulting TDAQ data are transmitted at 1 MHz through the Readout subsystem, which contains the Front-End LInk eXchange (FELIX) and Data Handler components, and the Dataflow subsystem, which contains the Event Builder, Storage Handler, and Event Aggregator components. Together these compose the DAQ system.

### 1.3 FELIX project

The Front-End LInk eXchange (FELIX) is a new detector readout component being developed by Brookhaven National Laboratory (BNL) as part of the ATLAS upgrade effort and it is designed to act as a data router, receiving packets from detector front-end electronics and sending them to programmable peers on a commodity high bandwidth network.

FELIX will be the main interface between the detector and all off-detector systems and his main task is to handle the communication between the off-detector back-end systems and the detector.

Whereas previous detector readout implementations relied on diverse custom hardware platforms, the idea behind FELIX is to unify all readout across one well supported and flexible platform, implementing detector data processing in software hosted by commodity server systems subscribed to FELIX data.

A first version of the FELIX board, the FLX-712, has been produced and tested to be implemented in the TDAQ upgrade of Phase-I of ATLAS, in particular the LAr Phase-I upgrade of the Trigger readout (both off- and on-detector) and the New Muon Small Wheel readout. The PCB employs two main I/O technologies: high speed optical fibres (up to 768 Gb/s) and the PCIe connection (up to 120 Gb/s) to control and configure the Front End (FE) chips and acquire data from calibration and detection procedures.



Figure 1.14: FLX-712 hosting a Xilinx Kintex Ultrascale FPGA (circled in red), a PCIe Gen.3 X16 connector (circled in orange), the MiniPODs (circled in light blue) and the connector for a Timing and Trigger Control (TCC) mezzanine.



Figure 1.15: Block diagram of the FLX-712 main transmission lines. There are present 8 (4 for the Transmitter and 4 for the Receiver) electro-optical MiniPOD transceivers handled by the GTH protocol of the FPGA, a PCIe connection at 16 lanes (8 transmitting and 8 receiving), a Micro controller to manage the JTAG, the SMBus, the trigger FPGA programming and a flash memory, and Jitter cleaner to provide a very stable clock to the board.

Since the board is thought to be easily programmable and reconfigurable, its commercial connections both electrical and optical and the versatility of the FPGA are its main stroing points.

From a network perspective, FELIX is designed to be flexible enough to support multiple technologies, including Ethernet and Infiniband. Given the general purpose nature of the FELIX effort, the system will also be adopted by other non-ATLAS projects.

Data, as processed by the FELIX system, are distributed via commodity multi-gigabit networks with at least 100 Gb/s network links that will be used throughout the DAQ system. FELIX supports two different link protocols for the transfer of data to and from front-end peers, each supported by the same hardware platform, with separate firmware revisions both based on the same core modules:

- the Gigabit Transceiver (GBT) chipset and associated technologies, developed as part of CERN's Radiation Hard Optical Link Project, whose goal is to achieve a radiation hard bi-directional link for use in LHC upgrade projects. GBT provides as interface an optical connectivity technology known as the Versatile link which provides high bandwidth and radiation hard transport of data up to 5 Gb/s;
- the FULL mode protocol (referring to full bandwidth) is implemented as a single wide data stream with no handshaking or logical substructure, given the requirement for a higher bandwidth data link, from the detector to FELIX, than GBT.

Seeing as not required radiation hardness, the FULL mode protocol can be implemented in FPGAs on both sides of the link.

The reduced constraints mean that FULL mode links can operate at a line transmission rate of 9.6 Gb/s which, accounting for 8b10b encoding, means a maximum user payload of 7.68 Gb/s.



Figure 1.16: Progressive substitution during Phase-I upgrade of the present readout system, based on custom boards, with the FELIX project board, which will be faster and more versatile.

#### FELIX Phase-II

At the present state of the FELIX project, the 24 layers board hosts an FPGA from Xilinx with 16 nm technology, a PCIe connector to interface with the readout system and a FPGA Mezzanine Connector (FMC) to link to the ATLAS Timing, Trigger and Control (TTC). Several electro-optical transceivers are organized on the board providing large bandwidth to transfer data at higher rates with respect to the DAQ system in use. Here follows a representation of the board with detailed description of its main components.





Figure 1.17: Top view of the FELIX Phase-II board with its main components highlighted in blue. 1) Xilinx Virtex Ultrascale<sup>+</sup> XCVU9P-FLGC2104AAZ FPGA, 2) DDR4 slot (on top of the FPGA in (b)), 3) 12-channels 25G Duplex Finisar Board-mounted Optical Assembly (BOA), 4) 12-channels 25G Duplex Amphenol ICC On-Board Trnsceiver (OBT), 5) 12-channels 16 Tx/Rx Samtec FireFly<sup>TM</sup>, 6) 4-channels 25G Duplex Samtec FireFly<sup>TM</sup>. On the back of the board there is also the FMC connector. The PCIe connector (below the FPGA) is a PCIe Gen.3 X16 connector with a nominal data-rate of 15.754 GB/s.

Since the firmware of the FPGA is still under development it is not possible to produce a block diagram as it has been done for the FLX-712 board, which is a previous version. Although it can still be presented a recap of the main interfaces and connection of FELIX Phase-II board.



Figure 1.18: Summary representation of the variety of the connections managed by the FPGA on the FELIX Phase-II board, these are divided into memories (Flash and DDR4), clock distribution, power management and monitoring components and, above all, a wide spectrum of data lines with different number of channels and data-rate.

The main idea behind the FELIX concept is the development of a modular system which makes it possible to independently upgrade or modify aspects of the system such as computing and buffering resources, network technology or supported serial-link protocols. The ability to evolve through further upgrades is a key feature of the readout system when considering the performance requirements and long development cycle leading to Phase-II, as well as the long lifetime of the ATLAS experiment beyond this period. Part of the process leading to the development of such a board is the validation of its channels and transmission lines, in order to ensure integrity in the data transmission. The following section is devoted to highlighting the importance of simulating the functioning of signal transmission lines when dealing with data-rates of the order of GB/s, as in the case of the FELIX board.

Later on, it will be presented the study of two of the channels of data transmission: the 4-channels 25 Gb/s Transmitters (Tx) and Receivers (Rx), and the 16-lane 8 Gb/s PCIe connection. I have chosen these two transmission lines since the former is one of the fastest electro-optical connection, and the latter because its topology is one of the most complicated on the PCB and this could cause problems at such high data-rates.

### Chapter 2

### Signal Integrity

Until 30 years ago most people treated Printed Circuit Boards (PCB) as totally passive devices for connecting components together. As long as the traces were connected to the correct pins, boards almost never had a negative impact on circuit performance. Traditionally, digital design was a relatively uncomplicated affair. Designers could develop circuitry operating up to 30 MHz without having to worry about issues associated with transmission line effects because, at lower frequency, the signals remained within data characterization, allowing the system to perform normally.

In recent decades this is becoming less and less true as we reached frequencies of the order of GHz. Many board designers now need to worry about the parasitic elements of traces (resistance, capacitance and inductance), the interaction between individual traces, and even the interaction between traces and the outside environment. In the present era, where clock frequencies are increased and signal integrity problems are getting more severe, product design teams have one chance to get a product to the market; the product must work successfully the first time. If identifying and eliminating signal integrity problems is not an active priority as early in the product cycle as possible, chances are the product will not work.

This is true also for PCBs developed in High Energy Physics (HEP) experiments like the FELIX board, and this is the reason why this new board deserves to be investigated in the light of Signal Integrity (SI).

High-speed digital design, in contrast to digital design at low speeds, emphasizes the behaviour of passive circuit elements. These passive elements may include the wires, circuit boards, and integrated circuit packages that make up a digital product. At low speeds, passive circuit elements are just part of a product's packaging while at higher speeds they directly affect electrical performance. High-speed digital design studies how passive circuit elements affect signal propagation (ringing and reflections), interaction between signals (crosstalk), and interactions with the natural world (electromagnetic interference).

### 2.1 Signal Integrity

The term SI addresses two concerns in the electrical design aspects: the timing and the quality of the signal [7][8]. Whether the signal reaches its destination when it is supposed to, and if it is in good condition when it gets there are the most recurring questions. The goal of signal integrity analysis is to ensure reliable high speed data transmission. The quality of the signal needs to be maintained for the receiver in an electronic design to deliver its intended goal.

The failures in high-speed digital circuit are often not readily reproducible. They are often difficult to diagnose, reproduce and fix; unlike the errors in the schematics and layout stage, where, it is possible to check the design using simple rules and instruments. High-speed design failures show up as failures at higher operating frequency, data error rates, cross talk errors and EMI errors. The debugging of high speed related errors may need expensive instruments, for example high bandwidth oscilloscopes, spectrum analysers, time domain reflectometers, to detect and understand the failure mechanism. Therefore, care must be taken at the design stage itself to ensure that the design is in accordance to high speed design rules.

Modern day examples of the high-speed signals include DDR Bus, HyperTransport Bus, USB, SATA, PCIe, Gigabit Ethernet, optic fibre transmission etc. High-speed design techniques must therefore be applied to the PCBs like FELIX containing mentioned and other high-speed signals to ensure proper and reliable operation.

#### 2.1.1 Transmission

In a digital system, a signal is transmitted from one component to another in the form of logic 1 or 0, which is actually at certain reference voltage levels. At the input gate of a receiver, voltage above the reference value  $V_{ih}$  is considered as logic high, while voltage below the reference value  $V_{il}$  is considered as logic low. The ideal voltage waveform in the logic world would be a square waveform whereas in fact the signal often looks more like a rising wave with noise and oscillations around reference values. More complex data, composed of a string of bits 1 and 0, are actually continuous voltage waveforms and the receiving component needs to sample the waveform in order to obtain the binary encoded information. The data sampling process is usually triggered by the rising edge or the falling edge of a clock signal thus the data must arrive at the receiving gate on time and settle down to a non-ambiguous logic state when the receiving component starts to latch in. Any delay of the data or distortion of the data waveform will result in a failure of the data transmission.

#### 2.1.2 Main issues

SI refers to a broader sense to all the problems that arise in high-speed products due to the interconnects. It is about how the electrical properties of the interconnects, interacting with the digital signal's voltage and current waveforms, can affect performance. All of these problems basically can be grouped in three categories: timing, noise and Electromagnetic Interference (EMI).

**Timing** Timing, which is a complicated field of study, is everything in a high-speed system. Signal timing depends on the delay caused by the physical length that the signal has to cover. It also depends on the shape of the waveform, that can be distorted, when the threshold is reached. In one cycle of a clock, a certain number of operations have to happen: this short amount of time must be divided up and allocated to various operations. For example, some time is allocated for gate switching, for propagating signal to the output gate, for waiting for the clock to get to the next gate and for waiting for the gate to read the data at the input. Thus timing plays a crucial role in every high-speed PCB.

**Noise** There are several SI noise problem such as ringing, ground bounce, reflections, near-end crosstalk, switching noise, non-monotonicity, power bounce, attenuation etc. All of these relate to the electrical properties of the interconnects and how the electrical properties affect the waveform of the digital signals. Here are explained in more detail some of the most encountered noise problems:

• ringing is an unwanted oscillation, particularly in the step response (the response to a sudden change in input) of a voltage or current. It happens when an electrical pulse causes the parasitic capacitances and inductances in the circuit (those that are not part of the design, but due to the materials used to construct the circuit) to resonate at their characteristic frequency.

Ringing, which can be due to signal reflection, is undesirable because it causes extra current to flow, thereby wasting energy and causing extra heating of the components. It can also cause unwanted electromagnetic radiation to be emitted, it can delay arrival at a desired final state and it may cause unwanted triggering of bistable elements in digital circuits;

• reflection occurs when part of a signal transmitted along a transmission medium, such as a copper cable or an optical fibre is reflected back. This happens because imperfections in the cable cause impedance mismatches and non-linear changes in the cable characteristics that can cause some of the transmitted signal to be reflected.

Impedance discontinuities cause attenuation, distortion, standing waves, ringing and other effects because a portion of a transmitted signal will be reflected back to the transmitting device rather than continuing to the receiver, much like an echo. This effect is compounded if multiple discontinuities cause additional portions of the remaining signal to be reflected back to the transmitter.

When a returning reflection encounters another discontinuity, some of the signal rebounds in the original signal direction, creating multiple echo effects. These forward echoes strike the receiver at different intervals making it difficult for the receiver to accurately detect data values on the signal;

• crosstalk is any phenomenon by which a signal transmitted on one circuit or channel of a transmission system, like a simple copper line, creates an undesired effect in another circuit or channel. It is usually caused by undesired capacitive, inductive, or conductive coupling from one circuit or channel to another which is close to the transmitting line. This effect is closely related to the distance among to lines since every electrical signal is associated with a varying field, whether electrical or magnetic. Where these fields overlap, they interfere with each other's signals and this electromagnetic interference creates crosstalk.

For example, if two wires next to each other carry different signals, the currents flowing in them will create magnetic fields that will induce a smaller signal in the neighbouring wire.

All of the effects listed above are related to one of the this four families of noise sources.

**Signal quality** A net can be thought in its simplest form as a series of metal wires connecting chips together, including not only the signal path but also the return path for the signal current. Signal quality on a single net depends as much on the physical features of the signal trace as on the return path. When the signal leaves the output driver, the voltage and the current, which make up the signal, see the interconnect as an electrical impedance. As the signal propagates down the net what is important is the instantaneous impedance it meets, because this can reflect and distort the signal.

For example some features that would change the impedance include: changings in the line width, changings in the layer routed trough a via and the presence of a connector or branches in the paths.

The impact on a signal from any discontinuity depends on the rise time of the signal; as the rise time gets shorter, the magnitude of the distortion will increase. The higher the frequency and the shorter the rise time, the more important is is to keep the impedance the signal sees constant.

Another aspect of SI associated with a single net is related to the timing difference between two or more signal paths, which is called skew. For example when a signal and a clock line have a skew different than expected, false triggering and errors can result. Moreover, if there is a skew between two lines that make up a differential pair (as those present in several data transmission lines on FELIX), some of the differential signal will be converted into common signal and the differential signal will be distorted. **Crosstalk** As mentioned before, when one net carries a signal, some of this voltage and current can pass over to an adjacent net; even though the signal quality on the first net is perfect, some of the signal can couple over and appear as unwanted noise on the second net. Crosstalk occurs in two different environments: when the interconnects are uniform transmission lines, as in most traces in a circuit board, and when they are not uniform transmission lines, as in connectors and packages. In controlled impedance transmission lines where the traces have a wide uniform return path, the relative amount of capacitive coupling and inductive coupling is comparable. In this case, these two effects combine in different ways at the ends of the quiet line.

**Rail collapse** Noise can affect not only signal paths, but also the power and ground distribution network that power each chip. When current through the power and ground paths changes, as when a chip switches its outputs or gates switch, there is a voltage drop across the impedance of the power and ground paths. This voltage drop will mean less voltage between the power and ground rails.

In high-performance processors, like the FPGA mounted on FELIX, the trend is to lower power supply voltage, but higher power consumption. This is primarily due to more gates on a chip switching faster. In each cycle, a certain amount of energy is consumed: when a chip switches faster, the same energy is consumed in each cycle but consumed more often, leading to higher performance than average consumption. These factors combine to mean higher currents are switching in shorter amounts of time, and the amount of noise that can be tolerated decreases. As the drive voltages decreases and the current level increase, any voltage drop associated with rail collapse become a bigger and bigger problem.

**EMI** EMI, also called Radio Frequency Interference (RFI) when in the radio frequency spectrum, is a disturbance generated by an external source that affects an electrical circuit by electromagnetic induction, electrostatic coupling, or conduction. The disturbance may degrade the performance of the circuit or even stop it from functioning. With clock frequencies in the hundreds of MHz range, the first few harmonics are within the common communication bands of TV, FM radio, cellphone and communication devices. This means there is the concrete possibility of electronic products interfering with communications unless their electromagnetic emissions are kept below acceptable levels. These considerations are valid also in experiments with high density of electronic components as those that will be present in HL-LHC. Unfortunately EMI gets worse at higher frequencies, the radiated strength from common currents increases linearly with frequency and from differential currents it increases with the the square of the the frequency, posing a hard challenge to SI.

#### 2.1.3 SI methodology

SI simulations and analysis take place several times when developing a new PCB, usually it is divided into pre-layout analysis and post-layout analysis. The most important thing one can do to enable success with simulation is to define a repeatable process with milestones. Whether designing and simulating on your own, or as part of a large design team like in big collaborations involving physics experiments, it is important to have a strategy to enable an efficient process. The sooner in the design cycle decisions are taken the better it is, in order to have more options and minimize reworking. Therefore, it is preferred to simulate as early in the design process as possible. An efficient design flow including simulation comprises:

- starting simulating in the concept and schematic phase of the design. This is the phase where one can use simulation to pick technologies, drive strength, termination, define the topology and so on. It is where design constraints based on actual simulation rather than rules-of-thumb can be developed. Doing so will ensure the design rules to be conservative enough without being overly constraining and enabling fast layout;
- when the layout and design of components is in the placement phase, one should pass design constraints based on simulation results. It is useful also to find ways to automate this process in the design flow;
- taking the board that is being simulated from the schematic editor into the PCB editor several times throughout the design process. One can start simulating from placement, then simulate again after some critical nets are routed and at the end run a final post-route verification simulation. So that problems are found as soon as possible it is suggested not to wait until the board is complete to run post-route verification;
- when failures are found in the simulation, the affected nets have to be extract over the schematic tool where the net characteristics can be changed to find a solution. Once the solution is achieved one can implement the change in the schematic or layout, and then take the design back to the PCB editor again for final verification.

This simulation methodology is intended to guide through typical steps of simulation and provide tips and resources along the way. As a solid example of methodology, the following is the work-flow adopted by the Mentor Graphics (now Siemens) HyperLynx PCB analysis and verification tool.

In the first stage, the pre-layout analysis, the first step is to create a new free-form schematic, essentially the representation of the electrical connections with proper symbols and names of the board circuits.

The subsequent stage is to define the stack-up of the project; typically current PCB have several layers (24 in the case of FELIX) of different materials and dimension that specify their impedance and other parameters.

#### 2.1 Signal Integrity

Next, the routing constraints, based upon actual stack-up and technologies, are passed to the layout before translating it into a PCB. Once the PCB has been rendered with a proper tool the post-layout analysis can take place. In the very first glance, the stack-up, the power supplies, the crosstalk thresholds and other parameters have to be checked, then net by net, component by component, one assigns a model to each Integrated Circuit (IC).

By assigning a model, often supplied by the part constructor, one specifies the exact model for that specific IC and how it works in a wide range of conditions. Basically the model contains the results from simulations of the response of the component to a series of stimuli.

Once the models of the ICs in the interested net have been chosen, the effective simulation (which is explained in detail in the next chapter) can start.

From the analysis of simulation results one can study all the possible issues treated before, look at their behaviour and see how they are related to the variation of parameters and topology of the nets. In case a net, or a whole part of the PCB, requires to be optimised its scheme is brought back to the free-form schematic editor where it is possible to make crucial changes in order to gain better performances.



Figure 2.1: Schematic work-flow of SI simulation phases as presented in HyperLynx SI: Basic Signal Integrity Methodology resources.

The area of interest of my thesis work has been the post-layout analysis, as the first FELIX Phase-II boards commissioned to Link Engineering had already been produced. In this sense, the simulations which I have produced with final PCB layout could be compared with real on-board tests done with oscilloscopes and physical connection to transmission lines.

### Chapter 3

## Simulation with HyperLynx

As mentioned in the previous chapter, whenever it comes to realising a high-performance PCB with high-speed digital design, the SI plays a crucial role in driving the decisions. The state of the art of the FELIX project, guided by Brookhaven National Laboratory and ATLAS collaboration at CERN, is the FELIX Phase-II board presented in this thesis and it has been produced by the Link Engineering company of S. Giovanni in Persiceto (BO).

Together with the contribution of the INFN section of Bologna, and the expertise provided by Link Engineering Chief Technical Officer Luca Pelliccioni and the Signal And Power Integrity Analysis Specialist Roberto Carretta, I have performed SI simulations on the FELIX board in the perspective to validate and optimise its functioning. This is the first time that such considerations and studies of signal quality are carried on over a high-performance board of a wide collaboration like this.

### 3.1 Groundwork

The SI simulation that I have conducted is part of the post-layout analysis of the project, and the intent of this work is to give a feedback on the reliability of some FELIX transmission lines providing simulation results. In particular the channels of interest that I have chosen to take into account are the PCIe connections (8 Gb/s per lane) and the transmission lines coupled to the optical fibre (25 Gb/s per channel).

These two transmission lines might be affected by some or many of the main issues introduced previously in the second chapter, and therefore it is worth to investigate how they behave in a typical situation of data transfer that can be reproduced with the HyperLynx tools.

#### 3.1.1 Models

In order to describe and study accurately the behaviour of these nets it is appropriate to consider that they can not simply be reduced to classical circuits. When dealing with high frequencies, the well known Kirchhoff laws are not suited anymore for this task as at high frequencies circuits behave like transmission lines, meaning that electromagnetic waves propagate along the circuit. This is the reason why we introduce models to describe circuits in SI simulations.

**S-parameters** Scattering parameters models describe the behaviour of a network when stimulated by electrical signals in a steady state, the scattering term comes from optical engineering referring to the effect observed when a plane electromagnetic wave is incident on an obstruction or passes across dissimilar dielectric media. In the same way many electrical properties of networks of components (inductors, capacitors, resistors) may be expressed using S-parameters, such as gain, return loss, reflection coefficient etc. In the context of S-parameters, scattering refers to the way in which the travelling currents and voltages in a transmission line are affected when they meet a discontinuity caused by the insertion of a network into the transmission line.

In the S-parameter approach, an electrical network is regarded as a black box containing various interconnected basic electrical circuit components or lumped elements which interacts with other circuits through ports. The network is characterized by a square matrix of complex numbers called its S-parameter matrix, which can be used to calculate its response to signals applied to the ports. The S-parameter matrix describing an N-port network will be square of dimension N and will therefore contain  $N^2$  elements.

S-parameters then describe the way an interconnect affects an incident signal and the ends where signals enter or exit in a Device Under Test (DUT) are called ports. Each S-parameter is the ratio of a sine wave scattered from the DUT at a specific port, to the sine wave incident to the DUT at a specific port. For all linear, passive elements, the frequency of the scattered wave will be exactly the same as the incident wave; the only two qualities of the sine wave that can change are the amplitude and phase of the scattered wave. The magnitude of an S-parameter is the ratio of relative amplitudes of the two sine waves, which is often described in dB through:

$$S_{dB} = 20 \times \log(S_{mag}) \tag{3.1}$$

where  $S_{dB}$  is the value of the magnitude in dB and  $S_{mag}$  is the value of the magnitude as a number. The phase of the S-parameters instead is simply the phase difference between the output wave minus the input wave.

For multiple, coupled transmission lines, like differential channels, the port assignment scheme sets the odd ports on the left of a transmission line and the even ports on the right; with the S-parameter first index referring to the output port, while the second index is the input port. The S-parameter matrix for the 2-port network is probably the most commonly used and serves as the basic building block for generating the higher order matrices for larger networks.



Figure 3.1: Model representation of a 2-port network like those present in PCIe channels.

The matrix in this case is:

$$\begin{pmatrix} b_1 \\ b_2 \end{pmatrix} \begin{pmatrix} S_{11} & S_{12} \\ S_{21} & S_{22} \end{pmatrix} \begin{pmatrix} a_1 \\ a_2 \end{pmatrix}$$
(3.2)

where

$$a_i = \frac{1}{2}k_i(V_i + Z_iI_i)$$
(3.3)

$$b_i = \frac{1}{2}k_i(V_i + Z_i^* I_i)$$
(3.4)

$$k_i = \frac{1}{\sqrt{|\mathcal{R}\{Z_i\}|}} \tag{3.5}$$

with  $Z_i$  the impedance for the i-th port, and  $V_i$  and  $I_i$  respectively the complex amplitudes of the voltage and current at port. The 2-port S-parameters have the following generic descriptions:

- S<sub>11</sub> is the input port voltage reflection coefficient;
- S<sub>12</sub> is the reverse voltage gain;
- S<sub>21</sub> is the forward voltage gain;
- $S_{22}$  is the output port voltage reflection coefficient.

For historical reasons, the magnitude of the reflected S-parameter  $S_{11}$  is called return loss, a measure of what is returned to the incident port and lost to the transmitted signal, and the magnitude of the transmitted S-parameter  $S_{21}$  is called insertion loss, a measure of what is lost from the signal when the interconnect is inserted between the two ports of a network analyser. When the interconnect is symmetrical from one end to the other, the return losses  $S_{11}$  and  $S_{22}$  are equal, while in an asymmetric 2-port interconnect they are different. In general, calculating by hand the return and insertion loss from an interconnect line is complicated; it depends on the impedance profile and time delays of each transmission line segment that makes up the interconnect and the frequencies of the two sine waves.

Here it is reported an example of two out of four S-parameters for the 8 Gb/s PCIE\_TX0 net on the FELIX board.



Figure 3.2: Return loss (green) and insertion loss (yellow) magnitudes of the PCIE\_TX0 channel on the FELIX board. The analysed spectrum ranges from 0 to 16 GHz to study the behaviour of the main harmonics composing the signal.

If the modelled connection was an ideal transparent interconnect, the expected reflected signal, namely the return loss parameter  $S_{11}$ , would be small (negative in this scale). However, the plot shows a return loss parameter which is approaching 0 as the frequency increase, indeed for a fully reflected signal the argument in the logarithmic part of Equation 3.1 would be exactly 1 and thus  $S_{11} = 0$ . It is evident that at high frequencies the transmission is not optimal, meaning that the possible cause is impedance mismatching between the input and the output of the channel. Moreover the are bounces at certain frequencies that are symptoms of signal reflections.

On the other hand, the insertion loss parameter  $S_{21}$  exploits a decreasing behaviour as the frequency grows, providing a strong hint of signal degradation, the more the mismatching of impedances increases, the more  $S_{21}$  gets negative. The sharp negative peak in the insertion loss plot, near 7 GHz, is an evident alarm bell that something is badly affecting the data line and it is worth considering a deeper analysis and to apply changes in the net design.

**IBIS** Input/output Buffer Information Specification (IBIS) models, instead of describing passive interconnect, are used as a behavioural model that describes the electrical characteristics of the digital inputs and outputs of active devices through V/I and V/T data without disclosing any proprietary information. These models do not correspond to the conventional idea of a model such as a schematic symbol or polynomial expression, among others.

An IBIS model consists of tabular data made up of current and voltage values in the output put and input pins, as well as the voltage and time relationship at the output pins under rising or falling switching conditions. This tabulated data represents the behaviour of the active device such as the FPGA present on the FELIX board. IBIS models are intended to be used for SI analysis to foresee fundamental signal integrity concerns in the transmission line that connects different devices. Potential problems that can be analysed by means of the simulations include the degree of energy reflected back to the driver from the wave that reaches the receiver due to mismatched impedance in the line; crosstalk; ground and power bounce; overshoot; undershoot etc. IBIS is an accurate model since it takes into account non-linear aspects of the I/O structures, the Electrostatic Discharge (ESD) structures, and the package parasitics.

**IBIS-AMI** While IBIS models are suited to describe the analogue behaviour of ICs, the IBIS Algorithmic Modeling Interface (IBIS-AMI) is a modeling standard for Serialiser/Deserialiser (SERDES) that enables fast, accurate, statistically significant simulation of multi-gigabit serial links. IBIS-AMI was developed by a consortium of Electronic Design Automation (EDA) and before their introduction systems designers faced significant limitations when performing serial link simulations: first of all because traditional analysis was slow and could not simulate the millions of bits needed to accurately predict link operating margins, also open-source statistical analysis tools simulated many millions of bits but could not accurately model a specific semiconductor vendor's device Intellectual Property (IP) and last because proprietary semiconductor vendor tools accurately model vendor IP and simulate millions of bits, but can not be used when different semiconductor vendors are used at each end of the link.

The IBIS-AMI specification was developed with the following specific goals: interoperability, meaning that models from different semiconductor vendors can work together, transportability, so that the same model runs in different IBIS-AMI simulators, performance, so that simulations with millions of bits can run in 10 minutes or less, flexibility, for models to support both statistical and time domain simulation, and last the mentioned IP Protection, thus that models can not be reverse-engineered, semiconductor vendors control which details are exposed to the user. In particular the IBIS AMI specification defines the required interface/communication protocols between a SERDES model and an EDA tool and allows the designer to successfully simulate a SERDES model irrespective of the EDA tool, enabling executable, software-based, algorithmic models to work together with traditional IBIS circuit models and allowing SERDES adaptive equalization algorithms to be modelled and used during channel simulation. The intent behind IBIS-AMI models is having plug-and-play simulation compatibility between SERDES models from different suppliers, in a standard commercial EDA format.

Here it is reported as an example an application of all the mentioned models to PCIE\_TX0 net, which as it has been anticipated in Figure 3.2 it is a possible critical line.



Figure 3.3: Schematic model of the PCIE\_TX0 net. At the two opposites of the net are placed the models for the driver (FPGA) and the receiver (PCIE connector), the block J9 in the middle is the model describing the net itself, while the other blocks describe the package and channel S-parameters of the driver and receiver.

### 3.2 Results

A powerful tool like the HyperLynx Serialiser/Deserialiser (SERDES) analyser can test the transmission of informations encoded in bits over a net. These bits are nothing but high and low voltage states, given a reference level, in the time domain of the clock. Each transmission protocol has its own frequency and encoded bit-words, but they all have in common the eye diagram. The eye diagram is a plot obtained superimposing the switching states of a transmission line and it is a clear and visible indicator of the presence of signal degradation.

These are for example just some of the most common data-transfer protocols.



Figure 3.4: Different kinds of eye diagrams with relative transmission types, each diagram has its own shape depending on the data-type and data-rate.

In this kind of plot, each SI issue like noise, crosstalk, signal reflection and so on produces evident modifications to the ideal eye diagram: for example a symptom of the bad quality of a channel is a very closed eye with a huge amount of noise and jitter present both in the voltage and time scales. It comes to help in this cases the eye mask to quantify the magnitude of the degradation of a signal; if the eye diagram is far from the borders of a defined mask then the line passes the test.



Figure 3.5: Ideal eye diagram respecting the borders of the applied mask.

Another important diagnostic tool commonly used in SI analysis are bathtub curves. While eye diagrams are created with sampling oscilloscopes and consist of multiple traces of data bits triggered by a bit clock, where the traces are superimposed in persistence mode showing the envelope of amplitude and timing fluctuations, bathtub curves, also referred to as Bit Error Ratio Tester (BERT) scans, are usually created with Bit Error Rate (BER).

A BERT generates data to pass through a DUT and then measures the transmitted data and compares for errors, thus determining the BER. As the measurement location is swept across a Unit Interval (UI), a plot of BER as a function of the UI is constructed. This plot typically resembles a cross section of a bathtub, thus the name bathtub curve. A histogram of the timing location of data transitions passing a voltage threshold can be compiled. Such a histogram can be acquired with an equivalent-time, sampling oscilloscope. However, the acquisition rate is generally slow and requires a very long time to achieve a high statistical count. The preferred method for histogram acquisition is by using a time interval analyser. In a short time (several seconds), histogram measurements will capture the most probable timing locations of data transitions, this histogram is also known as a jitter histogram. If the histogram is rescaled such that the integral is unity, it becomes a Probability Density Function (PDF) of jitter.

Once the complete jitter PDF is determined from the jitter histogram, the corresponding bathtub curves can be easily found. Recall that the range of the bathtub curve is a dimensionless number BER, which is the probability of bit errors. The indefinite integral of the complete jitter PDF is the probability of bit errors due to timing. Thus, the bathtub curve of timing errors is simply an integral, or the cumulative density function (CDF), of the jitter PDF.



Figure 3.6: Illustration of relationship between eye diagram, Jitter PDF, and bathtub curve.

Having assigned the right models for the nets, the drivers and the receivers and their relative packages, I have run the first simulations without any substantial operation on the board design, apart from enabling the trace coupling and having removed non-functional pads. This is what I have achieved in the very first launch.



(c)

Figure 3.7: Eye density plots for the 25 Gb/s TX61 (b) and RX61 (c) channels from the FPGA to the electro-optical transceiver (a).



Figure 3.8: Eye density plots for the 8 Gb/s PCIE\_TX0 (b) and PCIE\_RX0 (c) transmission lines from the FPGA to the PCIe connector (a).

### Chapter 4

### Increasing performances

What it is observed is that three out of four simulations are successful, meaning that the model description and the design itself are precise and operate well enough to pass the integrity test. The Transmitting net from the FPGA to the PCIe, instead, suffers from a large amount of noise, the eye diagram is almost completely closed and its shape is undefined, with switching states varying in time and voltage. This is possibly due to two main reasons: either the model description of the interconnect is not accurate enough, or the design of the channel and its surroundings can be enhanced by varying its geometry and other parameters. This chapter is devoted to the analysis of this particular bad-performing outcome.

As far as what we have seen up to now, the success of a simulation relies on a world of considerations and influencing factors that complicate its accomplishment. A first distinction among what can have impact on the outcome can be done: on one side there is the requirement to simulate a situation as close as possible to the reality, providing right modelling of all the parts constituting a network, on the other hand there is a wide gamma of parameters in the PCB design that can be changed and adjusted while developing the project.

In first instance, I have focused on the modelling part of the net, indeed there are some critical areas that involve other components on the board, which may interfere with the data transmission, that deserve a deeper investigation. In particular, the region near the pins connecting the FPGA to the first layer and the area surrounding the capacitor on the PCIE\_TX0 lines can degrade the goodness of the signal; this is why I have decided to implement 3D models of these areas in the simulation.



Figure 4.1: Planar and three-dimensional view of a capacitor in the PCIE\_TX0 line.

By doing so for both the critical zones identified with the 3D Area Manager, I have included the effect of these portions of PCB into the net schematic and a second simulation has been run producing the following eye density plot.



Figure 4.2: Eye density plot of the PCIE\_TX0 channel after implementing in the net description the 3D models for critical areas.

Even though the diagram presents still some evident complications and a non-sharp shape, this time the mask representing the minimum transmission requirements is respected and therefore the test can be regarded as successfully passed. Still the definition of the of the plot is not very good and the margin left between the eye gap and the mask is small, thus if we want to widen more the opening it is recommended to keep working on the net.

#### 4.1 Parameters

This is where design parameters come to benefit, especially the ones that have a physical impact on the layout of the FELIX board. We can distinguish between those that are strictly related to the geometry of the project, including the chosen components and ICs, and those that are related to the algorithmic part of the simulation.

#### 4.1.1 Geometrical

Geometrical parameters include: the topology of the net (namely how it is traced across the board), the dimension and relative position of vias and passive components, the thickness of layers and their composition depending on their function (signal, power, structural etc.) and other characteristics like the proximity of the traces that could experience crosstalk.



Figure 4.3: Backdrilling applied to a via connecting the signal net on the first layer (L1\_Top in red) and the net on the third layer (L3\_Signal1 in purple). By doing this the unused part of the via from the last layer to the third one is fully removed.

One important role is played by the stubs, which are the remaining unused parts of vias. Since vias are basically conducting holes connecting traces on different layers, the residual part of the tunnel can act as a resonant circuit if subjected to precise frequencies, hence a good choice when possible, which is the case of my third simulation, is to fully remove the stubs. With the backdrilling technique the vias are dig up from the unused part to prevent them from absorbing some energy of the signal, which would inevitably lead to a degradation.

#### 4.1.2 Configuration

The second family of parameters regards those that can be tuned just before performing the simulation: for example the pre-emphasis and de-emphasis of transmitter and receivers, the introduction of noise and jitter to condition the signal and simulate worse scenarios and other parameters related to the transmission protocol itself as words length, the number of bits, the data rate etc. of the line under study.

### 4.2 Proposed solution

Out of all these possibilities I have chosen to work with those that could have brought a major effect on the simulation, the geometrical parameters. In particular I have applied backdrilling to those crucial vias that have been spotted thanks to the 3D Area Manager, to reduce their resonance effects. Moreover, I have edited the stack-up of the FELIX board by redefining the structure of the layers corresponding to the PCIe connector, indeed reducing the number of layers from 24 to 14, the ones that are actually present in the realised board. While doing so, I have also changed the permittivity and the loss tangent of the transmission layers that surround the traces of data transmission simulating a different material.



Figure 4.4: Three-dimensional proportioned view of the edited stack-up of the FELIX board: in this configuration the PCIe connector region is treated separately from the main board (master).

The chosen material is MEGTRON 6, a high performance material that ensures less degradation of the signal. With these additional choices, I have launched a third and final simulation on the PCIE\_TX0 and this is the resulting eye density plot.





Figure 4.5: Evolution of the eye density plot of the critical PCIE TX0 connection.

By comparing the evolution of eye density plot diagrams with different simulation layouts, it is evident the huge improvement in the signal quality, upholding the goodness of the choices that I have made in the proposed solution.

These results are supported also by the following plots of eye diagram contours and BER bathtub curves displaying the improvement in SI simulations of the PCIE\_TX0 line.



Figure 4.6: Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 8 Gb/s PCIE\_TX0 transmission line from the FPGA to the PCIe connector.

### Chapter 5

### Conclusions

The research in HEP always poses significant challenges to the technical infrastructures involved in the process of unveiling the undiscovered of the Universe. In particle physics experiments, new technologies have to cope with more and more sophisticated and complicated requests everyday. In this optic, the largest particle collider ever built, the LHC, will keep being upgraded and improved in its detection capability. As a matter of fact, by the the end of Run-3 in 2024, the HL-LHC installation will take place, and the detector set-up with its trigger and data acquisition chain will be renewed to achieve a nominal luminosity greater than the actual one, at least by a factor five.

In particular, a new inner detector, the ITk, will be placed inside the ATLAS detector and, together with the calorimeters and the muon system, it will be employed a new TDAQ scheme based on the FELIX board.

To conclude, the work presented in this thesis is the first of this kind, intended to study and validate the design of high-performance DAQ boards in HEP experiments. It is the result of a wide collaboration between the BNL and CERN in primis for the project, and between the Link Engineering and the INFN section of Bologna, respectively for the hardware and software support.

The FELIX board studied in this activity is candidate to be a fundamental part of the new TDAQ chain that is being developed as part of the Phase-II upgrade of ATLAS. Since the PCB will play a major role in data communication, a huge effort is dedicated to ensure its reliability, to express its features at maximum potential.

When dealing with high frequency designs, it becomes essential to keep high the attention on all possible critical channels and on related signal integrity issues, that could deteriorate the transmission. The investigation lead in this thesis had the objective of checking whether problems of this sort were present or not, and if so to overcome them proposing modifications to the design. Although the produced results represent just a frame of a wider quality assurance study, required to completely validate a complex PCB (like the 24 layers FELIX board), the SI acknowledgement and the experience that I have acquired with the HyperLynx software can find application, hence be reproduced, on all the crucial nets of the project, providing an essential support to its development.

### Acknowledgements

Ci tengo a ringraziare innanzitutto il professore Alessandro Gabrielli, che mi ha accompagnato sin dalla triennale nel mio percorso di studio e verso il quale nutro una profonda e sincera stima, per la dedizione e l'impegno personale che ha dimostrato nell'aiutarmi a realizzare anche questo lavoro di tesi. Assieme a lui, ringrazio Fabrizio e i preziosi consigli che mi ha dato in questi anni tra una chiacchiera e l'altra nel laboratorio di elettronica in Berti Pichat. Ringrazio ancora Davide Falchieri della sezione INFN di Bologna per il supporto tecnico fornitomi.

Grazie anche a Luca Pelliccioni e soprattutto a Roberto Carretta della Link Engineering, per l'opportunità che mi è stata data di svolgere questa attività e per la loro professionalità e competenza.

Grazie a tutti i compagni di corso di Triennale e Magistrale, a tutti coloro che sono stati parte integrante della mia vita universitaria, un forte in bocca al lupo a chi è quasi arrivato in fondo al proprio percorso.

Un grazie smisurato e fraterno a White, Alfa, Jack, Comby e Fede della Quark Family per tutte le risate, le fatiche, le imprecazioni e le gioie condivise insieme, per ogni brindisi alla nostra amicizia, per ogni pranzo improvvisato e per ogni appunto scambiato.

Un ringraziamento speciale ai miei coinquilini di questi anni che hanno saputo colorare la mia vita di fuorisede, in particolare a quelli di LLZ8, a Davide, al mio fratello acquisito Luca e a Giulia, prezioso tassello della mia vita.

Un grazie a tutti gli amici che mi hanno accolto a Bologna e hanno voluto condividere con gioia il loro tempo col mio, grazie a tutti i ragazzi del Team Bombardona, Gianca, Chiaps, Fenno, Fox, Ciccio, Pio, Beppe, Grox, Gigi, Nico, Paolino, Tommy e Gala, e a quelli di TGIF, Dave, Pizzio, Edo, Enri, Fillo, Franci, Marti, Samu, Silvia e Vale.

Un grazie speciale di cuore, il più importante di tutti, alla mia Famiglia per tutto l'amore, il supporto e la fiducia che ho ricevuto in questo lungo percorso di crescita educativa e personale, Babbo, Mamma e Asia, siete la mia vera forza.

## Appendix A

### FPGA

Field Programmable Gate Arrays (FPGAs) are reprogrammable integrated circuits (IC), internally made of an array of reconfigurable logic states, subdivided in blocks. They are semicondutor flexible devices, with a high hardware-timed speed and reliability and, by their own nature, they work in parallel way: different processing operations do not have to compete for the same resources, thus the performances of a single part are not affected by the addition of another. FPGAs embed a two dimensional array of programmable logic blocks, and a hierarchy of reconfigurable interconnections which allow the connection of the component blocks. An FPGA contains, in its IC, millions of gates and the main components are the Configurable Logic Blocks (CLB), which includes the digital logic, inputs and outputs, the Programmable Interconnects (PI), which provide direction between the logic blocks to implement the use logic, and the Input/Output (I/O) Blocks, which are used to communicate with the FPGA from the outside world. These devices have always been associated to DAQ systems (like FELIX) since they are extremely versatile in terms of technologies (optical fibre, coaxial cables, ADCs, etc.), and very powerful thanks to the possibility of embedding a whole processor inside their programmable logic. Here is presented a general scheme view of an FPGA logic.



Figure A.1: FPGA structure design, with the internal component blocks: the Configurable Logic Blocks (CLB), the Programmable Interconnects (PI) and the I/O Blocks.

In general, a logic block consist of few logical cells and, when a design circuit has to be mapped with the internal components, these cells, which are considered as resources, are connected together. In order to generate an efficient design, the internal resources should not be stressed to the limits. Eventually, routing in FPGA consists of wiring segments of varying lengths which can be interconnected via electrically programmable switches. This is a basic cell of a CLB.



Figure A.2: Example of an FPGA unit cell: the main sub-parts are the Lookup Table (LUT), a D-type Flip-Flop (FF) and a Multiplexer (MUX).

The main configuration of an FPGA arrays and components can be done in three possible ways:

- symmetrical arrays, where CBLs are arranged in rows and columns of a matrix and interconnected by PIs. In the periphery of the FPGA the I/O Blocks provide the communications with the electronics outside the array structure. Each CLB consists of an n-input LUT and a pair of programmable FF. I/O Blocks also control functions such as tristate control and output transition speed. Interconnects provide routing path and, obviously, direct interconnects within adjacent logic blocks have smaller delays with respect to general purpose interconnect;
- row based Architecture, where the structure of logic modules is of alternating rows of logic modules and PI tracks. The I/O are at the end of each row, and these last are connect to each other vertically by the interconnect. Combinatorial modules contain only combinational elements while sequential modules contain both combinational elements along with FF. Routing tracks are divided into smaller segments connected by anti-fuse elements between them;
- hierarchical programmable logic device, where the architecture is structured with a top level containing only logic blocks and interconnects. Each logic block contains number of logic modules and each logic module has combinatorial as well as sequential functional elements. Each of these functional elements is controlled by the programmed memory. Communication between logic blocks is achieved by PI arrays. I/O Blocks surround this scheme of logic blocks and interconnects.

## Appendix B

## Supplementary diagnostic diagrams

Here are reported for completeness the bathtub curves and eye contours of the transmission lines simulated in this thesis work that have not been included in Chapter 4.



Figure B.1: Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 25 Gb/s TX61 transmission line from the FPGA to the electro-optical transceiver.



Figure B.2: Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 25 Gb/s RX61 transmission line from the electro-optical transceiver to the FPGA.



Figure B.3: Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 8 Gb/s PCIE\_RX0 transmission line from the PCIe connector to the FPGA.

# List of Figures

| 1.1  | CERN accelerator complex; the LHC in blue is the last stage of the ac-      |     |
|------|-----------------------------------------------------------------------------|-----|
|      | celerating chain which comprises smaller rings                              | 2   |
| 1.2  | Open view of sub-detectors geometry in the ATLAS experiment, including      |     |
|      | the central solenoid magnet and the barrel and end-cap toroid magnets       | 3   |
| 1.3  | Cross sections of the four technologies employed for muon detection         | 4   |
| 1.4  | Representation of the ATLAS Inner Detector with the SCT and TRT             |     |
|      | technologies disposition in the barrels and end-caps; inside the barrels it |     |
|      | is arranged the Pixel Detector                                              | 7   |
| 1.5  | Detailed illustration of the effective dimensions of the ID tracking system |     |
|      | around the beam pipe                                                        | 8   |
| 1.6  | Composition of the Pixel Detector, the main tracking system inside AT-      |     |
|      | LAS used to reconstruct interaction vertices                                | 9   |
| 1.7  | 3D reconstruction of the layers composing the ASIC chips included in the    |     |
|      | PD                                                                          | 10  |
| 1.8  | Each readout component of the PD is mounted on staves displaced around      |     |
|      | the beam pipe with a precise angle to cover the entire geometry             | 11  |
| 1.9  | IBL planar (a) and 3D (b) pixels read by Front End chips and connected      |     |
|      | to the readout chain trough flexible circuits                               | 12  |
| 1.10 | Roadmap of LHC main upgrade stages together with centre of mass energy      |     |
|      | and integrated luminosity targets to be reached                             | 13  |
|      | Display of the ATLAS Phase-II Inner Tracker ITk layout with all its shells. | 15  |
| 1.12 | Schematic layout of one quadrant of the ITk Inclined Duals layout for the   |     |
|      | HL-LHC, the active elements of the barrel and end-cap Strip Detector are    |     |
|      | shown in blue, for the Pixel Detector the sensors are shown in red for the  |     |
|      | barrel layers and in dark red for the end-cap rings                         | 16  |
| 1.13 | Schematic of the overall baseline design of the TDAQ system in Phase-       |     |
|      | II with the trigger levels and dataflow indicated in the legend aside. In   |     |
|      | particular the DAQ system is made up of FELIX, the Data Handlers, and       |     |
|      | the Dataflow subsystem at 1 MHz.                                            | 18  |
| 1.14 | FLX-712 hosting a Xilinx Kintex Ultrascale FPGA (circled in red), a PCIe    |     |
|      | Gen.3 X16 connector (circled in orange), the MiniPODs (circled in light     | . · |
|      | blue) and the connector for a Timing and Trigger Control (TCC) mezzanine.   | 21  |

| 22<br>23 |
|----------|
|          |
|          |
| 24<br>25 |
| 33       |
| 37       |
| 38       |
| 40       |
| 41       |
| 41       |
| 42       |
| 43       |
|          |

#### LIST OF FIGURES

| 3.8          | Eye density plots for the 8 Gb/s PCIE_TX0 (b) and PCIE_RX0 (c) transmission lines from the FPGA to the PCIe connector (a). $\ldots$ .                                                                                                                | 44       |
|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| $4.1 \\ 4.2$ | Planar and three-dimensional view of a capacitor in the PCIE_TX0 line.<br>Eye density plot of the PCIE_TX0 channel after implementing in the net<br>description the 3D models for critical areas.                                                    | 46<br>46 |
| 4.3          | Backdrilling applied to a via connecting the signal net on the first layer (L1_Top in red) and the net on the third layer (L3_Signal1 in purple).<br>By doing this the unused part of the via from the last layer to the third one is fully removed. | 47       |
| 4.4          | Three-dimensional proportioned view of the edited stack-up of the FE-<br>LIX board: in this configuration the PCIe connector region is treated<br>separately from the main board (master)                                                            | 48       |
| 4.5          | Evolution of the eye density plot of the critical PCIE_TX0 connection.                                                                                                                                                                               | 49       |
| 4.6          | Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 8 Gb/s PCIE_TX0 transmission line from the FPGA to the PCIe connector.                                                                                                  | 50       |
| A.1          | FPGA structure design, with the internal component blocks: the Config-<br>urable Logic Blocks (CLB), the Programmable Interconnects (PI) and the<br>I/O Blocks.                                                                                      | 55       |
| A.2          | Example of an FPGA unit cell: the main sub-parts are the Lookup Table (LUT), a D-type Flip-Flop (FF) and a Multiplexer (MUX)                                                                                                                         | 55<br>56 |
| B.1          | Eye contour and bathtub curve, respectively in Voltage and BER scale, of the 25 Gb/s TX61 transmission line from the FPGA to the electro-optical $\dot{A}$                                                                                           | ~ -      |
| B.2          | transceiver                                                                                                                                                                                                                                          | 57       |
| B.3          | to the FPGA                                                                                                                                                                                                                                          | 58<br>58 |

## Bibliography

- 1. Airapetian, A. et al. ATLAS detector and physics performance: Technical Design Report, 1 https://cds.cern.ch/record/391176 (CERN, Geneva, 1999).
- ATLAS inner detector: Technical Design Report, 1 http://cds.cern.ch/record/ 331063 (CERN, Geneva, 1997).
- Capeans, M et al. ATLAS Insertable B-Layer Technical Design Report tech. rep. CERN-LHCC-2010-013. ATLAS-TDR-19 (2010). https://cds.cern.ch/record/ 1291633.
- Béjar Alonso, I et al. High-Luminosity Large Hadron Collider (HL-LHC): Technical design report (ed Béjar Alonso, I) https://cds.cern.ch/record/2749422 (CERN, Geneva, 2020).
- Technical Design Report for the ATLAS Inner Tracker Pixel Detector tech. rep. CERN-LHCC-2017-021. ATLAS-TDR-030 (CERN, Geneva, 2017). https://cds. cern.ch/record/2285585.
- Technical Design Report for the Phase-II Upgrade of the ATLAS TDAQ System tech. rep. CERN-LHCC-2017-020. ATLAS-TDR-029 (CERN, Geneva, 2017). https: //cds.cern.ch/record/2285584.
- Brooksl, D. Signal Integrity Issues and Printed Circuit Board Deisgn (Prentice Hall PTR, Upper Saddle River, 2003).
- 8. Bogatin, E. Singal and Power Integrity Simplified (Prentice Hall, Upper Saddle River, 2010).