

Rete Ottica di Accesso a Divisione di frequenza e/o di lunghezza d'onda per soluzioni Next Generation Network

## **ROAD-NGN**

## Digital Signal Processing for FDMA-PON: Evaluation of Processing Complexity of Three Different Architectures

Roberto Cigliutti, POLITO <u>Roberto Gaudino</u>, POLITO









- Motivations
- FDMA/OFDMA PON Case Studies
- Complexity Evaluations
- Conclusions

## **Toward New Generation-PONs**



## Latest PON ITU-T Standard

- ITU-T G.989 NG-PON2 (TWDM-PON) (March 2013)
- PON Data Capacity:

R.O.A.D.

3

- 4λx10 Gbps Downstream and 4λx2.5 Gpbs Upstream (over up to 64 user)
- Simultaneous use of WDM (4  $\lambda$ ) and TDM (Burst mode) technologies



## **Toward New Generation-PONs**



- DS/US asymmetric user data capacity (i.e. for 32 users: 1.25Gbps/312.5Mbps)
- "Colored" ONUs (ONUs are not interchangeable: i.e. 8-ONU/ $\lambda$ )
- In-service synchronization of the TDM ONUs



## **CMOS Technology Evolution**

#### RELEASED FOR PRODUCTION

(intel) Innovation Enabled (echnology Pipeline Our Visibility Contil Vies to Go Out ~10 Years





## **Latest Released Devices**

| 45nm                     | 28nm<br>VIRTEX.?"<br>KINTEX." | 20nm<br>VIRTEX.<br>VIRTEX.<br>KINTEX. | 16nm<br>VIRTEX<br>UBINSCALE+<br>KINTEX | ALL PROGRAMMABLE.   |
|--------------------------|-------------------------------|---------------------------------------|----------------------------------------|---------------------|
| SPARTAN                  | ARTIX?                        |                                       |                                        | 2x DSP Slices       |
|                          | Virtex-7 Family               | Virtex<br>UltraScale                  | Virtex<br>UltraScale+                  | 2x Logic Cells      |
| Logic Cells (K)          | 1,955K                        | 627-4,433                             | 690-2,863                              | 2x Clock Speed      |
| DSP (Slices)             | 3,600                         | 600-2,880                             | 2,280-11,904                           | ZX CIOCK Speed      |
| DSP Performance (GMAC/s) | 5,335 GMAC/s                  | 4,268                                 | 21,213                                 | 4x DSP Performances |

Highest linearity, smallest dual, 16-bit, 800-MSPS DAC



Fujitsu Semiconductor Europe

Factsheet LEIA 55 – 65 GSa/s 8-bit DAC High-Speed & Low-cost DACs f<sub>s</sub><1GS/s

#### Ultra-High-Speed DACs f<sub>s</sub>>20GS/s

## Facts, Consequences & Open Questions

## FACTS

New high-speed electronic devices are today available.

## **CONSEQUENCES**

- A 10GHz analog processing bandwidth can be considered almost "a commodity".
- Electrical FDM-PON can be a realistic solution for future NG-PONs.

#### **OPEN QUESTIONS**

- Can a data capacity up to 40 Gb/s be handled at the OLT?
- Which is the most promising (cost, complexity) solution?







- Motivations
- FDMA/OFDMA PON Case Studies
- Complexity Evaluations
- Conclusions



## **Toward New Generation-PONs**



## **Improving NG-PON2**

- DS/US symmetric (1 Gbps/user) over a single wavelength
- Simplification of the ONUs operations

#### **DOWNSTREAM CASE STUDY:**

- Delivery of multilevel signals (1 Gbps/user per 32 users )over a single wavelength over the same ODN classes of the NG-PON2
- Use of the Electrical FDM-(A)ccess Technology (i.e. Electrical FDM) with SUBBAND DETECTION at ONUs
- Mandatory use of "low-cost" HW (DAC & DSP) for the ONUs



Subcarrier Multiplexing for Sub-band Detection





## **OLT TX Architecture**





#### FDMA Approach:

#### User Channel System Parameters

16QAM @ R<sub>s</sub>=275MBaud (incl. FEC); 10% Nyquist spectrum roll-off, BW≈9.8GHz

#### Sub-band DSP also in OLT

- Architecture #1 Mixed Analog/Digital
- Full-band DSP
  - Architecture #2 Full Digital

#### **OFDMA Approach:**

#### User Channel System Parameters

10 OFDM Subcarriers/channel; each subcarrier 16QAM @  $R_s$ =27.5MBaud (incl. FEC), BW $\approx$ 8.8GHz

- **Full-band DSP** (Sub-band OFDM is not practical)
  - Architecture #3 Full Digital



## FDMA #1: Mixed Digital/Analog Sol.





## FDMA #2: Full Digital Solution





## **OFDMA: Full Digital Solution**



## HW/DSP Solutions: FDMA vs. OFDMA

|           | FDMA<br>Architecture #1<br>(Mixed Dig./Analog)                                                                                                                                                                                                                  | FDMA<br>Architecture #2<br>(Full Digital)                            | OFDMA<br>Architecture<br>(Full Digital)                                                  |
|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------|------------------------------------------------------------------------------------------|
| DSP       | Entry level FPGA/ASIC<br>(Fs≥555.5 MHz)                                                                                                                                                                                                                         | High-End FPGA/ASIC<br>(Equiv. Fs≥18.026 GHz)                         | High-End FPGA/ASIC<br>(Equiv. Fs≥18.026 GHz)                                             |
| DAC       | N≤32 x Low-Cost CMOS                                                                                                                                                                                                                                            | 1x High Perfor. CMOS                                                 | 1x High Perfor. CMOS                                                                     |
| Analog    | I/Q mixers, RF<br>Amplifiers, 1:32 coupler<br>(Critical Design)                                                                                                                                                                                                 | RF Amplifiers only                                                   | RF Amplifiers only                                                                       |
| Operation | High power dissipation<br>(analog HW)<br>(CONSTRUCTION<br>(analog HW)<br>(CONSTRUCTION<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(analog HW)<br>(but scalable w.r.t. the<br>active channels # | Possible power<br>dissipation scaling w.r.t<br>the # of active users | DSP scaling for the # of<br>active users not possible in<br>principle (Fixed IFFT size ) |







- Motivations
- FDMA/OFDMA PON Case Studies
- Complexity Evaluations
- Conclusions



#### FDMA Architectures

• SRRC FIR Filter (N<sub>taps</sub>=150):

Overlap&Save – with 2048 pts/block @ f<sub>s</sub>=550 MSample/s

• Upsampling:

Cascaded-Integrator-Comb (CIC) interpolating filters (no multiplications/sums required)

• I/Q modulation:

classical sin()/cos() multiplication + 1 sum @ f<sub>s</sub>=19.8 GSample/s

#### OFDMA Architectures

• **IFFT** (size =1024 pts for BW≈10GHz signal):

Optimized Radix-2 or Split-Radix FFT Algorithms

(Real-valued signal  $\Rightarrow$  halves required FFT size to 512pts)



## **FIR Filter Algorithms**



• For a real sequence and a filter with real coefficients.



## **FFT/IFFT Algorithms**

FFT - Real Multiplications Cost



■ Number of DSP Multipliers ≤ Number of Algorithm Multiplications



## **Computational Complexity**

|                                                                                                                                  | Real Multiplications<br>[Operations/bit] | Real Sums<br>[Operations/bit] |
|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------|-------------------------------|
| FDMA Architecture #1<br>(Overlap & Save FIR N <sub>taps</sub> =150<br>FFT Block N <sub>FFT</sub> =2 <sup>11</sup> =2048)         | ≈ <b>72.09</b>                           | ≈ <b>102.13</b>               |
| FDMA Architecture #2<br>(Overlap & Save FIR N <sub>taps</sub> =150<br>FFT Block N <sub>FFT</sub> =2 <sup>11</sup> =2048)         | ≈ <b>108.20</b>                          | ≈192.40                       |
| <b>OFDMA-R2 Architecture</b><br>( <b>Optimized Radix-2 FFT</b><br>N <sub>FFT</sub> =512, Subcarriers # N <sub>SC</sub> =10 )     | ≈ <b>44.09</b>                           | ≈ 96.03                       |
| <b>OFDMA-SR Architecture</b><br>( <b>Optimized Split-Radix FFT</b><br>N <sub>FFT</sub> =512, Subcarriers # N <sub>SC</sub> =10 ) | ≈ <b>31.16</b>                           | ≈88.91                        |

- Symbol Rate: R<sub>s</sub>=275 Mbaud (Including FEC)
- QAM order: M=16,

21

• Number of Mux Channels: N<sub>ch</sub>=32



## **FPGA Resources: Slices**

#### **Multiplications & Sums** $\Rightarrow$ **Physical Multipliers & Adders**



- I Multiply-and-Accumulate (MAC) unit (Slice or Block) can be used for: 1x18bit Adder or 1x18bit Multiplier
- Computational complexity in FPGA/DSP:
  - Overall DSP size (number of "DSP Slices or Blocks"
  - **Computation Performance** (MAC operations-per-second i.e MAC/s)



## **FPGA-DSP Performances**





## **FPGA Resources Usage**









- Motivations
- FDMA/OFDMA PON Case Studies
- Complexity Evaluations
- Conclusions



- The OFDMA approach results to be advantageous w.r.t. the DSP implementation in FPGA (Particularly using Split-Radix based FFTs).
- All FDMA approaches are less efficient from the DSP complexity point-of-view.
- Nevertheless:
  - the "Full Digital" FDMA approach allows the implementation of power saving features for reduced user numbers ⇒ OPEX saving
  - The "Mixed Digital/Analog" FDMA approach is the only solution that does not require very fast DACs ⇒ actually the cheapest solution.







http://www.roadngn.uniroma3.it/index.html

# THANK YOU FOR YOUR ATTENTION







# **BACK-UP SLIDES**



#### Convolution Filter

Overlap&Save Algorithm:



#### Pros and Cons

30

- Compromise between complexity and I/O latency
- Requires high FFT size for efficiency (i.e. 2048pts for 150 Taps)



#### Optimized Radix-2

Classical Cooley-Turkey FFT w/o Trivial Multiplications (i.e. Twiddle factors  $(W_N)^m = +1, -1, +i, -i$ ).



"Butterfly" computation unit for Radix-2 algorithms

## Optimized Split-Radix

Modification of the Cooley-Turkey FFT using multiple order Radix "Butterflies" (i.e. Radix-2, 4, 8) for complexity optimization.

Both algorithms are suitable for the efficient implementation in VLSI/FPGA (Hardware optimization  $\Rightarrow$  not included in this analysis)