### §3 Programmable Logic Devices

### 3.0 Introduction

- Low cost, low risk way of implementing digital circuits as application specific Ics (ASICs).
- Technology of choice for low to medium volume products (say hundreds to few 10's of thousands per year).
- Good and low cost design software.
- Latest high density devices are over 1 million gates!

### 3.1 Technologies for Programmable Logic Devices

Current PLDs are based on three different technologies: antifuse, static RAM, EPROM/EEPROM.

### 3.1.1 Antifuse (See figure 3.1)

Invented at Stanford and developed by Actel. Currently mainly used for military applications. See <a href="https://www.actel.com">www.actel.com</a>.



Figure 3.1 Actel antifuse

## Number of antifuses on Actel FPGAs

| Device | Antifuses |
|--------|-----------|
| A1010  | 112,000   |
| A1020  | 186,000   |
| A1225  | 250,000   |
| A1240  | 400,000   |
| A1280  | 750,000   |



The resistance of blown Actel antifuses

### 3.1.2 Static RAM

DIGITAL SYSTEM DESIGN



3.2

Figure 3.2 1-bit of static RAM

- Almost all Field Programmable Gate Arrays (FPGAs) are based on static RAMs. (Figure 3.2)
- Static RAM cells are used for three purposes:
  - 1. As lookup tables (LUTs) for implementing logic (as truth-table).
  - 2. As embedded block RAM blocks (for buffer storage etc.).
  - 3. As control to routing and configuration switches.

#### Advantages:

- Easily changeable (even dynamic reconfiguration)
- Good density
- Track latest SRAM technology (moving even faster than technology for logic)
- Flexible no only good for FSM, also good for arithmetic circuits

#### Disadvantages:

- Volatile
- · Generally high power

### 3.1.3 EPROM & EEPROM (Figure 3.3)

- · Generally used in product-term type of PLDs.
- Non-volatile and reprogrammable.
- · Good for FSM, less good for arithmetics.



#### An FPROM transistor

- (a) With a high (>12V) programming voltage,  $V_{PP}$ , applied to the drain, electrons gain enough energy to "jump" onto the floating gate (gate1)
- **(b)** Electrons stuck on gate1 raise the threshold voltage so that the transistor is always off for normal operating voltages
- **(c)** UV light provides enough energy for the electrons stuck on gate1 to "jump" back to the bulk, allowing the transistor to operate normally

Facts and keywords: Altera MAX 5000 EPLDs and Xilinx EPLDs both use UV-erasable electrically programmable read-only memory (EPROM) • hot-electron injection or avalanche injection • floating-gate avalanche MOS (FAMOS)

Figure 3.3 EPROM/EEPROM technologies for CPLDs

DIGITAL SYSTEM DESIGN 3.4

### 3.2 FPGA Architectures

All FPGAs have the following key elements:

- The Programming technology
- The basic logic cells
- The I/O logic cells
- Programmable interconnect
- · Software to design and program the FPGA

Currently the three main player in this field are:-

- Actel
- Altera
- Xilinx

### 3.3 Actel FPGA's Architecture

- Uses antifuse technology
- Based on channelled gate array architecture as shown in Figure 3.4.
- Each logic element (labelled 'L') is a combination of multiplexers which can be configured as a multi-input gate as shown in Figure 3.5).



Figure 3.4 Channelled gate array architecture



Figure 3.5 Actel basic logic element configured as a 3-input AND gate



Figure 3.6 Actel combinational (C-module) and sequential (S-module) cells

- Combinational modules can also be used to form a latch.
- Using two latches in series (master-slave), a flip-flop can be built using two C-modules.
- Two C-modules is much bigger than a dedicated flip-flop.
- Actel introduced S-modules (sequential) which basically add a flip-flop to the MUX based C-module.
- ACT2 and ACT3 families have a mixture of C and S modules.

DIGITAL SYSTEM DESIGN 3.6

| Capability         | ACT 1               | ACT 2/1200XL                            | 3200DX                                  | ACT 3                                               |
|--------------------|---------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------------------|
| Core Module        | Simple Logic Module | Combinatorial and<br>Sequential Modules | Combinatorial and<br>Sequential Modules | Combinatorial and<br>Enhanced Sequential<br>Modules |
|                    |                     |                                         | Wide Decode Modules                     |                                                     |
|                    |                     |                                         | Embedded Dual-Port<br>SRAM              |                                                     |
| Interconnect       | Channeled           | Channeled                               | Channeled                               | Channeled                                           |
| Clocking Resources | Routed Clock (1)    | Routed Clocks (2)                       | Routed Clocks (2)                       | Routed Clocks (2)                                   |
|                    |                     |                                         | Quad Clocks (4)                         | Dedicated Array Clock                               |
|                    |                     |                                         |                                         | Dedicated I/O Clock                                 |
| I/O Module         | Simple I/O Module   | Latched I/O Module                      | Latched I/O Module                      | Registered I/O Module                               |

Figure 3.6 Summary of Key Actel FPGA Architectures

3.8

### 3.4 Altera MAX family of CPLDs

- CPLD = Complex Programmable Logic Devices
- FPGA = Field Programming Gate Arrays
- · Altera has three different PLD families:
  - 1. MAX family product-term based macrocells
  - 2. FLEX family SRAM based lookup tables (LUTs)
  - 3. APEX family mixture of product-term and LUT based devices

### 3.4.1 MAX 7000 Family



Figure 3.7 MAX7000 chip architecture

- Basic logic element is a macrocell which can implement a Boolean expression in the form of sum-of-product (SOP).
- An example of such a sum-of-product is: a•!b•c•!d + a•c•e + !a•f
- Each product term could have many input variables ANDed together. A SOP could have a number of product terms Ored together.
- Each macrocell also contains a flip-flop essential for implementing FSM.
- 16 macrocells are grouped together to form a Logic Array Block (LAB).
- In the centre is a Programmable Interconnect Array (PIA) which allows interconnection between different part of the chip.



Figure 3.8 Internal structure of a MAX 7000 Macrocell

- Each horizontal line represent a product term.
- Inputs are presented to the product term as signal and its inverse.
- Each macrocell can normal OR 4 product terms together.
- Each LAB share an additional 16 shared product terms in order to cope with more complex Boolean equations.
- Output XOR gate allows either efficient implementation of XOR function or programmable logic inversion.
- The SOP output can drive the output directly or can be passed through a register.
- This architecture is particularly good for implementing finite state machine.
- Each register can store one state variable. This can be fed back to the logic array via the Programmable Interconnect Array (PIA).
- This is not efficient for adder or multiplier circuits or as buffer storage (such as register file or FIFO buffers) – waste the potential of the logic array.

#### 3.5 Altera FLEX 8K/10K Families of FPGAs



Figure 3.9 FLEX 8K chip architecture



- Organised as rows of logic cells (similar to ACTEL's gate arrays) with routing channels in between.
- Each row of logic contains many Logic Array Blocks (LAB).
- Each LAB contains 8 Logic Elements (LE).
- Each LAB has its own Local Interconnect.
- Chip-level wiring is done with row and column interconnections.
- Flex 8K family is obsolete.
- Flex 10K family has this basic structure + Embedded Array Block (EAB) in each row.
- An EAB is a block of 2K bit SRAM.



Figure 3.11 Inside a FLEX 10K Logic Array Block (LAB)

3.12

Each Logic Element (LE) contains the following:

- A 16-bit SRAM lookup table (LUT) this can implement an arbitrary 4-input logic function (as truth table).
- Circuitry that form fast carry chain and fast cascade chain (see later).
- A D-register that can be by-passed.
- · Various preset/reset logic for the register.



Figure 3.12 Internal architecture of a Logic Element (LE)

|                          | EPF10K10<br>EPF10K10A | EPF10K20 | EPF10K30<br>EPF10K30A | EPF10K40 | EPF10K50<br>EPF10K50V<br>EPF10K50A | EPF10K70 | EPF10K100<br>EPF10K100A | EPF10K130V<br>EPF10K130A | EPF10K250A |
|--------------------------|-----------------------|----------|-----------------------|----------|------------------------------------|----------|-------------------------|--------------------------|------------|
| Typical Gates            | 10,000                | 20,000   | 30,000                | 40,000   | 50,000                             | 70,000   | 100,000                 | 130,000                  | 250,000    |
| Logic Elements           | 576                   | 1,152    | 1,728                 | 2,304    | 2,880                              | 3,744    | 4,992                   | 6,656                    | 12,160     |
| RAM Bits                 | 6,144                 | 12,288   | 12,288                | 16,384   | 20,480                             | 18,432   | 24,576                  | 32,768                   | 40,960     |
| Registers                | 720                   | 1,344    | 1,968                 | 2,576    | 3,184                              | 4,096    | 5,392                   | 7,120                    | 12,624     |
| Maximum User<br>I/O Pins | 134                   | 189      | 246                   | 189      | 310                                | 358      | 406                     | 470                      | 470        |

Figure 3.12a The size of FLEX 10K devices



Figure 3.13 Carry Chain in FLEX 8K & FLEX 10K

- Starts in first LE (LE1) of every LAB
  - · Function's carry chain can begin in any LE of a LAB
- Runs downward through LEs of a LAB
- At end of LAB,
  - . FLEX 8000, continues to top of next LAB in same row
  - · FLEX 10K, continues to top of second-next LAB in same row
- Stops at end of row
- Stops at EAB (FLEX 10K)



Figure 3.14 The propagation of the carry chain for FLEX8K and 10K



Figure 3.15 FLEX 8K/10K Cascade Chain



Figure 3.16 Cascade chain helps in reducing delay when cascading two or more LEs

DIGITAL SYSTEM DESIGN 3.14

### 3.5.1 Embedded Array Block in FLEX 10K

- An EAB is a large block, 2048 bits, of embedded RAM
- Synchronous and asynchronous operation supported
- Synchronous read and write cycle times of 9.5 ns for -3 speed grade



EABs cascaded to create wider RAM

- no speed penalty up to 2048 bits deep
- MAX+PLUS II configures RAM in the fastest way possible

EABs cascaded, muxed to create deeper RAM

- no speed penalty up to 2048 bits deep
- MAX+PLUS II configures RAM in the fastest way possible



Figure 3.17 Three different configurations of EAB memory



Figure 3.18 A 8 x 8 multiplier using 4 EABs

- A single 4 x 4 multiplier will fit into an EAB
- . Larger multipliers can be built from EABs and Les.
- You can find plenty of useful information and application notes on: www.altera.com

DIGITAL SYSTEM DESIGN 3.16

### 3.6 Altera APEX family of FPGAs

### **APEX 20K**



# Combine & Enhance Strengths of Prior Architectures for System-on-a-Chip Applications

© 1998 Altera Corporation

– Memory Core:

## **MultiCore**<sup>™</sup> **Architecture**

- MultiCore Makes Million-Gate PLD Design Possible
- ✓ Facilitates Efficient IP Integration

Look-up Table Core: FLEX 6000 ModelProduct-Term Core: MAX 7000 Model

FLEX 10KE Model

© 1998 Altera Corporation

DIGITAL SYSTEM DESIGN 3.17 DIGITAL SYSTEM DESIGN 3.18

### **APEX 20K Features**

- ∠ 0.25-μ/0.18-μ, 6LM
  SRAM Process
- - 4,160 to 42,240
     Logic Elements
  - 53,000 to 541,000
     Bits of RAM
  - 416 to 4,224 Macrocells
- - 2M Gate Density

- ∠ 125-MHz System Performance
  - 64-Bit, 66-MHz PCI Compliant
- MultiCore<sup>™</sup> Embedded
   Architecture
  - Product Term with
     3.9-ns Performance
  - High-Speed Dual-Port RAM
  - Content Addressable Memory (CAM)
- ∠ 4-Level Continuous FastTrack Interconnect™
  - New Level of Routing Hierarchy
- - 1X, 2X, 4X
- ∠ Common I/O Standard Support
  - LVTTL, LVCMOS, SSTL3, GTL/GTL+, LVDS
- MultiVolt™ I/O Interface
- ∠ Advanced FineLine BGA™ Packaging

## **APEX 20K Family**



## **APEX 20K Performance**



pykc/2001 pykc/2001

## **APEX 20K Power Savings**



Note: 25-MHz System Performance © 1998 Altera Corporation

### 

## **APEX 20K MegaLAB**



MegaLAB

- Logic Element (LE)
  - 4-Input LUT
  - D Flipflop
  - Carry & Cascade Chains
- ∠ Logic Array Block (LAB)
  - 10 LEs
- ✓ MegaLAB
  - 16 LABs
  - 1 Embedded System Block

**New Level** of Hierarchy

© 1998 Altera Corporation

DIGITAL SYSTEM DESIGN

## **Embedded System Block**

Enhanced Embedded Structure

Optimized for System-Level Integration



© 1998 Altera Corporation

## **Product Term Advantage**

- ✓ P-Term Superior for Combinatorial Functions
  - Address Decode, State Machines
- ∠ LUT Superior for Registered Data Path Functions

| Function                                  | EPF10K100B-1 | EPM7064S-5 |
|-------------------------------------------|--------------|------------|
| 16-state, 5-input/output<br>State Machine | 129 MHz      | 161 MHz    |
| 5 x 5 Registered I/O<br>Multiplier        | 166 MHz      | 59 MHz     |

© 1998 Altera Corporation



DIGITAL SYSTEM DESIGN 3.21 DIGITAL SYSTEM DESIGN

## **Embedded Product-Term Capability**

- ∠ ESB Implements Product-Term Logic
  - 32 Product Terms
  - 16 Programmable DFFs + XOR + Parallel Expanders
- ∠ Can Be Cascaded to Implement Wide Fan-in Functions



### **Embedded RAM**

- Variable Width
  - 2,048 Bits per ESB
  - Easily Combined to Build
    Wider/Deeper Memories

    128 X 16 256 X 8 512 X 4 1.024 X 2 2.048 X 1
- ∠ Dual-Port
  - Independent Read/Write
  - 150-MHz Dual-Port FIFOs
  - Synchronous/Asynchronous

© 1998 Altera Corporation



## **Content Addressable Memory (CAM)**

- CAM Accelerates Fast Search Applications
  - Functions as a Parallel Comparator
  - Order of Magnitude Faster than RAM (Serial)
- ∠ Looks up Data in Memory & Outputs Addresses



### **Common in High-Speed Communication Applications**

© 1998 Altera Corporation



3.22

## **System-Level Memory Integration**

- Efficiently Supports Various RAM Requirements of a System-Level Design
  - Cache RAM, Dual-Port FIFO, ROM

| Function       | Configuration | Total ESBs | Performance |
|----------------|---------------|------------|-------------|
| Cache RAM      | 256 x 32      | 4          | 150 MHz     |
|                | 4,096 x 64    | 128        | 110 MHz     |
| Dual-Port FIFO | 128 x 32      | 2          | 150 MHz     |
|                | 128 x 64      | 4          | 150 MHz     |
| ROM            | 256 x 32      | 4          | 150 MHz     |
|                | 4,096 x 64    | 128        | 110 MHz     |

© 1998 Altera Corporation

pykc/2001



pykc/2001

## **Phase-Locked-Loop**

- ✓ Altera First Shipped PLL on FLEX 10K Devices in 1996
- Next-Generation PLL
  - ClockLock<sup>™</sup> Synchronization Circuitry
  - ClockBoost<sup>™</sup> Multiplication Circuitry (1X, 2X & 4X)
  - Extended Frequency Range

| Parameter            | Min. | Max. | Unit |
|----------------------|------|------|------|
| Output Frequency     | 1    | 133  | MHz  |
| Input Frequency (x1) | 1    | 133  | MHz  |
| Input Frequency (x2) | 1    | 66   | MHz  |
| Input Frequency (x4) | 1    | 33   | MHz  |
| Clock Jitter         |      | 500  | ps   |

© 1998 Altera Corporation



## **APEX 20K: Complete System Integration**

- 1-GBit Ethernet 8-Port Switch
  - 64-Bit, 66-MHz PCI
  - 2.5-V/1.8-V Supply Voltage
  - I/O Interfaces: LVTTL, SSTL-3, GTL+, LVDS



DIGITAL SYSTEM DESIGN 3.24

#### 3.7 Xilinx 4000 Series FPGA



Figure 3.19 Xilinx 4K FPGA chip level architecture

- Xilinx first to introduce SRAM based FPGA using Lookup Tables (LUTs)
- Xilinx 4000 series contains four main building blocks:
- Configurable Logic Block (CLB)
- Switch Matrix
- VersaRing
- Input/Output Block

### 3.7.1 Configurable Logic Block Architecture



Figure 3.20 Internal architecture of Xilinx 4000 CLB

- Each CLB is more complex than an Altera Logic Element.
- Each CLB has two 4-input LUTs and two reigsters.
- The two LUTs implement two independent logic functions F and G.
- The outputs F' and G' from the two LUTs inside each CLB can be to form a more complex function H.
- Not shown here are carry and cascade chain circuits similar to Altera's FLEX devices.
- For the 4000E familys, each CLB can be configured as synchronous RAM.
   Write address, data, and control are synchronized to write clock. This is called distributed RAM
- · Possible configurations are:
  - 1. Two independent 16 x 1 RAMS
  - 2. One 32 x 1 or 16 x 2 RAM
  - 3. One 16 x 1 dual-port RAM (second port is read-only)

DIGITAL SYSTEM DESIGN 3.26

### 3.5.2 Neighbourhood Interconnect of Xilinx 4000

- Direct connections from CLB to adjacent CLB or IOB
- Fastest interconnect
   Less than 1 ns delay
- Abundant in XC5000
- ◆ Limited in XC3000
- Limited to special resources in XC4000



### 3.5.3 Switch Matrix Routing

- Flexible but slow if crosses many channels
- XC3000
  - ❖ 5 lines per channel
- XC4000
  - 8 similar Single-Length lines
  - 4 Double-Length lines skip every other switch matrix
- ◆ XC5000
  - Adds local interconnect matrix



### 3.5.4 Versa Ring - Routing for I/O Blocks



3.27 DIGITAL SYSTEM DESIGN 3.28

### 3.7.5 Virtex - The Future for FPGA?

### XC4000 Family (Smallest and Largest)

|             |              |            |             | RAM bits   |              |             |  |
|-------------|--------------|------------|-------------|------------|--------------|-------------|--|
| <u>Part</u> | <u>Gates</u> | CLB Matrix | <u>CLBs</u> | Flip-Flops | <u>(max)</u> | <u>IOBs</u> |  |
| 4003E       | 3k           | 10x10      | 100         | 360        | 3.2k         | 80          |  |
| 4025E       | 25k          | 32x32      | 1k          | 2560       | 33k          | 256         |  |

### Virtex Family (Smallest and Largest)

|             |              |            |             | RAM bits   |              |             |
|-------------|--------------|------------|-------------|------------|--------------|-------------|
| <u>Part</u> | <u>Gates</u> | CLB Matrix | <u>CLBs</u> | Flip-Flops | <u>(max)</u> | <u>IOBs</u> |
| XCV50       | 58k          | 16x24      | 1.7k        | 7.6k       | 57k          | 180         |
| XCV1000     | 1124k        | 64×96      | 27.6k       | 112k       | 524k         | 512         |

• Virtex is the new Xilinx family of FPGA with higher density, better routining and larger capacity.