# HEWLETT-PACKARD CO.



This page provides a running history of changes for a multi-page NOTE: drawing which cannot conveniently be re-issued completely after each change. When making a change, list for each page all beforeand-after numbers (within reason; use judgement, and use
"extensive" revision note if loss of past history is tolerable, or
retype complete page) and associate with each a symbol made up of
the change letter and a serial subscript to appear here and on the
page involved (there enclosed in a circle, triangle, or other
attention-getting outline)

attention-getting outline).

Ltr REVISIONS DATE INITIALS A 10-17-82 As Issued Model No. Stock No. 98561 B Theory of Operation Title Description Sept. 27 1985 Ву Sheet No. Jon Rubinstein 16 Supersedes Drawing No. A-98561-66519-9



98561-66519 СРО

THEORY OF OPERATION

Jon Rubinstein Oct 1, 1985



#### Introduction

#### 1.1 Scope

This document is the Theory of Operation for the 98561-66519 68020 CPU board for the Series 300 computer family. It discusses the circuit implementation of the '19 board in detail. It is suggested that the 98561-66519 ERS (under same part number as this document) be read since a knowledge of the ERS is assumed. Other related documents are as follows:

- MC68020 Users Manual; Motorola
- MC68881 Users Manual, Motorola
- DIO Bus specification,

## 1.2 Overview and Block Diagram (page 1)

The '19 board is divided into 6 blocks in the block diagram. Each block represents one or two pages of the schematic. The processor block is one page and consists of the main processor and the support circuitry. The Cache\_TLB block contains the cache key on one page and the cache data and TLB tag circuitry on the other. The MMU block contains the MMU data path, the TLB data path, and conversion of the data bus from 32 bits to 16 bits. The DIO INT block contains the DIO interface circuits on one page and the DIO peripherals on the other. The control block contains most of the state machines and control circuitry for the board. Finally, The Test block contains the test connectors and the spare gates.

Each section following will cover specific operation of the various blocks. For signal naming conventions, DAISY format is used: inverted signals are followed by a "~". Timing diagrams can be found under part number 98561-66519-7.

#### 2.0 The Processor Block (page 2)

#### 2.1 The Processors

The processor block contains the 68020 processor (U42), the 68881 co-processor (U29), and associated circuitry. The 68020 has a full 32 bit address and data bus. The address bus pins connect to the Logical Address (LA) bus 0 thru 31; the data bus is shared between both the processor and co-processor and is labeled Processor Data (PD) 0 thru 31. The PD bus connects to the processors and then immediately leaves the block.

The LA bus is decoded and buffered in the processor block. The buffers U40, U41, U58, U59 convert the LA bus to the Buffered Logical Address bus (BLA) 0 thru 31; note that the address is latched by address strobe (AS"). The LA bus is connected to series resistors RP3, RP4, RP6 to become the Tag Address bus (TA) 2 thru



21. The TA bus goes directly out of the processor block into the Cache TLB block.

LA, BLA and the function codes FC2-0 are used to decode the various spaces. CPU space (FC = 111) is decoded by U3 and generates CPU Space Access (CSA\*). Once CPU space is decoded, interrupt acknowledge (IACKFC\*) is generated by U25 and is or'ed with AS\* to generate Autovector (AVEC\*) for the processor. If an invalid CPU space access occurs, ICSA\* is generated which causes a bus error. When FC = 011 occurs, this is RSA\* and is used for selective purge, U3 decodes this signal.

The chip select for the co-processor is generated by the address comparator U4 and the 68881 detect circuits U26 and U45. U26 detects when the 68881 is installed, then the control register signal Floating Point Enable (FPEN) is anded in by U45. If the co-processor is installed and enabled, chip select is generated by the correct co-processor address (Address = \$00022000). Otherwise ICSA is generated causing a bus error and the processor executes an FLINE trap.

#### 2.2 Clock Circuits

The main clock is derived from clock oscillator Y2. The output of Y2 drives U38 where an external clock can be inserted by installing JP1 (for 3065 and other testing). The clock is divided by U39 and the Processor Clock (PCLK) is generated. PCLK drives the 68020 and based on the position of JP3, the co-processor. To provide a clock for the rest of the system, PCLK is generated by U22. The bus clock (BUSCLK) is generated by the other half of U39 and is PCLK divided by 2. Note that this clock only runs when the bus state machine is executing a cycle.

Oscillator Y1 is divided by U21 and then can be jumpered by JP3 to drive the 68881. Ideally, the clock frequency required for the 68881 would match the frequency of the processor; however at introduction, the 68881 runs at a slower frequency thus requiring an additional clock. When the 68020 and the 68881 are specified for the same frequency, JP3 is in the B position and Y1 is deleted.

## 2.3 MMU and Cache Support

The INHIBIT and MAPON signals are generated by U22, U27, U23, U28, and U2 (location B6). These signals are used on a cycle by cycle basis to determine if mapping is enabled for that cycle. When MAPON is asserted then the present cycle is mapped. The INHIBIT signal causes the cache to be disabled for that cycle.

The cache system is enabled only when required, thus reducing the power required for the high speed RAMs. The RAM Enable (RAMEN") and Data Cache Enable (DCEN") are generated by half of U21 (location B4), which is controlled by U131 and U132 (location F8). RAMEN" is asserted when ECS" is driven from the processor. If on the next falling PCLK edge AS" is not asserted, RAMEN" and DCEN"



are deasserted. If AS is asserted then RaMEN remains asserted and DCEN" is asserted; thus the cache and TLB RAMS will stay enabled. When both maps are off and the cache is disabled, U131 and U132 generate RAMDIS" which blocks the creation of the enable signals.

#### 3.0 Cache and TLB Tag Circuits

This portion of the board is broken into two pages of schematics. The first page contains the cache key (or tag), the second page has the cache data and the TLB tag.

The cache is a 16 Kbyte set associative direct mapped cache with a line size of 32 bits, a set size of one, and uses virtual addresses. Thus, the cache contains 4096 32 bit entries. The TLB consists of two 1024 entry set associative direct mapped TLBs, one for supervisor accesses and the other for user accesses.

## 3.1 Cache Key (Page 3)

Since the cache has 4K entries, the address is broken up as follows: the lower two bits (AO,A1) are the offset into the line, the next eleven bits (A2-A12) offset into each bank of RAM. A13 selects between the banks, and the rest of the bits (A14-A31) are used as the cache key. The center of the cache key is the two banks of 2Kx8 key RAMS which are enabled by RAMEN~. The output enable and write signal are qualified with A13 to separate the two banks. The lower bank consists of U71, U72, and U73; the upper bank is U101, U102, and U103.

The data path into the key comes from the BLA bus. U86, U108, U129 buffer BLA13 thru BLA31 into the key. The output from the key drives the first level of comparators U109, U110, U130. The cache valid bits are stored in U74 (user) and U104 (supervisor). The cache is designed such that each entry in the cache has two valid bits, one for supervisor and one for user. The goal with the two valid bits is to allow the software to purge either user entries or supervisor entries separately.

A 4Kx1 RAM would have been desirable for each of the valid bits; however, only 1kx4 RAMs are available with a clear capability. To convert the 1Kx4 to a 4Kx1 RAM two additional devices are used. An 8 to 1 MUX (U128) is used to select the correct valid bit of the 8 possible outputs. BLA2, BLA3 and FC2 are used to select between the possible outputs. For a write to the valid bits, a 20R8 PAL (U105) is used to store the three unchanged bits and modify the correct bit. When either the user or supervisor MMU is off, the function code select to the valid bit circuit is forced to a one by U45 and U140 so that only the supervisor valid bit is used.

The hit compare is implemented using two levels of comparators. The first level, U109, U110, U130, compare the output of the key RAMs with the logical address from the processor.



In addition, the valid bit is checked and address strobe is used to qualify the address. The outputs from the first stage feed both the second stage comparator (U111) and also generate Cache Hit (CHIT). The second stage comparator checks for a hit in the cache, and a hit in the TLB if the MMU is enabled for that cycle. If there is a hit in the cache and in the TLB and the cycle is a read, then the 3 Cycle Read (3CYRD) signal is generated which becomes DTACK.

The data PATH into the TLB tags is buffered by U86 and U79. These devices buffer BLA22 thru BLA31 as well as BPD6 which becomes the Cache Inhibit (CI) bit. The output enable for all devices on the board that are not actively used is controlled by U26 and CTLOE. If KPUP is driven low, many devices are tristated. This feature is for 3065 test only.

# 3.2 Cache Data and TLB Tag (Page 4)

The cache data RAMs are divided into a high and low bank. Each bank is 32 bits wide and is made up of four 2Kx8 RAMs. The low bank U51, U54, U55, and U77, is enabled when address 13 (BLA13) is low. The upper bank U52, U53, U56, and U57, is enabled when address 13 is high. The address into the data banks is the BLA bus which has been buffered with series damping resistors RP3 and RP5 becoming the CA bus (CA2-CA12).

The entire data array is output enabled only when a cache hit occurs, ie. 3CYRD" is asserted. The outputs of the RAM drive the processor data (PD) bus directly. In order to keep the bus loading to a minimum, the PD bus is buffered by U5 thru U8. After buffering, the PD bus becomes the Buffered Processor Data bus (BPD).

The TLB tag RAMs consist of U75, U76, and U106. Since the '19 board uses 4 Kbyte pages, LAO thru LA11 are used to offset into the page. LA12 thru LA21 are used to offset into the set associative TLB. FC2 is used to select between the two halves of the TLB. The tag stores address 22 thru 31 and the valid (VB) and CI bits. The TLB tag comparators, U78 and U87, compare the tag bits with BLA22 thru BLA31 and the valid bit. The outputs of the comparators are or'ed together to generate Tag Hit (THIT) which represents when a tag hit occurs. The outputs are also fed to the cache second level comparators.

## 4.0 MMU Data Path (page 5)

The MMU data path contains the conversion from the 32 bit data bus to 16 bits, the TLB data RAMs, and the MMU table walk address generation and multiplexers.

The 32 bit BPD bus is converted to a 16 bit bus by the bi-directional bus latches U12 thru U15. These devices allow the Internal Data (ID) bus to be connected to either BPD0-15 or BPD16-31. When reading in 32 bit mode, each pair of latches is



loaded from the ID bus before a BPD bus transfer occurs. When a write occurs, the correct half of the BPD bus is transferred to the ID bus.

The TLB data is stored in U62 and U92. These 2Kx8 RAMs are addressed by the same bits as the tag (BLA12-21 and FC2). The outputs are tied to the MMU Data bus (MD). This 16 bit bus provides the high order 12 bits of the physical address when the MMU is enabled. The MD bus is gated onto the Physical Address bus (PA) by U63 and U64. When the MMU is not enabled, the high order physical addresses are gated from the BLA bus to the PA bus by U44 and U61. The low order physical addresses (PA2-11) are gated from the BLA bus onto the PA bus by U44 and U88.

The rest of the circuitry in this block is used for handling MMU table walks. The table walk begins with enabling the correct root pointer to generate the first table lookup. The supervisor root pointer is stored in U31 and U33. The user root pointer is stored in U30 and U32. The root pointers contains the high order bits of the physical address and are gated onto the MD bus when a the first level of a table walk begins.

The low order bits of the physical address are chosen from the BLA bus by the multiplexers U60, U80, and U81. For the first level table access, BLA22-31 are used to offset into the table. Once the 32 bit table entry is loaded into the data bus latches, the next level table address, stored in BPD12-BPD26, is gated onto the MD bus and latched into U63 and U64.

For the second level of table walk, the high order address is driven from U63 and U64, and the low order address comes from the multiplexers (BLA16-21). Once the entry is in the data latches, BPD is again gated onto the MD bus and the data written into the TLB RAMs.

If the referenced or dirty bits need to be set, a write cycle is started to the same address using data from U11. This latch stores a byte of the page table entry and gates it onto the ID bus for the write. The correct dirty and referenced bits are input into this latch and then memory is updated.

## 5.0 DIO Bus Interface

The DIO bus interface consists of two pages the first of which actually control the DIO bus cycle and buffers signals from and to the bus. The second page is the required DIO peripheral circuits.

## 5.1 Interface Circuits (page 6)

The DIO interface circuits handle all connections to the DIO bus. The PA bus is buffered by U68, U98, and U123 (location G6) and becomes the Buffered Address bus (BA) on the backplane. The ID bus is buffered by U70 and U99 (location I3) before becoming the Buffered Data bus (BD). All control signals driven onto the bus



(ie. BAS, BDS, etc.) are buffered by U69 and U100 (location I6).

The seven interrupt lines on the bus are encoded into IPLO, IPL1, and IPL2 by U146 (location B2). The reset and halt lines are conditioned through U145, U85, U90, U144, U145, and U140 (location B4). This reset circuit is the identical circuit used on the previous Series 200 CPU boards. Bus error timeout capability is provided by counter U95 (location D2). At 16.67 Mhz, the timeout is 7.7 usec; at 20 Mhz, the timeout is 6.4 usec. In addition, the optional synchronous I/O capability (AUTO-DTACK) is provided by this part. The bus definition in Series 300 does not provide for synchronous DTACK space, however this feature is provided to allow use of old boards for internal use only. The possible positions for JP7 are in the following table:

# position

### function

| Α   |
|-----|
| OUT |
| R   |

No synchronous DTACK (standard) 16.67 Mhz synchronous DTACK 20 Mhz synchronous DTACK

Table X1: JP7 Positions

Bus precharge is provided by U140, U141 and U100 (location E3). BAS is delayed by 60 nsec before enabling the precharge driver which drives DTACK and BERR high.

DTACK and BERR from the bus are double synchronized by U124 and U34 (location G2). The first stage of the synchronizer is clocked by BUSCLK or BUSCLK depending on the position of JP2. BUSCLK is half of the frequency of PCLK, thus the bus runs at 8 Mhz for the 16 Mhz processor, and 10 Mhz for the 20 Mhz processor. At 20 Mhz a later edge of the clock is used to synchronize, thus lengthening the minimum sample time to meet bus specifications. The following table show the positions for the jumper.

## position

## function

A

16.67 Mhz operation 20 Mhz operation

Table X2: JP2 Positioning

The second stage of the synchronizer, U34, is clocked by PCLK~. U142 is used to add VPADTK~ to the DTACK path and to provide sufficient hold time for 20 Mhz operation.

Address decode for the '19 board is implemented using two PALS, U67 and U96 (location E6). U67 is the first stage of decoding; its outputs are DTACK, CPU peripheral access and Boot ROM access. The second stage breaks the address select into Boot ROM chip select, timer chip select, LED write, MMU register select and DTACK.



Bus arbitration is generated by U97 and U122 (location C6). All the control signals are double synchronized by U97 and then fed into the bus arbitration state machine U122. The flow chart of the state machine can be seen in appendix A, however it is very similar to the standard 68000 bus arbitration. One major enhancement in bus arbitration for the '19 board is the ability of the processor to run out of cache in parallel with alternate bus master transactions occurring on the bus.

Finally, the 6800 control signals for the 6840 timer (and for use in a Series 200 system) is generated by U66 (location C8). This state machine outputs the E clock and VMA based on when SVPA occurs. The bus state machine is DTACKed by the VPADTK signal signifying the completion of the 6800 bus cycle. The flow chart for this state machine can be seen in appendix B.

# 5.2 DIO Peripherals (page 7)

This page contains the DIO peripheral circuits and control/status register. The control/status register is located at an address of \$5F400E and is split into two parts. The register portion consists of U50, U47 and U48. These register are clocked on WRCTL~ and cleared on power up. U50 is also cleared when a bus error occurs and the TEST bit is set. The MMU bits are set by the MMU control PALS (control page). For reading the control/status register, the signals are gated onto the ID bus using U35 and U49.

The Boot ROMs are installed in U18, U19, U36, and U37. The lower address ROMs are U18 and U19, the upper address ROMs are U36 and U37. Boot ROM space begins at 0 and extends up to \$0001ffff. The address decoding for the various size ROMs are controlled by jumpers JP4, JP5, and JP6. The following table shows jumper positioning:

|      | JP6 | JP4 | JP5 |               |
|------|-----|-----|-----|---------------|
| 128K | A   | В   | С   | JEDEC Pinout  |
| 256K | В   | Α   | С   | JEDEC Pinout  |
| 512K | С   | Α   | D   | JEDEC Pinout  |
| 256K | В   | NC  | A   | Mostek Pinout |

Table X3: JP4, JP5, JP6 Positioning

The test LEDs CR1 and CR2 are driven by U20 which is cleared at power up and clocked with a write to Boot ROM space with PA14 high. The 6840 timer (U17) is controlled by the 6800 state machine and a 250 Khz oscillator (Y3). Its 8 bit data bus is tied to the lower half of the ID bus. The timer is fixed at interrupt level 6.

# 6.0 Control Circuits (page 8)

The control page contains most of the control circuitry for

0116



the data path in the other six pages. It can be broken down into the DIO bus state machine, MMU state machine, cache control circuits, register control circuits, and 68020 bus cycle control circuits.

## 6.1 Bus State Machine

The DIO bus state machine consists of U118 and U117 (location H3). It is capable of running several types of cycles: a standard 16 bit read or write cycle, a cache read or write cycle and an MMU read or write cycle. The cache cycle is two back to back DIO bus cycles used to access 32 bits of data from memory. The MMU read cycle is the same as a cache read cycle. The MMU write cycle generates a byte write. These DIO bus cycles look exactly like a 68000 bus cycle running at half the system clock frequency.

As an overview, the bus state machine requests the bus with BREQ, when BMINE is asserted, the bus cycle may begin (see appendix C for flow chart). If it is a read cycle, the right hand path is taken, otherwise the lower path is taken. The bus may be arbitrated away between consecutive 16 bit accesses for a 32 bit cache cycle. Once the bus cycle is begun, the CYCLE signal is asserted which starts the bus clock. With the bus clock running, DTACK and BERR can be sampled (see DIO Interface page). On completion of the bus cycle, the bus state machine asserts either DRDY or DERR, depending on whether the access was successful.

# 6.2 MMU State Machine

The MMU control is implemented by U43, U83, U135, U137, and associated circuitry (around H5). The state machine flow chart can be found in appendix D. The MMU table walk begins with the segment table lookup which is executed by requesting the bus state machine to execute an MMU read cycle. Next the page table entry is fetched in a similar manner. Finally, if the dirty or referenced bit needs to be set a MMU write cycle is requested. If during the translation, an error occurs, the correct MMU error bit is set while the others are cleared.

The data path is controlled by Enable User Root Pointer (ENURP"), Enable Supervisor Root Pointer (ENSRP"), Segment-Page Index (S\_PINDX), and MMU Data Enable (MDEN"). The processor is held off by the Walk Enable (WALKEN") signal which drives the halt input. Upon completion of the MMU table walk, either bus error is asserted causing an MMU fault, or halt and bus error is asserted causing the processor to rerun the bus cycle.

#### 6.3 Bus Cycle Selection and Cache Control

The choice of what kind of bus cycle or MMU cycle is executed is controlled by U93, U94, U84, U138, U119, U120, U28 and associated circuitry (around E4). If the mapper is on, (LMAPON is asserted) and there is a TLB miss or a write cycle without the dirty bit set, (LLKUP is asserted), then the LOOKUP signal is



asserted which blocks the bus access and starts the MMU state machine.  $\ensuremath{\mathsf{M}}$ 

If the LOCKUP is not asserted, the a bus cycle is to be run. The choice of whether to run a cache cycle (CACY) or a normal bus cycle (PRCY) is selected by U93 and U94. If a cache hit has occurred, the bus cycle is blocked by 3CYRD being asserted. Otherwise the choice between the two types of bus cycles is made by the CACHE signal which is generated in the cache control circuits. If the mapper is on, the Cache Inhibit bit (CIBIT) blocks the CACHE signal so that pages which have the CIBIT set in the TLB are not cached. A bus cycle is also blocked if either RSA or CSA is asserted.

The cache subsystem is controlled by the cache PAL U107 (location B8). This device generates the byte write strobes for the cache and the CACHE signal which determines if a cache bus cycle is run. See the PALASM definitions for further data about this PAL (A-1820-4384-1). The cache enables, output enables, and the write strobes for the keys are generated by U125 and U126 (location B6). The two halves of the cache are selected between by BLA13 and the RAMs are powered down when not in use by U114 and U119.

## 6.4 Other Control Circuits

The processor board registers are controlled by U65 (location E8). The address decode on the DIO Interface page generates MMUREG which is asserted when there is an access to \$005F4XXX. When this signal is asserted, the lower addresses decode the specific register being accessed. The decoder generates enable and write strobes for the user and supervisor root pointers (SVREN, USREN, SRPWR and URPWR), read and write strobes for the control/status register (RDCTL and WRCTL), and strobes for the TLB purge register (PTLBWR and PTLBRD). In addition, U89, U90, and U91 generate the cache and TLB flushing control.

The processor cycle termination is controlled by the circuits in the B1 area of the page. If a cache cycle is executed, then both DSACKs are asserted to the processor by U115, U116, U138, and U114. When both DSACKs are asserted, the processor executes a 32 bit access. If a 16 bit bus cycle was executed, only NDSKO<sup>~</sup> is asserted, informing the processor that only 16 bits of data are available.

If a bus error occurred on the DIO bus, the MMU has asserted MERR", or an access occurred to an invalid area of the CPU space (ICSA"), U132 and U115 generate BER68". This signal is fed to the bus error input to the processor causing it to take an exception or rerun the bus cycle if halt is also asserted.

At the C2 location is the Data Bus Buffer Enable (DBDEN~). This signal controls the enabling of the PD to BPD buffers. U131 generates the I\_O signal (location E1). This signal is used for



cache inhibit when the mapper is turned off. I\_O is asserted when the physical address is in the bit-mapped display frame buffer space or I/O space of the DIO address map (\$200000 to \$7FFFFF).

Finally the control sequencer U34 is at location A3. This circuit delays address strobe and generates a delayed address strobe (DAS), and a double delayed address strobe (DDAS). These signals are asserted one and two clocks after address strobe and are both removed one clock after address strobe. The DDCTL and DDEND signals are used at the end of the bus cycle and are asserted when address strobe is being deasserted by the processor

# 7.0 Test Interface and Spare Gates (page 9)

This page contains the 64000 state analyzer interface connections and the spare gates on the board.





