----------------------------------------------------------------------------
CDA3101 - Fall 2013 - Exam #3 Review
----------------------------------------------------------------------------

TOPICS WE HAVE COVERED IN ADDITION TO EXAM-1 and EXAM-2:  
  (Study these in Book and Web Notes)

   4. Processors
            4.1. The Central Processor - Control and Dataflow
            4.2. Datapath Design and Implementation 
            4.3. Single-Cycle and Multicycle Datapaths
            4.4. Controller Finite State Machines
            4.5. Microprogrammed Control

   5. Pipelining
            5.1. Overview of Microprogramming and Pipelining
            5.2. Pipeline Datapath Design and Implementation
            5.3. Pipeline Control and Hazards
            5.4. Pipeline Performance Analysis

   6. Memory Hierarchies
            6.1. Overview of Memory Hierarchies
            6.2. Basics of Cache and Virtual Memory
            6.3. Memory System Performance Analysis & Metrics

TYPES OF QUESTIONS:

      - Short Answer    (1-2 sentences - no dissertations, please)

      - Analysis        (calculate the performance of an arithmetic algorithm,
                         MIPS program, or single/multi-cycle datapath, given
			 IC & CPI for each instruction type, and CPU cycle 
			 time)

      - Performance    Calculate the performance of a pipeline given a
                         simple sequence of MIPS instructions


EXAMPLE QUESTIONS:

	* What is the difference between a single- and multi-cycle datapath?
	  Why is a multi-cycle datapath typically faster?
	  

          Answer:  A single-cycle datapath has CPI = 1 for all instructions,
	  since each instruction is performed in one cycle.  A multi-cycle
	  datapath has potentially different CPI for each instruction, since
	  a given type of instruction can require different CPI to execute.

          A multicycle datapath is typically faster than a single-cycle
	  datapath because the former can employ a fast clock, since its
	  hardware components dedicated to each cycle are small, and the
	  critical path through the components is relatively short.  As
	  a result, the components take relatively little time to settle.
	  In contrast, the single-cycle datapath requires that all 
	  instructions move at the speed of the slowest instruction, which
	  is determined by the longest path through the entire datapath.
	  The resultant asynchronous interaction between circuit components
	  causes the settling time to be much longer, which results in
	  a slower clock rate.  The gains realized by setting CPI=1 in
	  the single-cycle datapath do not outpace the gains realized by
	  the faster clock rate in the multicycle case.


        * What is an ISA and why is it important in processor design?


            An instruction set architecture is the specification that
	    links the hardware structure and function to that of the
	    software.  ISAs are important because they clarify processor
	    design and provide a convenient abstraction for hardware/
	    software interface design, analysis, and maintenance.


	* Know how to solve the types of problems we did in class and
	  recitation re: CPUtime = IC x CPI x CycleTime.  This is
	  *very* important.  Hint: You might be asked to compute the
	  runtime of one or more MIPS programs that you write, as well
	  as discuss the work requirement in an arithmetic algorithm
	  such as Booth's algorithm for signed multiplication.  So
	  practice on the exercises we used for Homework #3.  

	* What is a finite-state machine and how is it used to design
	  control for a multi-cycle datapath?  What are its limitations?
	  

          Answer:  A finite-state machine is a collection of states with
	  a start state and a transition function that tells you how to
	  go from one state to another.  Finite state machines are used
	  to express decision processes, and are employed in control
	  system design to simplify the design process.

	  A finite-state machine is useful for designing control systems
	  that have only a small number of states.  If the number of
	  states is large, then the graphical representation of the FSM
	  becomes visually intractable.  In such cases, one uses micro-
	  programming techniques instead.


	* What is microprogramming, and why is it better or worse than
	  finite-state control design methodologies?  What are the
	  advantages and disadvantages of microprogramming?
	  

          Answer:  Microprogramming is a software technique for designing
	  datapath control.  A microinstruction format is designed such
	  that (a) each field of the microinstruction represents a control
          signal for the datapath hardware (e.g., PCWrite, RegDst, etc.);
          and (b) the same microinstruction can be used for every micro-
          programming task (orthogonal format, simplicity favors regularity).
	  Microprogramming is more useful than finite-state graphical design
	  when there are a large number of states in finite state control.
          
          There are several advantages of microprogramming.  First, a large
	  number of object code instructions can be handled, much more than
	  with finite-state control graphical design techniques.  Second,
          since microprogramming is a software technique, there exists a
	  large body of software engineering tools to support the proper
	  design of a microprogram, using good software engineering practice.
	  Finally, microprogramming is well established and its techniques
	  and practices are relatively well understood, so it represents a
	  stable technology upon which to build datapath control designs.

	  Unfortunately, microprogramming has several disadvantages...
          The microinstructions must be consistent, that is, no field
	  can have more than one value for a given microinstruction.
	  Second, proofs of correctness for microprograms are not
	  necessarily easy, and thus rigorous software engineering is
	  somewhat dependent on the hardware design, which can itself
	  be erroneous.  Finally, microprogramming using a special 
	  microprogram memory is not necessarily faster on RISC machines
	  with large register files and caches than translating the
	  high-level language into a series of small object code type
	  of instructions.  In the past, microprogramming was faster
	  because there were no caches and main memory was slower than
	  microprogram memory.  Today, that is not necessarily the case,
	  but speed advantages can still exist for microprogramming with
	  restricted types of architectures, such as those that have only
	  a small number of general-purpose registers.


	* What are exceptions, and how are they handled in MIPS?
	  

          Answer:  An exception is an event that causes a computer
	  to behave in an unexpected way.  "Exception" usually refers
	  to an event that occurs inside the processor (e.g., arithmetic
	  overflow), whereas "interrupt" refers to an event that occurs
	  outside the processor (e.g., I/O request).

	  MIPS handles exceptions using two registers and some special
	  hardware.  The EPC register stores the address of the
	  instruction that raised the exception, or at which an
	  interrupt occurred.  The Cause register stores the type of
	  the exception.  An exception handler is invoked when an
	  exception is detected.  This requires that the mux that
	  controls PCWrite writes the address of the exception handler
	  into PC when the exception is detected.  After the exception
	  handler completes, then control attempts to transfer execution
	  back to (a) the offending instruction, whose address is in EPC,
	  or to (b) the next instruction (EPC + 4).


	* What exceptions would you expect to be raised in IEEE 754
	  floating point operations?
	  

          Answer:  Overflow and Underflow are the most common floating-
	  point exceptions.  Overflow refers to a result that is too large
	  (exponent too big) to be represented in single- or double-precision
	  floating point format.  Underflow refers to a significand that is
	  too small, and results in a mantissa that cannot be represented
	  within the precision limits of the IEEE 754 standard.


	* What are vectored interrupts?  What software and hardware
	  support do they require?
	  

          Answer:  A vectored interrupt is an exception raised by an
	  event external to the processor, which corresponds to an 
	  address A in microcode memory.  When A is invoked by the
	  microprogram control, an exception-handling routine is
	  executed.  Control can be (but is not always) transferred
	  to the instruction that caused the exception.  

	  Vectored interrupts require (a) interrupt detection hardware,
	  (b) a table of addresses (like a jump table) to which control
	  can be transferred, with one address per interrupt or interrupt
	  type, and (c) a means of transferring microprogram control to
	  the starting address of the interrupt handler pointed to in 
	  the address table.


	* Given a brief sequence of 4-5 MIPS instructions, use the
          5-stage MIPS pipeline discussed in class (IF-ID-EX-MEM-WB)
          to build a diagram with (or without) stalls.  Compute CPI
          of this code sequence on the MIPS pipeline.


          Answer:  
          
          MIPS INSTRUCTION  CC1  CC2  CC3  CC4  CC%  CC6  CC7  CC8  CC9 ...
          ...instr...            IF   ID   EX   MEM  WB
          ...instr...                 IF   ID   EX   MEM  WB
          ...instr...                     stall IF   ID   EX   MEM  WB
               :                            :   :    :    :     :    :

          CPI = no. of cycles consumed / no. of instructions 


	* Identify structural, data, and control hazards in MIPS code
          sequence.
	  

          Answer:  
            
             STRUCTURAL HAZARDS occur when two instructions compete
              for the same hardware component (e.g., read and write
              to data memory on the same cycle).  This can be allev-
              iated by splitting the cycle so that read occurs after
              write, or vice versa.

             DATA HAZARDS occur when there is a dependency between
              the fields of two instructions, for example:

               RAR - Read After Read -- Not usually a problem
               WAR - Write After Read 
               RAW - Read After Write - can be problematic
               WAW - Write After Write

              Know your data hazard types and how/why they cause
              problems (e.g., stalls) in a pipeline.

             CONTROL HAZARDS occur when a branch instruction incurs
              one or more stalls.  Review the text discussion, and
              Web sections 5.3 and 5.4 for details.

              We can reduce branch hazards via (a) moving branch
              target address (BTA) computation to ID stage instead
              of EXE stage, (b) moving an independent instruction 
              into the branch delay slot (BDS), or (c) predicting
              whether or not the branch will be taken (tricky...)



	* Compare and contrast cache and virtual memory.
	  

          Answer:  
            
            CACHE is a way of fooling the CPU into believing it
              has access to a very large, very fast main memory
              when main memory might be slow.  We do this by
              interposing a small, fast memory between the CPU
              and main memory.  This fast memory is called "cache".
              
              Know how cache works, and know the components of the
              cache address (tag field, index field, and offset field)
              and what they are used for.

            VIRTUAL MEMORY is a way of fooling the CPU into viewing
              main memory as being much bigger than the physical
              memory (in the computer) that is actually main memory.
              This is done by using the disk as a backing store for
              main memory.  First, we interpost memory between 
              the CPU and disk, which we call the paging table.
              Second, the paging table functions very much like
              cache.  Third, the paging table has a Translation
              Lookaside Buffer, which implements fast translation
              of the Virtual Address (what the CPU sees) into the
              Physical Address (the computer's physical memory space).
             
            A CACHE MISS occurs when a data block cannot be located
              within the cache.  Therefore, the block must be gotten
              from main memory (slower => time penalty).

            In VM, a PAGE FAULT occurs when a data block cannot be
              located in the page table, so it must be retrieved from
              disk (very slow => larger time penalty).


----