 EE141 Project:  CORDIC Processor CORDIC is an algorithmic approach that has long been used to compute a wide variety of goniometric and mathematical functions.  This project involves the design of a CORDIC processor that translates a vector expressed in cartesian coordinates into polar representation.  As a slight simplification, it is assumed that the vector is located in the first quadrant.  CORDIC is based on a sequence of rotations performed on the input vector, gradually reducing the angle of the rotation.  The goal is to rotate the input vector such that its final location is on the x-axis.  At that time, its amplitude is simply given by its x-coordinate.  We implement only 6 rotations (6 stages).  The angle is given by adding the angles of the rotations performed.  Our design involves only the amplitude calculation, but the angle calculation would be similar.  At this point, I could into more detail with a lengthy discussion, but I think you get the basic idea.  I conclude this explanation with a block-diagram of a single stage of the CORDIC rotation (i-th stage): . The primary goal in this project, aside from getting it to work, was to minimize the energy-delay product and secondarily to minimize the layout area.  I won't detail all the specifics of our design (most of which you can figure out from our Magic layouts) but will provide the following overview.  For our logic style, we decided to use complementary CMOS and pass-transistor logic instead of dynamic logic because a static CMOS implementation is less complex and easier to design and layout.  Unlike ratioed logic, complementary CMOS and pass-transistor logic also do not consume static power.  The sign function above is implemented as a 2-input MUX (implemented with pass-transistor logic) that selects between the input and its 2's-complement (implemented with a modified half-adder).  Shifting is done simply by wire.  Addition is accomplished with a carry-bypass adder with mirror adders rippled in each stage.  Invert functions are built right into the logic design where necessary so that there are no inverters on the critical path.  Finally, in assembling the complete datapath, we implemented deep-pipelining with edge-triggered flip-flops (implemented in TSPCL to avoid clock overlap) and inserted two-stage buffers where necessary.  Testing and analysis were done in HSPICE and IRSIM (results not provided here). Below is a screen capture of our Magic layout.  This is only for a single stage along the x-path (y-path very similar).  Note that our input vectors are 12 bits.  Clicking on each subcell will allow you to progressively zoom in to take a closer look.  You can zoom all the way into each basic subcell.  There are 3 redundant subcells strung together here; for the sake of convenience, only the leftmost one is clickable.  There are actually a few small errors in this layout that I won't correct here (don't want students "borrowing" my work now, do I?).  If you can find the error(s), well, good for you. =)  