The goal of this project is to design an 9x9 pipelined crossbar megacell. The figure below shows the architecture of an asynchronous (e.g. non-pipelined) 4x4 crossbar. This circuit has 4 input ports and 4 output ports that are connected using four 4-to-1 multiplexer circuits. Each output port “owns” one 4-to-1 multiplexer that is programmed (using private 2-bit control line) to connect to one input port. The 4-to-1 multiplexer can be built out of smaller 2-to-1 multiplexers (e.g. multiplexer tree) or as a single standard cell. A 9x9 crossbar will require a 9-to-1 multiplexer.

Figure 1. 4x4 Crossbar Architecture.
The performance of the crossbar can be significantly improved by pipelining the crossbar switch. Pipelining can be applied to both the horizontal broadcast bus and the multiplexer tree. In a pipelined crossbar, adds a clock input signal and requires the input data streams to be synchronized with the clock. The goal of this project is SPEED, SPEED and SPEED! This means that your design should minimize the clock period at which the crossbar operates to the maximal extent by clever use of pipelining techniques.
The project is to be done in pairs (two students per team). You are free to choose any CMOS implementation style dynamic logic. Feel free to mix the logic families in your design. All complimentary signals must be internally generated, and any number of levels of logic may be
used. Registers can be dynamic or static. You are free to use the clocking strategy of your choice (single phase, two phase, four phase, ...). Make sure, however, that races do not occur.
TECHNOLOGY: The design is to be implemented in a 0.8 mm CMOS.
POWER SUPPLY: A power supply of 3.3 V should be used.
PERFORMANCE METRIC: VOH, VOL: The output signals should settle to within 10% of their final value before the next clock event can be introduced!!!
NOISE MARGINS: The noise margins should be at least 10% of the voltage swing.
LOAD CAPACITANCE: Each output port should have a 20fF load.
CLOCKS: You are given a primary clock signal with a rise and fall time of 50 picoseconds and a duty cycle of 50%. All other clock signals should be derived from this primary signal using actual logic (e.g. complimentary clocks, non-overlapping clocks, clocks with a faster rise and fall time, etc.). The logic schematics and the simulated waveforms for these derived clocks should be included in the report.
INPUT/OUTPUT: Your megacell must have the following input signals: 9 input data ports, 1 clock input, 1 reset input (if you choose to implement reset capability for your flip-flops), 36 control inputs. The outputs consist of 9 output data ports.
The project is divided into 5 parts that are due at various times during the semester. Nominal deadline is to turn in material to TA by end of class on the due date. Consult the class schedule for a listing of due dates. A late submission will automatically result in 50% loss for that project part. The grade will be divided as follows:
P1: Multiplexer Design (10%)
Design, capture schematic, layout, extract layout, simulate extracted layout for a multiplexer cell that will form the basis of your crossbar design. Turn in transistor-level circuit schematic, clearly visible printout of the layout, simulation waveforms for schematic and extracted layout. Given 50 picoseconds rise/fall time input signals and 20fF load on the multiplexer output, how fast is the rise/fall time and propagation delay of your multiplexer design. A cover page must be included and identify your names, summarize your multiplexer circuit architecture, and show timing results (e.g. rise/fall times and propagation delay).
P2: Flip-flop Design (10%)
Repeat P1 for flip-flop that will be used to pipeline your crossbar design. The performance metrics for flip-flops are CLOCK-to-Q delay, setup and hold times. Assume 50 picoseconds rise/fall time for input and clock signals and 20fF load on flip-flop output. You can forego reset capability on the flip-flops in order to maximize performance. A cover page must be included and identify your names, summarize your flip-flop circuit architecture, and show timing results (e.g. CLOCK-to-Q delay, setup and hold times).
P3: Crossbar Schematic (10%)
Complete the crossbar schematic. Simulate the schematic using a crossover connection pattern (e.g. 1®9, 2®8, 3®7, … 9®1) and straight connection pattern (e.g. 1®1, 2®2, 3®3, …9®9). Each input port is to transmit a repeating data pattern consisting of input port address (e.g. port 1 transmits 00010001, … port 9 transmits 10011001). Assume 50 picoseconds rise/fall time for input and clock signals and 20fF load on flip-flop output. Simulate your design, increasing clock rate until the design fails to operate properly. Turn in the complete schematics and simulation results. A cover page must be included and identify your names, summarize your schematic architecture, and show maximum clock speed (for schematic design).
P4: Crossbar Layout (10%)
Complete the crossbar layout. Your layout should be rectangular without too much wasted space. Extract the layout and simulate it using same assumptions as in P3. Turn in a printout of the layout and extracted simulation results. A cover page must be included and identify your names, summarize your layout architecture, and show maximum clock speed (for schematic design).
P5: Project Poster (20% Performance + Correctness, 20% Creativity, 20% Poster Quality)
Instead of writing a report, you and your partner will present a poster on your results. The posters will be presented during the last two class sessions. You will get 5 minutes to explain why your design is good, and what is special about it. The instructor (and class) will get 5 minutes to ask you questions regarding your design. The poster should contain at most 9 power-point slides. One of these slides should contain your names and summarize in a number of bullets your important design decisions and results. The rest should be used to show the important schematics, simulation results, and everything to demonstrate the functionality and performance
of your design. MAKE SURE THAT YOUR POSTER CONTAINS THE MAXIMUM INFORMATION IN THE MINIMUM NUMBER OF WORDS. Graphics convey data a lot more effectively. We will also require you to submit a single sheet that summarizes your results (by e-mail) and e-mail us the GDS LAYOUT database and SIMULATION INPUT DECK used to analyze the speed of the crossbar in P4.
You are encouraged to discuss the project with your classmates and search literature and internet for project-related information. However; you are not permitted to copy binary files. If we find evidence of this, all the parties involved will automatically receive 0% grade on the entire project. For example, it is very easy for us to compare layouts of two multiplexers. It is OK for the multiplexer to use the same circuit architecture – but if we see that the layouts are identical (e.g. the 100 or so polygons in the layouts have identical position and size). If we find this, an investigation will immediately be launched.