ࡱ> jmz{klza߿ߝXo( UUUUzag\Oy Ek( DDD( n/ 0|DTimes New RomanTT/ܖ 0ܖDArialNew RomanTT/ܖ 0ܖ DTahomaew RomanTT/ܖ 0ܖh"0DWingdingsRomanTT/ܖ 0ܖ@De0}fԚingsRomanTT/ܖ 0ܖ A.@  @@``  @n?" dd@  @@`` TU6G$, -.  0 1  %567!!9!:; 2>?<5B &= 'I7&%(N*J+8=DH?r$߿ߝXoif$r$g\Oy Ekiic d0e0e    A A5% 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E|| ;@8VWdg4KdKdX< 0ppp@ g4FdFd|k 0ppp@  <4BdBd x 0Thʚ;ʚ;<4!d!d  x 0 <4dddd  x 0 r0___PPT10 2___PPT9/ 0?`+4/21/97 6\course\cpeg324-05F\Topic7cO =4 Cache Design `  jCache parameters (organization and placement) Cache replacement policy Cache performance evaluation method^k#b f;b f;,b k Cache Parameters`  Cache size : Scache (lines) Set number: N (sets) Line number per set: K (lines/set) Scache = KN (lines) = KN * L (bytes) --- here L is line size in bytes K-way set-associativeYq bjbbbbb bbjbbZb@  G  h  Trade-offs in Set-Associativity `&    XFull-associative: Higher hit ratio, concurrent search, but slow access when associativity is large; Direct mapping: fast access (if hits) and simplicity for comparison trivial replacement alg. Also, if alternatively use 2 blocks which mapped into the same cache block frame:  trash may happen. f  2h bSbbb@M   e  k  Note`  Main memory size: Smain (blocks) Cache memory Size: Scache (blocks) Let P = Since P >>1, average search length is much greater than 1. Set-associativity provides a trade-off between concurrency in search average search/access time per block#t /;`h!`hY`/`;`Z     ]   Y     "Important Factors in Cache Design#P#` # [Address partitioning strategy (3-dimention freedom) Total cache size/memory size Work load4 '\b  \  Address Partitioning`  Byte addressing mode Cache memory size data part = NKL (bytes) Directory size (per entry) M - log2N - log2L Reduce clustering (randomize accesses)+  'dbjbj)b   Note: The exists a knee`    & the data are sketchy and highly dependent on the method of gathering... & designer must make critical choices using a combination of  hunches, skills, and experience as supplement&  a strong intuitive feeling concerning a future event or result. I  2 `ds`dd`d/`   Basic Principle`  Typical workload study + intelligent estimate of others Good Engineering: small degree over-design  30% rule Each doubling of the cache size reduces misses by 30% by A. Smith It is a rough estimate only2n^n`^`   Cache Design Process`   Typical , not  Standard  Sensitive to: Price-performance -- Technology main M access time cache access chip density bus speed on-chip cachen* E*`bbE b    Cache Design Process`  (Step 1 : Choose Size fix K, L, varying N$)`b ) 1Step 2 : Choose L fix NKL = size and K, varying L$2` b 2 1Step 2 : Choose K fix NKL = size and L, varying K$2` b 2  N: Set number`  ^Cache directory# = N K Cache size = N K L Constraints in selection of N: (page size)_ _` _ K: Associativity`&    QBigger miss ratio Smaller is better in: faster Cheaper 4 ~ 8 get best miss ratioH))``` R  L : Line Size`  Atomic unit of transmission Miss ratio Smaller Larger average delay Less traffic Larger average hardware cost for associative search Larger possibility of  Line crossers Workload dependent 16 ~ 128 byteH/x|x!x/`|`!`  Cache Replacement Policy`  FIFO (first-in-first-out) LRU (least-recently used) OPT (furthest-future used) do not retain lines that have next occurrence in the most distant future Note: LRU performance is close to OPT for frequently encountered program structures.@P ``R`  Program Structure`   for i = 1 to n for j = 1 to n endfor endfor last-in-first-out feature makes the recent past likes the near future4  [ ```` `` ```` `` ````H`L 6    I U,)Why LRU and OPT are Close to Each Other?*P*` * LRU : look only at past OPT : look only at future But, recent past nearest future Why? (Consider nested loops)l\  2   2 b```  Problem with LRU`  Not good in mimic sequential/cyclic Example ABCDEF ABC& & ABC& & With a set size of 3N,x x ,``` Y Sequential Access`  Empirical Data`  NOPT can gain about 10% ~ 30% improvement over LRU (in terms of miss reduction)O O` O   A Comparison `  OPT has two candidates for replacement LRU only has one the least-recently used -- it never replaces the most recently referenced deadline in LRUZ Z+``Z`  %#+Performance Evaluation Methods for Workload,P,` , (Analytical modeling Simulation Measuring))` ) &$Cache Analysis Methods`  tHardware monitoring fast and accurate not fast enough (for high-performance machines) cost flexibility/repeatability2a`a` u '%Cache Analysis Methods`  Address traces and machine simulator slow accuracy/fidelity cost advantage flexibility/repeatability OS/other impacts - how to put them in?2%g%`g`  (&"Trace Driven Simulation for Cache#P#` # Workload dependence difficulty in characterizing the load no general accepted model Effectiveness possible simulation for many parameters repeatabilityd@6`@``6`  )'Problem in Address Traces`  Representative of the actual workload (hard) only cover milliseconds of real workload diversity of user programs Initialization transient use long enough traces to absorb the impact Inability to properly model multiprocessor effects~-D,3-`D``,`3`  *( An Example `  Assume a two-way associative cache with 256 sets Scache = 2 x 256 lines Assume that the difficulties of count or not count the initialization causes 512 more misses than actually required Assume a trace of length 100,000 with hit rate 0.99 than 1000 misses is generated the 512 makes big difference!! If want 512 miss count less than 5% then total misses = 512/5% = 10,240 miss thus with hit = 0.99 required trace length > 1,024,000!v#n* nNn) n"nV n%nd nb&6   +) o One may not know the cache parameters before hand What to do?? Make it longer than minimum acceptable length!p f 2f ffff o ,* S100,000? too small (10 ~ 100) x 106 OK? 1000 x 106 or more being used now ..     $`h`h` T /-./0123456 7 8 9 : ;<=>?@BCDEFKLM N!O"P#Q$R%>,  0` @EoOV` @Eff؂o` MMMwww` 33f3Ƨgzf` 3ffE` JH3f̙ff` 33̙fRP` =bf>?" dd@(? " nd@-< `_@`A`n< n?" ddH ""DDffPN    @ ` `,p@@ H(@(pCCL'(  L!T  L "b  L# " \  L "B L HDA "B L HDA "B L HDA "@@B L HDA "B  L HDA "B  L HDA "B  L HDA "@@B  L HDA "B  L HDA "B L HDA "B L HDA "@@B L HDA "  B L HDA "  B L HDA "  B L HDA "@ @ B L HDA "  B L HDA "  B L HDA "  B L HDA "@@B L HDA "B L HDA "B L HDA "z\  L "B L HDA "B L HDA "B L HDA "@@B L HDA "B  L HDA "B !L HDA "B "L HDA "@@B #L HDA "B $L HDA "B %L HDA "B &L HDA "@@B 'L HDA " B (L HDA " B )L HDA " B *L HDA "@ @ B +L HDA " B ,L HDA " B -L HDA " B .L HDA "@@B /L HDA "B 0L HDA "B 1L HDA "B 2L HDA "@@B 3L HDA "B 4L HDA "B 5L HDA "B 6L HDA "@@B 7L HDA "B 8L HDA " 9L # t?A?60%"@`tB :L 6D"tb `  ;L# "|i4 tB LB  BCENGGHʲI[TQ zR(VzR(V[T`TzR(V[T`T" ?L 6 "  X Click to edit Master title style!!  @L  Rectangle: Click to edit Master text styles Second level Third level Fourth level Fifth level"0  TClick to edit Master text styles Second level Third level Fourth level - Fifth level!   U  AL 6 "``  j* c    BL 6 "`   l*  c    CL 6 "`   l*  c   H L 0޽h ?>L @Eff؂o Blueprint*  0 **GGP*(  PB$T  P "6b  P# " P T??"@`\  P "B P HDA "B P HDA "B P HDA "@@B  P HDA "B  P HDA "B  P HDA "B  P HDA "@@B  P HDA "B P HDA "B P HDA "B P HDA "@@B P HDA "  B P HDA "  B P HDA "  B P HDA "@ @ B P HDA "  B P HDA "  B P HDA "  B P HDA "@@B P HDA "B P HDA "B P HDA "B P HDA "B P HDA "B P HDA "@@B P HDA "B  P HDA "B !P HDA "B "P HDA "@@B #P HDA "B $P HDA "B %P HDA "B &P HDA "@@B 'P HDA " B (P HDA " B )P HDA " B *P HDA "@ @ B +P HDA " B ,P HDA " B -P HDA " B .P HDA "@@B /P HDA "B 0P HDA "B 1P HDA "B 2P HDA "@@B 3P HDA "B 4P HDA "B 5P HDA "B 6P HDA "@@B 7P HDA "B 8P HDA "tB 9P 6D"$\ /c3  :P "/c3 B ;P 6D"/3 ,$D  0tB PB  ZBCENGGHʲI[TQ zR(VzR(V[T`TzR(V[T`T"]Ft\  ?P "tB @P 6D"r r tB APB 6D"442 BP  ZBCENGGHʲI[TQ zR(VzR(V[T`TzR(V[T`T"   CP 6R "Pp   X Click to edit Master title style!!  DP T Rectangle: Click to edit Master text styles Second level Third level Fourth level Fifth level"%p0u   [#Click to edit Master subtitle style$$  EP 6\Y "``  j* c    FP 6^ "`   l*  c    GP 6] "`   l*  c   H P 0޽h ?/ >PBP @Eff؂o 0 `(  "  T^jJjJ ?\"  ^ *b   H""DDff$  TP^jJjJ ? " ^ *b   H""DDff(  Zԝ^jJjJ ?\  ^ *b   H""DDff*  ZH^jJjJ ?  ^ *b   H""DDffp  01 ?&2  ^?  T^gֳgֳ ? C ^ SClick to edit Master notes styles Second Level Third Level Fourth Level Fifth Level"    T H  0qi6Ɍ? ? a(80___PPT10.kJ PH (      ThjJjJ ?\"   *b   H""DDff"  T(mjJjJ ? "  *b   H""DDff&  Z$}jJjJ ?\   *b   H""DDff(  ZjJjJ ?   *b   H""DDffH  0qi6Ɍ? ? a(80___PPT10.rL  0 6.0(  )   # lX\^gֳgֳ ??L!  ^   3 r0]^gֳgֳ ?@L%m  ^ H  0޽h ? 333ggg  0 0((    # l^gֳgֳ ??L;   ^   # lT^gֳgֳ ?@Lt  ^ H  0޽h ? 333gggt  0 $ (      # l^gֳgֳ ??L~N  ^   # lx^gֳgֳ ?@L-Q  ^    BhCzDEF1?gygy @  H  0޽h ? 333ggg  0 OG ((  ( ( # lPgֳgֳ ??LN    ( # l(gֳgֳ ?@L:  8   (9 xB ( H1?  ( Ttgֳgֳ?S kSmain(bj   ( THgֳgֳ?O  lScache(bj   ( Tڍgֳgֳ?t n ZYou need search!b  jB  ( B1? H ( 0޽h ? 333ggg  0 0 ( P!(.P( 08 dpRQ  0pdRQ  0 T'gֳgֳ?R  1 N Scache,b j &   @ o 0o 0 T/gֳgֳ?o I<b$  xB 0 H1? `@  o  0 o  0 T)gֳgֳ? o  I<b$  xB 0 H1? `e   0 T7gֳgֳ?dK -Q  ZFull associative`    0 T;gֳgֳ?1 K Q  YSet associative`    0 T=gֳgֳ?.K Q  W Direct Mapped`  @    0    0  BUCgDEF1?TTff @ l xB  0 H1?J J @    0   0  BUCgDEF1?TfTf @  xB 0B H1? J ? J  0  BeCpDEF1?K!dSo @gb  0  BeCpDEF1?K!dSo @b  0 TFgֳgֳ? p8 4 QSet #Pb   0  BDC"DEF1?2hCA! @ uQ H 0 0޽h ? 333ggg  0 6.8( X 8 8 # lJgֳgֳ ??L    8 3 rKgֳgֳ ?@L5C   H 8 0޽h ? 333ggg8  0   @x ( ֳ @ @ # l\Qgֳgֳ ??LR"    @ # l Rgֳgֳ ?@L1 /  d @ <1?mk8 / @/G @  B CDEF1? @/&xB @ H1?  & @ TUgֳgֳ?o V NM bitsb  pB  @ H1?opB  @ H1? tvB  @ N1?vB  @ N1?   @ T[gֳgֳ?#  MLog Nb   @ Tt`gֳgֳ? MLog Lb  vB @ N1?jB @@ B1?33jB @ B1?/ / @ Tdgֳgֳ?  l$Set number address in a line%%b % jB @ B1?n jB @ B1?z) @ Thjgֳgֳ? uh  Rset size b   @  BCpDEF1?oo @ a H @ 0޽h ? 333ggg  0 @8$$H(  H H # lzgֳgֳ ??L^g   8 > "H"B@ >]  H>] xB H H1?C s xB H H1?C s xB H H1?C s xB H H1?CsxB H H1?CsxB H H1?CsxB  H H1?CsxB  H H1?CsxB  H H1?Czszl  H <1?b]% @ >R  H>R   H Tgֳgֳ?>R K1.0b   H Tdgֳgֳ?>R K0.9b   H Tgֳgֳ?>'R K0.8b   H Tgֳgֳ?>-R K0.7b   H Tgֳgֳ?>RR& K0.6b   H Tägֳgֳ?>XR,  K0.5b   H TDǤgֳgֳ?> RZ  K0.4b   H T|̤gֳgֳ?>| RP  K0.3b   H TȤgֳgֳ?> Re  K0.2b   H TϤgֳgֳ?> R  K0.1b  xB H H1? ^ xB H H1? ^ xB H H1? ^ xB H H1?7 7^ 3 H TtԤgֳgֳ??  _8 10 20 30 40 Cache Size``b ` 4 H  BC DE0F88c? %/?HvFky  s  @]b xB H H1? 9 ~B  H N1? *  !H TXۤgֳgֳ?? r  L0.34b   #H T ߤgֳgֳ?@` s'General Curve Describing Cache Behavior((d (  $H Zgֳgֳ?x  X Miss Ratio f  H H 0޽h ? 333ggg  0 \TP( X P P # lgֳgֳ ?@LF-Mf   P@  BCDEF8c?NR @ K, H P 0޽h ? 333ggg   0 0(X( ֳ X X # lgֳgֳ ??LyI    X # l`gֳgֳ ?@L5   H X 0޽h ? 333ggg   0  `(  ` ` # lLcgֳgֳ ??L}M    ` # lugֳgֳ ?@LVfv\$ 0 0   H ` 0޽h ? 333ggg    0   0h (  h h 3 r|gֳgֳ ??L\,   d h <8c?I h Tgֳgֳ?&p m%Choose cache size fix K, L, varying N&&b & d h <8c?d  h T@n]gֳgֳ?:  7Choose line size L for fix cache size and K, varying L8P8b 8 d h <8c?T  ! h To]gֳgֳ? ^  7Choose associativity K fix cache size and L, varying K8P8b&   $   h  BbCDEF1?aa @;aK jB  h B1? jB  h B1?v   ]   h Tgֳgֳ?  p(Pick k = 2 - (likely k = 1) K is small))b ) pB  h H1?my h Zhgֳgֳ?BO6  Q Use new k b   h TPgֳgֳ?x _  R If k = old b  pB h H1? +V 2 h c GC_TENGHؚ I`TJ_TQ1? _TS_TS`T_T_TS`T_T %~ H h 0޽h ?h 333ggg   0 @ p_( Db hd p p # lgֳgֳ ??L5>   :8 IiYU pxB p H1?]  xB p H1?]  xB p H1?]  xB p H1?]ddxB p H1?]  xB p H1?]QQxB  p H1?]xB  p H1?]  xB  p H1?]l  p <1?}Y @ Ii]U pIi]U  p T`gֳgֳ?Ii]= K1.0b   p Tdgֳgֳ?Ic]7 K0.9b   p Tlgֳgֳ?I]l K0.8b   p Tȩgֳgֳ?I] K0.7b   p Tgֳgֳ?I] K0.6b   p Txgֳgֳ?I]  K0.5b   p Tgֳgֳ?I/ ]  K0.4b   p TԶgֳgֳ?I2 ]  K0.3b   p Tgֳgֳ?IU ])  K0.2b   p Tgֳgֳ?I ]U K0.1b  xB p H1? 9xB p H1?A A 9xB p H1? 9xB p H1? 9  p Tgֳgֳ?T( @10 20 30 40AAb A , p  BC DE0F88c? "426Cs}~J > (  @"! p TDgֳgֳ? C  ZCache Size (N)f    p Z<gֳgֳ?  eRelative Number of Missesf  H p 0޽h ? 333ggg$   0 P xd(  x x # lgֳgֳ ??L5>   :8 IiYU xyxB x H1?]  xB x H1?]  xB x H1?]  xB x H1?]ddxB x H1?]  xB x H1?]QQxB  x H1?]xB  x H1?]  xB  x H1?]l  x <1?}Y @ Ii]U xIi]U  x Tgֳgֳ?Ii]= K1.0b   x Tgֳgֳ?Ic]7 K0.9b   x Ttgֳgֳ?I]l K0.8b   x T0gֳgֳ?I] K0.7b   x TXgֳgֳ?I] K0.6b   x T4gֳgֳ?I]  K0.5b   x Tgֳgֳ?I/ ]  K0.4b   x Tgֳgֳ?I2 ]  K0.3b   x Tgֳgֳ?IU ])  K0.2b   x T gֳgֳ?I ]U K0.1b  xB x H1? 9xB x H1?A A 9xB x H1? 9xB x H1? 9  x Tgֳgֳ?xL @10 20 30 40AAb A , x  BC DE0F88c? "426Cs}~J > (  @:9 x Tgֳgֳ?  _Cache Line Size (L)f    x Zgֳgֳ?  eRelative Number of Missesf  H x 0޽h ? 333gggc  0  ` (  :8 /y?e  xB  H1?C u xB  H1?C u xB  H1?C u xB  H1?CtutxB  H1?CuxB  H1?CauaxB  H1?C!u!xB   H1?CuxB   H1?Cul   <1?c? @ /yCe  /yCe    Tgֳgֳ?/yCM K1.0b     T"gֳgֳ?/sCG K0.9b    Tgֳgֳ?/C| K0.8b    T)gֳgֳ?/C K0.7b    T/gֳgֳ?/C K0.6b    T0gֳgֳ?/C K0.5b    T5gֳgֳ?/? C  K0.4b    T8gֳgֳ?/B C  K0.3b    T;gֳgֳ?/e C9  K0.2b    Tp@gֳgֳ?/ Ce  K0.1b  xB  H1? I xB  H1?' ' I xB  H1? I xB  H1? I    TEgֳgֳ?|P @10 20 30 40AAb A ,   BC DE0F88c? "426Cs}~J > (  @.-  N|Igֳgֳ? t Cache Associativity Factor (K)f&       ZNgֳgֳ?g  eRelative Number of Missesf    s Rgֳgֳ ??L%   H  0޽h ? 333ggg  0 h`p(    3 r Ygֳgֳ ?@L   ^2  61? ^2  61?D g   # l]gֳgֳ ??LQ   ^2  61?D g H  0޽h ? 333ggg7  0 w( ֳ   # lXdgֳgֳ ??LQ     3 rDggֳgֳ ?@L   jB  B1?$jB  B1?1!  Tigֳgֳ?   Qsimplerc  pB  H1? c l pB  H1?j a H  0޽h ? 333ggg  0 0((    # l\rgֳgֳ ??LzJ     # l4sgֳgֳ ?@LZ`3  H  0޽h ? 333ggg  0 0(( Ԁg   # lwgֳgֳ ??L]-     # lxgֳgֳ ?@L  H  0޽h ? 333ggg^  0 ( g   # lgֳgֳ ??LS#     # ldgֳgֳ ?@L`    Z4gֳgֳ? C 6 L& .b  H  0޽h ? 333ggg>9 0 %w%@;.g M[a]c   z ag H Iv O,$D  0N  H  H 6 ?% MBc  ` H 0 ?)` H 0 ?e` H 0 ?i` H 0 ?` H 0 ?^` H 0 ?P`  H 0 ? !H <$ ?S MCc   "H < ?Vx MAc   #H < ?TH MDc   $H < ?W MEc   %H <@ ?Zp MFc   &H <( ?PH MGc   'H <` ?S MHc   (H <?a z.Nearest Future Access Furthest Future Access/x/c /  )H <l?>.g M[b]c   z ag *H q2w,$D  0N  +H  ,H 6 ?% MBc  ` -H 0 ?)` .H 0 ?e` /H 0 ?i` 0H 0 ?` 1H 0 ?^` 2H 0 ?P` 3H 0 ? 4H < ?Y MZc   5H < ?Sx MCc   6H < ?WH MAc   7H <( ?T MDc   8H < ?Wp MEc   9H < ?YH MFc   :H < ?P MGc   ;H <?a z.Nearest Future Access Furthest Future Access/x/c /  (g M[c]c  H H 0޽h ? 333gggg____PPT10?+Lv+D' = @B D' = @BA?%,( < +O%,( < +Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*H%(D' =+4 8?dCB0-#ppt_w/2BCB#ppt_xB*Y3>B ppt_x<*HD' =+4 8?\CB#ppt_yBCB#ppt_yB*Y3>B ppt_y<*HD{' =%(D#' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*H%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*HD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*HDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*H%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*HD' =+4 8?dCB0-#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*HDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<**H%(D' =+4 8?dCB1+#ppt_w/2BCB#ppt_xB*Y3>B ppt_x<**HD' =+4 8?\CB#ppt_yBCB#ppt_yB*Y3>B ppt_y<**H+8+0+H0 +Y  0  (    # lgֳgֳ ??LD     # lXgֳgֳ ?@L    Tgֳgֳ? y ?  M~ ~b  H  0޽h ? 333ggg  0 6.(     # l/gֳgֳ ??L8     3 r gֳgֳ ?@Lt0-   H  0޽h ? 333ggge  0 ee0e(  g   # l4gֳgֳ ??LA   .`8 ?  c rB  B1?`  0?rB  B1?|J`  0?|JrB  B1?`  0?rB   B1? `   0? rB   B1?Q  `   0?Q  rB   B1?ee`  0?eurB  B1?|eJe`  0?|eJurB  B1?ee`  0?eurB  B1?e e`  0?e urB  B1?Q e e`  0?Q e urB  B1?`  0?rB  B1?|J`  0?|JrB  B1?`  0?rB  B1? `  0? rB  B1?Q  `   0?Q  rB ! B1?` " 0?rB # B1?|J` $ 0?|JrB % B1?` & 0?rB ' B1? ` ( 0? rB ) B1?Q  ` * 0?Q  rB + B1?` , 0?rB - B1?` . 0?rB / B1?  ` 0 0?  rB 1 B1?uu` 2 0?urB 3 B1?DD` 4 0?DTrB 5 B1?| J ` 6 0?| J rB 7 B1?` 8 0?"rB 9 B1?` : 0?rB ; B1?  ` < 0?  rB = B1?` > 0?rB ? B1?| | ` @ 0?|  rB A B1? ` B 0? rB C B1?K K ` D 0?K [ rB E B1?  ` F 0? ) rB G B1?Q  ` H 0?Q  rB I B1?  ` J 0?  rB K B1?  ` L 0?  rB M B1?4 4 ` N 0?4 D rB O B1?|4 J4 ` P 0?|4 JD rB Q B1?4 4 ` R 0?4 D rB S B1?4 4 ` T 0?4 D rB U B1?Q 4  4 ` V 0?Q 4  D rB W B1?  ` X 0?  rB Y B1?| J ` Z 0?| J rB [ B1?  ` \ 0?  rB ] B1? ` ^ 0? rB _ B1?Q  ` ` 0?Q  rB a B1?f f ` b 0?f v rB c B1?|f Jf ` d 0?|f Jv rB e B1?f f ` f 0?f v rB g B1?f f ` h 0?f v rB i B1?Q f  f ` j 0?Q f  v rB k B1? l ` l 0? l rB m B1? l ` n 0? l rB o B1?u ul ` p 0?u l rB q B1?D Dl ` r 0?D Tl rB s B1? l ` t 0? "l rB u B1? l ` v 0? l rB w B1? l ` x 0? l rB y B1?| | l ` z 0?| l rB { B1?K K l ` | 0?K [ l rB } B1?  l ` ~ 0? ) l rB  B1? l `  0? l rB  B1? l `  0? l rB  B1?  `  0?  rB  B1? e e`  0? e urB  B1?  `  0?  rB  B1?  `  0?  rB  B1? `  0? rB  B1? 4 4 `  0? 4 D rB  B1? `  0? rB  B1? f f `  0? f v   Tx\gֳgֳ??  SOPT g  `  T_gֳgֳ?" A B C D E F G A B C& & G ABC& G ABC& G ABC CCg C   T]gֳgֳ? QA g    Thgֳgֳ?r= QA g    Tlgֳgֳ? QA g    T|pgֳgֳ?v  QB g    Ttgֳgֳ?G   QC g    Twgֳgֳ?   QA g    T|gֳgֳ?rS= QB g    Ttgֳgֳ?S QB g    T0gֳgֳ?Sv  QC g    TXgֳgֳ?G S  QA g    T4gֳgֳ? S  QB g    Tgֳgֳ? QC g    Tgֳgֳ?v  QA g    Tȗgֳgֳ?G   QB g    Tgֳgֳ?   QC g    Tgֳgֳ? v  SLRU g    T\gֳgֳ? S  QA g    T8gֳgֳ?r =S  QB g    Tgֳgֳ? S  QC g    Tgֳgֳ? v S  QD g    T̲gֳgֳ?G S  QE g    Tgֳgֳ? S  QF g    Tgֳgֳ?r" =  QA g    T`gֳgֳ?"   QB g    T<gֳgֳ?" v  QC g    Tgֳgֳ?G "   QD g    Tgֳgֳ? "  QE g    Tgֳgֳ?   QA g    Tgֳgֳ? v  QB g    Tgֳgֳ?G   QC g    Tdgֳgֳ?  QD g  r2  B1?E -  T@gֳgֳ?   P& ...b  r2  B1?[r2  B1?g'jB  B1? 6LjB  B1?"jB @ B1? jB  B1? jB  B1?  ` jB @ B1?, . jB @ B1? jB @ B1?  jB @ B1?0&4H  0޽h ? 333ggg  0 6.(    # lHgֳgֳ ??L^.     3 r gֳgֳ ?@LG9   H  0޽h ? 333ggg   0 L(    # lgֳgֳ ??L`0     # lgֳgֳ ?@L7     Tgֳgֳ?1_ jK  Lthe most recently referenced the furthest to be referenced in the futureMPM` M    BWC%DEF1?V$V$ @ M pB  H1?% %H  0޽h ? 333ggg  0 6. ( wx{ xx   # lgֳgֳ ??L     3 rDgֳgֳ ?@L}   H  0޽h ? 333ggg  0 6.P( X   # l0gֳgֳ ??LV &     3 rgֳgֳ ?@L  H  0޽h ? 333ggg`   0 `( ֳ   # lD gֳgֳ ??L^.     # l gֳgֳ ?@L     T gֳgֳ?tX T cont db  H  0޽h ? 333ggg ! 0 0(p( X   # lTgֳgֳ ??L     # lgֳgֳ ?@Ln8  H  0޽h ? 333ggg " 0 0( (  $     # l gֳgֳ ??L[      # lgֳgֳ ?@L   H   0޽h ? 333ggg # 0 (8( ֳ ( ( # l(&gֳgֳ ??L`0    ( # l'gֳgֳ ?@Lf  pB ( H1?d H ( 0޽h ? 333ggg $ 0 h`0( ֳ 0 0 # l3gֳgֳ ?@Lv  H 0 0޽h ? 333ggg % 0 h`8(  8 8 # lG^gֳgֳ ?@L,  ^ H 8 0޽h ? 333gggf 0 &P( ̘    H1 ?%2   ^   f^gֳgֳ ? C  ^ @` H  0qi6Ɍ ? a(f 0 &(     H1 ?%2      f=gֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &$(  $ $  H1 ?%2    $  fDgֳgֳ ? C   @` H $ 0qi6Ɍ ? a(f 0 &,( b! , ,  H1 ?%2    ,  fJgֳgֳ ? C   @` H , 0qi6Ɍ ? a(f 0 &4( \P 4 4  H1 ?%2    4  f2gֳgֳ ? C   @` H 4 0qi6Ɍ ? a(f 0 &<( ae@b < <  H1 ?%2    <  f(Ugֳgֳ ? C   @` H < 0qi6Ɍ ? a(f 0 & D(  D D  H1 ?%2    D  f[gֳgֳ ? C   @` H D 0qi6Ɍ ? a(f  0 &0L(  L L  H1 ?%2    L  f`gֳgֳ ? C   @` H L 0qi6Ɍ ? a(f  0 &@T(   T T  H1 ?%2    T  fXggֳgֳ ? C   @` H T 0qi6Ɍ ? a(f  0 &P\( ֳ \ \  H1 ?%2    \  fmgֳgֳ ? C   @` H \ 0qi6Ɍ ? a(f  0 &`d( 'G  d d  H1 ?%2    d  f$tgֳgֳ ? C   @` H d 0qi6Ɍ ? a(f  0 &pl(  l l  H1 ?%2    l  fzgֳgֳ ? C   @` H l 0qi6Ɍ ? a(f 0 &t(  t t  H1 ?%2    t  fgֳgֳ ? C   @` H t 0qi6Ɍ ? a(  0   |V (  | |  H1 ?%2    |  f4gֳgֳ ? C    When size and K is fixed N * L = constant but, use small L increase N, hence increase the directory size.$ `b | d | <1? j48 vCb |vCbl | <1?vAbxB | H1?C | Tgֳgֳ?c# iNb H""DDff48  C   | C l  | <1? A xB  | H1?u Cu   |  BFCnDEF1?<mEm @l   |  BlCGDEF1?k<kF @i  @8 C  | C r | B1?G  ~B | N1?Cu u  |  B9CDEF1?08 @ O  | Tgֳgֳ? u  iKb  H""DDffH | 0qi6Ɍ ? a(  0   V (     H1 ?%2      fHgֳgֳ ? C    When size and K is fixed N * L = constant but, use small L increase N, hence increase the directory size.$ `b | d  <1? j48 vCb vCbl  <1?vAbxB  H1?C  Tgֳgֳ?c# iNb H""DDff48  C    C l   <1? A xB   H1?u Cu    BFCnDEF1?<mEm @l    BlCGDEF1?k<kF @i  @8 C   C r  B1?G  ~B  N1?Cu u    B9CDEF1?08 @ O   T gֳgֳ? u  iKb  H""DDffH  0qi6Ɍ ? a(f 0 &(     H1 ?%2      fTgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &(     H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &( tg    H1 ?%2      f`gֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &( 4g    H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &(     H1 ?%2      f$gֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &(     H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &(     H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 & (  n0\    H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(f 0 &0(     H1 ?%2      fgֳgֳ ? C   @` H  0qi6Ɍ ? a(& 0 @v(     H1 ?%2   j   f4gֳgֳ ? C   HExample: Consider: line A is discarded by OPT, but retained by LRU then A is dead for LRU until it is replaced since A is most recently referenced, than k-1 more miss is generated before A can be ejected effective associative research is reduced by 1 BHP bb7bI d  <G/*1?]rd  <G/*1?6r|d  <G/*1?rH  0qi6Ɍ ? a(f# 0 &(     H1 ?%2      f@+gֳgֳ ? C   @` H  0qi6Ɍ ? a(f$ 0 & ( ԇ(      H1 ?%2       f0gֳgֳ ? C   @` H   0qi6Ɍ ? a(f% 0 &( P    H1 ?%2      f6gֳgֳ ? C   @` H  0qi6Ɍ ? a(f& 0 &(     H1 ?%2      f<gֳgֳ ? C   @` H  0qi6Ɍ ? a(f' 0 &$(  $ $  H1 ?%2    $  f,Cgֳgֳ ? C   @` H $ 0qi6Ɍ ? a(f( 0 &,(  , ,  H1 ?%2    ,  fIgֳgֳ ? C   @` H , 0qi6Ɍ ? a(f) 0 &4(  4 4  H1 ?%2    4  fL=gֳgֳ ? C   @` H 4 0qi6Ɍ ? a(N* 0 <(  < <  H1 ?%2    <  fTTgֳgֳ ?H }   (Note: You need to produce enough # of misses to get the precision for necessary # of misses. Each quachaping of the cache, the trace length increases roughly a factor of 8. For a 2 million byte caches - 100 million refs!!b H < 0qi6Ɍ ? a(r<0d@˨Y]zRrr20IL/RT?WPLڙ`%t  d "@%'*,.f13B68;,G:SUXZ\BP`_a?@ABCDEFGHIJLMNOPQRSTUVWXYZ[\]^_`abcdefghin ith LRUSequential AccessEmpirical Data A Comparison,Performance Evaluation Methods for WorkloadCache Analysis MethodsCache Analysis Methods#Trace Driven Simulation for CacheProblem in Address Traces An Example Slide 33 Slide 34  Fonts UsedDesign Template Slide Titles"4 $, _oNEoNE$_ Guang R. GaoGuang R. Gao  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJLMNOPQRSTUVWXYZ[\]^_`abcdefghiklmnopqstuvwxy~Root EntrydO)PicturesCurrent UserrSummaryInformation(Kd<PowerPoint Document(DocumentSummaryInformation8jRoot EntrydO)PPicturesCurrent UserrSummaryInformation(Kd<      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJLMNOPQRSTUVWXYZ[\]^_`abcdefghistuvwxy ՜.+,D՜.+,x4    Letter Paper (8.5x11 in)"! (Times New RomanArialTahoma Wingdings 新細明體 Blueprint Cache DesignCache Parameters Trade-offs in Set-AssociativityNoteSlide 5#Important Factors in Cache DesignAddress PartitioningNote: The exists a kneeSlide 9Basic PrincipleCache Design ProcessCache Design Process)Step 1 : Choose Size fix K, L, varying N2Step 2 : Choose L fix NKL = size and K, varying L2Step 2 : Choose K fix NKL = size and L, varying KN: Set numberK: AssociativityL : Line SizeCache Replacement PolicyProgram Structure Slide 21*Why LRU and OPT are Close to Each Other?Problem with LRUSequential AccessEmpirical Data A Comparison,Performance Evaluation Methods for WorkloadCache Analysis MethodsCache Analysis Methods#Trace Driven Simulation for CacheProblem in Address Traces An Example Slide 33 Slide 34  Fonts UsedDesign Template Slide Titles"4 $, PowerPoint Document(DocumentSummaryInformation8