èªãã æ¬
ããã°ã©ããŒã®ããã®CPUå
¥é
èè
: Takenobu Tani
äŒç€Ÿã§èªãŸããæ¹ããªã¹ã¹ã¡ãããŠãããèªãã§ã¿ãã®ã§ææ³ãæžããŸãã
åºç瀟ã¯ã©ã ãããŒãã§ãã
ãŸãšã
éåžžã«è¯ãã£ãã§ãããã®æ¬ãåºçããŠãããŠããããšãããããŸããšããæè¬ã®æ°æã¡ã§ãã£ã±ãã§ãã
å
·äœçã«ã¯ä»¥äžã®ç¹ãè¯ãã£ãã§ãã
- CPUã®é床åäžã®çºã«ã¯åœä»€æµã®å¯åºŠãäžããå¿ èŠããã -> ãã®ããã®å·¥å€« -> ããã«ããçºçããåé¡ -> åé¡ãžã®ã¢ãããŒããšããç« å±éãéåžžã«ããããããã£ã
- æ°ããæŠå¿µ(ãã€ãã©ã€ã³ãã¹ãŒããŒã¹ã«ã©ãã¢ãŠããªããªãŒããŒ, åå²åœä»€ç)ãç»å Žããéã«å¿ ãå³ã«ãã説æããã
- åãæŠå¿µãVendorãæ¬(ãã¿ãã)ã§ã¯ç°ãªãçšèªã§èª¬æãããŠããå Žåã«ãã®æºãããè£è¶³ããŠããã
- x86/Arm䞡察å¿ã®Assemblyã«ããæ€èšŒã³ãŒãããããå®éã«ç»å Žããæ©èœã確ããããã
ããã°ã©ã ã¯ã¡ã¢ãªããåœä»€ãååŸããŠé çªã«å®è¡ããŠããããããã®ã¡ã³ã¿ã«ã¢ãã«ã ãšè§£å床ãäžæ®µäžãããšæããŸããã
第1ç« CPUã¯åŠäœã«ããŠãœãããŠã§ã¢ãé«éã«å®è¡ããã®ã
CPUã®æ§èœãšã¯
æ¬æžã¯CPUãæ§èœã®é¢ããçºããã®ã§ãããããããæ§èœãšã¯ãšãããšããããå§ããŠãããŸãã
$$ CPUæé = \frac{å®è¡åœä»€æ°}{ããã°ã©ã } \times \frac{ã¯ããã¯ãµã€ã¯ã«æ°}{å®è¡åœä»€æ°} \times \frac{ç§æ°}{ã¯ããã¯ãµã€ã¯ã«æ°} $$
ãããŠãCPUã®æ§èœãšã¯ãäžèšã§å®ããCPUæéãçãã»ã©ãããšããŸãã
$$ CPUæé = \frac{ç§æ°}{ããã°ã©ã } $$
ã®ããã«ããããããã3é
ã«ããŠããããšãããšãåé
ããããããCPUã®ç°ãªãåŽé¢ãè¡šãããã§ãã
å
·äœçã«ã¯ã第1é
ãããã°ã©ã ãæ§æããåœä»€æ°ãã³ã³ãã€ã©ãåœä»€ã»ããã¢ãŒããã¯ãã£ã«ãã£ãŠæ±ºãŸããŸãã
第2é
ã¯CPUã®å
éšæ§é (microarchitecture)ã第3é
ã¯åå°äœãåè·¯å®è£
æè¡ããããã察å¿ããŸãã
åœä»€æµ
ããã°ã©ã ãbuild/compileããŠæ©æ¢°èªã«ããŠCPUã«æž¡ããŸãããCPUãããã®æ©æ¢°èªãã©ãèŠããŠããããšãããšãããçš®ã®åœä»€æµãšããŠèŠããŠãããšèª¬æãããŸãã
ãããŠæ¬æžã§ã¯ãã®åœä»€æµãæ©ãããããã«CPUãè¡ã£ãŠãã工倫ãšãé
ããªãèŠå ãšãã芳ç¹ããç« ãæ§æãããŸãã
第2ç« åœä»€ã®å¯åºŠãäžããããŸããŸãªå·¥å€«
CPUã®åœä»€æµãéãããããã«ãã®å¯åºŠã«çç®ããŸããå¯åºŠãšããã®ã¯ãå®è¡äžã®CPUã®ã¹ãããã·ã§ãããæ¡ã£ããšãã«ã©ããããã®åœä»€ãå®è¡äžããšããæŠå¿µãšç解ããŠããŸãã
ãŸãé次åŠçããå§ããŸãã
ããã¯ããåœä»€ã®å®è¡ãå®äºããŠãã次ã®åœä»€ã®å®è¡ãéå§ããåŠçæ¹æ³ã§ããå®è¡äžã®CPUã®ã©ã®æç¹ã§ã1ã€ã®åœä»€ããå®è¡ããŠããªãã®ã§å¯åºŠãšããŠã¯æãäœãç¶æ
ã§ãã
ããã«ãåœä»€å®è¡ã®ååŠç(ã¹ããŒãž)ãèæ
®ãããšé次åŠçã¯ä»¥äžã®ããã«è¡šãããšãã§ããŸãã
次ã«ãã€ãã©ã€ã³åããŸããä»ãŸã§ã¯åã®åœä»€ã®å®è¡å®äºãŸã§åŸ
ã£ãŠãã次ã®åœä»€ãå®è¡ããŠããŸããããå®äºãŸã§åŸ
ã€ã®ã§ã¯ãªããã¹ããŒãžã®å®äºãŸã§åŸ
ã£ãŠãã次ã®åœä»€ãå®è¡ããŸãã
ãã®çµæããã1æç¹ãã¿ããšã¹ããŒãžæ°ã®åœä»€ãå®è¡äžãšãªããŸãã
ãªãããã€ãã©ã€ã³åã«ãããåã®åœä»€ã®å®è¡ã®å®äºãåŸ ããã«æ¬¡ã®åœä»€ãå®è¡ããããšã«ãªããŸããããã¯åºãæå³ã§ã®ãææ©çãªåŠçããå°å ¥ãããã®ãšèª¬æãããŸãããã®ããšãæ¬¡ç« ä»¥éã®ããŒã¿äŸåé¢ä¿ããåå²åœä»€ã®è©±ã«ã€ãªãããŸãã
ãããŠããã®ãã€ãã©ã€ã³ãç©ççã«è€æ°èšããã¢ãããŒããã¹ãŒããŒã¹ã«ã©åã§ãã
æåŸã«ãã¹ããŒãžåå²ãæŽã«æšãé²ã(ã¹ãŒããŒãã€ãã©ã€ã³å)ããã¹ãŒããŒã¹ã«ã©ã®äžŠå床ã4ã«ãã¹ããŒãžåå²æ°ã12ã«ãããšçŸä»£ã®æšæºçãªCPUãšåçšåºŠã®èŠæš¡æã«ãªãããã§ãã
æåã®é次åŠçãšæ¯èŒãããšå¯åºŠãåäžããŠããããšãããããŸãã
ã¹ãŒããŒã¹ã«ã©ããã€ãã©ã€ã³åã¯æŠå¿µãšããŠã¯ç¥ã£ãŠããŸãããåœä»€æµã®å¯åºŠãšãã芳ç¹ããæŽçããŠãããŠããæ¬ç« ã®èª¬æã¯ãšãŠãããããããã£ãã§ãã
第3ç« ããŒã¿äŸåé¢ä¿
第2ç« ã§å°å
¥ããããã€ãã©ã€ã³åã«ããåœä»€æµã®å¯åºŠãé«ããããšãã§ããŸããããã ãããã€ãã©ã€ã³åã«ãã£ãŠãå
ã®åœä»€ã®å®è¡å®äºãåŸ
ããã«æ¬¡ã®åœä»€ã®å®è¡ãéå§ãããŸãã
ããã«ãã£ãŠåœä»€éã«äŸåé¢ä¿ãããå Žåã«åé¡ãçããŸããå
·äœçã«ã¯
add x1, x2, x3
sub x4, x5, x1
ã®ããã«add
åœä»€ã§æŽæ°ããregisterãåŸç¶ã®sub
åœä»€ã§å©çšããå Žåãadd
åœä»€ã®å®äºã«ãã£ãŠx1 registerãæŽæ°ãããŠããã§ãªããšsub
åœä»€ãå®è¡ã§ããŸããã
ãã®åœä»€ãåŸ
æ©ããŠããéã¯ãã€ãã©ã€ã³ã®ã¹ããŒãžã®äžéšãå©çšãããŠããªãç¶æ
ãšãªã£ãŠããŸããåœä»€æµã®å¯åºŠãäœäžããŠããŸããŸãã
ããã§ãäŸåé¢ä¿ã«ãã£ãŠåŸ
æ©ããŠããåœä»€ãå®è¡ãã代ããã«å¥ã®åœä»€ãå
ã«å®è¡ããããšã§ãã¹ããŒãžã®ç©ºããåããã¢ãŠããªããªãŒããŒãšããææ³ãå°å
¥ãããŸãã
ãŸããããŒã¿ã®äŸåé¢ä¿ã«ãããããçš®é¡ããããäŸåé¢ä¿ã®çš®é¡ã«ãã£ãŠã¯ãregisterã®ãªããŒã ã§å¯Ÿå¿ã§ãããšãã£ã説æããããŸãã
ã¢ãŠããªããªãŒããŒã®ååšèªäœã¯ãAtomics and Locksãèªãã§ç¥ã£ãŠããã®ã§ããã©ãããçç±ã§ãããå®è¡ããããã¯ããã£ãŠããªãã£ãã®ã§æ¬ç« ã®èª¬æã¯ãšãŠãããããããã£ãã§ãã
åœä»€ã¬ã€ãã³ã·ã®èšæž¬å®éš
å®éã«åçš®åœä»€(add,mul,load)ã®ã¬ã€ãã³ã·ãèšæž¬ããassemblyã®ãµã³ãã«ããããŸãã
ãšãªããaddåœä»€ã«ã€ããŠã¯1cycleã§å®è¡ããŠããããšã確ãããããŸããã
0.2ç§çšåºŠã§10ååã®åœä»€ãå®è¡ãããã®ã¯ãããã§ãã
ãã®ä»ã«ãäŸåé¢ä¿ãå€ãããšã©ãå€åãããã®å®éšããããŸãã
æ€èšŒç°å¢ã¯ä»¥äžã§ãã
$ uname --kernel-name --kernel-release --machine --processor --hardware-platform --operating-system
Linux 5.19.0-35-generic x86_64 x86_64 x86_64 GNU/Linux
第4ç« åå²åœä»€
ã¢ãŠããªããªãŒããŒããã£ãŠããŠãã«ããŒã§ããªãåœä»€æµã®å¯åºŠäœäžèŠå ã«åå²åœä»€ããããŸãã
ãšããã®ããifã®ãããªåœä»€ã¯å®è¡ããŠæ¡ä»¶æç«ãå€æããã®ã¡ã«ãpc registerãæŽæ°ããããšã§åœä»€æµãåãæ¿ããã®ã§åå²åœä»€ã®åŸã®åœä»€ã®å®è¡ãå
šãŠç¡é§ã«ãªãå¯èœæ§ããããŸãã
ããã§ãããã«å¯ŸåŠããããã«åå²äºæž¬ãšããä»çµã¿ãCPUã«å®è£
ãããŸãã
åå²äºæž¬ã§ã¯ãåå²åœä»€ãå®è¡ããããã³ã«ãã®åœä»€ã®ã¢ãã¬ã¹ãšåå²å
ãå°çšã®èšæ¶é å(BTB)ã«ä¿æããŠãããåœä»€ãfetchãã床ã«ã¢ãã¬ã¹ã§æ€çŽ¢ããŠåå²åœä»€ããå€å®ããŸããããã«åå²åœä»€ã®æ¡ä»¶ã®æåŠã®å±¥æŽãä¿æããŠãããåå²äºæž¬ã«åœ¹ç«ãŠãããã§ãã ifãå®è¡ãã床ã«CPUã§ã¯ãããªããšãèµ·ããŠãããšç¥ãè¡æçã§ããã
æ¬ç« ã«éã£ãããšã§ã¯ãªãã§ãããåèãšããŠç±³åœç¹èš±ãŸã§ããæããããŠããçè
ã®ç¥èã®æ·±ãã«é©ããããŸãã
åå²äºæž¬ãã¹ã®èšæž¬å®éš
æ¬ç« ã®å®éšã§ã¯ãåå²äºæž¬ã50%ãã¹ããããã°ã©ã ãš100%ãããããããã°ã©ã ã§ã©ã®çšåºŠã®å·®ãã§ãããæ€èšŒããŸãã
以äžãèªåã®æå
ã®èšæž¬çµæã§ããã
2000äž branchesãšãããŸããããã®å
ååã¯loopã®å€å®ãªã®ã§å®è³ªçã«ã¯1000äžã§ãmissã500äžãšãªã£ãŠããŸãã
次ãåå²äºæž¬ãã»ãšãã©ãããå Žåã§ãã
çç®ãã¹ãã¯ãcycleæ°ã40%çšåºŠæžã£ãŠããããå®è¡é床ã40%çšåºŠåäžããŸããã
ãã®ããã«åå²äºæž¬ã®åœ±é¿ãå®éã«ç¢ºãããããšãã§ããŸããã
第5ç« ãã£ãã·ã¥ã¡ã¢ãª
CPUããã¡ã¢ãªãžã®ã¢ã¯ã»ã¹ã«ã¯10 ~ 100ãµã€ã¯ã«çšåºŠãèŠããããããŠã¢ãŠããªããªãŒããŒå®è¡ã§åãããããµã€ã¯ã«æ°ã«ãéçããããããã§ãCPUãã¡ã¢ãªéã«ãã£ãã·ã¥ãå°å
¥ãããŸãã
ã¡ã¢ãªãžã®æžã蟌ã¿ã¯ãã£ãã·ã¥ã«ãªãããã®ã§ãã£ãã·ã¥ãšã¡ã¢ãªéã®äžæŽåãçºçããããšãšãªãããã®åé¡ã¯10ç« ã§æ±ããŸãã
ãã£ãã·ã¥ãå°å
¥ãããšããŠããã£ãã·ã¥ãã¹èªäœã¯é¿ããããããã®åœ±é¿ã¯å€§ããã
ããã§ãã£ãã·ã¥ãã¹ãèµ·ããããå Žåã3ã€ã«é¡ååãããããã察çããŠãããŸãã
ãŸãåæåç
§ãã¹ã«å¯ŸããŠã¯ãã£ãã·ã¥ã©ã€ã³ã§ã容éæ§ãã¹ã«å¯ŸããŠã¯éå±€åã競åæ§ãã¹ã«ã¯ã»ããã¢ãœã·ã¢ãã£ãæ¹åŒã§å¯ŸåŠããŸãã
ãã£ãã·ã¥ã®è©±ã§ãã£ãã·ã¥ã©ã€ã³ãéå±€åããã説æãããŸãããããããã£ãã·ã¥ãã¹ã®é¡åãšå¯Ÿå¿ããã説æãããããããã£ãã§ãã
ãŸããèªåã¯ãã«ã¢ãœã·ã¢ãã£ãæ¹åŒãšã»ããã¢ãœã·ã¢ãã£ãæ¹åŒã®éããwayæ°ãšãããã®ãããããã£ãŠããªãã£ãã®ã§æ¬ç« 説æã¯éåžžã«ãããããã£ãã§ãã
æ®æ®µã®ã¢ããªã±ãŒã·ã§ã³ã§ãã£ãã·ã¥ã©ã€ã³ãæèããããšã¯ã»ãšãã©ãªãã®ã§ãããã©ã€ãã©ãªã®ã³ãŒããèŠãŠãããšãã£ãã·ã¥ã©ã€ã³ãæèããã³ã¡ã³ããæã
èŠãããããšããããŸããçŸç¶ã§ã¯structã64byte以å
ã«ããŠãããšãã£ãã·ã¥ã«ä¹ãããããããã®ç解床ã§ãã
ãã£ãã·ã¥ãã¹ã®æž¬å®
æ¬ç« ã§ã¯ãã£ãã·ã¥ãã¹ãã©ã®ããã«åœ±é¿ããããå®éšããŸãã以äžããã£ãã·ã¥ããã¹ãããããã°ã©ã ã
ã¢ãã¬ã¹ã®å¢åå€ãšããŠ4096ãå©çšããŸããã
次ããã¡ã¢ãªã¢ã¯ã»ã¹ããã£ãã·ã¥ã©ã€ã³ã«ã®ããããªããã°ã©ã ã§ãã
0.1%çšåºŠãããã¹ããŠããªãããã§ãã
èªåã®ç°å¢ã§ã¯å®è¡é床ã«å·®ãã§ãŸããã§ããã
第6ç« ä»®æ³èšæ¶
Virtual addressãšphysical addressã®å€æãè¡ãã¬ã€ã€ãŒãä»®æ³èšæ¶ã ä»®æ³èšæ¶ãå°å
¥ããããšã§æ§ã
ãªã¡ãªãããããäžæ¹ã§ã察å¿é¢ä¿ã®æ
å ±èªäœ(ããŒãžããŒãã«)ã¯ã¡ã¢ãªäžã«ããããããã£ãŠãã¡ã¢ãªã«ã¢ã¯ã»ã¹ããéã«ã¯å¯Ÿå¿é¢ä¿ã®è§£æ±ºã®ããã«ã¡ã¢ãªã¢ã¯ã»ã¹ãå¿
èŠã«ãªãã®ã§éœå2åã®ã¢ã¯ã»ã¹ãå¿
èŠãšãªã£ãŠããŸãã
ããã§ã¯ãã£ãã·ã¥ã解決ããããšããåé¡ãšåãããšãèµ·ããŠããŸããããã§ãããŒãžããŒãã«ã®äžéšãCPUäžã«ä¿æããããš(TLB)ã§ã¢ãã¬ã¹è§£æ±ºæã®ã¡ã¢ãªã¢ã¯ã»ã¹ãæããããã«ããã
èªåã¯ããŒãžããŒãã«ãšTLBã®é¢ä¿ã®ç解ãææ§ã ã£ãã®ã§ãæ¬ç« ã®èª¬æããšãŠããããããã£ãã§ãã
ãŸããããã»ã¹ãããã¹ã¬ããã®æ¹ãåãæ¿ãã³ã¹ããäœãçç±ãšããŠTLBã®ãã£ãã·ã¥ãã¹ã圱é¿ããŠããã®ããªãã»ã©ã§ããã
å ããŠããŒãžããŒãã«ã倧ãããããåæ©ãããã£ãŠããªãã£ãã®ã§ãTLBã®ãã£ãã·ã¥ãã¹ãäžãããããšãã説æããšãŠãããããããã£ãã§ãã
CPUã®åœä»€æµã®å¯åºŠãé«ããããã®å·¥å€«ãç¥ããšãããŒãžãã©ã«ãæã«I/Oãçºçãããããä»ãŸã§ã®èŠåŽãå šéšæ°Žã®æ³¡ã«ãªããšããã®ãè ¹èœã¡ã§ããã®ãããããã§ãã
TLBãã¹èšæž¬å®éš
ãã£ãã·ã¥ã©ã€ã³åæ§ã«ãTLBãã¹ã®åœ±é¿ãå®éšããŸãã 以äžã¯TLBããããããããã°ã©ã ã§ãã
次ã«ãTLBããã¹ãããããã°ã©ã ã§ããå€æŽç¹ã¯ãåç §ããã¡ã¢ãªã¢ãã¬ã¹ãäžå®ç¯å²ã«æããã³ãŒãããªãããåç §ããããŒãžæ°ãå¢ãããã ãã§ãã
èªåã®ç°å¢ã§ã¯å®è¡æéã7åçšåºŠå¢å ããŸããã
TLBãã¹ã®åœ±é¿ã確èªã§ããŸããã
第7ç« I/O
CPUã®åœä»€å®è¡ã«ããCPUããå€éšã®ããã€ã¹ã«ã¢ã¯ã»ã¹ããä»çµã¿ã«ã€ããŠã
èªåã¯ãã¡ã¢ãªããããI/Oãšå°çšã®I/Oåœä»€ã«ããã¢ã¯ã»ã¹ã®æ¹åŒããã£ã¡ãã«ãªã£ãŠç解ããŠããã®ã§ãæ¬ç« ã®æŽçã¯éåžžã«å©ãããŸããã
ãŸããDMAã³ã³ãããŒã©ãšCPUã®é¢ä¿ã®èª¬æãããããããã£ãã§ãã
I/Oåœä»€ã«ããããã€ã¹ã®å€ã®èªã¿åºã
æ¬ç« ã®å®éšã§ã¯ãreal time clock(RTC)ãPCI Expressã®æ
å ±ãèªã¿åºãããã°ã©ã ãè©ŠããŸãã
æ®å¿µãªããèªåã®PCã§ã¯Segmentation faultãšãªã£ãŠããŸããããŸããã§ããããI/Oãin,outåœä»€ããå®çŸãããŠããããšãããããŸãã
第8ç« ã·ã¹ãã ã³ãŒã«ãäŸå€ãå²ã蟌ã¿
åå²åœä»€ä»¥å€ã§ãåœä»€æµã®åãæ¿ããèµ·ããã±ãŒã¹ã«ã€ããŠã
exception, interrupt, trap, fault, system call,...çãåœä»€æµã®ç¹å¥ãªåãæ¿ããšãã芳ç¹ããæŽçããŠãããŠããŸãã
æ¬æžã¯CPUã«é¢é£ãããããã¯ãåœä»€æµãšãã芳ç¹ããæŽçããŠãããŠãããŸãããæ¬ç« ã®æŽçã¯ç¹ã«ããããããã§ãã
å®çŸ©ã®ä»æ¹ã«ããããŸãããã·ã¹ãã ã³ãŒã«ãäŸå€ãå²ã蟌ã¿ã«ã€ããŠã¡ã³ã¿ã«ã¢ãã«ã確ç«ããããšæãæ¹ã«æ¬ç« ã¯ãšãŠããããããããã§ãã
ãŸããããŸã§ã®ç« ã§ãå²ã蟌ã¿ã³ã³ãããŒã©ãŒ, ã¢ãã¬ã¹å€æåŠçããã£ãã·ã¥ãI/Oãã¹çã«ãµããŸãããããããé¢é£ã³ã³ããŒãã³ããšåçš®äºè±¡ãã©ã察å¿ããŠãããã®å³ãéåžžã«ããããããã£ãã§ãã
å ããŠãã·ã¹ãã ã³ãŒã«ãäŸå€ãå²ã蟌ã¿æã®æåã®èª¬æãå
·äœçã§ããã¯ã¿ãŒããŒãã«ã®èª¬æããããŸãã
ã·ã¹ãã ã³ãŒã«ã¯é
ããšæŒ ç¶ãšæã£ãŠããã®ã§ããããªãé
ãããããããŸã§ã®ãã€ãã©ã€ã³ããã£ãã·ã¥ã®èŠ³ç¹ããç解ã§ããŸããç« ç«ãŠãç·ŽãããŠãããšæããããŸãã
ã·ã¹ãã ã³ãŒã«ãšäŸå€ã®å®éš
å®éã«ã·ã¹ãã ã³ãŒã«ãå®è¡ããŠã¿ãŸãã
ã·ã¹ãã ã³ãŒã«èªäœã¯ãä»æ§ãææ¡ã§ããŠããããããã°ãåŒæ°ãregisterã«èšå®ããã®ã¡ãå°çšã®åœä»€ãå®è¡ããã ããšããã®ãããããŸãã
/* write(2) system-call */
mov eax, 1 /* system-call number: write() */
mov edi, 1 /* fd: stdout */
lea rsi, [rip + msg] /* buf: */
mov edx, 13 /* count: */
syscall
Hello Worldæ¬ã§åŠãã ããšã§ãããããã¯x86 + linuxã®ä»æ§ã§ãã£ãŠãsystem callã®åŒæ°ãregisterã§ã¯ãªãstackçµç±ã§æž¡ããšããã®ãèšèšäžã¯ãããããšããç解ã§ãã
ãŸãããŒãé€ç®ãããŒãžãã©ãŒã«ãã®äŸããããŸãã
第9ç« ãã«ãããã»ããµ
2ã€ä»¥äžã®CPUã«ãã£ãŠæ§æãããå Žåã«ã€ããŠã
ãã«ãããã»ããµãšãã«ãã³ã¢ã®æèã«ãã䜿ãåãã®èª¬æããªãã»ã©ã§ããã
èªåã¯ãã«ãã³ã¢ãšèšã£ãéã«æ³å®ãããããŒããŠã§ã¢æ§æã¯äžã€ã ãšæã£ãŠããã®ã§ãããå€æ§ãªæ§æãå¯èœãªã®ãå匷ã«ãªããŸããã
ç¹ã«ãã¡ãã»ãŒãžäº€æåã§ã¡ã¢ãªå
±æããªãããŠããªãå Žåããããšã¯æã£ãŠãã¿ãªãã£ãã§ãã
第10ç« ãã£ãã·ã¥ã³ããŒã¬ã³ã¹å¶åŸ¡
åç« ã§èª¬æãããå
±æã¡ã¢ãªåã«ãããåé¡ç¹ãšå¯ŸåŠæ³ã«ã€ããŠã
å
·äœçã«ã¯ãCPUããšã«cacheãä¿æããããšãšãªãã®ã§ãåäžã¡ã¢ãªã¢ãã¬ã¹ã®ã³ããŒãè€æ°ååšãããããä»ã®CPUã®æŽæ°çµæãå¥ã®CPUããèªããªããšãã£ããã£ãã·ã¥éã®æŽåæ§ã厩ããåé¡ã«å¯ŸåŠããå¿
èŠãããã
ãã£ãã·ã¥ã³ããŒã¬ã³ã¹ã®åé¡ç¶æ³ãäžå¯§ã«èª¬æããŠãããã®ã§ãMSIãããã³ã«ã®ç«ã¡äœçœ®ãç解ããããã£ãã§ãã
æåã«ãã£ãã·ã¥ã³ããŒã¬ã³ã¹ã®è©±ãç¥ã£ãéã¯ãå€æ°ã«æžã蟌ããšè£ã§ã¯CPUéã§åœè©²ãã£ãã·ã¥ãç¡å¹ã«ããããåããè¡ãããŠãããªããŠæããããŸããã§ããã
ã³ããŒã¬ã³ã¹ãã¹ã®å®éš
æ¬ç« ã®å®éšã§ã¯ã2ã€ã®threadã§ãäºãã«ã¡ã¢ãªãå€æŽãåãããã°ã©ã ãåãããŸãã
ãã®éãå€æŽå¯Ÿè±¡ã®ã¡ã¢ãªã®ãã£ãã·ã¥ã©ã€ã³ãåããç°ãªããã§ã©ã®ãããªåœ±é¿ã芳枬ãããããæ€èšŒããŸãã
ãŸãthreadéã§ãã£ãã·ã¥ã©ã€ã³ãå ±æããªãå Žåã§ãã
ãã¹çã0.19%ãšã»ãšãã©ãªãããšãããããŸãã
次ã¯ãã£ãã·ã¥ã©ã€ã³ãå
±æããå Žåã§ãã
ãã£ãã·ã¥ãã¹çã70åçšåºŠå¢å ããå®è¡æéã3åçšåºŠå¢å ããŸããã ãã£ãã·ã¥ã©ã€ã³ã®ãã©ãŒã«ã¹ã·ã§ã¢ãªã³ã°ã®åœ±é¿ãããããšã確ãããããŸããã
第11ç« ã¡ã¢ãªé åºä»ã
1ã€ã®CPUããè€æ°ã®ã¡ã¢ãªã¢ã¯ã»ã¹ã«äœããã®é åºé¢ä¿ã匷å¶ããæ段ãšããŠã®memory orderingã«ã€ããŠã
ãªãã¡ã¢ãªã¢ã¯ã»ã¹ãäžã€ã®CPUå
ã§å
¥ãæ¿ããã®ãã®èª¬æãåèã«ãªããŸãã
fenceåœä»€ã®å¿
èŠæ§ããacquire, releaseåœä»€ã®åäœãå
·äœäŸã€ãã§ããããããã§ãã
ãŸããæ¬ç« ã§èª¬æãããã¡ã¢ãªé åºä»ãã¯1ã€ã®CPUããã®ã¡ã¢ãªã¢ã¯ã»ã¹ã«ã€ããŠã§ããç¹ã匷調ãããŠããŸãããã®ãããæ¬ç« ãšRust Atomics and Locksã䜵ããŠèªãã®ããªã¹ã¹ã¡ã§ãã
fenceãldar, stlråœä»€ããããŸã§1CPUã«å¯Ÿããå¶çŽã§ãããhappens-before relationshipãšã¯å¥ã®è©±ãšæŽçã§ããŠéåžžã«ç解ãé²ã¿ãŸããã
ã¡ã¢ãªãªãŒããªã³ã°ã®å®éš
æ¬ç« ã®å®éšã§ã¯å®éã«CPUãåœä»€ãå
¥ãæ¿ããŠå®è¡ããŠããããšãå®éšããŸãã
å
·äœçã«ã¯ã0ã«åæåãããå€æ°x,yã«å¯ŸããŠã以äžã®äºã€ã®threadãå®è¡ããŸãã
- thread1: x = 1ãå®æœããã®ã¡ãyãload
- thread2: y = 1ãå®æœããã®ã¡ãxãload
åœä»€ãã€ã³ãªãŒããŒã«å®è¡ãããŠããã°threadã®å®è¡é åºã«é¢ããããx = 0, y = 0ãšããç¶æ
ã«ã¯è³ããªãã¯ãã§ãã
ããããªãããx86ã§ãã£ãŠãloadãšstoreéã§ã¯å®è¡é åºã®å
¥ãæ¿ããèµ·ããããšã蚱容ãããŠããããšãããå®éã«x = 0, y = 0ãšããç¶æ
ã芳枬ãããŸãã
å®éã«è©ŠããŠã¿ããšä»¥äžã®ããã«ãªããŸããã
ç¶ããŠãstoreãšloadéã«ã¡ã¢ãªãã§ã³ã¹åœä»€ãæ¿å ¥ããäŸãè©ŠããŸãã
mov [rip + value_x], rax /* value_x = 1 */
mfence /* FORCE ORDERING */
mov r14, [rip + value_y] /* r14 = value_y */
ã¡ã¢ãªãã§ã³ã¹åœä»€ãå®éã«åœä»€ã®å ¥ãæ¿ããææ¢ããŠããããšã確ãããããŸããã
第12ç« äžå¯åæäœ
æ¬ç« ã§ã¯å
±éã®ã¡ã¢ãªã2ã€ä»¥äžã®CPUéã§çžäºã«æŽæ°ããå Žåã«ã€ããŠã§ãã
ã¡ã¢ãªãªãŒããªã³ã°ã«æ¯ã¹ãŠãäžå¯åæäœã®è©±ã¯è§£æ±ºãããåé¡ãããããããã解決æ³ãå°çšã®åœä»€äœ¿ããšãã話ã§æå€ãšããããããå°è±¡ããããŸãã
swapãcompare and swapåœä»€ãã©ã®ããã«åäœããã®ãã®èª¬æããããŸãããã®ãããã¯ããã°ã©ã èšèªã§ããã®ãŸãŸapiã«ãªã£ãŠããæ°ãããã®ã§ãå
éšåäœãç¥ãã®ãçµå±è¿éã ãšæããŸããã
äžå¯åæäœã®å®éš
æ¬ç« ã®å®éšã§ã¯ãå ±æå€æ°ãthreadéã§ã€ã³ã¯ãªã¡ã³ãããéã«atomicåœä»€ã䜿ããªããšã©ããªãããæ€èšŒããŸãã
å ·äœçã«ã¯ä»¥äžã®ããã«atomicåœä»€ã䜿ã£ãŠã€ã³ã¯ãªã¡ã³ãããå Žå
lock xadd [rip + counter], rax
ãšåçŽãªaddåœä»€ã䜿ãå Žåãæ¯èŒããŸãã
add rax, 1
$ ./counter_atomic
main(): start
child1(): start
child2(): start
child2(): finish: loop-variable = 5000000
child1(): finish: loop-variable = 5000000
main(): finish: counter = 10000000
$ ./counter_bad
main(): start
child1(): start
child2(): start
child2(): finish: loop-variable = 5000000
child1(): finish: loop-variable = 5000000
main(): finish: counter = 5133651
addåœä»€ãå©çšãã./counter_bad
ã®æ¹ã§ã¯ãcounterãæå³éãã«ãªã£ãŠããªãããšã確èªã§ããŸããã
ãŸãaddåœä»€ã䜿ã£ãå Žåã§ãthreadéã®ã¢ã¯ã»ã¹é »åºŠã§ã¯counterã®å€ãããæå³éãã«ãªãäŸãèŒã£ãŠããããã®çš®ã®ãã°ã®åä»ããããããŸãã
第13ç« é«éãªãœãããŠã§ã¢ãæžãéã«ã¯äœã«æ³šç®ãã¹ãã
ãããŸã§ã®ãŸãšãã
é«éãªãœãããŠã§ã¢ãæžãããã§ãäœã«çç®ãããã¯ãå®è¡å¯Ÿè±¡ã®ãœãããŠã§ã¢ã ãã§ãªãããã°ã©ãã³ã°èšèªãOSãCPUçã«ãäŸåããããé£ããåé¡ã
ãããªäžã倧æ ãšããŠã©ã®ããã«ã¢ãããŒãã§ãããã«ã€ããŠæããŠãããŸãã
ä»é²A CPUã«ã€ããŠããã«åºãæ·±ãç¥ãã«ã¯
CPUã«ã€ããŠããã«è©³ããç¥ããã人ã«åããæ
å ±æºã®çŽ¹ä»ã
æžç±ã ãã§ãªããè«æãç¹èš±ãè¬çŸ©è³æãèŒã£ãŠãããããã§ãã èªåã¯ãSystems Performance Second Edition(詳解ã·ã¹ãã ã»ããã©ãŒãã³ã¹ 第2ç)ãèªãã§ã¿ãããšæã£ãŠãããŸãã
ããšããã«ãèè
ã®ãªã¹ã¹ã¡ãã®ã£ãŠããã®ã§èŠãã§ãã¯ã§ãã
ä»é²B åCPUã®åºæ¬çãªåœä»€
x86, Arm, RISC-Vããããã®åºæ¬çãªåœä»€ã解説ããŠãããŸãã
èªåã®ã¢ã»ã³ããªã®åŠç¿ãœãŒã¹ãåºæ¬çã«æ¬ã®appendixãªã®ã§éåžžã«ãããããã§ãã
RISC-Vã¯æ¯èŒåœä»€ã§æé»çãªã¬ãžã¹ã¿ãæŽæ°ããªããšããåŠã³ããããŸããã
ä»é²C çŸä»£çãªCPUã®å®è£ äŸ (BOOM)
BOOMãšããRISC-Vã®å®è£
ã«ã€ããŠã
ãŸã£ããèªããŠãªãã§ãã
Chiselã§æžãããŠãããããã§ãããã€ãããŒããŠã§ã¢èšè¿°èšèªã®æèã§çªç¶Scalaã§ãŠããŠé©ããŸãã
ä»é²D ãã€ã¯ããªãã¬ãŒã·ã§ã³æ¹åŒãšããã®åœä»€ã¬ã€ãã³ã·
x86ã®ãããªCISCã§åºãŠãããã€ã¯ããªãã¬ãŒã·ã§ã³ã«ã€ããŠã
ä»ãŸã§ãã¢ã»ã³ããªã£ãŠCPUãå®éã«å®è¡ããŠããæ©æ¢°èªãš1:1察å¿ããŠãããšæã£ãŠããã®ã§ããããã¯ãã¢ã»ã³ããªã§ããæœè±¡åãããã¬ã€ã€ãŒãªã®ããšæã£ãŠããŸããŸãã
ä»é²E GPUããã³ãã¯ãã«æ¹åŒã«ããããã€ãã©ã€ã³ã®é«å¯åºŠåã®å·¥å€«
æ¬ç« ã§æ±ããªãã£ãGPUã«ã€ããŠã
èªåã¯CPUãšGPUã£ãŠå
·äœçã«ã©ãé£æºããŠããã®ããç¥ãããã§ãã
7ç« ã«ãããšCPUããã¯I/Oããã€ã¹ãšããŠGPUãèŠããŠããã¯ããªã®ã§ãI/OåŠçãšããŠGPUã«åŠçãäŸé Œããæãã«ãªãã®ã§ããããã
ã§ããããªããšããã°ã©ãã³ã°èšèªããã©ããã£ãŠæ±ãã®ã ãããã
ä»é²F CPUã®æ§èœåäžã®ç©ççãªé£ãã
CPUã®ç©ççãªåŽé¢ã«ã€ããŠã
ç©çãåŠã³ããã§ã