Transcript 과제 슬라이드
Seoul National University
Computer Architecture Project #2
Cache Simulator
1
Seoul National University
Objectives
To understand cache memory
Organization
Set associativity
Operation
Cache Read & Write, Hit & Miss
LRU replacement policy
Performance
Hit/miss ratio, miss penalty
To develop your own cache simulator
Memory
Access
Pattern
Cache
Organization
Display
Option
Cache Simulator
Hit/Miss
Performance
2
Seoul National University
General Cache Organization (S, E, B)
E = 2e lines per set
set
line
If e = 1, “Direct Mapped Cache”
else If s = 1, “Fully Associative Cache”
else “E-Way Set Associative Cache”
S = 2s sets
v
valid bit
tag
0 1 2
B-1
Cache size:
C = S x E x B data bytes
B = 2b bytes per cache block (the data)
3
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
0…01
100
find set
4
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
compare both
0…01
100
valid? + match: yes = hit
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
block offset
5
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
compare both
0…01
100
valid? + match: yes = hit
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
block offset
short int (2 Bytes) is here
No match :
• One line in set is selected for eviction and replacement
• Replacement policies: random, least recently used (LRU), …
6
Seoul National University
LRU Replacement Policy
Theoretically…
Address
1
2
3
4
1
2
3
1
2
3
4
5
Set
1
2
3
4
1
2
3
1
2
3
4
5
1
2
3
4
1
2
3
1
2
3
4
1
2
3
4
1
2
3
1
2
3
Practically…
7
Seoul National University
Performance
(Average Access Time) = (Hit Time) + (Miss Rate) × (Miss Penalty)
= (Hit Time) + [1 – (Hit Rate)] × (Miss Penalty)
Example
Suppose cache hit time is 1 cycle,
Miss penalty is 100 cycles,
and hit rate is 97%.
Then average access time is:
1 cycle + ( 1 – 0.97 ) × 100 cycles = 1 + 0.03 × 100 = 4 cycles.
8
Seoul National University
Requirements of the cache simulator (1)
Cache simulator (hereinafter referred to CSIM) shall implement arbi
trary numbers of sets and lines, and block size.
You should implement a way to provide the numbers of sets and lines, and
block size as inputs to CSIM.
CSIM shall a read trace file line by line and process it.
You should determine whether each memory operation is a cache hit or
miss.
You should implement the LRU replacement policy
CSIM shall report the result of cache simulation.
You should report these three basic results: numbers of Hits, misses, and evicts
You should be able to report the average access time of cache simulation
You should be able to report whether each memory access in trace file results
in a cache hit or miss
9
Seoul National University
Restrictions & Advices
Implement method for input parameters.
You should implement it by argument passing. (full credit)
If you can’t, you can use standard input such as scanf(). (low credit)
Evaluate only data cache performance.
Therefore, you should ignore instruction load.
You should assume that the memory accesses are aligned properly.
Therefore, you can ignore requested size in trace file.
You should evaluate your CSIM with, at least, 3 different trace data. You can
use one provided with this project.
Calculate average access time using below assumption:
Hit time = 1 cycle, miss penalty = 100 cycles.
Compile your CSIM without warnings.
10
Seoul National University
How to trace memory accesses
“valgrind”
GPL licensed programming tool for memory debugging, memory leak detection,
and profiling. (from http://en.wikipedia.org/wiki/Valgrind)
Usage: >> valgrind -log-fd=1 --tool=lackey -v --trace-mem=yes ls -l
– Valgrind prints out memory accesses of “ls -l” on stdout, so you need
to capture it by:
>> valgrind -log-fd=1 --tool=lackey -v --trace-mem=yes ls -l > ls.trace
Output Format: [space]operation address,size
Output
Type
Example
Naccess
[space]
I 0400d7d4,8
Instruction load
All instructions
1
X
L 04f6b868,8
Data Load
movl (%eax), %ebx
1
O
S 7ff0005c8,8
Data Store
movl %eax, (%ebx)
1
O
M 0421c7f0,4
Data Modify
incl (%ecx)
2
O
11
Seoul National University
Reference Cache Simulator
Usage: >>./csim [-v] -s <s> -E <E> -b <b> -t <trace file>
-v: Optional verbose flag that displays trace info
-s <s>: Number of set index bits (S = 2s is the number of sets)
-E <E>: Associativity (number of lines per set)
-b <b>: Number of block bits (B = 2b is the block size)
-t <trace file>: Name of the valgrind trace to replay
set
line
S = 2s sets
v
tag
0 1 2
Cache size:
C = S x E x B data bytes
B-1
valid bit
B = 2b bytes per cache block (the data)
12
Seoul National University
Cache Simulation Example (1)
Usage: >>./csim [-v] -s <s> -E <E> -b <b> -t <trace file>
Example: >>./csim -v -s 4 -E 1 -b 4 -t ./traces/yi.trace
Number of set index bits = 4 (16 sets)
Associativity = 1 (Direct Mapped Cache)
Number of block bits = 4 (16 blocks in a cache line)
Output
L 10,1 miss
M 20,1 miss hit
….
hits: 4 misses:5 eviction: 3
13
Seoul National University
Cache Simulation Example (2)
Example memory access pattern
Oper.
Address
Byte
S
V
0
I
1
I
Load
0x10
1
Modify
0x20
1
2
I
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
14
Seoul National University
Cache Simulation Example (3)
R/W
Address
Byte
S
V
0
I
1
V
Load
0x10
1
Modify
0x20
1
2
I
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
0
Evict
1
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0x0
15
Seoul National University
Cache Simulation Example (4)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
1
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
16
Seoul National University
Cache Simulation Example (5)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
2
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
17
Seoul National University
Cache Simulation Example (6)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
18
Seoul National University
Cache Simulation Example (7)
R/W
Address
Byte
S
V
0
I
1
V
0x1
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
3
1
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
19
Seoul National University
Cache Simulation Example (8)
R/W
Address
Byte
S
V
0
I
1
V
0x2
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
4
2
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
20
Seoul National University
Cache Simulation Example (9)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Modify
Hit
Miss
4
Evict
5
3
Average Access Time
= 1 + (5 / 9) * 100 = 56.5 Cycle
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
21
Seoul National University
보고서 작성요령 (1)
설계
시험
아래의 내용을 포함할 것
설계 요구사항
구현
제시된 CSIM의 설계 요구사항을 자신의 CSIM에 맞춰
재정의
구현
자신의 CSIM이 어떤 식으로 동작하며 어떻게 설계 요구사항을
반영하는지 서술
자신의 CSIM의 사용법과 시뮬레이션 결과 출력 방법에 대해 서술
시험
CSIM의 요구사항을 어떤 방법으로 검증하였는지 서술
최소 3가지 Trace Data를 이용하여 검증 수행
추가적으로, Trace Data를 어떤 방법으로 얻었는지를 서술
CSIM 구현 내용을 알 수 있도록 캡쳐된 이미지를 첨부할 것
22
Seoul National University
보고서 작성요령 (2)
Design
Testing
아래의 내용을 포함할 것
성능 평가
Coding
각각의 Cache 구조 (direct mapped, E-way set
associative 및 fully associative cache)별로 성능을 측정하고 각각을
비교할 것
23
Seoul National University
평가기준
Title
CSIM
Pts.
Description
70
10
Warning: 각 -0.5 pt. / Error: 각 -1 pt.
Parameter Input
10
Argument Passing: 10 pts., Other methods: 5 pts.
Cache Operation
성능 평가
주석
제출지연
30
Details
제출
Cache Organization
보고서
Pts.
5
20
5
10
설계 요구사항
7
구현
7
시험
8
성능 평가
8
매 1일 당
-5
Dynamic allocation 사용 시: 5 pts.
- 배열 사용 시: 2 pts.
Hit/miss의 정확한 처리: 10pts.
Replacement policy (LRU): 5 pts
- implementing random replacement: 3pts.
각각의 Memory Access에 대한 결과 (Hit/Miss) 시현: 4pts.
- 결과 시현 여부를 선택할 수 있는 옵션 제공: 1pts.
정확한 Average Access Time의 제공
최대한 각각의 라인에 주석을 제공
제출 기한 1주일까지 제출 가능
24
Seoul National University
제출방법
아래 제출 목록의 산출물들을 메일로 제출
E-mail address: yonghunlee@archi.snu.ac.kr
E-mail 제목: “[CSIM]학번_이름”
산출물들은 “학번_이름.zip” 또는 “학번_이름.tar”으로 압축하여 제출
제출 목록
CSIM source code
Project 보고서
CSIM의 검증 시 사용한 Trace file
제출 기한 : ’13. 12. 18(수) 23:59 까지
25