I am confused on how the data is accessed on a 2-way associative cache.
For example, C = ABS
C = 32KB
A = 2
B = 32bits
S = 256
offset = lg(B) = 5
index = lg(S) = 8
tag = 32 - offset - index = 19
Say I have I have the following addresses
tag | index | offset
1000 0000 0000 0000 000|0 0000 000|1 0000
1000 0000 0000 0000 000|0 0000 000|0 0000
1000 0000 0000 0000 000|0 0000 000|1 1010
and my cache looks like
index valid dirty tag data valid dirty tag data
0: 1 0 0x80... some data1 1 0 0x80... some data2
1: . .
2: . .
3: . .
How do I determine which of the two cache array I should take the data from (data1 vs data2) when the index and tag bits are the same?
Likewise, how do I determine which of the data in the two array I should kick out when I need to update the cache with the same index and tag bits?
I am thinking it has to do with the offset bits, but I am not too sure how to use the offset bits or what exactly do they represent or maps to in the cache array.
How could your cache get into this state? An access with the same index and tag would hit instead of allocating a second entry.
Having the same line of physical memory in cache twice (with different index or tags) can happen because of homonym or synonym problems caused by virtual indexing (or tagging), but this situation is just impossible in a correctly-designed cache.
You don't evict in this case; that's a cache hit.
The index selects a set of tags. These 2 tags (in your case) are matched against the tag bits of the address. If one matches, it's a hit. If not, it's a miss.
So an access with the same index but different tag is when you need to evict one of the existing lines. The usual replacement policy is LRU. One way to implement this is by having the position in the set be significant. Every time a line is accessed, its tag is moved to the MRU position. When a line has to be evicted from the set, the LRU position is evicted. This will be the line that was accessed least recently.
Normally newly-added lines go in the MRU position, but adaptively adding into the LRU position avoids evicting valuable data while looping over giant arrays. See this blog post about Intel IvyBridge's adaptive L3 replacement policy for some clever experimental testing to investigate the hardware behaviour, and some nice explanations.
Nope, the offset bits select bytes within a line. hit/miss/replacement doesn't care about this. The cache-access hardware uses the offset and size to select which range of bytes to read or update after the right line is found.