Demand Paging: Calculating effective memory access

2020-05-29 17:57发布

问题:

I can't understand the answer to this question:

Consider an OS using one level of paging with TLB registers. If the page fault rate is 10% and dirty pages should be reloaded when needed, calculate the effective access time if:

  • TLB Lookup = 20 ns
  • TLB Hit ratio = 80%
  • Memory access time = 75 ns
  • Swap page time = 500,000 ns
  • 50% of pages are dirty.

Answer:

T = 0.8(TLB+MEM) + 0.2(0.9[TLB+MEM+MEM] + 0.1[TLB+MEM + 0.5(Disk) + 0.5(2Disk+MEM)]) = 15,110 ns

Can you explain why?

回答1:

In this context "effective" time means "expected" or "average" time. So you take the times it takes to access the page in the individual cases and multiply each with it's probability. The expression is somewhat complicated by splitting to cases at several levels. The cases are:

  • 80% of time the physical address is in the TLB cache. That gives us 80% times access to TLB register plus access to the page itself: 0.8(TLB + MEM)
  • remaining 20% of time it is not in TLB cache. That splits into further cases, so it gives us 0.2(loooong expression) (the expression doesn't actually have that parenthesis, but I'll take it as a typo, because the coefficients don't add up to 1 without it and it makes no sense if they don't). The cases are:
    • 90% (of those 20%) of times the page is still mapped, but the address fell out of the cache, so we have to do extra memory read from the page map. So 90% times access to TLB register plus access to the page table plus access to the page itself: 0.9(TLB + MEM + MEM). One-level paging is mentioned, so it's just 1 extra memory access here, but practical implementations generally have two-level paging and would thus have 2 extra memory accesses.
    • 10% (of those 20%; the expression suggests this, but the question is not clear and suggests rather that it's 10% overall) of times the page needs to be loaded from disk. This gives 10% times the (failed) access to TLB register and (failed) access to page table and than it needs to load the page. To load it, it will have to make room for it, so it will have to drop another page. This splits to two options:
      • 50% the page to be dropped is clean, so the system just needs to read the new content: 0.5(Disk).
      • 50% the page to be dropped is dirty, so the system needs to write it to disk (MEM+Disk) and read in the new content (Disk), giving the 0.5(2Disk + MEM)

I think some extra memory accesses should be included in the last two (swap) cases as two accesses are needed to mark the previous page unavailable and the new page available in the page table.

It is also highly unrealistic, because in real system when a room for reading in a page is needed, the system always chooses a clean page to replace. To make sure it has clean pages there is a background process that goes over dirty pages and writes them out. It takes some computing resources, so it should actually count toward memory access a bit, but much less since the page faults don't need to wait for the writes to finish.

The expression is actually wrong. It should be either

T = 0.8(TLB + MEM) + 0.2((0.9(TLB + MEM + MEM)) + 0.1(TLB + MEM + 0.5(Disk) + 0.5(2Disk + MEM)))

if page faults are 10% of TLB misses or

T = 0.8(TLB + MEM) + 0.1(TLB + MEM + MEM) + 0.1(TLB + MEM + 0.5(Disk) + 0.5(2Disk + MEM))

if page-faults are 10% of all accesses. You are not explicit about it, but I would assume the later if the formula didn't include that 0.2*0.9, which suggests the former.



回答2:

reading the question I was thinking about a more realistic scenario based, for instance, on a two-level paging system.

In this scenario, as far as I can understand, there could be the case page table (PT) itself is not resident in memory (PT itself may have been paged out from RAM into swapping area (e.g. Linux) or into pagefile (e.g. Windows)).

Regarding page directory (the first level of paging hierarchy) I believe it has to be always resident in RAM (otherwise, upon context switch, the x86 CR3 register content would be totally useless...)

Thus it exist a percentage of occurrences we have to include at least:

  • Disk access time needed to read & bring in memory (from swapping area or pagefile) the PT itself
  • MEM time needed to access PT now in memory

Does it make sense ?