Does the TLB make a significant impact on the performance of the system?
It might appear not…
Consider a 4 MB process on a system with 4KB pages. That process will have 1000 pages.
If the TLB has 32 entries then only 32 of 1000 pages will be in the TLB.
Therefore, one might think that only 32/1000 page requests would be found in the TLB.
That might be true if the process was just making random memory requests.
Luckily, most processes do not behave that way.
Locality of Reference:
Locality of reference basically says that most processes will make a large number of references to a small number of pages as opposed to a small number of references to a large number of pages.
Locality of reference can be summarized by the 90/10 rule of thumb.
A process spends 90% of its time in 10% of its code.
It is not clear that that also holds for a program's data. (But we will ignore that for now.)
What does that do for us as far as the TLB goes?
Well, given that about 90% of the references will be to 10% of the pages we would do very well with the TLB if it could hold those 10% of the pages that are used most often.
10% of 1000 is 100 so we can't quite hold all 100 page translations in the TLB.
However, holding the 32 most often used pages will ensure that most of the page requests can be handled by the TLB.
Most machines that have Virtual Memory use some combination of multi-level paging with a TLB.
Intel Pentium, Macintosh, Sun Sparc, DEC Alpha.
A flowchart that shows the use of the TLB is shown above. Note that by the principle of locality, it is hoped that most page references will involve page table entries in the cache.
The page size of a system has a significant impact on overall performance.
Internal fragmentation – smaller page size less internal fragmentation less internal fragmentation better use of main memory.
Smaller page size – more pages per process, larger page tables. In multiprogramming environments this means page tables may be in virtual memory – this may mean a double page fault one to bring in required portion of page table one to bring the required process page.
The graph below shows the effect on page faults of two variables one the page size and the second the number of frames allocated to a process.
The leftmost graph shows that as page size increases that number of page fault correspondingly increase. This is because the principle of locality of reference is weakened. Eventually as the page size approaches the size of the process the faults begin to decrease.
On the right the graph shows that for fixed page size the faults decrease as the number of pages in memory grows. Thus, a software policy (the amount of memory to allocated to each process) affects a hardware design decision (page size).
Of course the actual size of physical size of memory is important. More memory should reduce page faults. However as main memory is growing the address space used by applications is also growing reducing performance and modern programming techniques such as Object-Oriented programming (which encourages the use of many small program and data modules with references scattered over a large number of objects) reduce the locality of reference within a process.
§ A small page size reduces internal fragmentation.
§ A large page size reduces the number of pages needed, thereby reducing the size of the page table (page table takes up less memory).
§ A large page size reduces the overhead in swapping pages in or out. In addition to the processing time required to a handle a page fault , transferring 2 1K blocks of data from disk is almost twice as long as transferring 1 2K block .
§ A smaller page size, with its finer resolution, is better able to target the process’s locality of references. This reduces the amount of unused information copied back and forth between memory and swapping storage. It also reduces the amount of unused information stored in main memory, making more memory available for useful purposes.
When a page fault occurs and all frames are occupied a decision must be made as to what page to swap out. The page replacement algorithm is used to choose what page should be swapped.
Note that if there is no free frame a busy one must be swapped out and the new one swapped in. This means a double transfer – which takes time. However if we can find a busy page that has not been modified since it was loaded to disk then it does not need to be written back out. The modified bit in the page table entry indicates whether a page has been modified or not. Thus the I/O overhead is cut by half if the page is unmodified.