Overview of Basic Concepts⌗
The concept of workingset was first introduced by Professor Peter Denning in 1968. The fundamental idea is that programs exhibit temporal locality in their memory usage. At time $t$, the workingset of a program is considered to be the set of pages accessed by the process during the interval $(t-\tau, t)$, denoted as $W(t, \tau)$, where $\tau$ is known as the workingset parameter, a user-defined value. The original paper suggested that memory not accessed within $\tau$ could be reclaimed through sampling.
In the Linux kernel, workingset refers to the inactive
and active
LRU (Least Recently Used) lists. For file pages read via the read
system call, they are placed in the inactive
LRU list upon the first read. If these pages are repeatedly read, they are promoted to the active
LRU list. For pages mapped via mmap
, the process is similar: they are initially placed in the inactive
LRU list and may be activated to the active
LRU list if deemed sufficiently active during memory reclamation.
When memory pressure triggers reclamation, pages are freed from the tail of the inactive
list. If the inactive
list becomes significantly smaller than the active
list, pages from the active
list are moved to the inactive
list to balance their sizes.
Overall, the aging process of pages involves moving from the head of the active
list to its tail, then to the head of the inactive
list, and finally to the tail of the inactive
list before being reclaimed.
This article focuses on the workingset for file pages, although the current kernel also considers anonymous pages, as the underlying principles are consistent.
Lifecycle Model of File Pages⌗
Assuming the machine’s memory is divided into two parts: the active
list and the inactive
list. This assumption simplifies analysis. The sizes of these two parts are mutually adjustable, with the ratio of active:inactive
not exceeding 3:1
according to kernel settings. If the active
list grows too large, pages are moved to the inactive
list to maintain the balance.
The kernel’s page lifecycle model can be summarized as follows:
- File pages are placed at the head of the
inactive
list when read into memory. - Each new file page read causes a page at the tail of the
inactive
list to be reclaimed, making space for the newly inserted page, effectively moving the pages in the inactive list toward the tail by one position. - Pages with high access frequency, detected during reclamation or repeated reads via the
read
system call, are promoted to theactive
list, causing pages ahead of them in theinactive
list to move one position towards the tail. - If the
active
list grows more than three times the size of theinactive
list, pages from the tail of theactive
list are moved to the head of theinactive
list to rebalance the ratio.
This simplified model describes the kernel’s page lifecycle. It also shows that pages in the inactive
list are either activated or reclaimed, causing pages ahead of them to move towards the tail (aging). A page is reclaimed after moving toward the tail through len(inactive)
positions.
Problem Introduction⌗
Some file pages are accessed periodically. When first loaded into memory, they are placed at the head of the inactive
list. As other file pages are read into inactive
list and pages at the tail are reclaimed, these file pages are gradually pushed to the tail of the inactive
list and eventually reclaimed. After being reclaimed for a while, they are accessed again, repeating the process of loading, moving, and reclaiming (referred to as periodically accessed pages).
These pages are not highly active; if their access frequency were high enough, they would have been promoted to the active
list. The fact that they oscillate within the inactive
list indicates moderate access frequency.
However, the active
list only moves pages to the inactive
list when it becomes too long. Consequently, the active
list may contain long-unaccessed inactive pages. Since the active
list is not yet long enough, these pages remain there, even though their access frequency is lower than that of the periodically accessed pages. We need a method to allow these inactive pages in the active
list to compete fairly with periodically accessed pages, retaining genuinely valuable and frequently accessed pages.
Basic Idea of Workingset⌗
Workingset is such a method. The inactive
and active
lists together form the workingset, aiming to retain pages accessed by processes in the workingset and prevent their reclamation.
A periodically accessed page moves from the head of the inactive
list to its tail before being reclaimed, traveling through len(inactive)
positions.
From the moment a page is reclaimed until it is accessed again, the inactive
list continues to activate and reclaim pages. Suppose X
pages are reclaimed and activated during this period.
If the page were initially placed X + len(inactive)
positions from the tail, it would be accessed just as it reaches the tail, avoiding reclamation and preventing oscillation.
The key is that the inactive
list is only len(inactive)
long. What we need is to place the page X + len(inactive)
positions from the tail.
Since the inactive
list is preceded by the active
list, we can insert the page at the X
-th position from the end of the active
list. Instead of inserting in the middle, we can simply place it at the head of the active
list, achieving protection.
Additionally, this approach serves another purpose: once the page is added to the active list, it will compete with other active pages. As the length of the active list gradually increases, when it becomes necessary to age a batch of pages from the active list to the inactive list, the kernel will compare the access frequencies. This comparison ensures that genuinely active pages remain in the active list, while less active pages are moved to the inactive list for potential reclamation.
This is the basic idea of workingset.
Implementation of Workingset⌗
By placing the page at the head of the active
list, it moves through the active
list to the tail of the inactive
list, providing a maximum travel distance of len(active) + len(inactive)
. The required distance to prevent reclamation is X + len(inactive)
. As long as len(active) + len(inactive)
is greater than X + len(inactive)
, placing the page at the head of the active
list provides sufficient protection; otherwise, the page will eventually be reclaimed.
The inequality len(active) + len(inactive) > X + len(inactive)
simplifies to len(active) > X
. Just checking this inequality is sufficient. When a page is read into memory, if the inequality holds, it is placed at the head of the active
list to avoid reclamation before the next access.
The remaining issue is calculating X
. As previously mentioned, it represents the number of pages reclaimed and activated in the inactive
list from the time a periodically accessed page is reclaimed until it is accessed again. We can set up a counter, incremented each time a page in the inactive
list is reclaimed or activated.
Thus, when a periodically accessed page is reclaimed, we record the current value of this counter in its radix tree slot
. When the page is read again, it is reinserted into the same radix tree slot
. We retrieve the old counter value, subtract it from the current counter value to obtain X
.
Then, we compare len(active) > X
to decide whether to place the page at the head of the active
list. If protection is feasible, the page is placed at the head of the active
list; otherwise, it is placed in the inactive
list.