2007-09-17
How mmap(2)
requires a unified buffer cache
In a previous entry I mentioned that Sun's
addition of mmap()
basically forced their hand on having a unified
buffer cache. Today I feel like elaborating on that.
The problem with having mmap()
and not a unified buffer cache is page
coherence. If you have some programs using mmap()
and some using
regular read()
and write()
, you will wind up with two copies of
pages, one mapped into process memory and one in the buffer cache.
Because virtual memory and the buffer cache are not unified, there is
nothing that keeps these two copies in sync with each other; programs
will see an unpredictable mix of new and old data, depending on what
pages got forced out of virtual memory when.
(A related problem is finding already-mapped pages so that you can share mappings across processes, which means you're going to need some sort of mapping index anyways.)
Since you want to let people mmap()
more file pages than you have
buffer cache, you can't just have mmap()
use the buffer cache to hold
mapped in file pages. You can reuse your mapping indexing scheme to
create coherence without technically having a unified buffer cache, but
I think that there would be various issues and you're so close to a
unified buffer cache that you might as well go the rest of the way.
(The one benefit of not unifying the buffer cache is that at least theoretically you have a clear way to avoid file IO eating your virtual memory system.)