Subdirectories: NewFeatures.
2013-08-27
What templates DWiki uses
Per ProcessingModel, DWiki ultimately produces output by expanding a template. This means that DWiki has to figure out what template to use for this process, and because the TemplateSyntax is fairly limited, it is much simpler for DWiki to start with a separate template for every different view of things it wants to have.
This means that while DWiki tries not to hardcode template names or the structure of the template directory, there are a certain amount of hardcoded names it knows about that need to be there for proper DWiki operation.
The short list of such templates is:
dwiki/view-*.tmpl,dwiki.tmpl: starting view templates.views/*: conventional location for templates that display a particular ordinary view.error.tmpl,errors/*: displaying errors (always 404 responses).login-error.tmpl: displaying a login error (a regular page, not a 404).- Comment templates:
comment/comment.tmpl: used to show each comment when we're showing all comments.comment/posting.tmpl: used to show the result of posting a comment. By convention,comment/posted-<result>.tmplis used to display specific results, where<result>is one of 'good' (the comment was posted successfully), 'bad' (something went wrong), 'badchars' (the comment has bad characters in it), or 'nocomment' (the comment was empty and DWiki refused to post it).blog/blogdirpage.tmpl: used to show each page in BlogDir view.blog/blogentry.tmpl: used to show each page in Blog view.syndication/atomentry.tmpl: used to render an Atom feed entry for each page.syndication/atomcomment.tmpl: used to render an Atom feed entry for each comment.syndication/rss2entry.tmpl: used to render an RSS 2.0 feed entry for each page.All paths are relative to the template directory.
Determining a template for a view
For views that are displayed using templates, DWiki tries to find the starting template by looking in three places, in order:
dwiki/view-<view>-<pagetype>.tmpldwiki/view-<view>.tmpldwiki.tmplBy convention, everything that generates text/html pages just goes through
dwiki.tmplso that there is one place that does top-level 'skinning' for the entire DWiki. Only views that both use templates and generate something besides text/html sidestep this.The standard
dwiki.tmpluses the#{<...}first-found template inclusion mechanism (see TemplateSyntax) to pull in the real per-view content. It looks in four places to try to find this content, in this order:
Overrides/...$(page)/$(view-format).tmplOverrides/...$(page)/all.tmplviews/$(view-format)-$(pagetype).tmplviews/$(view-format).tmplThe first two allow page and directory hierarchy specific overrides; the latter two are the generic places. Most views don't need to distinguish between file types, but the 'normal' view must use different templates for files and directories (since a directory doesn't have wikitext to display).
The current template-based views are: normal, history, search, blog, blogdir, atom, atomcomments, sitemap, showcomments, and writecomment. The login and logout views are 'synthetic' and don't actually display anything unless an error happens. The 'source' view simply dumps the page content out straight without getting anywhere near templates.
Note that the atom and atomcomments views are special: although they render through templates, they generate application/atom+xml content instead of text/html. Thus they use
dwiki/view-*templates directly, bypassingdwiki.tmpl. The sitemap view is similarly special, although it generates application/xml content.Error templates
Errors are rendered by the template
error.tmpl. There are special error rendererserror::titleanderror::bodythat look for error-specific additional templates in the subdirectoryerrors/. Each type of error looks for titles aserrors/<error>-title.tmpland main error body aserrors/<error>.tmpl(with internal defaults if they don't exist).Current error types: badaccess, badformat, badpage, inconsistpage, nopage.
Everything else is free and floating
That's it. DWiki has no other hardcoded template names.
2013-03-06
DWiki's caching system
DWiki has optional caching in order to speed up generating results repeatedly. DWiki uses a disk-based cache for this (although the interface is abstracted and alternate forms of caching may be introduced someday). There are three caches, which can be enabled separately: the renderer cache, a brute force page cache, and an in-memory brute force page cache that is only used if DWiki is running as a preforking SCGI server.
DWiki never removes the files of out of date cache entries from the disk cache; instead, it stops considering out of date ones to be valid. Cleaning out the detritus is left for an external process. ChrisSiebenmann considers this safer; giving a program an automated
unlink()makes him nervous.See ConfigurationFile for the options controlling the behavior of the caches.
In theory DWiki's caching is optional. In practice a decent sized DWiki is simply too slow without caching for some of the more expensive operations and caching becomes more or less a necessity. ChrisSiebenmann now believes that you should configure all levels of caching in basically any DWiki unless you have some unusual need and are sure.
The brute force cache
The brute force page cache is about as simple as you can get: it caches complete requests for a configured time (called a time-to-live, or TTL). That's it. The BFC is intended as a load-shedding measure when DWiki is under significant load, so it only acts under certain circumstances:
- only on
GETorHEADrequests.- only on requests without a
Cookie:header.- requests only get put into the cache if the system seems loaded.
(For speed, when something is valid in the cache DWiki just serves it without checking the system load.)
A good BFC TTL is on the order of 30 seconds to three minutes or so; long enough to shed significant load if you are getting a lot of hits to a few pages and short enough that dynamic pages won't become too outdated. (And that waiting to see a comment show up or whatever is not too annoying.)
Because Atom syndication requests are among the most expensive pages to compute, the BFC can be set to give them a longer TTL than usual. There is a second TTL that can be set for Atom requests that aren't using conditional
GET; the idea is that if requesters cannot be bothered to be polite, we can't be bothered to serve fresh content. Setting this option always caches the results of such requests, even if the load is low, which means that even people doing proper conditional GET requests will use the cached results for as long as their (lower) TTL says to.It's actually faster to serve static pages from the static page server code than from the BFC, so the BFC doesn't try to cache static pages.
The two sides of the BFC
It's important to understand that the BFC does not check load when it is checking to see if something is in its cache. This means there are two stages to processing a request: deciding what TTL to use for cache checks, and deciding whether to cache something that was not current in the cache.
The TTL used is:
bfc-atom-nocond-ttlif this is an unconditional request for an Atom view, if set.bfc-atom-ttlfor Atom view requests in general, if set.bfc-cache-ttlotherwise.Pages enter the BFC cache either because the system seems to be loaded or because
bfc-atom-nocond-ttlwas set and they were an unconditional request for an Atom view.Once something is in the cache, it will be served from the cache if it is not older than the check TTL. Different requests can use different check TTLs for the same cached page; for example, conditional GETs versus other requests for Atom views.
The in-memory cache
The in-memory cache is essentially a version of the brute force cache that holds pages in memory instead of on disk. It's only effective in environments where DWiki serves multiple requests from the same process; currently it's only used if DWiki is running as a preforking SCGI server. Because it holds pages in memory as page response objects, the in-memory cache is about the fastest way that DWiki can serve requests. In particular it's faster to serve static pages from the IMC than from disk, so unlike the BFC the IMC does cache static pages.
Because IMC entries disappear automatically and are essentially free to create, the IMC caches pages unconditionally when active (unlike the BFC). This means that it should normally have a relatively low TTL, often lower than the BFC's TTL. Note that because the IMC is before the BFC, it can load its cache from BFC cache hits.
For obvious reasons, it's pointless to set the IMC cache size to be larger than the number of requests a preforked SCGI process will serve before exiting.
To keep IMC memory usage under control, the IMC has a settable maximum page size that it will cache. Tune this as appropriate for your environment.
The IMC can be deliberately forced on with
imc-force-on, in case you're running DWiki in some other preforking environment (for example as a WSGI application under a preforking WSGI server such as uWSGI).Considerations for the IMC TTL
Under some setups, DWiki will only be running as a (preforking) SCGI server when it's under heavy load; in others DWiki is running this way all of the time, even when the load is light. Because the IMC unconditionally caches pages the latter situation can be annoying; it means that someone who, say, writes and posts a new comment may not see that comment until the IMC TTL expires. DWiki makes some attempt to bypass the IMC (and the BFC) in the common case of someone leaving a comment. However this is not perfect (in part because it requires the web browser to accept cookies from DWiki).
(This also applies if you're running DWiki in some other preforking environment and have forced the IMC on.)
If you're running DWiki full time in an IMC-on environment, you likely want to set a quite low IMC TTL, such as 15 to 30 seconds. If you're running DWiki with the IMC on only under heavy load you can set a higher IMC TTL, such as two minutes (120 seconds).
The renderer cache
The renderer cache is actually two caches. The renderer cache proper caches the output of various renderers (cf TemplateSyntax). The output is cached with a validator and the cached results are fully validated before they get used; this means that renderer cache entries do not normally use a TTL and in theory could be valid for years.
The (heuristic) generator cache caches the output of some expensive precursor generator routines. These cache entries only have heuristic validators, where DWiki can be fooled if people try hard enough. Generator cache entries do have a TTL, so that if the heuristic is fooled DWiki will pick up the new result sooner or later. Some cache entries can also explicitly invalidated by DWiki in a pretty reliable process; by default, these have a much longer TTL than plain heuristic cache entries. These are called 'flagged' (heuristic) generator entries and various ConfigurationFile settings controlling how they behave are
render-heuristic-flagged-....(Trivia: the 'flagged' name is because such entries are invalidated using a flag file, or more accurately a flag cache entry.)
Currently the main renderer cache caches the output of various wikitext to HTML rendering routines while the generator cache caches the results of various filesystem 'find all descendents' walks that are used to build lists of comments (for Atom comments feeds and some wikitext macros; this uses explicit invalidation) and lists of pages (for Atom feeds and various blog renderers such as
blog::prevnext).Unfortunately, a DWiki page that has comment or access restrictions must be cached separately for each DWiki user that views it. Under some situations this can result in a number of identical copies being cached under different names. If you want to avoid this, DWiki lets you turn off renderer caching for non-anonymous users.
Force-invalidating list of pages caches
The general validator for
blog::prevnextcache entries is the modification time for all of the directories involved that had files in them at the time (the latter condition is for technical reasons). The heuristical validator checks that some of the file timestamps are still the same, but it can't check all of them and still be a useful cache.So the easy way to invalidate this is to change the modification time of a directory involved, for example with
touch.The 'list of pages' cache is similarly invalidated by changing a directory modification time. Unlike the
blog::prevnextcase, the directory times are the only thing that this cache checks. This is a bit of a pity but the performance improvements from caching this information are very visible.Disk space usage and directories
Much like comments, each page that has something cached for it becomes a subdirectory, with the various cached things in files. The different sorts of caches use different top-level directories under the
cachedir, so you have paths likecachedir/bfc,cachedir/renderers, andcachedir/generators.Because some results include absolute URLs that mention the current hostname, DWiki must maintain separate caches for each
Host:header it sees in the BFC and the general renderers cache. These are handled as subdirectories in each cache directory, socachedir/bfc/localhost/...and so on. Entries in the generator cache don't depend on the currentHost:header, so there is only one (sub)cache for all requests,cachedir/generators/all/....Generally the general renderers cache uses the largest amount of disk space, followed by the BFC, and the generator cache is the smallest.
If you're using caching (and as mentioned, you probably want to), you'll want to periodically trim the caches. ChrisSiebenmann just does this by hand every so often by removing the cache directories entirely; DWiki will then rebuild them as necessary.