Subdirectories: NewFeatures.
DWiki Authentication
DWiki has optional support for authenticating users, which is a prerequisite for restricting access to pages and for allowing people to comment. User authentication is done by cookies, which means that people wanting to be authenticated have to accept cookies from the DWiki's web server.
Whether authentication is on is controlled by the
authfilesetting in the ConfigurationFile; if it is set, it specifies a password file for the DWiki. Once enabled, a login box will appear at the bottom of pages where people can enter their login and password into a form and submit it to the wiki. If the password is correct, DWiki will send back a login cookie and the session is now authenticated (provided that the user's browser then sends the cookie back to DWiki with future requests).An authenticated person has a login name and may optionally be in some groups. When checking permissions, logins and groups are treated the same (so you should not create groups that have the same name as users; this is either pointless or dangerous, depending on how many people are in the group). What groups a login is part of is specified in the password file.
To be precise, an authenticated request is any request that has a valid associated login name. Normally this happens because the user's browser sent back a valid DWiki login cookie, but a DWiki may have a default login, set in the ConfigurationFile. If the default login is set and exists in the password file, everything is authenticated; either as a 'real' (passworded) login or as the default login.
Because DWiki is hard-coded to require authentication before people can write comments, setting a default user is the only way to let the world (potentially) comment on your DWiki.
Using Authentication
Authentication is used by the
{{Restricted}}and{{CanComment}}DWikiText macros. Without arguments they restrict the page to authenticated people or allow comments by authenticated people (respectively). With arguments, they restrict things more tightly. There are two sorts of arguments:
- positive arguments are plain logins or groups, and require the authenticated session to be one of the things named.
- negative arguments start with '
-' and are then logins or groups, and require the authenticated session to not be one of the things named.If only negative arguments are given, anyone not mentioned passes; if both positive and negative arguments are given, you must pass the positive arguments and not fail the negative arguments.
Directories can create default permissions for everything under them by having a special file called
__accesswith either or both of Restricted and CanComment macros.__accessfiles are checked backwards from the page being looked at, and the first one that contains a Restricted or a CanComment (depending on what is at issue) wins.__accessfiles can have other content, although ChrisSiebenmann doesn't expect people to look at them very often.Note: this means that subdirectories can give back permissions that were denied by a higher-level directory. This is deliberate.
Authentication limits
DWiki authentication protects only file contents. It does not protect directory contents and it thus doesn't protect a page's (file) name. Moral: don't put sensitive information into page names.
Password security
Note: DWiki doesn't specially encrypt login / password information while it's being sent to the web server. Unless the entire connection is running over SSL, people can theoretically snoop the password in clear text.
DWiki doesn't store someone's clear text password (even in its password file); instead it stores a hash of the password, using a format that guarantees that if two different people use the same password they will get different hashes. (Barring the hash function itself being broken.)
As always, people should be strongly discouraged from using important passwords (eg, their Unix account passwords) for any web service, a DWiki included. Using one's Unix login name as one's DWiki login name is harmless and even convenient.
The cookie
The cookie DWiki uses has the login name in clear text, and is authenticated with an added hash value. If you want the gory details, see
authcookie.pyandhtmlauth.pyin the DWiki source code. With a properglobal-authseedsecret in the ConfigurationFile, it is believed to be secure from all brute-force attacks.The cookie is normally quite long-lived. It becomes invalid if the user's password or the DWiki global authseed change.
The cookie is not restricted to coming from a single IP address or anything like that.
Format of the password file
The password file has a simple format. Blank lines and comment lines (lines that have a '
#' character as their first non-whitespace) are ignored. Otherwise, lines have the format:<login> <password-hash> [<group> ....]There can be any amount of whitespace between elements; groups are optional.
The easy way to add logins or change passwords is with the
dpasswd.pyprogram in the DWiki source. Adding or changing groups, or deleting logins, you get to do by editing the file directly.DWiki has no support for creating logins or changing passwords over the web. This is deliberate.
How you manage this process in general is up to you; in non-paranoid environments ChrisSiebenmann uses a group-writeable password file owned by an appropriate (Unix) group.
DWiki bugs/needfix
/{....}as a template comment, because I think I want them. (maybe another character, but ehh; this sort of looks like a C/etc comment.)inode ctime is last modified, inode mtime is created. The split has started. This may or may not work well; I'll have to see. (Partly based on what else screws with ctimes in our Unix environment.)
It seems clear that ctime is not too useful in at least some context. I should use it for safety in Atom feed generation and some other contexts, but not otherwise by default.
http://projects.edgewall.com/trac/wiki/WikiFormatting documents some stuff better than me, plus has 'processors'. I could steal that.
Searching needs to be less lame, at least for searching through the searchbox. It probably wants to be case-independant and possibly only for word starts (instead of word boundaries on both sides; arguably all searches should be only word boundary start ones).
The real rule is not 'identifier boundary', it is 'identifier component boundary', which is \b or a-z0-9 at the start, and \b or A-Z at the end.
It should be possible to create an Atom feed template that included all of the comments as part of the page.
CSS work. This implies that I need to actually understand CSS. I laugh at myself, hollowly. (Progress: we now style some stuff with CSS.)
We should be able to see the history page for any RCS-but-not-displayable page.
There should be some form of RecentChanges that throws in time information. (Clearly not Striped'able.)
Open issues
Do I want a 'render this page as wikitext' magic template option? That's what the injectors hard-code right now.
writecomment needs some way to generate a good link to help/DWikiText, so that people can actually know what to write a comment in. (It has one now, but the way may be a bit lame.)
Do we need a way to turn off WikiWord links? (The current approach is to use
[[...|]], which is perhaps good enough for the rare cases.)Should we forbid switching to alternate views in a virtual directory? The 'normal' view doesn't work entirely right (drops subdirectories); this may be a bug. (Fixed now: the listdir renderer needs to always include all subdirectories, despite their timestamps possibly being outside the restriction.)
We need to sort out when a link stays in the same view and when it doesn't. At the moment it is somewhat ad-hoc.
[[...]]links don't chase redirects, and they should. Well, now they do and I'm not convinced it's the right thing. It's convenient, but it changes the explicitly written link text; this might be good or it might be bad.Decide: should access restrictions look sort of like Unix access restrictions, being enforced top-down, or the current bottom-up way? I am starting to think that bottom-up is open to some reliability issues. But on the other hand, top-down has semantic issues too.
Profile the code. Laugh hysterically. Fix what I can.
DWiki should be more configurable through the filesystem. Can we support adding new views (directory and/or file) by reading the canonical template directory, for example? This would suffice for anything that doesn't require special handling.
Long-range:
DWiki knows a lot about what views do what. Unfortunately I suspect that this is impossible to work around, especially given how
htmlviews.pyis set up.Templates should mark up with <div> and so on.
wikirend.py needs to style-mark much of the things that it emits. I would like to find some general augmentation mechanism, although it's probably not going to be pretty.We're going to need to genericize access control. I think it will be some matrix of view + file patterns + file attributes. (Punt for now, everyone can see everything.)
This is a permanently FIXME page.
Security Aspects of DWiki
DWiki has a general attitude about security: it really distrusts incoming requests, it somewhat distrusts itself, but it has a rational trust of the people creating DWiki templates and pages. DWiki will try to save people from accidental mistakes, but doesn't bother with things that are just half-hearted attempts to stop people from deliberately sidestepping security restrictions. Moral: don't let people write DWiki pages unless you trust them.
Some knowledge of the ProcessingModel and the ConfigurationFile (and what can be set there) may be helpful for the rest of this discussion.
A Quick Summary
DWiki itself is written in Python (a lot of Python). This means that unless there is a gross implementation error in the Python interpreter, it is secure from simple problems such as buffer overruns. While DWiki uses some components from the standard Python libraries, they too are well-tested and believed to be entirely safe.
Because it is quite careful at multiple levels about how it handles requests, hostile HTTP requests should not be able to trick DWiki into serving anything from outside the page directory (or the comments directory, or the static content directory). InvalidPageNames discusses things it won't serve even inside them.
DWiki doesn't attempt to stop insiders from using DWiki to serve 'bad' content, ultimately because there are so many ways a malicious insider can do that. ChrisSiebenmann feels that it is better to be honest about not making any attempt rather than making an attempt and causing people to put more trust in it than it warrants.
If run as a CGI-BIN, DWiki should not be run with a UID that has any special access to restricted files. But then, no CGI-BIN should be run that way.
DWiki has some degree of optional Authentication, but it is no stronger than the usual run of the mill login and password on other web sites. Really sensitive content is probably best not served from a web server that the public (whatever that means to you) can access.
Pages versus Templates
What people can do with the ability to write DWikiText in DWiki pages is somewhat less powerful than what they can do with the ability to write DWiki templates. Similarly, errors in DWikiText are considered far less fatal than errors in templates; DWikiText errors just result in funny-looking pages, while template errors result in terse web error pages.
Thus: while it's safe to let people write DWiki pages in general, you probably want to restrict (at least somewhat) who can write or modify your templates. Plus, your templates (being, you know, templates) shouldn't need modification all that often. People can create and modify pages all the time.
How DWiki tries to be secure
Cautious processing
Internally, DWiki tries to operate in a relatively 'security conservative' fashion. For example, the frontend rejects clearly invalid things without passing them through to the DWiki core, because the core has a lot more power than the frontend so a mistake has larger ramifications.
DWiki also is deliberately structured so as to give itself as little power as possible.
Errors Abort Processing
DWiki can hit a number of internal problems while processing a request; for example, a template that's called for might be missing. When this happens, DWiki aborts processing the entire request, throwing an error all the way back to the front end, which generates a terse error page about the situation.
This may be abrupt ... but it is safe.
File Access
DWiki reads only a few files: the ConfigurationFile, the global-authseed-file file, the authfile password file, and things under the page, template, RCS, static files, and comments directories (if those are configured on).
Except for the password file, the DWiki core only accesses files through a simple storage layer abstraction, which provides 'storage pools' to the rest of DWiki. Each storage pool confines all file requests to relative paths under the pool's root, explicitly ruling out InvalidPageNames when retrieving files for the rest of DWiki.
The storage layer has no general file writing capabilities. The only interface it has for writing files is specifically designed for comments, using a specific naming and storage scheme. And only the comments directory uses a storage pool that supports this abstraction.
Following Symlinks
Unlike some web servers (eg, Apache), DWiki takes no special care to not follow symbolic links that point outside one of its storage pool directory roots. If you put such a symbolic link into a storage pool area, DWiki assumes that you know what you're doing.
This is deliberate. Attempting to duplicate the kernel's
namei()function in user space is inevitably very complicated (and prone to surprising races). Rather than run the risk of making a mistake in the amount of code required, DWiki is honest about the whole situation.Limitations DWiki imposes on itself
Limited URL scope
DWiki refuses to serve any request that is not under
staticurl(if set) orrooturl. Anything understaticurlmust be a static request and is served only as such.Limited static-content serving
In addition to dynamic DWiki pages, DWiki can serve static content via the
staticdirConfigurationFile directive. Since DWiki's goals for serving static content are very modest (CSS files, images, etc), DWiki refuses requests for static directories. As mentioned in ProcessingModel, static content is served by the frontend, thereby keeping the amount of code involved in the process down.In addition, DWiki rejects any request for static content that is not in the default 'normal' view.
A brief sketch of the Dwiki processing model
The core of DWiki is a template expansion engine and a collection of (text) renderers; DWiki displays pages by figuring out what template to use and then rendering it out.
Renderers generate text based on the current context, such as the page that is being displayed. The most important (and largest) renderer is the wikitext renderer, which takes page content in DWiki's wiki text format and turns it into HTML.
Other renderers create things like the navigation 'breadcrumbs' up at the top of this page and and the page tools and last-modified lines at the bottom. Renderers generally create only the essential pieces of that information; surrounding text is created through template expansion. Renderers are hardcoded parts of DWiki and are thus written in Python.
Templates are text files; they get expanded by the template engine through a recursive process of applying template 'macros' to their text. Template macros can insert other (expanded) templates, insert text taken from context variables, and insert the results of renderers. A typical template might look like:
<html><head><title>${|wikititle|wikiname} :: ${page}</title></head> <body> @{breadcrumbs} <br/> @{wikitext} <hr> #{footer.tmpl} #{site-sig.tmpl} </body> </html>(the actual templates that render this DWiki are somewhat more complicated than that, but this shows the flavour.)
DWiki produces all pages this way. Displaying different types of pages (regular pages versus directories) and different views of the same page (such as the history view) is done by selecting a different starting template; the template (presumably) uses different renderers that the normal view.
Errors are also rendered using templates (if an appropriate template exists). This allows some error pages to reuse renderers as appropriate; for example, the no-such-page error template includes breadcrumbs just as regular pages do, as you can see at NoSuchPage.
Wart: the view source display is not done by a template: it just barfs the content out straight as plain text. One current limitation of renderers and templates is that they can't control the content-type, which is set in the HTML view core.
Wart: the mapping of view + file attributes to templates is currently hard-coded.
The frontend versus the core
DWiki is divided into two components: the front end and the core. The front end receives raw HTTP requests, figures out if they are proper requests, and then passes them to the core to go through the core's processing. If the front end can detect that a HTTP request is not something that the core can handle, it rejects it immediately with a terse error.
Similarly, if the core encounters a processing error it throws an exception up to the front end, which logs it and generates another terse error.
It is the front end that can optionally serve static files; the core is not involved in that process.
DWiki features
DWiki's job is to be a good way to display version controlled wiki-text pages that you write in a real editor.
The important DWiki features:
- simple but reasonably powerful text rendering (based on WikiText).
- natural support for arbitrarily-named links: you don't have to follow some magic page naming standard that doesn't fit well with the natural names for things.
- pages are normal, simple files, and you edit them directly in Unix.
- support for putting pages in RCS, with strong disincentives to hand-edit files without checking them out (they stop displaying).
- directories can display like changelogs: pages inline, most recent first.
- can generate Atom syndication feeds for recently changed things.
The inevitable feature list:
In no particular order:
- simple WikiText-like text rendering. (Chris wrote pages in GNU Emacs and relentlessly smushed anything that got in the way of how GNU Emacs wanted to autoformat things.)
- The text rendering choices are designed to make it easy to write about Unix systems.
- full support for directly editing wiki pages.
- does not force a flat page namespace; uses straightforward Unix files and directories to organize the DWiki page space. (Thereby keeping the Unix view of DWiki's pages simple.)
- supports a blog-like view of a directory that inlines pages there, most recent first.
- in-filesystem page redirects make it trivial to support plurals, moved/renamed pages, etc.
- text-based page templates control how all pages appear, making it easy to control various bits of a DWiki's appearance.
- pages can be put in RCS for version control and multi-person editing access. RCS files can live in either the page directory hierarchy (for simplicity) or another parallel directory tree (for neatness).
- forces people not to edit RCS-controlled files without locking them by refusing to display inconsistent unlocked files.
- generates Atom syndication feeds for recently changed pages and recent comments, for the entire DWiki or any subtree thereof.
- written in Python.
- simple-ish yet powerful enough (I hope) user authentication system, with an equally simple yet powerful way of restricting who can read DWiki pages.
- supports the option of letting people (possibly including the world) comment on some or all of the pages.
- takes some pride in properly generating and handling Last-Modified: and ETag: headers in HTTP responses.
- wikitext to HTML generates fully HTML 4.01 Transitional compliant HTML provided only that you don't jump multiple indent levels in at once in lists (thus Formatting doesn't validate).
- can run as a CGI-BIN or standalone, and support for additional environments (SCGI, WSGI, whatever) should be easy to add if it is needed. Disclaimer: standalone does not use a production-quality webserver implementation; it uses Python's
BaseHTTPServerwith a hack to use threading.Missing DWiki features
Also in no particular order:
- you can't edit DWiki pages from the web, but see WhyNotWebEditing.
- no user authentication.
- therefor, no access restrictions on who can read what.
- searching is primitive at best.
A necessary acknowledgement:
A number of DWiki's features and design decisions are shamelessly inspired by C.J. Silverio's as yet (22 May 2005) unfinished Snippy. Note that Snippy is much more powerful than DWiki probably ever will be, plus if it had been finished when I was writing DWiki I probably wouldn't have.
Page Names That DWiki Won't Serve
There are some paths and page names that DWiki categorically refuses to serve, even if they seem to resolve to real files. Because they're enforced by both low-level code and high-level code, they apply to DWiki pages, static files being served by DWiki, and even templates. (Technically they apply to comments too, but comments can't generate file names that violate these rules.)
What gets rejected:
Any path that includes a path component that starts with a
., ends with,vor a~, or isRCS.Any non-relative path that includes
..,., or a sequence//; usually this might appear in the URL of an incoming request. (Incoming requests are not supposed to include things like that. But ChrisSiebenmann declines to believe that everyone sending DWiki requests is going to do what they're supposed to.)DWiki will reject REDIRECT files that either have too many '..' entries (so that they are trying to escape the root of the page directory) or that fail these checks after they've potentially been converted from relative path names to absolute inside-DWiki paths.
When DWiki rejects bad paths, generally it says that there is no page by that name. Sometimes it rejects the request entirely in huge flames.
Redirection Files
Files in the page directory can create HTTP redirections, making it trivial to support plurals, moved/renamed pages, and so on. There are two ways of doing it:
REDIRECTcontent and symbolic links.If a file starts with a line that says '
REDIRECT somewhere', and does not have more than a few lines of content, DWiki considers it a redirection. The somewhere is basically interpreted as if it was appearing in a[[....]], so it can be:
- redirection to another DWiki page.
- redirection to an external web site, written as
http://....- redirection to an absolute URL on this web site, written as
<...>These files are generically called REDIRECT files.
A symbolic link is only considered a redirect if DWiki can 'resolve' it into an existing page. To resolve the symbolic link redirect, DWiki tries to interpret the symbolic link's value as if it was appearing in a
[[...]]as a DWiki relative page name.If the symbolic link doesn't resolve this way, DWiki treats the whole thing as an ordinary page; this keeps 'ordinary' uses of symlinks intact in most cases, including when the symlinks point to something outside the DWiki page directory.
Redirects to http:// links or absolute URL links are a convenient way of creating WikiWord abbreviations to external things for local use. Make an appropriate REDIRECT file, stick it in your Aliases area, and now every page in the DWiki can say GoogleSearch or something and get a link, bam.
(WikiWord redirection rewriting means that in many cases the generated link will even point to the real target instead of the REDIRECT file, as you can see here.)