NewFeatures: Chronological entries

This is a log of new features of note in DWiki.

DWiki can now generate RSS 2.0 format syndication feeds for recently changed pages. This is a terrible hack that should not exist but ChrisSiebenmann has to deal with a few things that don't accept Atom format feeds, only RSS 2.0 feeds. RSS 2.0 page feeds are just like Atom page feeds and all Atom page feed restrictions and configuration options apply to them too. They are not advertised anywhere (either in page tools or in feed automdiscovery); to get access to them you must specify the feed URL directly, using the view name 'rss2' (as in http://you.cim/dwiki/?rss2).

See dwiki/view-rss2.tmpl and syndication/rss2entry.tmpl for what RSS 2.0 elements are used and how.

There is no RSS 2.0 feed for page comments.

(Because this is a hack, asking for the RSS 2.0 feed of VirtualDirs that are restricted such that they get redirections to the base directory, per AtomFeedsAndVirtualDirs, will get you a redirection to the Atom feed for that base directory. This is considered acceptable since people aren't supposed to be using those feeds anyways.)

DWiki can now restrict what sorts of VirtualDirs advertise AtomFeeds (both in SyndicationDiscovery and in the Atom toolbar) and/or provide them if they're requested by URL.

It turns out that when you have a fair amount of content in a DWiki your VirtualDirs and thus your AtomFeeds proliferate like over-active rabbits. Then SyndicationDiscovery kicks in so that anyone who looks at a virtual directory can discover its Atom feed and either start polling it or just crawl it. Once your DWiki gets big enough this becomes not really a good thing, as Chris has found out with his techblog.

Read more »

Directories can now say that they don't want to be rendered in specific view types. The usage case Chris has in mind is his techblog, where the blogdir view of categories is utterly huge because it renders hundreds of entries. Because this is intended to be a graceful gentle fix, trying to view a directory in a disallowed view generates a (permanent) redirection to the default view of the directory. To avoid redirection loops, this redirection only happens if the view has been specified explicitly as a URL parameter.

(For obvious reasons, disallowed views are also disallowed in virtual directories derived from a particular real directory.)

This is done similarly to DefaultDirViews: touch a file in the directory called .flag.noview:<viewname>. Unlike default views, this is not currently inherited by child directories.

The 'See As' page tools links also exclude disallowed view types. Right now they do so a little bit too thoroughly, in that they exclude the default view if it's also disallowed. Moral: don't do that, even though the code saves you from a redirection loop in this case.

Right now there is no restrictions on what (directory) views can be disallowed, so you can disallow Atom feeds. This is probably not a feature and will probably not be staying, although Chris may change his mind about this or just be lazy.

New: various new configuration options

Not covered before now are various new configuration options that have been quietly added to DWiki over the five or so years that I have been using it as mostly a blogging engine. As you might expect, a bunch of these have to do with dealing with obnoxious clients of various sorts.

They are by and large documented in ConfigurationFile. I am not going to try to remember them here.

New: DWikiText has 'processing notes' (and better quoting)

These are directives that change how DWikiText is interpreted to do things like turn off certain font characters or map a simple-to-type character sequence like '->' to a HTML entity. They are documented in DWikiText so I am not going to repeat myself here.

In the process I added a new and less annoying plain quoting mechanism: ``...''. It looks better in ASCII than it probably does in the font here.

(The code for this was written in August of 2007 or so, but I sat on it because I wasn't entirely sure I liked the feature. Well, nuts to that; time to roll things out and just go.)

New: directories can have an index 'page'

I've decided that sometimes I really want a directory to have an index page, not just a list of the contents. So now I can, with the unimaginatively named 'index' view. It's mostly template based; the normal template uses inject::index to display an __index file. However, the view has several special properties:

  • unlike other directory views, it is not inherited by subdirectories.
  • it is only listed as an available view in the page tools area if there is a file called __index in the directory.
  • if __index is a redirection, the index view will just generate a redirect to the target.

(Thus, the index view could be used to replace the wikiroot configuration directive.)

The index view is valid on directories even without a __index page in the directory. Right now the template just displays a normal directory listing in that case.

New: #pragma search ...

I got tired of the relative links in my blog drafts (written in an entirely different directory than their eventual home) not working. So, introducing #pragma search; this adds additional directories to search for pages when resolving relative links, both WikiWords and explicit [[...]] links. The directories listed in a search pragma must be absolute path names, and are searched only after the hard-coded possibilities have been exhausted (including the alias directory for WikiWords).

You can't have both a #pragma search ... and a #pragma pre in the same page, partly because it doesn't make any sense. (This is the simple way of getting out of handling multiple #pragma directives.)

New: optional disk-based caching

Having run out of other ways to really improve performance, I added a disk-based caching infrastructure to DWiki and then put in two caches.

The real cache is the renderer cache, which stores the results of selected renderers (currently just the wikitext renderers). Via some glue it's also used to store the results of the filesystem walk that's the expensive bit of blog::prevnext.

The Brute Force Cache is for dealing with Slashdotting style situations; it just caches complete requests for N seconds when the system seems to be under load. I also hijacked it as a convenient place to add extra caching for Atom feeds and to force this caching on software that doesn't do conditional GET.

(For more details, see Caching.)

This required a new storage pool class. Like the comment store, it uses a customized and restricted interface to write things (and a new interface to read them). The cache storage pool stores objects, not data blobs, using the cPickle module to make the swap back and forth. (This may be a mistake, but it's fast and easy.)

Since removing files in DWiki makes me nervous, I didn't bother to implement any sort of cache cleaning; you get to do that by hand. The cache has TTLs, and the renderer cache has validation layered on top of the cache object store, but when they detect something invalid they just ignore it. (On the other hand, the cache storage layer does use temporary files and rename(), so in a sense it's already removing files.)

In theory the cache interface is generic, so later I can hook up a memcached setup or something without having to change higher-level code.

New: Text formatting via macros

I (ChrisSiebenmann) found myself with a genuine need for text set in HTML <strike>. Rather than try to invent a formatting setup for this that did not make me cringe, I just punted to the easy solution: macros that do text formatting.

So, DWiki now has grown the new macros:

  • ST for the font styles DWiki didn't already have.
  • C for character entities, and ShowCharEnts to show the named entities we support.
  • because I was there anyways, AB for <abbr>, so I could have those cute inline abbreviation expansion things. Tragically <abbr> is not supported by IE 6 and less, so AB may quietly change to generating <acronym> someday.
  • and finally, IMG to generate <img>. Width, height, and alt text is mandatory, and there is a hacky way to also roll title text in too. (Title text is optional.)

Through special black magic, ST, C, and AB can be used in comments (the omission of IMG is deliberate). In theory this lets a commenter cause character set explosions, but in practice a bad commenter can just write UTF-8 directly (UTF-8 is the common and only sane character set choice, so).

Implementing these as macros means that they have some limitations. You can't nest C, AB, or a differently styled ST inside an ST, and currently none of them can be done inside link text ([[....]]).

These macros are a bit of a hack. It's relatively easy to implement bits of HTML this way, but I'm not sure if it's good design overall.

New: Previous and Next links in pages in blog views

The more I have used DWiki for a blog, the more I've realized that I want individual entries to be able to have Previous (entry) and Next (entry) links. At first I resisted because this would require an expensive filesystem walk on even individual page views, but I have now given in and made the blog::prevnext renderer, which will do this if I want.

blog::prevnext generates links directly to the pages, not to /range/N-N/ virtual directories, because I think that works better. (It would be less code the other way, but better links are worth the code.)

New: DWiki can generate Google Sitemaps

I added the ability for DWiki to generate Google Sitemap format XML files, using the view 'sitemap'. The information included is very basic currently: just the URL for each file page, all of them set to priority 0.8 (in the hopes that Google will decide that all of the directories are priority 0.5 and prefer returning file page results).

Google does not say what Content-Type you should return sitemaps in, so I have opted for 'application/xml'.

In the future, something as elaborate as Atom rendering may be done. For now, everything is hardcoded in the sitemap::minurlset renderer.

Updated: now directories are shown too, at priority 0.6. This feature is clearly going to be in flux for a while.

New: https:// now supported in plain text

As part of fixing Atom feeds to not break embedded https:// urls, I decided that we should support plaintext https:// URLs, like say

DWiki should now support non-HTTP URLs much better in general (before, there were a number of problems and issued). You can even include mailto: links if you really want to.

New: Atom feed autodiscovery

There's a standard for autodiscovery of Atom feeds, involving <link rel="alternate" type="application/atom+xml" href="..."> element in your <head>. Now DWiki has a atom::autodisc renderer to create them.

The current code only generates 'recently changed pages' Atom feed links, and so disappears entirely when there isn't one. In theory one can have multiple autodiscoverable feeds (the first is the default, and they get title="..." elements), but I don't quite feel like being that daring just yet.

(I am also not confidant that clients have the UI issues involved sorted out. I'm not sure I have the issues sorted out; for example, should file pages have only the comment feed in the autodiscovery, or should they also have the recent changes for their directory feed in? Which better matches practical user expectations? Can I expect users to be aware of the difference between directory pages and file pages?)

New: linktocomments renderer

This is a little new renderer that creates a link to a page in the view necessary to show comments. In turn, this has caused the 'your comment has been posted' template page to be tarted up so that it uses it, thereby letting people who have posted comments see them in the page they go to.

(I have decided not to have it link to the comment section of the page, just on general principle. I may change my mind about this.)

New: feed-max-size and feed-max-size-ips

This is all because LiveJournal has undocumented size limits on incoming syndication feeds, limits that DWiki can easily blow past. Since I actually wanted LiveJournal to be able to get syndication feeds from me, DWiki has grown two new configuration settings.

feed-max-size is an integer kilobytes. It is a rough limit on how large any feed can be; once DWiki generates a feed that is this many kilobytes or larger it stopps adding more entries, regardless of the setting for atomfeed-display-howmany. If unset, there is no size limit.

feed-max-size-ips restricts feed-max-size to the whitespace separated list of IP addresses or tcpwrappers style IP address prefixes (eg '66.150.15.' to get all of 66.150.15.*). Syndication fetches from other addresses will behave as if there was no feed-max-size.

Strictly speaking, feed-max-size limits only the size of the atom::pages or atom::comments output to that size. Whatever else is tacked on to make up a feed (hopefully not very big) will add some extra size.

Moral: undersize feed-max-size a bit. For LiveJournal, the limit is apparently 150 kilobytes (currently), so setting it to '120' or so should provide a comfortable safety margin.

Although I'm not entirely fond of this (to put it one way), the documentation has been updated appropriately, making this feature more or less official.

New: /oldest/ virtual directory restriction

DWiki has long been able to give people the latest N things in a virtual directory context (as latest/<N>). Now it can give them the oldest N things, using the obvious syntax: oldest/<howmany>.

Just to show off, ranges properly convert themselves into 'oldest/<N>' at the end of their run, just as they convert themselves into 'latest/<N>' at the start.

Documentation has been updated appropriately.

New: Better Last-Modified handling

Over the past while it has become increasingly obvious that it's useful for as many responses as possible to carry a Last-Modified: header. (The last straw was wanting Google's index to show modification dates for DWiki pages.)

My reason for killing Last-Modified: was so that things like logging in and logging out, which can't be reflected in the timestamp, would still have conditional GETs be served new pages. But since the conditional GET logic is in DWiki itself, I can have DWiki be smarter about it.

DWiki now separates the page timestamp from the idea of whether the page timestamp is reliable or simply vaguely useful information. The page timestamp will always be served if it exists at all, but conditional GETs only look at the page timestamp if it's reliable (which means that if authentication is on, the answer is generally 'not').

This should work much better.

New: Page Titles

Pages now have accessible 'titles', sort of. A page's title is taken to be the value of the header that starts the page, if said header is on the very first line. (So this page's nominal title is 'New: Page Titles'.) The header level doesn't matter; a <h6> is as good as a <h1>, so long as it's the first line on the page.

This info is available only after the page has been rendered, in the new global context variable :wikitext:title. Fortunately for us, Atom feed entries can have their fields in any order, so we are free to generate <title> after <content>.

Why did I do this? First, it's suitably low rent, and second I decided I wanted some vague way to generate semi-real page titles in Atom feeds instead of the current full path to the page (ever so helpful and informative as it is).

The only tricky bit was making sure that only the appropriate magic wikitext renderers set the page title, and not all the times that we spin through wikitext looking for, eg, permissions. (Especially important in Atom feeds, as Atom feeds look at everyone's permissions before they do the real rendering.)

A DWiki page (technically, any wikitext, so comments too) can now start with the line '#pragma pre' to declare that the entire rest of the page is simply preformatted text and should be barfed out as such (minus the #pragma line, which is swallowed). '#pragma plaintext' is accepted too.

This is a much more convenient and maintainable way to stick plaintext files (such as program source or something) into a DWiki than indenting the entirity of their text one space.

Note that this does not make the page come out as text/plain. The page is still text/html and fully templated, it's just that the wikitext is one big <pre> lump, instead of more sophisticated formatting.

It's unlikely that DWiki will acquire any other sorts of pragmas (eg to say 'format this as nicely HTML-ized Python code'), partly because ChrisSiebenmann is dubious about the 'nicely HTML-ized' bit of any formatters since they invariably involve aesthetic decisions that people (eg, him) can and do object to. Having an easy way of including plaintext is the 80%-90% solution, and that is the DWiki way.

DWiki has a new template handling scheme: the core idea is that we now have a way of a) picking the first existing template from a list of them and b) generating candidate templates by variable substitution and 'all parent directories' expansion. This gives DWiki a simple and general framework for doing things like 'template injection', which lets us skin an entire directory hierarchy (but not the entire wiki) with things like blog sidebars.

This also gives us a single top-level template that generates all normal HTML-based pages, thereby giving us a single place to skin the entire site. The per-view templates in views/* (now only a convention) now just generate view-specific information, leaving all of the rest up to the top-level template.

The clarity and lack of stupid template piece duplication of the result is a clear indication of how it is a better scheme. (And no more silly things like splitting a <div> start and end into different files and hoping they get included in the right spots.)

TemplateSyntax and TemplatesUsed have been revised appropriately.

One obvious way of handling blogs with categories is to create appropriate directory hierarchies for each category, then hardlink a page's file into all of the appropriate 'category' directories. However, this raises a problem: DWiki's idea of a page's identity is its path.

Read more »

Directories can now have Readme files, called __readme. Readme files are injected into pages via the new renderer inject::readme (probably the first of several injectors).

The current templates don't inject __readme in normal directory views, but do inject them for blog and blogdir views (as you may see from this directory). Blog and blogdir views now drop all files starting with __, taking out __readme and __access and any future special magic files.

Read more »

DWiki can now generate Atom feeds for recently changed pages and recently made comments, either for the entire DWiki or for some subtree of it. For comments, this can be down to an individual article.

At the moment, pages in the Atom feed are rendered without macros except for CutShort, for efficiency reasons. All of the links are turned into absolute links (with http:// et al), since this is basically required. Nulled-out macros produce a small message to that effect in the generated content, so that people reading the Atom feed can tell that something is going on.

Read more »

The primary way of getting nested lists is now to indent the nested list entries relative to the parent list (entry). This looks visually better in plain ASCII for cases when there is a decent amount of text.

Although ChrisSiebenmann thought he wasn't going to, the old style of nesting lists (multiple list start characters, eg ***) still works. It turns out the GNU Emacs will properly autoindent for these lists but not for real indented lists, plus sometimes they actually look visually better.

The amount of old-style nesting is ignored in an indented context; it's treated as just a new level.

You can now use LinkAbbrevs by name (not by URL) without a |; ie, instead of writing [[<text>|]], you can just write [[<text>]]. This only happens if <text> wouldn't result in a link to a real page or an external URL if there was no abbreviation.

Thus, one can write [[Google]] in the page once, and later write [[Google]], and have it work out.

DWiki now lets you use spaces to separate things in [[....]] links instead of |. If you do this, the last word is taken as the link URL or page, and the rest are the link name. (| has priority over this; DWiki tries space-separation only if there is no |.)

Thus [[Google Rules The Web]] turns into Google Rules The Web.

You can use either side as an abbeviation later, for example: Google Rules The Web, Google Rules The Web. (See View Source.)

LinkAbbrevs done this way don't have to use |, as long as there is a space in the value: [[Google Rules The Web]] still turns into Google Rules The Web.

This allows somewhat more aesthetic long link name things.

Note that the opening [[ and the closing ]] have to be on the same line in the wikitext.

A DWiki RedirectFile can now point to absolute URLs as well as local DWiki pages. Absolute URLs are written like they would be in [[...]]; either http:// or <...>.

DWiki now supports generating links to URLs on the wiki's web server that are outside the DWiki itself. These are written as links in the format <...> and have to happen inside [[...]]. For example:

See [[the root of this web server|</>]].

generates a link to the DWiki's web server's root, which is all but certain to be outside the DWiki space if you're running DWiki as a CGI-BIN.

DWiki now remembers the name and URL for links that you write with [[...|...]] and thereafter allows you to omit one side of the |, at which point it will fill in the remembered values. This means that if you want to link to Google a lot in your document, you only have to write (eg) the URL once, and can thereafter refer to Google much more compactly.

This does collide a little bit with using [[...|]] to write totally unstyled text: it only comes out unstyled if it's not already been used as the name of a link.

For now I will live with that.

The RecentChanges DWikiText macro now accepts additional arguments that are the list of directories to restrict the listing to, or with a dash at the front, directories to exclude from the listing. If you combine both, both criteria must match: in an included directory and not excluded.

This lets a RecentChanges listing exclude areas that churn too much or are otherwise less interesting to list. (Perhaps, for example, a blog sub-hierarchy in a DWiki.)

If there is a common directory prefix that scanning is limited to, the scan is efficient: only that directory and lower is looked at at all.

Template expansion via #{...} now removes a final newline if such a final newline is present. (It doesn't remove more than one newline.)

The final newline is really an implementation artifact of files; it's there because lines end with newlines, not because people consider it to be part of the file's real content. Deleting it thus brings template expansion closer to inserting people's idea of the file's contents into place.

It also means that we avoid having templates introduce whitespace into undesired places. For example:

[There's more starting at %{blog::seemore}#{blog/rangemore}]

and blog/rangemore of:

or %{range::moreclip}

doesn't introduce a gap between the end of %{range::moreclip}'s output and the ']' in what the browser displays. (See how we didn't think of blog/rangemore as actually having a newline at the end as part of it?)

If {{RecentChanges}} is invoked in a {{Striped:...}} context, it now names the links to the pages by the name of the page instead of the page's full path. This turns out to be much less annoying if many (or even some) of the changes are deep in a hierarchy.

Rational: you asked for a compact display so you're going to get it.

All wikitext macros and template renderer functions now have Python docstrings. This is because there is a new macro, {{DocAll}}, that spits out a list of macros or renderers and their docstrings, thereby creating a one-stop shop for a list of them all and documentation for them.

Formatting and TemplateSyntax have been changed to use this, thereby killing several birds and a fix-me with one somewhat extended stone.

In the process of writing docstrings, I fixed several irritating limitations and renamed a few renderers.

ChrisSiebenmann likes this approach best because it keeps the documentation closely attached to the function, thereby serving as a clear visual reminder that a) change the function, update the documentation that's right there in front of him and b) write a new function, write a docstring to go with it.

Directories can now be displayed as a 'blog', which treats all file descendants of the directory as if they were blog entries and attempts to be somewhat intelligent about how much to show and how to navigate things, supporting navigation by page date or Nth to Mth most recent pages.

ChrisSiebenmann doesn't believe this makes BlogDir obsolete. BlogDir is an excellent view for ChangeLog style situations, which is exactly what NewFeatures uses it for; however, it makes a bad way of displaying a true blog-style environment because of eventual information overload. BlogView is designed to deal with that by trimming what gets displayed and providing time-based and range-based navigation.

Read more »

Default directory views, mentioned before in DefaultDirViews, are now inherited from parent directories if the child specifies no particular desire about the whole thing.

This is convenient for setting an entire hierarchy to a BlogView (qv).

DWiki has a new formatting feature. If you write just '* * *' on a line by themselves, they center and become a separator. Like, well, this:

* * *

You can't indent them, because I decided that would run too much risk of confusing things with pre-formatted text blocks.

Why? I just decided I liked that style of chunk separator. (Maybe if I could get a <hr> variant that didn't stretch the entire width...)

Note: feature subject to me changing my mind.

Symbolic links inside the DWiki page area now cause redirections if their value would be a valid redirection in a REDIRECT line.

For example: FrobTig, which is a symlink in /Aliases with the value '../People'.

We could have tried using os.path.realpath() on the symlink and taking the result relative to the store root, but I think that that has more subtle explosive breakages.

Symbolic links that don't resolve to a real DWiki page this way are interpreted normally, so you can still have symlinks that point to files outside the DWiki pagestore root.

DWiki now supports 'virtual directories': directories that don't really exist but instead serve to limit what's shown for a real directory. For example, you can limit what's shown for a real directory to only the most recent 5 things, or to only things written on 2005/05/29.

Read more »

DWiki's tables can now have rows that span multiple lines, using indentation to continue them on subsequent lines just like with lists. The appearance is straightforward; just write:

| start your table | another cell that
  is continued on another line | the end cell |
| cell one | cell two | cell three |

A row is closed by having a ' |' at the end of a line, or just by starting a new row.

Read more »

DWiki now supports definition lists (<dl>, <dt>, <dd> in HTML). I started with what I think was Wikipedia's syntax but decided it was ugly in plain text so came up with my own that I like better.

Read more »

DWiki now has a simple hierarchical way of handling access to pages for various things (both access and commentability), so you can give (or take away) permissions for things to entire directory trees at a shot. We use a simple implementation where directories can have a magic file called __access, which creates default permissions for everything under them.

Read more »

DWiki now has an extremely low-rent generic from-the-web search functionality. It's so low-rent I'm not sure I'm going to keep it, but we'll have to see.

Read more »

DWiki now supports comments on pages; comments are themselves written in DWikiText. Currently pages have to be specifically enabled for this; in the future I will have a better global or semi-global mechanism for this.

Read more »

As part of the fallout of other work, I fixed the nit where 'View Source' would show as a page tool on pages where viewing the source would get you a permission or unavailability error.

With blog-style things now decently supported, there is the problem that some pages want to be longer than is comfortable for display in a blog setting. Now wikitext can signal that only the front should be shown in some contexts, like so:

Read more »

Directories can now have default view types associated with them, by the simple yet sleazy method of touching a file in them called .flag.prefview:<viewname>.

This lets us auto-set a directory to default to a blogdir view.

The page tools now autogenerate a list of 'See As <whatever>' links for alternate directory formats.

This required some innards magic to distinguish a normal view being explicitly specified from no view being specified and us just defaulting to a normal view.

Directories can now be displayed as a 'blogdir', which treats the files in the directory as if they were blog entries: they are sorted by modification time, most recent first, and then they are all displayed inline with a title.

A chunk of this behavior is controlled by templates and new renderers. 'blogdir' is a new view, only valid for directories, which uses blogdir.tmpl. The blog::blogdir renderer does the direct expansion, running each file through the template blog/blogentry.tmpl.

There are also new renderers for the new blog-like time format and for showing the owner of a file. (Currently the Unix and/or RCS owner.)

This is ... how shall we say it ... not hugely scalable over the long run in terms of time structure and number of entries.

Page tools: See As Blog, See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.