Wandering Thoughts archives

2011-12-14

Practical issues with REST's use of the Accept header

In full blown REST, a single resource can have multiple representations (formats). You decide which representation to ask for and to serve by what is in the HTTP Accept header of the HTTP request; if you want a HTML version of the URL you say that you will accept text/html and if you want a JSON version you ask for application/json (see eg).

The issue I have with this in practice is that it makes the Accept header part of the name of the specific representation of the resource (okay, this is really the MIME type). You cannot talk about, for example, 'the URL of the Atom feed'; you must specify both the URL and the type when you are talking about it. By 'talk about' I mean more than just 'share with people'. Any time you want to retrieve a specific representation of the resource, the tools you use to do this must be told the type as well as the URL, or must default to the right type.

(And if the tool doesn't support specifying the type or has an awkward procedure for this, you lose.)

This is a problem. To start with, we simply don't have an agreed on notation for 'a URL in a specific type' in the same sense that we have a notation for 'an URL'. Names and notation are important, and lack of good notation is a serious drawback because (among other things) it gets in the way of communication. Communication between people, communication between people and programs, and communication between programs (and even inside programs).

Of course, REST has a good reason for doing this. From my outsider perspective, the problem REST is solving is auto-discovering how to retrieve a specific representation of a resource. If you have a base URL, in a pure REST environment you now know how to retrieve any representation of the URL that's available. Want a JSON or an Atom feed version of URL <X>? Request <X> with an Accept of JSON or Atom (or what have you) and you're done. And this works for any initial default representation; you do not have to assume, eg, an initial HTML representation where you then parse <link> elements out of the <head> to discover the Atom feed URL.

Of course this raises questions in practice about what the equivalent of a URL is in another representation. For example, what is the proper Atom feed representation of the front page of a blog that shows only the most recent five entries: is it an Atom feed with only the five entries on the front page, or is it a full-sized Atom feed with more entries? I suspect that the proper REST answer is the former while most people would consider the latter to be more useful.

I'm a pragmatist. I care a lot about clear names and easy communication, and I live in a world of imperfect tools (many of which were designed before the Accept type became important this way). Thus while I can admire the intellectual purity and usefulness of the full blown REST approach, I don't agree with it in practice and I'm not likely to build services that work that way (unless there's a compelling technical need to avoid the discovery problem). Even if it's less pure, I'll park my different representations of more or less the same thing on different URLs; it's simply easier in practice.

web/PracticalRESTAccept written at 23:51:44; Add Comment

What makes backups real

DEVOPS_BORAT tweeted today:

Is not about backup, is about restore.

YES. Many, many times yes.

DEVOPS_BORAT is undeniably funny, but sometimes those funny things are also pithily saying something very important where you shouldn't just laugh and move on. This is one of those times.

Until you have tested restores, you do not have backups; you have a superstitious ritual that may or may not write some useful bits to some place. What is important is not making those bits; what is important is getting things back. If you are not testing restores, you are just going through the motions of backups without knowing if they actually work. Restores are what makes backups real instead of cargo cult rituals.

Make your backups real today, before you find out the hard way that you've just been performing a superstitious ritual.

(The ideal test is an end-to-end restoration where you don't just test that you can, say, restore a database's files from backups; you also test that your database software is happy with the files and that all of the information is there.)

If you want hair-raising things, I've written about all of the things that can go wrong with backups before.

sysadmin/WhatMakesBackupsReal written at 22:15:45; Add Comment

Shell functions versus shell scripts

As part of Sysadvent I recently read Jordan Sissel's Data in the Shell, where one of his examples was a suite of tools for doing things with field-based data (things like summing a field). I approve of this in general, but there's one problem: he wrote his tools as shell functions. My immediate reaction was some approximation of a wince.

I have nothing against shell functions; I even use them in scripts sometimes, because they can be the best tool for the job. But using shell functions for tools like this has one big drawback: shell functions aren't really reusable. Jordan's countby function is neat, but if he wants to use it in a shell script he's out of luck; he's going to have to put a copy of the shell function in the shell script. If it was a shell script, he could have used it interactively just as he did and he could have reused it in future shell scripts.

Your default should always be to write tools as shell scripts. As nifty as they may be, shell functions are for two special cases; when you need to manipulate the shell's current environment or when you are absolutely sure that what you're writing will only ever be used interactively in your shell and never in a shell script (even a shell script that you wrote purely to record that neat pipeline you put together and may want some day in the future). Frankly, there are very few tools that you will never want to reuse in shell scripts, most especially if the reason you're writing them in the first place is to make pipelines work better.

(Shell scripts are also generally easier to write and debug, since you can work on them in a full editor, try new versions easily, and run them under 'sh -x' and similar things. They are also more isolated from each other.)

By the way, I'll note that I've learned this lesson the hard way. When I started out with my shell I wrote a lot of things as shell functions; over time it's turned out that I want to use many of them as shell scripts for various reasons and so I've quietly added shell script versions of any number of them. If I was clever I would do a systematic overhaul of my remaining shell functions to sort out what I no longer use at all, what should be a shell script, and what needs to remain as a function.

unix/ShellScriptsVsFunctions written at 01:43:18; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.