2013-03-25
My (current) view of using branches in VCSes
In a comment on this entry, Aristotle Pagaltzis asked:
(Though I admit I wonder why you do have
fred-1andfred-2[source directories] rather than branches in your VCS.)
The simple answer is that my favorite way of changing branches is with
cd. This is especially the case if I'm developing things in parallel
and may well wind up throwing one of them away; for good reasons VCSes
make totally deleting a branch much harder than a plain 'rm -rf'.
I'll admit that part of my preference is because I haven't yet gotten
around to master branching in either git or Mercurial (partly because I
frankly don't entirely like it or need it yet).
(Some people will say that you should keep even experimental stuff that didn't work out in your VCS in case you ever want to go back to it later. This may work for them but it doesn't work for me; I want my VCS to be neater than that.)
But even without that I think cd is much easier than going back and
forth between branches in the same directory hierarchy, especially
if you're developing on both branches. With today's VCSes, flipping
back and forth between branches generally wants you to have actually
committed your work to both of them and this itself leads to messy
history (or to a lot of redoing commits or the equivalent, as you commit
only so that you can flip branches, then annul and overwrite the commit
the next time). I also personally think that it is a cleaner and more
natural model for (multi-)branch development, with the only downside
being more disk space being used.
(And disk space is generally cheap unless you're dealing with huge repos. The two largest repos I have handy are Mozilla and the Linux kernel; Mozilla is 2 GB and Linux is 1.2 GB. That's not going to break the bank on modern machines.)
I understand why VCSes have branch-switching commands (they can't not have them, to put it one way) and the benefits of having multiple branches in the same repo (including things like being able to do easy diffs between branches). But it just doesn't fit into the way that I prefer to interact with VCSes and I like to keep my life simple.
2013-03-18
The wrong way for a framework to lay out projects
I admire Django's attempts to screw up my entire source repository structure, where by 'admire' I actually mean 'am madly hacking around'.
I'm sad to say that Django 1.4 is a great illustration of two bad things. First it is a great illustration of how not to lay out a project hierarchy and second it is a great illustration of how not to do a layout transition.
Up until Django 1.4, the more or less canonical source layout of a Django project looked something like this:
mysite/
manage.py
settings.py
...
myapp/
models.py
....
The manage.py file is basically the central hub for doing a lot of
things with (and to) your project. In Django 1.4 they decided that
manage.py should live at the top level, in a layout that now looks
like this:
manage.py
mysite/
settings.py
....
The problem with this layout is that it breaks the first rule of sane
code layout: everything goes in a single directory hierarchy that
is your VCS repo. Modern VCSes manage a directory hierarchy, so you
really want to give them one. The Django 1.4 layout requires you to
invent a container directory purely so that you can put manage.py in
the same repo as mysite. This container repo has no sensible name
the way mysite did (or alternately the only sensible name is once
again 'mysite', so you have a mysite/mysite directory to confuse
everyone). By contrast the old layout was perfect for VCSes (you made
mysite the root of your VCS repo and everything was great).
The other problem is transitioning an existing project that has, of
course, set itself up with the mysite directory being the VCS repo.
In order to keep manage.py under VCS and to keep Django happy, you
get to push absolutely everything else in your repo down a level in
a massive (and completely artificial) rename changeset. Depending
on how your VCS works this may well completely screw up VCS history
and the ability to trace changes back over the discontinuity. Since
I decline to do this to myself (and to our Django-based web app), I'm instead forced into very ugly
hacks in manage.py to make it work where it is.
(Django people will say that Django is forced to do this because of
Python module handling issues. My view is that it is a mistake to make
things into modules in the first place when they are in fact not, and
mysite is not a module in any meaningful sense.)
2013-03-03
Why a netcat-like program is a good test of a language
When I talked about my first Go experience, I mentioned in passing that a netcat-like program is actually not a bad test program for a language (or for certain sorts of libraries in, eg C). Today I feel like explaining that.
To start with, it's not an empty and artificial challenge; a netcat-like
program does something meaningful and practical (although it may not
be necessary if you already have netcat). The problem itself touches
many levels of a language and its library, since it has to interact with
standard input and output, deal with command line arguments, look up
hostnames and ports, make network connections and talk over them, and
deal with buffering and byte input and output. It also involves some
level of network concurrency, either through real concurrency (as in
Go) or through the equivalent with select(),
poll(), or the like. There are also some subtle and taxing aspects
to the problem, such as shutdown(), that test whether the language
(and library) designers were paying attention or thought it worthwhile
to expose the entire underlying system API.
(In a low-level language like C you'll also wind up exploring things
like memory allocation and any safe buffer handling libraries that are
available. If you're working with select() et al you can also extend
the problem to playing around with nonblocking IO, again if the language
gives you access to this.)
Of course there are many aspects to a language and its libraries beyond relatively low level networking, so this problem doesn't come anywhere near to exploring all of a language and its libraries. Still, I've found that it covers a lot of ground that's interesting to me personally and the whole experience is a good way of seeing what the language feels like.
Some people will want to write HTTP-based test programs instead because that's more directly relevant to them. I'm the kind of cynical person who wants to see the low-level plumbing in action too, partly because I think it's more revealing of the language's core attitudes. Since the web is so pervasive and important, my feeling is that everyone doing a new language environment is going to make sure they have good HTTP support (assuming they care about such usability at all). And if a language doesn't have either high-level HTTP support or good low-level networking support, well, that tells me a lot about its priorities.
Go: when I'd extend an interface versus making a new one
One of the reddit suggestions in response to my entry on using
type assertions to reach through interfaces
noted
that you could embed one interface inside another one, effectively
extending the interface that you embed, so my Closer interface
could have been:
type ConnCloser interface {
net.Conn
CloseWrite() error
}
When I saw this my instinctive reaction was that this was wrong for my situation; since then I've spent some time thinking about why I feel that way. My conclusion is that I think I have good reasons but I may be wrong.
Simplifying, the dividing point for me is whether all of the values I'm
dealing with would be instances of the new interface, for example if
I was writing code that only dealt with TCP and Unix stream sockets.
In that situation my life would be simpler if I immediately converted
the net.Conn values into ConnCloser values and then had the rest of
my code deal with the latter (freely calling .CloseWrite() when it
wanted to). What I'm doing is converting net.Conn values into what
they really are, which is values that have a wider interface.
But if not all of the values I'm dealing with are convertible and if
I'm only doing the conversion in one spot (and only once), extending
net.Conn doesn't feel like an accurate description of what I'm doing.
I'm just fishing through it to see if I can call another routine and
then immediately calling that routine. Using just an interface with
CloseWrite() makes my actual intentions clear.
I'd feel different if I was passing the converted values around between
functions or storing them in something. The issue here is that such
functions don't really want to accept anything that simply has a
CloseWrite() method with the right signature; they want to deal
specifically with net.Conn values that also have that method. A bare
Closer interface that only specifies a CloseWrite() method is too
broad an allowance for what I actually mean and thus would be the wrong
approach. (At this point I start waving my hands vaguely.)
The more I think about it the less I'm sure what proper Go style
should be here, and I have to admit that part of my feelings against
ConnCloser are based purely on it having another line that doesn't do
anything in my original situation (I'm often a terseness person).