Wandering Thoughts

2019-02-15

Accumulating a separated list in the Bourne shell

One of the things that comes up over and over again when formatting output is that you want to output a list of things with some separator between them but you don't want this separator to appear at the start or the end, or if there is only one item in the list. For instance, suppose that you are formatting URL parameters in a tiny little shell script and you may have one or more parameters. If you have more than one parameter, you need to separate them with '&'; if you have only one parameter, the web server may well be unhappy if you stick an '&' before or after it.

(Or not. Web servers are often very accepting of crazy things in URLs and URL parameters, but one shouldn't count on it. And it just looks irritating.)

The very brute force approach to this general problem in Bourne shells goes like this:

tot=""
for i in "$@"; do
  ....
  v="var-thing=$i"
  if [ -z "$tot" ]; then
    tot="$v"
  else
    tot="$tot&$v"
  fi
done

But this is five or six lines and involves some amount of repetition. It would be nice to do better, so when I had to deal with this recently I looked into the Dash manpage to see if it's possible to do better with shell substitutions or something else clever. With shell substitutions we can condense this a lot, but we can't get rid of all of the repetition:

tot="${tot:+$tot&}var-thing=$i"

It annoys me that tot is repeated in this. However, this is probably the best all-around option in normal Bourne shell.

Bash has arrays, but the manpage's documentation of them makes my head hurt and this results in Bash-specific scripts (or at least scripts specific to any shell with support for arrays). I'm also not sure if there's any simple way of doing a 'join' operation to generate the array elements together with a separator between them, which is the whole point of the exercise.

(But now I've read various web pages on Bash arrays so I feel like I know a little bit more about them. Also, on joining, see this Stackoverflow Q&A; it looks like there's no built-in support for it.)

In the process of writing this entry, I realized that there is an option that exploits POSIX pattern substitution after generating our '$tot' to remove any unwanted prefix or suffix. Let me show you what I mean:

tot=""
for i in "$@"; do
  ...
  tot="$tot&var-thing=$i"
done
# remove leading '&':
tot="${tot#&}"

This feels a little bit unclean, since we're adding on a separator that we don't want and then removing it later. Among other things, that seems like it could invite accidents where at some point we forget to remove that leading separator. As a result, I think that the version using '${var:+word}' substitution is the best option, and it's what I'm going to stick with.

BourneSeparatedList written at 23:12:33; Add Comment

2019-02-06

Using a single git repo to compare things between two upstreams

The other day I wrote about hand-building an updated upstream kernel module. One of the things that I wanted to do in that is to compare the code of the nct6775 module I wanted to build between the 4.20.x branch in the stable tree and the hwmon-next branch in Guenter Roeck's tree. In my entry, I did this by cloning each Git repo separately and then running diff by hand, but this is a little awkward and I said that there was probably a way to do this in a single Git repo. Today I have worked out how to do that, and so I'm going to write it down.

To do this we need a single Git repo with both trees present in it, which means that both upstream repos need to be remotes. We can set up one as a remote simply by cloning from it:

git clone [...]/groeck/linux-staging.git

(I've chosen to start with the repo I'm theoretically going to be building from, instead of the repo I'm only using to diff against.)

Then we need to add the second repo as a remote, and fetch it:

cd linux-staging
git remote add stable [...]/stable/linux.git
git fetch stable

At this point 'git branch -r' will show us that we have all of the branches from both sides. With the data from both upstreams in our local repo and a full set of branches, we can do the full form of the diff:

git diff stable/linux-4.20.y..origin/hwmon-next drivers/hwmon/nct6775.c

We can make this more convenient by shortening one or both names, like so:

git checkout linux-4.20.y
git checkout hwmon-next

git diff linux-4.20.y.. drivers/hwmon/nct6775.c

I'm using 'git checkout' here partly as a convenient way to run 'git branch' with the right magic set of options:

git branch --track linux-4.20.y stable/linux-4.20.y

Actually checking out hwmon-next means we don't have to name it explicitly.

We can also diff against tags from the stable repo, and we get to do it without needing to say which upstream the tags are from:

git diff v4.20.6.. drivers/hwmon/nct6775.c
git diff v4.19.15.. drivers/hwmon/nct6775.c

The one drawback I know of to a multi-headed repo like this is that I'm not sure how you get rid of an upstream that you don't want any more. At one level you can just delete the remote, but that leaves various things cluttering up your repo, including both branches and tags. Presumably there is a way in Git to clean those up and then let Git's garbage collection eventually delete the actual Git objects involved and reclaim the storage.

(One can do more involved magic by not configuring the second repo as a remote and using 'git fetch' directly with its URL, but I'm not sure how to make the branch handling work nicely and so on. Setting it up as a full remote makes all of that work, although it also pulls in all tags unless you use '--no-tags' and understand what you're doing here, which I don't.)

Looking back, all of this is relatively basic and straightforward and I think I knew most of the individual bits and pieces involved. But I'm not yet familiar and experienced enough with git to confidently put them all together on the fly when my real focus is doing something else.

(Git is one of those things that I feel I should be more familiar with than I actually am, so every so often I take a run at learning how to do another thing in it.)

GitCompareAcrossUpstreams written at 23:10:50; Add Comment

2019-01-31

What getopt package I use for option handling in my Go programs

One of Go's famous issues is that the standard library's flag package doesn't handle command line options in the normal Unix getopt style. When I started with Go this quite irritated me, and then later I gave up and decided to use flag anyway. Since then, I've changed my mind again and now all of my recent Go programs use a third party package that gives me more Unix-style option handling. There are many of these packages; the one that I settled on is github.com/pborman/getopt/v2.

(At one point I played around with github.com/spf13/cobra, but apparently it didn't stick. I think it was too complicated for the sort of small programs I usually write, which don't normally have sub-commands and so on.)

What I like about pborman/getopt is that it's straightforward to use and it gives you Unix style command handling with no fuss. I use the API format that uses existing variables, so my code tends to look like:

getopt.Flag(&oneline, 'l', "List one machine per line")
getopt.FlagLong(&help, "help", 'h', "Print this help")
[...]

The exception to this is counters, such as the usual -v:

vp := getopt.Counter('v', "Increase verbosity for some operations")

Generally I've found the API straightforward, and I like that it's relatively simple to add some additional help text after the flags using getopt.SetUsage() and so on. My programs not infrequently have some extra usage information, but not enough to call for a full separate option to dump it out; tacking it on the end of the flags is my usual approach.

(This is where I point to the GoDoc for a full API reference.)

A lot of people seem to like the Kingpin package; I know that many Prometheus programs use it, for example, as does Square's certigo and acmetool. If I had complicated needs I would probably look into it.

(Like Cobra, Kingpin handles sub-commands with sub-flags and so on. You can see an example of that in certigo.)

PS: Now that I look at GoDoc, this package doesn't seem to be very popular nowadays. Things like cobra, pflag, and Kingpin are reported as being used by many more packages. If you're afraid of package rot and other problems, this may influence your choice. On the other hand, I tend to think that Unix getopt style option handling probably doesn't need much further development.

GoMyGetoptChoice written at 23:54:29; Add Comment

2019-01-28

Go 2 Generics: some features of contracts that I like

A major part of the Go 2 generics draft design is its use of contracts to specify constraints on what types are accepted by generic functions, which are essentially formalized function bodies. On the one hand I think that contracts today are too clever, but on the other hand I think that interfaces are not the right model for type constraints. Although the existing first Go 2 generics draft is probably dead at this point, since it's clear that the community is split, I still feel like writing down some features of contracts that I like (partly because I'd like to see them preserved in any future proposal).

The first thing is that contracts are effectively an API and are explicitly decoupled from the implementation of generic functions. This is good for all of the reasons that an API is good; it both constrains and frees the implementer, and lowers the chances that either users or implementers (or both) are quietly relying on accidental aspects of the implementation. As an API, in the best case contracts can provide straightforward documentation that is independent from a potentially complex implementation.

(I think it would be hard to provide this with the current version of contracts, due to them being too clever, but that's a separate thing.)

Because contracts have an independent existence, multiple things can all use the same contract. Because they're all using the same contract, it's guaranteed that they all accept the same types and that a type that can be used with one can be used with all. This is directly handy for methods of generic types (which might be a bit hard otherwise), but I think it's a good thing in general; it makes it both clear and easy to create a related group of generic functions that all operate on the same things, for example.

(In theory you can do this even if each implementation has a separate expression of the type constraints; you just make all of the separate type constraints be the same. But in practice this is prone to various issues and mistakes. A single contract both makes it immediately clear that everything accepts the same type and enforces that they do.)

The final thing I like about contracts is that they can explicitly use (and thus require) struct fields. This is a somewhat contentious issue, but I don't like getters and setters and I think that allowing for direct field access is more in the spirit of Go's straightforward efficiency. Perhaps a sufficiently clever compiler could inline those little getter and setter functions, but with direct struct fields you don't have to count on that.

(I also feel that direct access to struct fields is in keeping with direct access to type values, which very much should be part of any generics implementation. If you cannot write Min() and Max() generic functions that are as efficient as the non-generic versions, the generics approach is wrong.)

Go2ContractsLike written at 23:39:20; Add Comment

2019-01-17

Why C uninitialized global variables have an initial value of zero

In C, uninitialized local variables are undefined but uninitialized global variables (whether static or not) are defined to start out as zero. This difference periodically strikes people as peculiar and you might wonder why C is this way. As it happens, there is a fairly simple answer.

One answer is certainly 'because the ANSI C standard says that global variables behave that way', and in some ways this is the right answer (but we'll get to that). Another answer is 'because C was documented to behave that way in "The C Programming Language" and so ANSI C had no choice but to adopt that behavior'. But the real answer is that C behaves this way because it was the most straightforward way for it to behave in Unix on PDP-11s, which was its original home.

In a straightforward compiled language like the early versions of C, all global variables have a storage location, which is to say that they have a fixed permanent address in memory. This memory comes from the operating system and when operating systems give you memory, they don't give it to you with random contents; for good reasons they have to set it to something and they tend to fill it with zero bytes. Early Unix was no exception, so the memory locations for uninitialized global variables were know to start out as all zero bytes. Hence early K&R C could easily and naturally declare that uninitialized global variables were zero, as they were located in memory that had been zero-filled by the operating system.

(Programs did not explicitly ask Unix for this memory. Instead, executable files simply had a field that said 'I have <X> bytes of bss', and the kernel set things up when it loaded the executable.)

The fly in the ointment for this simple situation is that there are some uncommon architectures where zero-filled memory doesn't give you zero valued variables for all types and instead the 0 value for some types has some of its bits turned on in memory. When this came up, people decided that C meant what it said; uninitialized values of these types were still zero, even though you could no longer implement this with no effort by just putting these variables in zero-filled memory. This is where 'the ANSI C standard says so' is basically the answer, although it is also really the only good answer since any other answer would make the initial value of uninitialized global variables non-portable.

(You can read more careful discussion of this on Wikipedia, and probably in many C FAQs. The comp.lang.c FAQ section 5.17 lists some architectures where null pointers are not all-bits-zero values. I suspect that there have been C compilers on architectures where floating point 0 is not all-bits-zero, although it is in IEEE 754 floating point, which pretty much everyone uses today.)

As a side note, the reason that this logic doesn't work for uninitialized local variables is that in a straightforward C implementation, they go on the stack and the stack is reused. The very first time you use a new section of stack, it's fresh memory from the operating system, so it's been zero-filled for you and your uninitialized local variables are zero, just like globals. But after that the memory has 'random' values left over from its previous use. And for various reasons you can't be sure when a section of the stack is being used for the first time.

(In a modern C environment, even completely untouched sections of the stack may not be zero. For security reasons, they may have been filled with random values or with specific 'poison' ones.)

CWhyGlobalsZeroDefault written at 00:42:22; Add Comment

2018-12-16

The Go 2 Error Handling proposal will likely lead to more use of error in return types

One of the bits of the Go 2 error handling proposal that some people dislike is that the new check keyword only works on values of type error. I've seen a number of suggestions in the wiki feedback page that widen this, making check more generally applicable. I don't have a strong opinion on this, but I do have an observation on the current approach.

One of my firm beliefs is that most programmers are strongly driven to do what their languages make easy. If a language makes something the easiest path to what programmers want, programmers will use that path, even if it's not what the path is really for and even if it requires some contortions. What the language designers feel about this or advocate doesn't really matter; people will not take the hard road just because you want them to and think they should.

In the current Go 2 error handling proposal, using a check on an error value is the easiest way to check for exceptional conditions. The obvious thing to expect here is that people will increasingly signal exceptional conditions through error values, including adding them to APIs where necessary or changing return values from other types (especially bool) to error instead. The resulting error may or may not be meaningful (in some cases it will be a boolean), but it can be used with check to make exception handling easy, and that's what a lot of people will care about.

(Some of these APIs will be public, but a lot more will be internal or even the informal 'APIs' of internal functions and methods inside a package.)

Whether or not this is a good thing probably depends on your perspective. On the one hand, we're likely to see a lot more use of error rather than people making up their own error mechanisms. On the other hand, some of those uses of error will not be for actual errors and the error values won't be particularly meaningful. Some of them will really be booleans, where the only thing that matters is whether the value is nil or non-nil. This risks confusing other people who expect the error values to have any meaning or use beyond the nil check used by check.

If people consider this change in error usage to be a bad thing, I don't think that there's any way out short of either not implementing error handling or of making check more general somehow.

(This is related to my views on the Go 2 error inspection proposal, where I also expect people to do what the proposal makes easiest, whether or not it's what the language designers want.)

Sidebar: A reason to keep check restricted to error values despite this

An error value is a clear signal that this is where you find whether or not something has succeeded (well, mostly, and CGo is an exception that is not really in normal Go APIs). This means that you can be very confident that checking an error for non-nil is a proper activity; you are not accidentally checking something that is not really an error signal. You can't have as much confidence in bool values or the like, and therefor there's a higher chance that programmers who are slapping check in front of functions that return bool are actually making a (tempting) mistake.

You might think that no one would ever make this sort of confusion. I disagree; exactly what is and isn't an error, and how they're signaled, is something that can easily go wrong if you're not working with something that is completely unambiguous. See, for example, how ZFS on Linux accidentally misunderstood an error return in a way that caused a kernel crash.

(In Go, a non-nil error value meaning an error is socially unambiguous, and it would become more so in a world with check in it. CGo is an unfortunate forced exception that I don't think anyone likes.)

Go2ErrorHandlingHammer written at 00:57:21; Add Comment

2018-12-07

Modern Bourne shell arithmetic is pretty pleasant

I started writing shell scripts sufficiently long ago that the only way you had to do arithmetic was to use expr. So that settled in my mind as how you had to do arithmetic in shell scripts, and since using expr is kind of painful (and it has an annoying, obscure misfeature), mostly I didn't. If I actually had to do arithmetic in a shell script I might reach for relatively heroic measures to avoid expr, like running numbers through awk. In one relatively recent occasion, I had enough calculations to do that I resorted to dc (in retrospect this was probably a mistake). However, lately I've been using shellcheck, and it's been nagging at me about my occasional use of expr, which had the effect of raising my awareness of the modern alternatives.

Today I needed to write a script that involved a decent amount of math, so I decided to finally actually use modern built-in Bourne shell arithmetic expressions for all of my calculations. The whole experience was pretty pleasant, everything worked, and the $((...)) syntax is a lot nicer and more readable than any of the other alternatives (including expr). Since $((...)) is part of the POSIX shell standard and so supported by basically anything I want to use, I'm going to both switch my habits to it and try to remember that I can use arithmetic in shell scripts now if I want to.

Since some of what I was doing in my shell script was division and percentages, I found it a little bit irksome that Bourne shell arithmetic is entirely integer arithmetic; it got in the way of writing some expressions in the natural way. For example, if you want N% of a number (where both the number and the percent are variables), you'd better not write it as:

$(( avar * (npercent/100) ))

That only works in floating point. Instead you need to restructure the expression to:

$(( (avar * npercent) / 100 ))

It's a little thing, but it was there. And since at least some Bourne shells truncate instead of round in this situation, I found myself carefully looking at every division I was doing to see how it was going to come out.

One thing I found both pleasant and interesting is how you don't write '$avar' in arithmetic expansion context, just 'avar'. Unlike almost everywhere else in Bourne shell syntax, here the Bourne shell treats a bare word as a variable instead of straight text. This is a completely sensible decision, because arithmetic expansion is a context where you're not going to use bare words. This context dependent shift is a pretty clever way to partially bridge the gulf between shells and scripting languages.

(For an example of how annoying it is to make the decision the other way, see my view of where TCL went wrong.)

PS: You might ask why it took me so long to switch. Partly it's habit, partly it's that I spent a long time writing shell scripts that sometimes had to run on Solaris machines (where /bin/sh is not a POSIX shell), and partly it's that I spent a long time thinking of arithmetic in the shell as a Bash-specific feature. It was only within the past few years that it sunk in that arithmetic expressions are actually a POSIX shell feature and so it's portable and widely supported, not a Bash-ism.

(It took me a similarly long amount of time to switch to $(...) instead of `...` for command substitution, even though the former is much superior to the latter.)

BournePleasantArithmetic written at 00:47:32; Add Comment

2018-11-29

Go 2 Generics: Interfaces are not the right model for type constraints

A significant and contentious part of the Go 2 draft generics design is its method of specifying constraints on the types that implementations of generic functions can be used with. There are excellent reasons for this; as an motivating example, I will quote the overview:

[...] In general an implementation may need to constrain the possible types that can be used. For example, we might want to define a Set(T), implemented as a list or map, in which case values of type T must be able to be compared for equality. [...]

The Go 2 generics proposal adopts a method called contracts, which are basically Go function bodies with somewhat restricted contents. From the beginning, one of the most proposed changes in generics is that type constraints should instead be represented through something like Go interfaces. I believe that this would be a mistake and that interfaces are fundamentally the wrong starting point for a model of type constraints.

First, let's observe that we can't really use stock interfaces as they stand as type constraints. This is because stock interfaces are too limited; they only let us say things about a single type and those things have fixed types (whether interfaces or concrete types). Pretty much every interface-based proposal that I've seen immediately expands generic-constraint interfaces to allow for multiple type variables that are substituted into the interface. Axel Wagner's proposal is typical:

type Graph(type Node, Edge) interface {
  Nodes(Edge) []Node
  Edges(Node) []Edge
}

However, I believe that this is still the wrong model. The fundamental issue is that interfaces are about methods, but many type constraints are about types.

An interface famously does not say things about the type itself; instead, it says things about the methods provided by the type. This provides implementations of interfaces with great freedom; in a well-known example, it's possible and even simple for functions to fulfill an interface. Providing interfaces with type variables and so on does not fundamentally change this. Instead it merely expands the range and flexibility of what one can say about methods.

However, a significant amount of what people want to do with generic functions is about the types themselves, and thus wants constraints on the types. The starting example of Set(T) is an obvious one. The implementation does not care about any methods on T; it simply wants to be able to compare T for equality. This is completely at odds with what interfaces want to talk about. Fundamentally, a significant number of generics want to operate on types themselves. There is not just Set(T), there are also often expressed desires like Max(T), Min(T), Contains(), Sort([]T), and so on.

A related issue is that interfaces are about a single type, while type constraints in generic functions are not infrequently going to be about constraints on the relationship between types. The Graph example from the overview is an example; it talks about two separate types, each of which is required to have a single method with a specific type signature:

contract Graph(n Node, e Edge) {
  var edges []Edge = n.Edges()
  var nodes []Node = e.Nodes()
}

In Axel Wagner's proposal, this is modified (as it has to be) into a single type that implements both methods. These two are not the same thing.

An example that combines both problems is the convertible contract from the draft design:

contract convertible(_ To, f From) {
  To(f)
}

This is expressing a constraint about the relationship between two types; From must be convertible into To. There is no single type and no methods in sight, and so expressing this in the interface model would require inventing both.

All of this is a sign that using interfaces to express type constraints is forcing a square peg into a round hole. It is not something that naturally fits the problem; it is simply something that Go already has. Interfaces would be a fine fit in a world where generics were about methods, but that is not the world that people really want; they want generics that go well beyond that. If Go 2 is to have generics, it should deliver that world and do so in a natural way that fits it.

Given my view that contracts are too clever in their current form, I'm not sure that contracts are right answer for type constraints for generics. But I'm convinced that starting from interfaces is definitely the wrong answer.

(This entry was sparked by a discussion with Axel Wagner on Reddit, where questions from him forced me to sit down and really think about this issue instead of relying on gut feelings and excessive hand waving.)

Sidebar: Interfaces and outside types

In their current form, interfaces are limited to applying to existing methods, for good reasons. But if type constraints must be expressed through methods, what do you do when you want to apply a type constraint to a type that does not already implement the method for that constraint? When it is your own package's type, you can add the method; however, when it is either a standard type or a type from another package, you can't.

I think it's clear that you should be able to apply generic functions to types you obtain from the outside world, for example from calling other package's functions or methods. You might reasonably wish to find the unique items from a bunch of slices that you've been given, for example, where the element type of these slices is not one your types but comes from another package or is a standard type such as int.

(It would be rather absurd to be unable to apply Max() to int values from elsewhere, and a significant departure from current Go for the language to magically add methods to int so that it could match type constraints in interfaces.)

Go2GenericsNotWithInterfaces written at 01:02:56; Add Comment

2018-11-28

Go 2 Generics: A way to make contracts more readable for people (if not programs)

In the Go 2 generics draft design, my only real issue with contracts is their readability; contracts today are too clever. Contracts have been carefully set up to cover the type constraints that people will want to express for generics, and I think that something like contracts is a better approach to type constraints for generics than something like interfaces (but that's another entry). Can we make contracts more readable without significantly changing the current Go generics proposal? I believe we can, through use of social convention and contract embedding.

For being read and written by people, the problem with contracts today is that they do not always clearly say what they mean. Instead they say it indirectly, through inferences and implicit type restrictions. To make contracts more readable, we need to provide a way to say tricky things directly. One obvious way to do this is with a standard set of embeddable contracts that would be provided in a package in the standard library. Let's call this package generic/require (because there probably will be a standard generic package of useful generic functions and so on).

The require package would contain contracts for standard and broadly useful properties of types, such as require.Orderable(), require.String(), and require.Unsigned(). When writing contract bodies it would be standard practice to use the require versions instead of writing things out by hand yourself. For example:

contract Something(t T) {
  require.Unsigned(t)
  // instead of: 1 << t
}

Similarly, you would use 'require.Integer(t)' instead of any of the various operations that implicitly restrict t to being an integer (signed or unsigned), and require.Numeric if you just needed a number type, and so on. Of course, contracts from the require package could be used on more than direct type parameters of your contract; they could be used for return values from methods, for example.

The obvious advantage of this for the reader is that you have directly expressed what you require in a clear way. For cases like require.Integer, where there are multiple ways of implementing the same restriction, you've also made it so that people don't have to wonder if there's some reason that you picked one over the other.

It's likely that not everything in contracts would be written using things from the require package. Some restrictions on types would be considered obvious (comparability is one possible case), and we would probably find that require contracts don't make other cases any easier to read or write. One obvious set of candidates for require contracts are cases where there are multiple ways of creating the same type restriction.

(In the process the Go developers might find that there were gaps in the type restrictions that you could express easily, as exposed by it being difficult or impossible to write a require contract for the restriction. For example, I don't think there's currently any easy way to say that a type must be a signed integer type.)

Use of the require package would be as optional as the use of Go's standard formatting style, and it would be encouraged in much the same way, by the tacit social push of tools. For instance, 'gofmt -s' could rewrite your contracts to use require things instead of hand-rolled alternatives, and golint or even vet could complain about 'you should use require.<X> instead of ...'. Faced with this push towards a standard and easy way of expressing type restrictions, I believe that most people would follow it, much as how gomft has been accepted and become essentially universal.

An open question is if this would add enough more readability to significantly improve contracts as they stand. I'm not sure that a number of the trickier restrictions in the full draft design could be implemented in this sort of reusable contract restriction in a way that both is usable and is amenable to a good, clear name. People may also disagree over the relative readability of, say:

contract stringer(x T) {
   var s string = x.String()
   // equivalent:
   require.String(x.String())
   // or perhaps:
   require.Returns(x.String, string)
}

At a certain point it feels like one would basically be spelling out phrases and sentences in the form of require package contracts. I'm not sure that's a real improvement, or if a more fundamental change to how contracts specify requirements is needed for true readability improvements.

Sidebar: Augmenting the language through the back door

The require package would also provide a place to insert special handling of type restrictions that can't be expressed in normal Go, in much the way that the unsafe package is built in to the compiler (along with various other things). People might find this too awkward for some sorts of restrictions, though, since using require has to at least look like normal Go and even then, certain sorts of potentially implementable things might be too magical.

Go2ContractsMoreReadable written at 02:01:13; Add Comment

2018-11-16

Go 2 Generics: Contracts are too clever

The biggest and most contentious thing in the Go 2 Draft Designs is its proposal for generics. Part of the proposal is a way to specify things about the types that generic functions can be used on, in the form of contracts. Let me quote from the overview:

[...] In general an implementation may need to constrain the possible types that can be used. For example, we might want to define a Set(T), implemented as a list or map, in which case values of type T must be able to be compared for equality. To express that, the draft design introduces the idea of a named contract. A contract is like a function body illustrating the operations the type must support. For example, to declare that values of type T must be comparable:

contract Equal(t T) {
    t == t
}

This is a clever trick (among other things it basically avoids adding any new syntax to Go to define contracts), but my view is that it is in fact too clever. Using function bodies with operations in them to define the valid types is essentially prioritizing minimal additions to the language and the compiler over everything else. The result is easy for the compiler to use (it just runs types through contract bodies in a regular typechecking process) but provides generally bad to terrible ergonomics for everything else.

In practice, contracts will be consumed and used by far more than the compiler. People will read contracts to understand error messages about 'your type doesn't satisfy this contract', to understand what they need to do to create types that can be used with generic functions they're interested in, to see how to specify something for their own generic functions, to try to understand mistakes that they made in their own contracts, and to understand what a contract actually requires (as opposed to what the code comments claim it requires, because we all know about what happens to code comments eventually). IDE systems will read contracts so they can figure out what types can be suggested as type parameters for generic functions. Code analysis will read contracts for all sorts of reasons, including spotting unnecessarily requirements that the implementation doesn't actually need.

All of these parties and all of these purposes are badly served by contracts which require you to understand the language and its subtle implications at a deep level. For instance, take this contract:

contract Example(t T) {
    t.Fetch()[:5]
    t.AThing().Len()
}

The implications of both lines are reasonably subtle; the first featured in a quiz by Dave Cheney, and the second requires certain sorts of return values and what the .Len() method is on, as covered as part of addressable values in Go. I would maintain that neither are obvious to a person (certainly they aren't to me without careful research), and they might or might not be easily understood by code (certainly they're the kind of corner cases that code could easily be wrong on).

Part of the bad ergonomics of contracts here is that they look simple and straightforward while hiding subtle traps in the fine details. I've illustrated one version of that here, and the detailed draft design actually shows several other examples. I could go on with more examples (consider the wildly assorted type requirements of various arithmetic operators, for example).

Go contracts do not seem designed to be read by anything but the compiler. This is a mistake; computer languages are in large part about communicating with other people, including your future self, not with the computer. If we're in doubt, we should bias language design toward being clearly and easily read by people, not toward the compiler. Go generally has this bias, preferring clean communication over excessive cleverness. The current version of contracts seem quite at odds with this.

(This is especially striking because other parts of the Go 2 Draft Designs are about making Go clearer for people. The error handling proposal, for example, is all about making Go code read better by hiding repetitive patterns.)

PS: The Go team may have the view that only a few people will ever create generics and thus contracts, and mostly everyone else will read the excellent documentation that these very few people have carefully written. I think that the Go team is quite wrong here; I expect to see generics sprout like mushrooms across Go codebases, at least for the first few years. In general I would argue that if generics are successful, we can expect to see them widely used, which means that many generics and contracts will not be written or read by people who are deep experts with Go. A necessary consequence of this wide use will be that some amount of it will be with contracts and code created partly through superstition in various forms.

Go2ContractsTooClever written at 01:04:00; Add Comment

(Previous 10 or go back to October 2018 at 2018/10/15)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.