States in a state machine aren't your only representation of state

November 16, 2014

I think in terms of state machines a lot; they're one of my standard approaches to problems and I wind up using them quite a bit. I've recently been planning out a multi-threaded program that has to coordinate back and forth between threads as they manipulate the state of host authentication. At first I had a simple set of states, then I realized that these simple states only covered the main flow of events and needed to be more and more complicated, and then I had a blinding realization:

Not all state needs to be represented as state machine states.

When you have a state machine it is not so much tempting as obvious to represent every variation in the state of your entities as another state machine state. But if you do this, then like me you may wind up with an explosion of states, many of which are extremely similar to each other and more or less handled the same way. This isn't necessary. Instead, it's perfectly sensible to represent certain things as flags, additional or detailed status fields, or the like. If you want to mark something as going to be deleted once it's unused, there's no need to add new states to represent this if you can just add a flag. If you have three or four different ways for something to fail and they all lead to basically the same processing, well, you don't need three or four different states for 'failed X way'; you can have just one 'failed' state and then another field with the details of why.

Off the top of my head now, I think that states are best for things that have a different flow of processing (ideally a significantly different flow). The more both the origin state and the processing of two 'states' resembles each other, the less they need to be separate states and the more the difference can be captured in a different variable or field (and then handled in the code with only some ifs).

(On the other hand, if two different states were handled the same way but came from different origin states and transitioned to different destination states, I think I'd definitely keep them as separate states and just share the common code somehow. This would preserve the clarity of state flow in the system. Although if two separate states needed exactly the same handling in the code, I might think I was overlooking something about the states in general.)


Comments on this page:

This seems to fall somewhere under the general banner of “capture similarities in code, differences in data”.

By Jamie McRae at 2014-11-19 06:21:05:

Even in hardware protocol design, where state is to be expressed in a small finite number of bits, this is common.

Normally, there's a state machine (possibly multiple state machines), plus additional timers, counters, etc. For example, consider the ethernet transmit backoff algorithm. It retries ten times, then fails the transmission and kicks a collision error upstairs.

While you could have ten states for this, if the retransmit operation itself is a small state machine, you end up with a combinatorial exposion of states. Better to have one retransmit state machine with an if (++counter > 10) condition.

To work properly in the larger state machine, define a set of states in which the additional state is valid (timer active, counter counting) and ensure that it's properly initialized on all transitions into the set (and the timer is deactivated on all transitions out).

Written on 16 November 2014.
« Our current problems with 10G Intel networking on OmniOS
We've started to decommission our Solaris 10 fileservers »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Nov 16 01:14:43 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.