Chris's Wiki :: blog/techhttps://utcc.utoronto.ca/~cks/space/blog/tech/?atomDWiki2024-03-24T03:08:33ZRecently changed pages in Chris's Wiki :: blog/tech.tag:cspace@cks.mef.org,2009-03-24:/blog/tech/DNSIpLookupsManyPossibilitiescks<div class="wikitext"><p>One of the things that you can do with the <a href="https://en.wikipedia.org/wiki/Domain_Name_System">DNS</a> is ask it to
give you the DNS name for an IP address, in what is called a <a href="https://en.wikipedia.org/wiki/Reverse_DNS_lookup">reverse
DNS lookup</a>. A
full and careful reverse DNS lookup is more complex than it looks
and has more possible results than you might expect. As a result,
it's common for system administrators to talk about <em>validated
reverse DNS lookups</em> versus plain or unvalidated reverse DNS lookups.
If you care about the results of the reverse DNS lookup, you want
to validate it, and this validation is where most of the extra
results come in to play.</p>
<p>(To put the answer first, a validated reverse DNS lookup is one
where the name you got from the reverse DNS lookup also exists in
DNS and lists your initial IP address as one of its IP addresses.
This means that the organization responsible for the name agrees
that this IP is one of the IPs for that name.)</p>
<p>The result of a plain reverse DNS lookup can be zero, one, or even
many names, or a timeout (which is in effect zero results but which
takes much longer). Returning more than one name from a reverse DNS
lookup is uncommon and some APIs for doing this don't support it
at all, although DNS does. However, you cannot trust the name or
names that result from reverse DNS, because reverse DNS lookups is
done using a completely different set of <a href="https://en.wikipedia.org/wiki/DNS_zone">DNS zones</a> than domain names use, and
as a result can be controlled by a completely different person or
organization. I am not Google, but I can make reverse DNS for an
IP address here claim to be a Google hostname.</p>
<p>(Even within an organization, people can make mistakes with their
reverse DNS information, precisely because it's less used than the
normal (forward) DNS information. If you have a hostname that
resolves to the wrong IP address, people will notice right away;
if you have an IP address that resolves to the wrong name, people
may not notice for some time.)</p>
<p>So for each name you get in the initial reverse DNS lookup, there are a
number of possibilities:</p>
<ul><li>Tha name is actually an (IPv4, generally) IP address in text form.
People really do this even if they're not supposed to, and your DNS
software probably won't screen these out.<p>
</li>
<li><a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/ReverseDNSCleverness">The name is the special DNS name used for that IP address's
reverse DNS lookup</a> (or at
least some IP's lookup). It's possible for such names to also
have IP addresses, and so you may want to explicitly screen
them out and not consider them to be validated names.<p>
</li>
<li>The name is for a private or non-global name or zone. People do
sometimes leak internal DNS names into reverse DNS records for
public IPs.</li>
<li>The name is for what should be a public name but it doesn't exist
in the DNS, or it doesn't have any IP addresses associated with it
in a forward lookup.<p>
In both of these cases we can say the name is <em>unknown</em>. If you
don't treat 'the name is an IP address' specially, such a name
will also turn up as unknown here if you make a genuine DNS query.<p>
</li>
<li>The name exists in DNS with IP addresses, but the IP address you
started with is not among the IP addresses returned for it in a
forward lookup. We can say that the name is <em>inconsistent</em>.<p>
</li>
<li>The name exists in DNS with IP addresses, and one of those IP
addresses is the IP address you started with. The name is
<em>consistent</em> and the reverse DNS lookup is <em>valid</em>; the IP address
you started with is really called that name.</li>
</ul>
<p>(<a href="https://utcc.utoronto.ca/~cks/space/blog/programming/ReverseDNS">There may be a slight bit of complexity in doing the forward
DNS lookup</a>.)</p>
<p>If a reverse DNS lookup for an IP address gave you more than one
name, you may only care whether there is one valid name (which gives
you a name for the IP), you may want to know all of the valid names,
or you may want to check that all names are valid and consider it
an error if any of them aren't. It depends on why you're doing the
reverse DNS lookup and validation. And you might also care about
why a name doesn't validate for an IP address, or that an IP address
has no reverse DNS lookup information.</p>
<p>Of course if you're trying to find the name for an IP address, you
don't necessarily have to use a reverse DNS lookup. In some sense,
the 'name' or 'names' for an IP address are whatever DNS names point
to it as (one of) their IP address(es). If you have an idea what
those names might be, you can just directly check them all to see if
you find the IP you're curious about.</p>
<p>If you're writing code that validates IP address reverse DNS lookups,
one reason to specifically check for and care about a name that is
an IP address is that some languages have 'name to IP address' APIs
that will helpfully give you back an IP address if you give them
one in text form. If you don't check explicitly, you can look up an
IP address, get the IP address in text form, feed it into such an API,
get the IP address back again, and conclude that this is a validated
(DNS) name for the IP. </p>
<p>It's extremely common for IP addresses to have names that are unknown
or inconsistent. It's also pretty common for IP addresses to not
have any names, and not uncommon for reverse DNS lookups to time
out because the people involved don't operate DNS servers that
return timely answers (for one reason or another).</p>
<p>PS: It's also possible to find out who an IP address theoretically
belongs to, but that's an entire different discussion (or several
of them). Who an IP address belongs to can be entirely separate
from what its proper name is. For example, in common colocation
setups and VPS services, the colocation provider or VPS service
will own the IP, but its proper name may be a hostname in the
organization that is renting use of the provider's services.</p>
</div>
The many possible results of turning an IP address into a 'hostname'2024-03-24T03:08:33Z2024-03-24T03:07:31Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SSDsUnderstandingDramlesscks<div class="wikitext"><p>Over on the Fediverse, <a href="https://mastodon.social/@cks/112124784351423486">I grumbled about trying to find SATA SSDs for
server OS drives</a>:</p>
<blockquote><p>Trends I do not like: apparently approximately everyone is making
their non-Enterprise ($$$) SATA SSDs be kind of terrible these days,
while everyone's eyes are on NVMe. We still use plenty of SATA SSDs
in our servers and we don't want to get stuck with terrible slow
'DRAM-less' (QLC) designs. But even reputable manufacturers are
nerfing their SATA SSDs into these monsters.</p>
</blockquote>
<p>(By the '(QLC)' bit I meant SATA SSDs that were both DRAM-less and
used <a href="https://en.wikipedia.org/wiki/Quad-level_cell#Quad-level_cell">QLC flash</a>,
which is generally not as good as other flash cell technology but
is apparently cheaper. The two don't have to go together, but if
you're trying to make a cheap design you might as well go all the
way.)</p>
<p>In a reply to that post, <a href="https://mastodon.social/@cesarb/112125110964541395">@cesarb noted that the SSD DRAM is most
important for caching internal metadata</a>, and shared
links to <a href="https://sabrent.com/blogs/storage/dram-hmb">Sabrent's "DRAM & HMB"</a> and <a href="https://phisonblog.com/host-memory-buffer-2/">Phison's "NAND
Flash 101: Host Memory Buffer"</a>, both of which cover
this issue from the perspective of NVMe SSDs.</p>
<p>All SSDs need to use (and maintain) metadata that tracks things
like where logical blocks are in the physical flash, what parts of
physical flash can be written to right now, and how many writes
each chunk of flash has had for wear leveling (since flash can only
be written to so many times). The master version of this information
must be maintained in flash or other durable storage, but an old
fashioned conventional SSD with DRAM had some amount of DRAM that
was used in large part to cache this information for fast access
and perhaps fast bulk updating before it was flushed to flash. A
DRAMless SSD still needs to access and use this metadata, but it
can only hold a small amount of it in the controller's internal
memory, which means it must spend more time reading and re-reading
bits of metadata from flash and may not have as comprehensive a
view of things like wear leveling or the best ready to write flash
space.</p>
<p>Because they're PCIe devices, DRAMless NVMe SSDs can borrow some
amount of host RAM from the host (your computer), much like some or
perhaps all integrated graphics 'cards' (which are also nominally
PCIe devices) borrow host RAM to use for GPU purposes (the NVMe
"Host Memory Buffer (HMB)" of the links). This option isn't available
to SATA (or SAS) SSDs, which are entirely on their own. The operating
system generally caches data read from disk and will often buffer data
written before sending it to the disk in bulk, but it can't help with
the SSD's internal metadata.</p>
<p>(DRAMless NVMe drives with a HMB aren't out of the woods, since I
believe the HMB size is typically much smaller than the amount of
DRAM that would be on a good NVMe drive. There's an interesting
looking academic article from 2020, <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0229645">HMB in DRAM-less NVMe SSDs:
Their usage and effects on performance</a>
(<a href="https://doi.org/10.1371/journal.pone.0229645">also</a>).)</p>
<p>How much the limited amount of metadata affects the drive's performance
depends on what you're doing, based on both anecdotes and Sabrent's
and Phison's articles. It seems that the more internal metadata
whatever you're doing needs, the worse off you are. The easily
visible case is widely distributed random reads, where a DRAMless
controller will apparently spend a visible amount of time pulling
metadata off the flash in order to find where those random logical
blocks are (enough so that it clearly affects SATA SSD latency, per
the Sabrent article). Anecdotally, some DRAMless SATA SSDs can
experience terrible write performance under the right (or wrong)
circumstances and actually wind up performing worse than HDDs.</p>
<p>Our typical server doesn't need much disk space for its system disk
(well, the mirrored pair that we almost always use); even a generous
Ubuntu install barely reaches 30 GBytes. With automatic weekly
<a href="https://en.wikipedia.org/wiki/Trim_(computing)">TRIMs</a> of all
unused space (<a href="https://utcc.utoronto.ca/~cks/space/blog/linux/LinuxBlockDiscardInPractice">cf</a>), the
SSDs will hopefully easily be able to find free space during writes
and not feel too much metadata pressure then, and random reads will
hopefully mostly be handled by Linux's in RAM disk cache. So I'm
willing to believe that a competently implemented DRAMless SATA SSD
could perform reasonably for us. One of the problems with this
theory is finding such a 'competently implemented' SATA SSD, since
the reason that SSD vendors are going DRAMless on SATA SSDs (and
even NVMe drives) is to cut costs and corners. A competent, well
performing implementation is a cost too.</p>
<p>PS: I suspect there's no theoretical obstacle to <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerNVMeU2U3AndOthers2022">a U.2 form factor
NVMe drive</a> being DRAMless and using
a Host Memory Buffer over its PCIe connection. In practice U.2
drives are explicitly supposed to be hot-swappable and I wouldn't
really want to do that with a HMB, so I suspect DRAM-less NVMe
drives with HMB are all M.2 in practice.</p>
<p>(I also have worries about how well the HMB is protected from stray
host writes to that RAM, and how much the NVMe disk is just trusting
that it hasn't gotten corrupted. Corrupting internal flash metadata
through OS faults or other problems seems like a great way to have
a very bad day.)</p>
</div>
About DRAM-less SSDs and whether that matters to us2024-03-20T03:16:42Z2024-03-20T03:15:41Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/WriteBufferingAndSyncscks<div class="wikitext"><p>Pretty much every modern system defaults to having data you write
to filesystems be buffered by the operating system and only written
out asynchronously or when you specially request for it to be flushed
to disk, which gives you <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WriteBufferingHowMuch">general questions about how much write
buffering you want</a>. Now suppose, not
hypothetically, that you're doing write IO that is pretty much
always going to be specifically flushed to disk (with <code>fsync()</code> or
the equivalent) before the programs doing it consider this write
IO 'done'. You might get this situation where you're writing and
rewriting mail folders, or where the dominant write source is
updating a <a href="https://en.wikipedia.org/wiki/Write-ahead_logging">write ahead log</a>.</p>
<p>In this situation where the data being written is almost always
going to be flushed to disk, I believe the tradeoffs are a bit
different than in <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WriteBufferingHowMuch">the general write case</a>.
Broadly, you can never actually write at a rate faster than the
write rate of the underlying storage, since in the end you have to
wait for your write data to actually get to disk before you can
proceed. I think this means that you want the OS to start writing
out data to disk almost immediately as your process writes data;
delaying the write out will only take more time in the long run,
unless for some reason the OS can write data faster when you ask
for the flush than before then. In theory and in isolation, you may
want these writes to be asynchronous (up until the process asks for
the disk flush, where you have to synchronously wait for them),
because the process may be able to generate data faster if it's not
stalling waiting for individual writes to make it to disk.</p>
<p>(In OS tuning jargon, we'd say that you want writeback to start
almost immediately.)</p>
<p>However, journaling filesystems and concurrency add some extra
complications. Many journaling filesystems have the journal as a
central synchronization point, where only one disk flush can be in
progress at once and if several processes ask for disk flushes at
more or less the same time they can't proceed independently. If you
have multiple processes all doing write IO that they will eventually
flush and you want to minimize the latency that processes experience,
you have a potential problem if different processes write different
amounts of IO. A process that asynchronously writes a lot of IO and
then flushes it to disk will obviously have a potentially long
flush, and this flush will delay the flushes done by other processes
writing less data, because everything is running through the
chokepoint that is the filesystem's journal.</p>
<p>In this situation I think you want the process that's writing a lot
of data to be forced to delay, to turn its potentially asynchronous
writes into more synchronous ones that are restricted to the true
disk write data rate. This avoids having a large overhang of pending
writes when it finally flushes, which hopefully avoids other processes
getting stuck with a big delay as they try to flush. Although it
might be ideal if processes with less write volume could write
asynchronously, I think it's probably okay if all of them are forced
down to relatively synchronous writes with all processes getting an
equal fair share of the disk write bandwidth. Even in this situation
the processes with less data to write and flush will finish faster,
lowering their latency.</p>
<p>To translate this to typical system settings, I believe that you
want to aggressively trigger disk writeback and perhaps deliberately
restrict the total amount of buffered writes that the system can
have. Rather than allowing multiple gigabytes of outstanding buffered
writes and deferring writeback until a gigabyte or more has
accumulated, you'd set things to trigger writebacks almost immediately
and then force processes doing write IO to wait for disk writes to
complete once you have more than a relatively small volume of
outstanding writes.</p>
<p>(This is in contrast to typical operating system settings, which
will often allow you to use a relatively large amount of system RAM
for asynchronous writes and not aggressively start writeback. This
especially would make a difference on systems with a lot of RAM.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WriteBufferingAndSyncs?showcomments#comments">6 comments</a>.) </div>Disk write buffering and its interactions with write flushes2024-03-18T01:59:40Z2024-03-18T01:59:25Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ServerCPUDensityAndRAMLatencycks<div class="wikitext"><p>When I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServersSpeedOfChangeDown">how the speed of improvement in servers may
have slowed down</a>, I didn't address CPU
core counts, which is one area where the numbers have been going
up significantly. Of course you have to keep those cores busy, but
if you have a bunch of CPU-bound workloads, the increased core count
is good for you. Well, it's good for you if your workload is genuinely
CPU bound, which generally means it fits within per-core caches.
One of the areas I don't know much about is how the increasing CPU
core counts interact with RAM latency.</p>
<p>RAM latency (for random requests) has been relatively flat for a
while (it's been flat in time, which means that it's been going up
in cycles as CPUs got faster). Total memory access latency has
apparently been 90 to 100 nanoseconds for several memory generations
(although <a href="https://en.wikipedia.org/wiki/DDR5_SDRAM">individual DDR5 memory module access is apparently only
part of this</a>, <a href="https://www.crucial.com/articles/about-memory/everything-about-ddr5-ram">also</a>).
Memory bandwidth has been going up steadily between the DDR
generations, so per-core bandwidth has gone up nicely, but this is
only nice if you have the kind of sequential workloads that benefit
from it. As far as I know, the kind of random access that you get
from things like pointer chasing is all dependent on latency.</p>
<p>(If the total latency has been basically flat, this seems to imply
that bandwidth improvements don't help too much. Presumably they
help for successive non-random reads, and my vague impression is
that reading data from successive addresses from RAM is faster than
reading random addresses (and not just because RAM typically transfers
an entire cache line to the CPU at once).)</p>
<p>So now we get to the big question: how many memory reads can you
have in flight at once with modern DDR4 or DDR5 memory, especially
on servers? Where the limit is presumably matters since if you have
a bunch of pointer-chasing workloads that are limited by 'memory
latency' and you run them on a high core count system, at some point
it seems that they'll run out of simultaneous RAM read capacity.
I've tried to do some reading and gotten confused, which may be
partly because modern DRAM is a pretty complex thing.</p>
<p>(I believe that individual processors and multi-socket systems have
some number of memory channels, each of which can be in action
simultaneously, and then there are <a href="https://en.wikipedia.org/wiki/Memory_rank">memory ranks</a> (<a href="https://www.crucial.com/support/articles-faq-memory/what-is-a-memory-rank">also</a>)
and <a href="https://en.wikipedia.org/wiki/Memory_bank">memory banks</a>. How
many memory channels you have depends partly on the processor you're
using (well, its memory controller) and partly on the motherboard
design. For example, 4th generation AMD Epyc processors apparently
support 12 memory channels, although not all of them may be populated
in a given memory configuration (<a href="https://www.phoronix.com/review/ddr5-epyc-9004-genoa">cf</a>). I think
you need at least N (or maybe 2N) DIMMs for N channels. And <a href="https://chipsandcheese.com/2022/11/08/amds-zen-4-part-2-memory-subsystem-and-conclusion/">here's
a look at AMD Zen4 memory stuff</a>,
which doesn't seem to say much on multi-core random access latency.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerCPUDensityAndRAMLatency?showcomments#comments">2 comments</a>.) </div>Something I don't know: How server core count interacts with RAM latency2024-03-03T03:55:27Z2024-03-03T03:54:58Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ServersSpeedOfChangeDowncks<div class="wikitext"><p>One of the bits of technology news that I saw recently was that AWS
was changing how long it ran servers, from five years to six years.
Obviously one large motivation for this is that it will save Amazon
a nice chunk of money. However, I suspect that one enabling factor
for this is that old servers are more similar to new servers than
they used to be, as part of what could be called the great slowdown
in computer performance improvement.</p>
<p>New CPUs and to a lesser extent memory are somewhat better than
they used to be, both on an absolute measure and on a performance
per watt basis, but the changes aren't huge the way they used to
be. SATA SSD performance has been more or less stagnant for years;
NVMe performance has improved, but from a baseline that was already
very high, perhaps higher than many workloads could take advantage
of. Network speeds are potentially better but it's already hard to
truly take advantage of 10G speeds, especially with ordinary workloads
and software.</p>
<p>(I don't know if SAS SSD bandwidth and performance has improved,
although raw SAS bandwidth has and is above what SATA can provide.)</p>
<p>For both AWS and people running physical servers (like <a href="https://support.cs.toronto.edu/">us</a>) there's also the question of how
many people need faster CPUs and more memory, and related to that,
how much they're willing to pay for them. It's long been observed
that a lot of what people run on servers is not a voracious consumer
of CPU and memory (and IO bandwidth). If your VPS runs at 5% or 10%
CPU load most of the time, you're probably not very enthused about
paying more for a VPS with a faster CPU that will run at 2.5% almost
all of the time.</p>
<p>(Now that I've written this it strikes me that this is one possible
motivation for cloud providers to push 'function as a service'
computing, because it potentially allows them to use those faster
CPUs more effectively. If they're renting you CPU by the second and
only when you use it, faster CPUs likely mean more people can be
packed on to the same number of CPUs and machines.)</p>
<p><a href="https://support.cs.toronto.edu/">We</a> have a few uses for very
fast single-core CPU performance, but other than those cases (and
<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/SlurmHowWeUseIt">our compute cluster</a>) it's hard to
identify machines that could make much use of faster CPUs than they
already have. It would be nice if <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ZFSFileserverSetupIII">our fileservers</a> had <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerNVMeU2U3AndOthers2022">U.2 NVMe drives</a> instead of SATA SSDs but I'm not sure
we'd really notice; the fileservers only rarely see high IO loads.</p>
<p>PS: It's possible that I've missed important improvements here
because I'm not all that tuned in to this stuff. One possible area
is PCIe lanes directly supported by the system's CPU(s), which
enable all of those fast NVMe drives, multiple 10G or faster network
connections, and so on.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServersSpeedOfChangeDown?showcomments#comments">2 comments</a>.) </div>The speed of improvement in servers may have slowed down2024-03-01T03:44:15Z2024-03-01T03:43:13Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/OpenSourceCultureAndPublicWorkcks<div class="wikitext"><p>A while back I wrote about how <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/RequirementToScaleYourWork">doing work that scales requires
being able to scale your work</a>, which
in the open source world requires time, energy, and the willingness
to engage in the public sphere of open source regardless of the
other people there and your reception. Not everyone has this sort
of time and energy, and not everyone gets a positive reception by
open source projects even if they have it.</p>
<p>This view runs deep in open source culture, which valorizes public
work even at the cost of stress and time. Open source culture on
the one hand tacitly assumes that everyone has those available, and
on the other hand assumes that if you don't do public work (for
whatever reason) that you are less virtuous or not virtuous at all.
To be a virtuous person in open source is to contribute publicly
at the cost of your time, energy, stress, and perhaps money, and
to not do so is to not be virtuous (sometimes this is phrased as
'not being dedicated enough').</p>
<p>(Often the most virtuous public contribution is 'code', so people who
don't program are already intrinsically not entirely virtuous and lesser
no matter what they do.)</p>
<p>Open source culture has some reason to praise and value 'doing work
that scales', public work; if this work does not get done, nothing
happens. But it also has a tendency to demand that everyone do it and
to judge them harshly when they don't. This is the meta-cultural issue
behind things like <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/BugReportExperienceObligation">the cultural expectations that people will file bug
reports</a>, often no matter what the bug
reporting environment is like or if filing bug reports does any good
(<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/BugReportBenefit">cf</a>).</p>
<p>I feel that this view is dangerous for various reasons, including
because it blinds people to other explanations for a lack of public
contributions. If you can say 'people are not contributing because
they're not virtuous' (or not dedicated, or not serious), then you
don't have to take a cold, hard look at what else might be getting
in the way of contributions. Sometimes such a cold hard look might
turn up rather uncomfortable things to think about.</p>
<p>(Not every project wants or can handle contributions, because they
generally require work from existing project members. But not all
such projects will admit up front in the open that they either don't
want contributions at all or they gatekeep contributions heavily
to reduce time burdens on existing project members. And part of
that is probably because openly refusing contributions is in itself
often seen as 'non-virtuous' in open source culture.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/OpenSourceCultureAndPublicWork?showcomments#comments">One comment</a>.) </div>Open source culture and the valorization of public work2024-02-26T21:43:52Z2024-02-26T04:21:12Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/DesktopECCOptions2024cks<div class="wikitext"><p>A traditional irritation with building (or specifying) desktop
computers is <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UseECCIrritation">the issue of ECC RAM</a>, which for
a long time was either not supported at all or <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IntelCPUSegmentationIrritation">was being used by
Intel for market segmentation</a>.
First generation AMD Ryzens sort of supported ECC RAM with the right
motherboard, but <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ECCRAMSupportLevels">there are many meanings of 'supporting' ECC RAM</a> and questions lingered about how meaningful
the support was (<a href="https://utcc.utoronto.ca/~cks/space/blog/linux/AMDWithECCKernelMessages">recent information suggests the support was real</a>). Here in early 2024 the situation
is somewhat better and I'm going to summarize what I know so far.</p>
<p>The traditional option to getting ECC RAM support (along with a
bunch of other things) was to buy a 'workstation' motherboard that
was built to support Intel Xeon processors. These were available
from a modest number of vendors, such as SuperMicro, and were
generally not inexpensive (and then you had to buy the Xeon). If
you wanted a pre-built solution, vendors like Dell would sell you
desktop Xeon-based workstation systems with ECC RAM. You can still
do this today.</p>
<p>Update: I forgot AMD Threadripper and Epyc based systems, which you
can get motherboards for and build desktop systems around. I think
these are generally fairly expensive motherboards, though.</p>
<p>Back in 2022, Intel introduced their <a href="https://en.wikipedia.org/wiki/LGA_1700#Alder_Lake_chipsets_(600_series)">W680 desktop chipset</a>.
One of the features of this chipset is that it officially supported
ECC RAM with 12th generation and later (so far) Intel CPUs (or at
least apparently the non-F versions), along with official support
for memory overclocking (and CPU overclocking), which enables faster
'XMP' memory profiles than the stock ones (should your ECC RAM
actually support this). There are a modest number of W680 based
motherboards available from (some of) the usual x86 PC desktop
motherboard makers (and SuperMicro), but they are definitely priced
at the high end of things. Intel has not yet announced <a href="https://en.wikipedia.org/wiki/LGA_1700#Raptor_Lake_chipsets_(700_series)">a 'Raptor
Lake' chipset version of this</a>,
which would presumably be called the 'W780'. At this date I suspect
there will be no such chipset.</p>
<p>(The Intel W680 chipset was brought to my attention <a href="https://mastodon.social/@bshanks/111897549472732911">by Brendan
Shanks on the Fediverse</a>.)</p>
<p>As mentioned, AMD support for ECC on early generation Ryzens was a
bit lackluster, although it was sort of there. With the current
<a href="https://en.wikipedia.org/wiki/Socket_AM5">Socket AM5</a> and <a href="https://en.wikipedia.org/wiki/Zen_4">Zen
4</a>, a lot of mentions of ECC
seem to have (initially) been omitted from documentation, as discussed
in Rain's <a href="https://sunshowers.io/posts/am5-ryzen-7000-ecc-ram/">ECC RAM on AMD Ryzen 7000 desktop CPUs</a>, and <a href="https://www.tomshardware.com/pc-components/cpus/amd-confirms-ryzen-8000g-apus-dont-support-ecc-ram-despite-initial-claims">Ryzen
8000G series APUs don't support ECC at all</a>.
However, at least some AM5 motherboards do support ECC with recent
enough firmware (provided that you have recent BIOS updates and
enable ECC support in the BIOS, per Rain). These days, it appears
that a number of current AM5 motherboards list ECC memory as supported
(although <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ECCRAMSupportLevels">what supported means is a question</a>)
and it will probably work, especially if you find people who already
have reported success. It seems that even some relatively inexpensive
AM5 motherboards may support ECC.</p>
<p>(Some un-vetted resources are <a href="https://old.reddit.com/r/truenas/comments/10lqofy/ecc_support_for_am5_motherboards/">here</a>
and <a href="https://forum.level1techs.com/t/am5-consumer-motherboards-with-full-reporting-and-correcting-ecc/200543">here</a>.)</p>
<p>If you can navigate the challenges of finding a good motherboard,
it looks like an AM5, Ryzen 7000 system will support ECC at a lower
cost than an Intel W680 based system (or an Intel Xeon one). If you
don't want to try to thread those rapids and can stand Intel CPUs,
a W680 based system will presumably work, and a Xeon based system
would be even easier to purchase as a fully built desktop with ECC.</p>
<p>(Whether ECC makes a meaningful difference that's worth paying for
is a bit of an open question.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DesktopECCOptions2024?showcomments#comments">6 comments</a>.) </div>Options for genuine ECC RAM on the desktop in (early) 20242024-02-26T21:43:52Z2024-02-17T04:52:09Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/IPv6NowReliableForMecks<div class="wikitext"><p>I've had IPv6 at home for a long time, first in tunneled form and
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPv6ComplicationsAgain">later in native form</a>, and recently I
brought up <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/Ubuntu2204WireGuardIPv6Gateway">more or less native IPv6 for my work desktop</a>. When I first started
using IPv6 (at home) and for many years afterward, there were <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPv6IsGoingToBeFun">all</a> <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/IPv6ConfigurationFun">sorts</a> of
complications and <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPv6ComplicationsAgain">failures</a> that could
be attributed to IPv6 or that went away when I turned off IPv6. To
be honest, when I enabled IPv6 on my work desktop I expected to run
into a fun variety of problems due to this, since before then it
had been IPv4 only.</p>
<p>To my surprise, my work desktop has experienced no problems since
enabling IPv6 connectivity. I know I'm using some websites over
IPv6 and I can see IPv6 traffic happening, but at the personal
level, I haven't noticed anything different. When I realized that,
I thought back over my experiences at home and realized that it's
been quite a while since I had a problem that I could attribute to
IPv6. Quietly, while I wasn't particularly noticing, the general
Internet IPv6 environment seems to have reached a state where it
just works, at least for me.</p>
<p>Since <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPv6IsTheFuture">IPv6 is everyone's future</a>, this is good
news. We've been collectively doing this for long enough and IPv6
usage has climbed enough that it should be as reliable as IPv4, and
hopefully people don't make <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/IPv6ConfigurationFun">common oversights</a> any more. Otherwise, we would
collectively have a real problem, because turning on IPv6 for more
and more people would be degrading the Internet experience of more
and more people. Fortunately that's (probably) not happening any
more.</p>
<p>I'm sure that there are still IPv6 specific issues and problems
that come up, and there will be more for a long time to come (until
perhaps they're overtaken by year 2038 problems). But t you can
have problems that are specific to anything, including IPv4 (and
people may already be having those).</p>
<p>(As more people add IPv6 to servers that are currently IPv4 only,
we may also see a temporary increase in IPv6 specific problems as
people go through 'learning experiences' of operating IPv6 environments.
I suspect that <a href="https://support.cs.toronto.edu/">my group</a> will
have some of those when we eventually start adding IPv6 to various
parts of our environment.)</p>
</div>
Using IPv6 has quietly become reliable (for me)2024-02-26T21:43:52Z2024-02-01T03:26:21Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/HistogramsNeedTotalsToocks<div class="wikitext"><p>A true <a href="https://en.wikipedia.org/wiki/Histogram">histogram</a> is
generated from raw data. However, in things like metrics, we generally
don't have the luxury of keeping all of the raw data around; instead
we need to summarize it into histogram data. This is traditionally
done by having some number of buckets with either <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusHistogramsWantSums">independent or
cumulative values</a>. A lot
of systems stop there; for example <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ZFSOnLinuxGettingPoolIostats">OpenZFS provides its histogram
data this way</a>. Unfortunately
by itself this information is incomplete in an annoying way.</p>
<p>If you're generating histogram data, you should go the extra distance
to also provide a true total of all of the raw data. The reason is
simple; only with a true total can one get a genuine and accurate
average value, or anything derived from that average. Importantly,
one thing you can potentially derive from the average value is an
indication of what I'll call skew in your buckets.</p>
<p>The standard assumption when dealing with histograms is that the
values in each bucket are randomly distributed through the range
of the bucket. If they truly are, then you can do things like get
a good estimate of the average value by just taking the midpoint
of each bucket, and so people will say that you don't really need
the true total. However, this is an assumption and it's not necessarily
correct, especially if the size of the buckets is large (as it can
be at the upper end of a 'powers of two' logarithmic bucket size
scheme, which is pretty common because it's convenient to generate).</p>
<p>I've certainly looked at a number of such histograms where it's
clear (from various other information sources) that this assumption
of even distribution wasn't correct. How incorrect it was wasn't
all that clear, though, because the information necessary to have
a solid idea wasn't there.</p>
<p>Good histogram data takes more than counts in buckets. But including a
true total as an additional piece of data is at least a start, and it's
probably inexpensive (both to export and to accumulate).</p>
<p>(Someone has probably already written a 'best practices for gathering
and providing histogram data' article.)</p>
</div>
Histogram data is most useful when they also provide true totals2024-02-26T21:43:52Z2024-01-27T03:41:23Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CPUIGPCoolingAdvantagecks<div class="wikitext"><p>Once upon a time, you could readily get basic graphics cards,
generally passively cooled and certainly single-width even if they
had to have a fan in order to get you dual output support; this is,
for example, more or less what I had in <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2011">my 2011 era machines</a>. These days these cards are mostly
extinct, so when I put together <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/WorkMachine2017">my current office desktop</a> I wound up with a dual width, definitely
fan-equipped card that wasn't dirt cheap. For some time I've been
grumpy about this, and sort of wondering where they went.</p>
<p>The obvious answer for where these cards went is that CPUs got
integrated graphics (although not all CPUs, especially higher end
ones, so <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/WorkMachine2017">you could wind up using a CPU without an IGP and needing
a discrete GPU</a>. When thinking about why
integrated graphics displaced such basic cards, it recently struck
me that one practical advantage integrated graphics has is cooling.</p>
<p>The integrated graphics circuitry is part of the CPU, or at least
on the CPU die. General use CPUs have been actively cooled for well
over a decade now, and for a long time they've been the focus of
high performance cooling and sophisticated thermal management. The
CPU is probably the best cooled thing in a typical desktop (and it
needs to be). Cohabiting with this heat source constrains the <a href="https://en.wikipedia.org/wiki/Graphics_processing_unit#Integrated_graphics_processing_unit">IGP</a>,
but it also means that the IGP can take advantage of the CPU's
cooling to cool itself, and that cooling is generally quite good.</p>
<p>A discrete graphics card has no such advantage. It must arrange its
own cooling and its own thermal management, both of which cost money
and the first of which takes up space (either for fans or for passive
heatsinks). This need for its own cooling makes it less competitive
against integrated graphics, probably especially so if the card is
trying to be passively cooled. I wouldn't be surprised if the options
were a card that didn't even compare favorably to integrated graphics
or a too-expensive card for the performance you got. There's also
the question of whether the discrete GPU chipsets you can get are
even focused on low power usage or whether they're designed to
assume full cooling to allow performance that's clearly better than
integrated graphics.</p>
<p>(Another limit, now that I look, is the amount of power available
to a PCIe card, especially one that uses fewer than 16 PCIe lanes;
apparently a x4 or x8 card may be limited to 25W total (with an x16
going to 75W), <a href="https://en.wikipedia.org/wiki/PCI_Express#Power">per Wikipedia</a>. However, I don't
know how this compares to the amount of power an IGP is allowed to
draw, especially in CPUs with more modest overall power usage.)</p>
<p>The more I look at this, the more uncertainties I have about the
thermal and power constraints that may or may not face discrete GPU
cards that are aiming for low cost while still offering, say,
multi-monitor support. I imagine that the readily available and more
or less free cooling that integrated graphics gets doesn't help the
discrete GPUs, but I'm not sure how much of a difference it really
makes.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/CPUIGPCoolingAdvantage?showcomments#comments">5 comments</a>.) </div>The cooling advantage that CPU integrated graphics has2024-02-26T21:43:52Z2024-01-25T03:22:06Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MotherboardFeaturesPCIeCostscks<div class="wikitext"><p>My current <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/WorkMachine2017">office desktop</a> and <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2018">home
desktop</a> are now more than five years old
(although they've had some storage tuneups since then), so I've
been looking at PC hardware off and on. As it happens, PC desktop
motherboards that have the features I'd like also not infrequently
include extra features that I don't need, such as built in wifi
connectivity. I'm somewhat of a hardware minimalist so in the past
I've reflexively attempted to avoid these features. The obvious
reason to do this is that they tend to increase the cost. But lately
it's struck me that there's another reason to want a desktop PC
motherboard without extra features, and that is <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PCIeAndModernCPUs">PCIe lanes</a>.</p>
<p>Processors (CPUs) and motherboard chipsets only have so many PCIe
lanes in total, partly because supporting more PCIe lanes is one
of those product features that both Intel and AMD use to segment
the market. This matters because days, almost everything built
into a PC motherboard is actually implemented as a PCIe device,
which means that it normally consumes some number of those PCIe
lanes. The more built in devices your motherboard has, the more
PCIe lanes they consume out of the total ones available, which can
cut down on other built in devices and also on connectivity you
want, such as <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/M2SSDsAndNVMe">NVMe drives</a> and physical PCIe card
slots. <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PCIeSlotsLimitations">Physical PCIe slots can already have peculiar limitations
on which ones can be used together</a>, which
has the effect of reducing the total PCIe lanes they consume, but
you generally can't play very many of these games with built in
hardware.</p>
<p>(You can play some games; on my <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2018">home desktop</a>, the motherboard's
second NVMe slot shares two PCIe lanes with some of my SATA ports. If
I want to run the NVMe drive with x4 PCIe lanes instead of x2, I can
only have four SATA ports instead of six.)</p>
<p>Of course, all of this is academic if you can only find the motherboard
features you want on higher end motherboards that also include these
extra features. Provided that there aren't any surprise limitations
that affect things you're going to use right away, you (I) just get
to live with whatever limitations and constraints on PCIe lane usage
you get, or you have to drop some features you want. This is where
you have to read motherboard descriptions quite carefully, including
all of the footnotes, and perhaps even consult their manuals.</p>
<p>(What features I want is another question, and there are tradeoffs
I could make and may have to.)</p>
<p>Fortunately (given the growth of things like NVMe drives), the
number of PCIe lanes available from CPUs and chipsets has been going
up over time, as has their speed. However I suspect that we're
always going to see Intel and AMD differentiate their server
processors from their desktop processors partly by the number of
PCIe lanes available, with the 'desktop' processors having the
smaller number. My impression is that AMD desktop CPUs have more
CPU PCIe lanes than Intel desktop CPUs and also I believe more
chipset PCIe lanes, but Intel is potentially ahead on PCIe bandwidth
between the chipset and the CPU (and thus between chipset devices
and RAM, which has to go through the CPU). Whether you'll ever
stress the CPU to chipset bandwidth that hard is another question.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MotherboardFeaturesPCIeCosts?showcomments#comments">One comment</a>.) </div>Desktop PC motherboards and the costs of extra features2024-02-26T21:43:52Z2024-01-23T04:29:05Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SelectiveRestoresAndIndexescks<div class="wikitext"><p>Recently we discovered first that <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/AmandaReadsTarRestoresToEnd">the Amanda backup system has
to read some tar archives all the way to the end when restoring a
few files from them</a> and
then <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/AmandaReadsTarRestoresToEndII">sometimes it can do quick restores from tar archives</a>. What is going on is
the general issue of <em>indexed (archive) formats</em>, and also the
potential complexities involved in them in a full system.</p>
<p>To simplify, tar archives are <a href="https://www.gnu.org/software/tar/manual/html_node/Standard.html">a series of entries for files and
directories</a>. Tar
archives contain no inherent index of their contents (unlike some
archive formats, such as <a href="https://en.wikipedia.org/wiki/ZIP_(file_format)">ZIP archives</a>), but you can
build an external index of where each file entry starts and what
it is. Given such an index and its archive file on a storage medium
that supports random access, you can jump to only the directory and
file entries you care about and extract only them. Because tar
archives have not much special overall formatting, you can do this
either directly or you can read the data for each entry, concatenate
it, and feed it to '<code>tar</code>' to let tar do the extraction.</p>
<p>(The trick with clipping out the bits of a tar archive you cared
about and feeding them to tar as a fake tar archive hadn't occurred
to me until I saw what Amanda was doing.)</p>
<p>If tar was a more complicated format, this would take more work and
more awareness of the tar format. For example, if tar archives had
an internal index, either you'd need to operate directly on the raw
archive or you would have to create your own version of the index
when you extracted all of the pieces from the full archive. Why
would you need to extract the pieces if there was an internal index?
Well, one reason is if the entire archive file was itself compressed,
and your external index told you where in the compressed version
you needed to start reading in order to get each file chunk.</p>
<p>The case of compressed archives shows that indexes need to somehow
be for how the archive is eventually stored. If you have an index
of the uncompressed version but you're storing the archive in
compressed form, the index is not necessarily of much use. Similarly,
it's necessary for the archive to be stored in such a way that you
can read only selected parts of it when retrieving it. These days
that's not a given, although I believe many remote object stores
support <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests">HTTP Range requests</a> at
least some of the time.</p>
<p>(Another case that may be a problem for backups specifically is
encrypted backups. Generally the most secure way to encrypt your
backups is to encrypt the entire archive as a single object, so
that you have to read it all to decrypt it and can't skip ahead
in it.)</p>
</div>
Indexed archive formats and selective restores2024-02-26T21:43:52Z2024-01-14T04:28:30Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MFAIsBothSimpleAndWorkcks<div class="wikitext"><p>Over on the Fediverse <a href="https://mastodon.social/@cks/111564970574073609">I said something a while back</a>:</p>
<blockquote><p>I have a rant bubbling around my head about 'why I'll never enable MFA
for Github'. The short version is that I publish code on GH because
it's an easy and low effort way to share things. So far, managing
MFA and MFA recovery is neither of those; there's a lot of hassles,
worries, work to do, and 'do I trust you to never abuse my phone
number in the future?' questions (spoiler, no).</p>
<p>I'll deal with MFA for work. I won't do it for things I'm only doing
for fun, because MFA makes it not-fun.</p>
</blockquote>
<p><a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFABasicOptionsIn2023">Basic MFA</a> is ostensibly pretty simple these
days. You get a trustworthy app for your smartphone (that's two strikes
right there), you scan the QR code you get when you enable MFA on your
account, and then afterward you use your phone to generate the MFA
<a href="https://en.wikipedia.org/wiki/Time-based_one-time_password">TOTP</a>
code that you type in when logging in along with your password. That's
a little bit more annoying than the plain password, but think of the
security, right?</p>
<p>But what if your phone is lost, damaged, or unusable because it has
a bulging battery and it's taking a week or two to get your carrier
to exchange it for a new one (which happened to us with our work
phones)? Generally you get some one time use special codes, but now
you have to store and manage them (obviously not on the phone). If
you're cautious about losing access to your phone, you may want to back
up the TOTP QR code and secret itself. Both the recovery codes and
the TOTP secret are effectively passwords and now you need to handle
them securely; if you use a password manager, it may or may not be
willing to store them securely for you. Perhaps you can look into <a href="https://age-encryption.org/">age</a>.</p>
<p>(Printing out your recovery codes and storing the paper somewhere leaves
you exposed to issues like a home fire, which is an occasion where you
might also lose your phone.)</p>
<p>Broadly, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFAAccountRecoveryDistrust">no website has a good account recovery story for MFA</a>. Figuring out how to deal with this
is not trivial and is your problem, not theirs. And while TOTP is
not the in MFA thing these days, the story is in many ways worse
with physical hardware tokens, because you can't back them up at
all (unlike TOTP secrets). Some environments will back up software
<a href="https://en.wikipedia.org/wiki/WebAuthn">Passkeys</a>, but so far
only between the same type of thing and often at the price of
synchronizing things like all of your browser state.</p>
<p>However, all of this is basically invisible in the simple MFA story.
The simple MFA story is that everything magically just works and
that you can turn it on without problems or serious risks. Of course,
websites have a good reason for pushing this story; they want their
users to turn on MFA, for various reasons. My belief is that the
gap between the simple MFA story and the actual work of doing MFA
in a way that you can reliably maintain access to your account is
dangerous, and sooner or later this danger is going to become
painfully visible.</p>
<p>(Like many other versions of <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SecurityIsPeople">mathematical security</a>,
the simple MFA story invites blaming people (invariably called 'users'
when doing this) when something goes wrong. They should have carefully
saved their backup codes, not lost track of them; they should have
sync'd their phone's TOTP stores to the cloud, or done special export
and import steps when changing phones, or whatever else might have
prevented the issue. This is as wrong as it always is. <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SecurityIsPeople">Security is not
math, it is people</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFAIsBothSimpleAndWork?showcomments#comments">5 comments</a>.) </div>MFA today is both 'simple' and non-trivial work2024-02-26T21:43:52Z2024-01-11T03:49:15Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertificateExpiryHackcks<div class="wikitext"><p>Famously, <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>
certificates expire, which even today can take websites offline because
they didn't renew their TLS certificate in time. This doesn't just affect
websites; people not infrequently create certificates that are supposed to
be long lived, except <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/TenYearsNotLongEnough">sometimes they make them last (only) ten years,
which isn't long enough</a>. When people
argue about this, let's be clear; TLS certificate expiry times, like most
forms of key expiry, are fundamentally a hack that exists to deal with the
imperfections of the world.</p>
<p>In a spherical and frictionless ideal world, TLS certificate keys would
never be compromised, TLS certificates would never be issued to anyone
other than the owner of something, TLS certificates could be effectively
invalidated through revocation, and there would be no need to have TLS
certificates ever expire. In this world, TLS certificates would be
perpetual, and when you were done with a website or some other use of a
TLS certificate, you would publish a revocation of it just to be sure.</p>
<p>We don't live in a spherical and frictionless world, so TLS
certificates expire in order to limit the damage of key compromise
and mis-issued certificates. That TLS certificates expire only
imperfectly limits the damage of key compromise, not only because
you have to wait for the certificate to expire (hence the move to
shorter and shorter lifetimes) but also because there's generally
nothing that stops you from re-using the same key for a whole series
of TLS certificates. Since we don't have effective certificate
revocation at scale, both mis-issued certificates and certificates
where you know the key is compromised can only really be handled
by letting them expire. If they didn't expire, they would be dangerous
forever.</p>
<p>(If you're a big place, the browsers will give you a hand by shipping
an update that invalidates the required certificates and keys, but this
isn't available to ordinary mortals.)</p>
<p>This isn't particularly specific to TLS; other protocols with public
keys often have the same issues and adopt the same solution of
expiry times (PGP is one example). There are protocols that use
keys without expiry times, such as <a href="https://en.wikipedia.org/wiki/DomainKeys_Identified_Mail">DKIM</a>. However,
DKIM has extremely effective key revocation; to revoke a key, you
remove the public part from your DNS, and then no one can validate
anything signed by that key (well, unless they have their own saved
copy of your old DNS). Other protocols punt and leave the whole
problem up to you, for example SSH keypairs.</p>
<p>(Some protocols have other reasons for limiting the lifetime of keys,
such as making encrypted messages 'expire' by default)</p>
<p>The corollary of this is that if you're dealing with TLS certificates
(or keypairs in general) and these issues aren't a concern for you,
there's not much reason to limit your TLS certificate lifetimes.
<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/TenYearsNotLongEnough">Just don't make their lifetimes be ten years</a>.</p>
<p>(My current personal view is that there are two reasonable choices
with TLS certificate lifetimes. Either you have automated issuance
and renewal, in which case you should have short lifetimes, or you
have manual issuance and rollover, in which case they should be as
long as you can get away with. TLS certificates that live for a
year or three and have to be manually rolled over are the worst of
both worlds; a key compromise or a mis-issuance is dangerous for a
comparatively long time, and the rollover period is long enough
that you'll have issues keeping track of it and doing it.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertificateExpiryHack?showcomments#comments">One comment</a>.) </div>TLS certificate expiry times are fundamentally a hack2024-02-26T21:43:52Z2024-01-07T23:10:40Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FilesystemCacheAndNFSBandwidthcks<div class="wikitext"><p><a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ZFSFileserverSetupIII">Our current ZFS fileserver hardware</a>
is getting long in the tooth, so we're working on moving to new
hardware (with the same software and operational setup, which we're
happy with). This new hardware has 512 GB of RAM instead of the 192
GB of RAM in our current fileservers, which means that we're going
to have a very big ZFS filesystem cache. Today, I was idly wondering
how long it would take to fill the cache to a reasonable level with
NFS (read) traffic, since <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusGrafanaSetup-2019">we have metrics</a> that include, among other
things, the typical bandwidth our current fileservers see (which
usually isn't all that high).</p>
<p>ZFS doesn't use all of the system's memory for its ARC cache, and
not all of the ARC cache is file data; some of it is filesystem
metadata like the contents of directories, the ZFS equivalent of
inodes, and so on. As a ballpark, I'll use 256 GBytes of file data
in the cache. A single server with a 1G connection can read over
NFS at about 110 Mbytes a second. This is a GByte read in just under
ten seconds, or about 6.4 GBytes a minute, and a bit under 46 minutes
of continuous full-rate 1G NFS reads to fill a 256 GByte cache
(assuming that the ZFS fileserver puts everything read in the cache
and there are no re-reads, which are some big assumptions).</p>
<p>Based on what I've seen on our dashboards, a reasonable high NFS
read rate from a fileserver is in the area of 300 to 400 Mbytes a
second. This is about 23.4 GBytes a minute (at 400 Mbytes/sec), and
would fill the ZFS fileserver cache from a cold start in about 11
minutes (again with the assumptions from above). <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/NetworkRelatedSpeeds2022">400 Mbytes/sec
is well within the capabilities of SSD-based fileservers</a>.</p>
<p>However, most of the time our fileservers are much less active than
that. Last Thursday, the average bandwidth over the workday was in
the area of 1 Mbyte/sec (yes, Mbyte not GByte). At this rate filling
a 256 GByte cache of file data would take three days. A 20 Mbyte/sec
sustained read rate fills the cache in only a few hours. At the low
end, relatively 'small' changes in absolute value clearly have an
outsized effect on the cache fill time.</p>
<p>In practice, this cache fill requires 256 GBytes of different data
that people want to read (possibly in a hurry). This is much more
likely to be the practical limit on filling our fileserver caches,
as we can see by the typical 1 Mbyte/sec data rate.</p>
<p>(All of this is actually faster than I expected before I started
writing this and ran the numbers.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FilesystemCacheAndNFSBandwidth?showcomments#comments">2 comments</a>.) </div>Some ballpark numbers for fun on filling filesystem cache with NFS traffic2024-02-26T21:43:52Z2024-01-07T02:31:39Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/EmailAddressesBadPermanentIDscks<div class="wikitext"><p>Every so often someone needs to create a more or less permanent
internal identifier in their system every person's account. Some
of the time they look at how <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/OIDCThreeEmailAddresses">authentication systems like OIDC
return email addresses among other data</a> and decide that since pretty
much everyone is giving them an email address, they'll use the email
address as the account's permanent internal identification.
As <a href="http://regex.info/blog/2006-09-15/247">the famous saying</a> goes,
now you have two problems.</p>
<p>The biggest problem with email addresses as 'permanent' identifiers
is that people's email addresses change even within a single
organization (for example, <a href="https://www.utoronto.ca/">a university</a>).
They change for the same collection of reasons that people's
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FullLegalNamesProblems">commonly used names</a> and <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/LoginsDoChange">logins change</a>. An organization that refuses to change
or redo the email addresses it assigns to people is being unusually
cruel in ways that are probably not legally sustainable in any
number of places.</p>
<p>(Some of the time there will be some sort of access or forwarding
from the old email address to the new one, but even then the old
email address may no longer work for non-email purposes such as
OIDC authentication. And beyond that, the person won't want to keep
using their old and possibly uncomfortable email address with you,
they want to use their new current one.)</p>
<p>The lesser problem is that you have no particular guarantee that
an organization won't reuse email addresses, either in general or
for particularly desirable ones that get reused or reassigned as
an exception because someone powerful wants them. Sometimes you
sort of have no choice, because account recovery has to run through
the email address you have on file, but at other times (such as in
theory with <a href="https://en.wikipedia.org/wiki/OpenID#OpenID_Connect_(OIDC)">OIDC</a>), you
have some form of internal ID that is supposed to be unique and
permanent, which you should use.</p>
<p>Even if you have to remember an email address for account recovery,
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PowerOfMeaninglessIDs">you want your internal identifier for accounts to be meaningless</a>. This will make your life much simpler in
the long run, even if this is never exposed to people.</p>
<p>(There are also security issues lurking in the underbrush of reading
too much into email addresses, <a href="https://trufflesecurity.com/blog/google-oauth-is-broken-sort-of/">cf</a> (<a href="https://news.ycombinator.com/item?id=38720544">via</a>).)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/EmailAddressesBadPermanentIDs?showcomments#comments">2 comments</a>.) </div>Email addresses are not good 'permanent' identifiers for accounts2024-02-26T21:43:52Z2023-12-31T04:22:46Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/StandardsAndBadContentcks<div class="wikitext"><p>In a comment on <a href="https://utcc.utoronto.ca/~cks/space/blog/spam/SMTPSmugglingConsequences">my entry on what I think SMTP Smuggling enables</a>, <a href="https://leahneukirchen.org/">Leah Neukirchen</a> noted something important, which is
that <a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol">SMTP</a>
messages that contain a CR or a LF by itself aren't legal:</p>
<blockquote><p>I disagree. The first mail server is also accepting a message
with a non-CRLF LF, which violates RFC 5322 <a href="https://datatracker.ietf.org/doc/html/rfc5322#section-2.3">section 2.3</a></p>
<blockquote><p>CR and LF <strong>MUST</strong> only occur together as CRLF; they <strong>MUST NOT</strong>
appear independently in the body.</p>
</blockquote>
</blockquote>
<p>The capitalization in the RFC quote is original, the emphasis is
mine, and <a href="https://datatracker.ietf.org/doc/html/rfc2119">the meaning of these terms is covered in RFC 2119</a>. What it adds up to
is unambiguous at one level; a SMTP message that contains a bare CR
or LF isn't an RFC 2119 compliant message, much like a C program with
undefined behavior isn't a valid ANSI C program.</p>
<p>But just like the ANSI C standard doesn't (as far as I know) put
any requirements on how a C compiler handles a non-ANSI-C program,
RFC 2119 provides no requirements or guidance on what you should
or must do with a non-compliant message. This is quite common in
standards; standards often spell out only what is within their scope
and what must be done with those things. They've historically been
silent about non-standard things, leaving it entirely to the
implementer. When it comes to protocol elements, this generally
means rejecting them (you don't try to guess what unknown SMTP
commands are), but when it comes to things you don't act on like
email message content, things are much fuzzier.</p>
<p>At this point two things often intervene, The first is <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's
Law</a>, which
suggests people accept things outside the standard. The second is
that strong standards compliance is often actively inconvenient or
problematic for people using the software. I've lived life behind
a SMTP mailer that had strong feelings about RFC compliance (at
least in some areas), and by and large we didn't like it. Strict
software is often unpopular software, which pushes people writing
software to appeal to Postel's Law in the absence of anything else.
If you don't even have an RFC to point to that says 'you SHOULD
reject this' (or 'you MUST reject this') and you have people banging
on your door wanting you to be liberal, often the squeaky wheel
gets the grease (or has gotten until recently; these days people
are somewhat less enamored of <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's Law</a>, for various reasons
including security issues).</p>
<p>(C compilers and their reaction to undefined behavior is a complex
subject, but I don't know of any mainstream compiler that will
actually reject code that has known undefined behavior.)</p>
<p>At this point there's not much we can do here. It's obviously much
too late for existing RFCs and standards that don't have any
requirements or guidance on what you should do about bad contents,
and I'm not sure that people would agree on adding it anyway. People
can attempt to be strict and hope that not much will be affected,
or they can try to write rules about error recovery (which HTML
eventually did in HTML5) to encourage software to all do the same,
agreed-on thing. But these will probably mostly be reactive things,
not proactive ones (so we're probably about to see a wave of SMTP
mailers getting strict in the wake of <a href="https://utcc.utoronto.ca/~cks/space/blog/spam/SMTPSmugglingBackground">SMTP Smuggling</a>).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/StandardsAndBadContent?showcomments#comments">One comment</a>.) </div>Standards often provide little guidance for handling 'bad' content2024-02-26T21:43:52Z2023-12-26T02:44:15Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSInternalCANameConstraintsIIcks<div class="wikitext"><p>A while back I wrote an entry about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSInternalCANameConstraints">TLS CA root certificate name
constraints for internal Certificate Authorities</a>. One of the traditional problems
with your own TLS CA is that this CA can be used to sign any name,
even names you don't want to sign. A name constraint would limit
that, but traditionally these weren't widely supported, especially
on TLS CA root certificates. Then I read Michal Jirků's <a href="https://wejn.org/2023/09/running-ones-own-root-certificate-authority-in-2023/">Running
one's own root Certificate Authority in 2023</a>
and had a realization about a general way out so that everything
would accept your TLS name constraints.</p>
<p>Some software will accept name constraints on root CA certificates,
so you create your root certificate with them. Some software will
only accept name constraints on intermediate CA certificates, so
then you create an intermediate certificate with the same constraints
as your root certificate; it should also have the same validity
period as your root certificate (or as long a validity period as
you expect to need your CA for). At this point, you throw away the
CA root certificate's private key, so no one can make any more
intermediate certificates. This insures an attacker can't create a
new intermediate certificate without name constraints and then issue
certificates from it that will be accepted by older Chrome versions
and other things that ignore root CA name constraints.</p>
<p>(<a href="https://serverfault.com/questions/1012847/does-chrome-support-x509v3-permitted-name-constraints">Modern Chrome versions support root CA name constraints</a>,
but I expect that older Chrome versions will linger on for quite
some time in the form of old Android devices that aren't getting
updates.)</p>
<p>I suspect that there are some uncertainties and potential problems
with this approach. Most obviously, you're in trouble if TLS clients
decide to start limiting the lifetime of intermediate CA certificates,
since you can't make any more once you destroy the root CA key. As
far as I can tell, the <a href="https://cabforum.org/baseline-requirements/">CA/Browser Forum Baseline Requirements</a> don't currently limit
intermediate certificate lifetimes, but this may change someday.</p>
<p>(One option would be to pre-mint a series of intermediates with
relatively short (and overlapping) lifetimes as a precaution. But
then you have to protect all of these intermediate certificates
for possibly several decades.)</p>
<p>Since using intermediate CAs is what's done in <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSThreeWorlds">public (web) TLS</a>, probably pretty much everything supports it so
maybe there aren't really any uncertainties beyond the long validity
period of the intermediate certificate. I don't know how many levels
of intermediates are commonly supported, so maybe there could be
problems if you wanted two levels of intermediate (the one that's
effective your 'root' CA and then a second, shorter term one below
it that you used to directly issue TLS certificates).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSInternalCANameConstraintsII?showcomments#comments">3 comments</a>.) </div>A possible path to reliable name constraints on internal TLS CAs2024-02-26T21:43:52Z2023-12-09T04:02:40Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/RequirementToScaleYourWorkcks<div class="wikitext"><p>Several years ago I read Tobias Bernard's <a href="https://blogs.gnome.org/tbernard/2020/01/17/doing-things-that-scale/">Doing Things That Scale</a>.
To summarize the article badly, Tobias Bernard talked about how
they had moved away from customizing things (and then having to
maintain the customizations) in favour of doing generally useful
changes to upstreams. I have various feelings about this as a general
principle (and comments on <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/CustomizationSensibleLimits">my entry on thinking about the sensible
limits of customization</a>
gave me more things to think about), but one of the thoughts that
firmed up is that <strong>you can only do work that scales if you can
scale your work</strong>.</p>
<p>As I read Bernard's article, scaling your work in this context means
getting the things you want into the upstream, even in some imperfect
form, or perhaps becoming an upstream yourself (by creating something
of and for general use). In order to scale your work this way, you
must be both willing to undertake this and able to do so. For the
most common way to scale your work, this means working with the
upstream and even becoming part of it, if you think there are
significant things that need it. This is work, which is to say that
it takes time and energy, and with some upstreams it can be a
challenging and fraught endeavour (I suspect readers will have their
own examples).</p>
<p>(In the other option, you need to have the time and energy to build
and run a successful, generally useful thing that becomes reasonably
popular. This is often considered an especially thankless and
wearying task.)</p>
<p>If you won't be scaling your work for whatever reason (including
lack of the time required or lack of willingness to wrestle with
muddy pigs), your choices are to either live with the current state
of affairs, whatever it is, or do work that doesn't scale, the sort
of personal, individual customization that <a href="https://blogs.gnome.org/tbernard/2020/01/17/doing-things-that-scale/">Tobias Bernard discusses
moving away from</a>. I
have some broader views on this that I've decided don't fit into
the margins of this entry, so I will confine myself to saying that
I think there are plenty of people who fit into this general category.</p>
<p>(<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/CustomizationSensibleLimits">As noted in comments on my earlier entry</a>, there's also the aspect
of customization as something you enjoy doing for fun. There's no
reason to restrict yourself from fun because it only benefits you;
if anything, that's kind of the point of fun.)</p>
</div>
Doing work that scales requires being able to scale your work2024-02-26T21:43:52Z2023-12-06T03:40:54Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MailClientsAndCreatingHTMLMailcks<div class="wikitext"><p>It's not news that a great deal of email in the world today is
written in HTML, and has been for some time. If you insist on plain
text email, you're increasingly an anachronism. Many people writing
email probably don't even think about people who prefer plain text,
and I think many mail clients will default to HTML even if you're
replying to a plain text message, so even if you write to me in
HTML and I write back in plain text, your reply is back to HTML
again.</p>
<p>But while that description is true at the level of what people
experience, it's not true at the technical level (at least, not
usually). Even today, most 'HTML' email is actually a MIME
multipart/alternative, with text/plain and text/html alternate
parts. An awful lot of the time, the contents of the text/plain
part isn't a little message saying 'read the HTML', it's in fact a
faithful plain text version of the HTML that people wrote. Pretty
much universally, mail clients quietly create that plain text version
from the HTML version that people write, following an assortment
of conventions for how to render HTML-isms in plain text. Looked
at from this angle, it is quietly impressive. Here is a feature
and a chunk of code that could be considered partially vestigial,
yet almost everyone implements it.</p>
<p>One of the reasons this may be somewhat easier than it looks is
that people rarely literally write HTML in their mail client. Instead
they tend to work in a WYSIWYG environment, where the mail client
can mark up the text with intentions, like 'bold' or 'links to <X>',
and then render the intentions in both HTML and plain text. But I'm
only guessing about how mail clients implement this. I don't think
it's as simple as pushing the HTML through some sort of plain text
rendering, because the plain text and the HTML sometimes change styles
for things. For instance, in the HTML, bits quoted from the message
being replied to may be indented, while in the plain text they get
rendered with the customary '> ' in front of them.</p>
<p>It's not only mail clients used by people that (still) do this. A
fair number of major sources of (HTML) email more or less automatically
generate a plain text version as well, often coming at least partially
from people's input. For one example I experience regularly, Github
issues are natively in Markdown and are commonly seen in HTML format,
but Github faithfully makes a quite usable text/plain version. It
might not be much effort with Markdown, but it's at least some.</p>
<p>(Not all plain text plus HTML email has the same content in both
forms, and <a href="https://utcc.utoronto.ca/~cks/space/blog/spam/FadingPlaintextParts">sometimes the plain text content is broken in various
ways</a> (<a href="https://utcc.utoronto.ca/~cks/space/blog/spam/PlaintextAndHTMLDriftApart">also</a>). But this is still the exception;
the vast majority of these emails that I get have functionally the same
content in both the plain text and the HTML version.)</p>
<p>(This entry was sparked by me idly wondering if it would be possible
to easily write HTML-format emails in <a href="https://www.gnu.org/software/emacs/manual/html_mono/mh-e.html">MH-E</a>,
and then realizing that it wasn't enough to just write HTML emails;
I'd need to generate the plain text version too.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MailClientsAndCreatingHTMLMail?showcomments#comments">One comment</a>.) </div>The quietly impressive thing mail clients do when you write HTML mail2024-02-26T21:43:52Z2023-11-30T03:18:34Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/WireGuardAndLinkLocalIPv6cks<div class="wikitext"><p>Suppose, <a href="https://mastodon.social/@cks/111416319651490140">not hypothetically</a>, that you are
setting up a WireGuard tunnel to extend IPv6 connectivity to a
machine that is (still) on an IPv4-only subnet. One part of IPv6
in general is <a href="https://en.wikipedia.org/wiki/Link-local_address">link-local addresses</a>, which are
required for <a href="https://en.wikipedia.org/wiki/Neighbor_Discovery_Protocol">the IPv6 Neighbor Discovery Protocol (NDP)</a> and are
used for other things. However, under Linux <a href="https://social.treehouse.systems/@grawity/111416856752696832">WireGuard interfaces
disable automatic kernel link local generation</a>,
leaving either your higher level software or you to configure them.
So the obvious question is whether you should set up IPv6 link local
addresses by hand (if your software will do it for you, you might
as well let it).</p>
<p>WireGuard interfaces are point to point links and don't do NDP, so
they don't need a link-local address for that, and I don't know if
you can run <a href="https://en.wikipedia.org/wiki/DHCPv6">DHCPv6</a> over
one even if you want to. Apparently <a href="https://en.wikipedia.org/wiki/Open_Shortest_Path_First#OSPF_v3">OSPFv3</a>
requires link-local addresses, and you might want to run that in
some more complicated WireGuard IPv6 situations. A simple point to
point WireGuard link to extend IPv6 to a host will work (as far as
I can tell) if you only configure it with the global peer IPv6
addresses involved and don't have link local addresses, but <a href="https://mastodon.social/@cks/111416337437328726">this
may be an IPv6 crime</a>.</p>
<p>However, it may be that one or both ends doesn't have a fixed IPv6
address; for example, they may obtain them through <a href="https://en.wikipedia.org/wiki/IPv6#Stateless_address_autoconfiguration_(SLAAC)">IPv6 Stateless
Address Autoconfiguration (SLAAC)</a>
and change them over time. In this case you can't configure a fixed
global (peer) address on the WireGuard interface, because it's not
fixed (if you tried, you'd have to coordinate updates with SLAAC
address changes). The only fixed addresses you have are link local
ones you generate yourself.</p>
<p>(Hopefully you at least know what IPv6 /64s (or greater) are on
each end of the WireGuard link so that you can set up appropriate
routing and allowed IP information.)</p>
<p>The other reason I see to set up link local addresses even if you
don't strictly need them is that it gives you an address for the
peer that's generally going to sidestep any routing configuration
issues. You can use this peer IP (with a scope or interface
specification) to ping or talk to the peer over the WireGuard link
to test it, and be pretty sure that this is exactly and only what's
happening. Now that I've realized this, I think I'm going to configure
link local addresses on all future IPv6 point to point links just for
this.</p>
<p>(I've spent enough time being puzzled by IPv4 routing issues involving
clever WireGuard configurations that I don't want to repeat it with
IPv6, although right now I'm not doing anything complicated.)</p>
<p>PS: Learn from my mistakes and remember to add your IPv6 link local
address range to the WireGuard allowed IPs (on both sides, if
applicable; in my case one side's allowed IPs is already all of
IPv6, so I only needed to add fe80::/64 to the other).</p>
</div>
WireGuard and the question of link-local IPv6 addresses2024-02-26T21:43:52Z2023-11-16T03:25:55Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/BackupSystemsStorageFormatDividecks<div class="wikitext"><p>One of the divides in large scale systems for handling backups is
whether they have their own custom storage format (or formats) for
backups, or whether they rely on outside tools to create what I'll
call 'backup blobs' that the backup system then manages. This
division is fractal, because sometimes what you're backing up is,
for example, database snapshots or dumps, and even if the backup
system has its own custom storage format it may well treat the
database dump as an opaque blob of a file that it only deals with
as a unit. (It's a lot of work to be able to peer inside all of
the storage formats you might run into, or even recognize them.)</p>
<p>The advantage of a backup system that relies on other tools is that
it doesn't have to write the tools. This has two benefits. First,
standard tools for making backups of filesystems and so on are often
much more thoroughly tested and hardened against weird things than
a new program. Second, if you allow people to specify what tools
to use and provide their own, they can flexibly back up a lot of
things in a lot of special ways; for example, you could write a
custom tool that took a ZFS snapshot of a filesystem and then
generated a backup of the snapshot. More complex tricks are possible
if people want to write the tools (imagine a database 'backup'
program that treated the database as something like a filesystem,
indexing it and allowing selective restores).</p>
<p>(Generally, backup systems insist that tools have certain features
and capabilities, for example being able to report a list of contents
(an index) of a just-made backup in a standard format. It's up to
you to adapt existing programs to fit these requirements, perhaps
with cover programs.)</p>
<p>The advantage of a backup system that has its own storage format
for backups and its own tools for creating them, restoring them,
and so on is that the backup system can often offer more features
(and better implemented ones). A backup system that relies on other
tools for the actual work of creating backups and performing restores
is forced to treat those tools as relatively black boxes; a backup
system that does this work in-house can tightly integrate things
to provide various nice features, like knowing exactly where a
specific file you want to restore is within a large backup, or
easily performing fine grained backup scheduling and tracking across
a lot of files. And the storage format itself can be specifically
designed for the needs of backups (and this backup system), instead
of being at the mercy of the multiple purposes and historical origins
of, say, tar format archives.</p>
<p>(But then the backup system has to do all of the work itself, and
fix all of the bugs, if it manages to find them before they damage
someone's backup.)</p>
<p>In practice, backup systems seem to go back and forth on this over
time depending on their goals (including where they're backing up
to) and the state of the commonly available tools on the platforms
they want to work on. For commercial backup systems, there can also
be commercial reasons to use a custom format that only your own
tools can deal with. Over some parts of the past, general tools
have been limited and not considered suitable so even open-source
people built fully custom systems. Over other parts, the tools have
been considered good enough for the goals, so open-source backup
systems tended to use them and focus on other areas.</p>
<p>(For open source backup systems it is in some sense a waste to have
to write your own tools. There's only so much programming resources
you have available and there are lots of things a good backup system
needs to implement that can't be outsourced to other tools.)</p>
</div>
Backup systems and how much they do or don't know about storage formats2024-02-26T21:43:52Z2023-11-11T04:13:11Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FirewallsAndMACscks<div class="wikitext"><p>Over on the Fediverse <a href="https://mastodon.social/@cks/111315098180201473">I mentioned that on some networks we authorize
machines by controlling what Ethernet addresses ('MACs') get what
IP addresses</a>. In
response, I was asked <a href="https://mastodon.sdf.org/@smammy/111315374869667323">a very good question about why not have the
firewall work by Ethernet address instead of IP</a>. A starting
answer is that firewalls have traditionally not had particular good
support for working on MACs, and have instead focused on IPs. But
why do firewalls prefer to work this way? There are probably several
reasons, but I will theorize that a good part of it is that IP
addresses are both more general and easier to work with.</p>
<p>First, IP addresses are more general, as they're preserved across
router hops and the destination address is available much earlier
even for destinations that are directly attached to the same network
as the firewall. An Ethernet address is only useful if the sending
or receiving machine is on the same network as the firewall and
talking to it directly; otherwise it will be the MAC of some router.
Or, to put it differently, the Ethernet addresses of packets are
constantly being rewritten by everything that touches them (more
or less) in the natural course of events, while IPs only change if
someone does it deliberately. IPs tell you the (purported) endpoint,
while MACs only tell you the immediate next or previous hop. Often
the endpoints are much more meaningful to firewalls than the immediate
hops.</p>
<p>(Partly this is a result of most network traffic being routed
traffic. Of course, controlling the ability to get off a network
is an exception.)</p>
<p>Second, IPs are generally easier for a firewall to work with because
they're more structured. Ethernet addresses are not quite random,
but from a firewall's perspective they usually might as well be;
the list of MACs that are or could be on some network generally has
no structure or organization to it, since the devices may come from
many vendors with many assigned Ethernet prefixes. IP addresses are
much more hierarchical (by design) and so in many situations are
much more amenable to compact representations and data structures
with good performance. Or to put it another way, there's no MAC
equivalent of '192.168.1.0/24'.</p>
<p>(Often there will also be fewer active IPs than there are known and
potentially active MACs, so even without structure to the IP addresses
the firewall has fewer objects to care about.)</p>
<p>Since DHCP servers turn MACs into IPs, they have to deal with much
the same challenges as firewalls. However, they have two advantages.
The first is that they're in user space instead of in the kernel,
which means they have a lot more options (and the consequences are
lower). The second is that they have to do this mapping process
only infrequently and they're not in the critical path for per-packet
latency. If it takes half a second or a second or even ten seconds
to get a DHCP lease, this is probably not critical; if it takes a
tenth of a second for your packet to go through the firewall, this
is very bad. A DHCP server can usually afford to do slow things
where a firewall can't.</p>
<p>(Also, a linear array of MACs is not that space consuming, at least
by the standards of user space programs. If you don't have any
additional data, you can pack a table of 100,000 of them into just
under 600 KiB. If the DHCP server keeps them in sorted order,
matching a MAC can be done with binary search. Most DHCP servers
probably don't have anywhere near 100,000 registered MACs.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FirewallsAndMACs?showcomments#comments">One comment</a>.) </div>Network firewalls and Ethernet addresses2024-02-26T21:43:52Z2023-11-03T03:01:06Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSClientHostCertVerificationscks<div class="wikitext"><p>One of the lesser used aspects of <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a> is that
TLS clients can send a certificate to the TLS server, in addition
to the server sending one to clients. In private deployments, these
client certificates are often issued out of a private Certificate
Authority, possibly with custom fields that are understood by the
software involved. However, you can also use conventional public
TLS certificates for hosts as client certificates, and there are
situations where you might want to do this; for a non-hypothetical
example, you might want to verify some sort of 'identity' of third
party <a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol">SMTP</a>
mail sending machines that are contacting your (public) SMTP server
in order to give them extra privileges.</p>
<p>(The advantage of not using a private Certificate Authority for
this is that you don't have to run a CA or validate the identity
of clients when they request certificates from you; you delegate
all of those hassles to the public 'web PKI' infrastructure.)</p>
<p>If you get sent a TLS client certificate that is a host certificate,
there are at least two decent approaches to verifying the identity
(as well as an obvious third terrible one). First, in all cases you
need to <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertVerifyTwoParts">verify the TLS certificate chain</a>
and perhaps check for revocation and so on. But then, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertVerifyTwoParts">just as
with servers</a>, you need to verify the hostname
(or host names). To do this, you could have a list of allowed host
names and check that the TLS client certificate is for one of them,
or you could check that the TLS client certificate verifies for a
particular allowed host name.</p>
<p>One difference between these two is wild card TLS certificates. A
wild card TLS certificate for '*.example.org' will validate for
'host.example.org', but its host name isn't 'host.example.org'.
Another difference is that you might wind up with easier code for
validation, because you can simply ask your TLS library 'does this
TLS certificate chain validate for host <X>' rather than having to
ask 'does this certificate chain validate' and then checking the
DNS names in the host certificate.</p>
<p>On the other hand, if you have a lot of host names to accept you
probably want to validate the certificate chain only once, since
it's expensive, and then do the cheap host name checks. And provided
that you're careful, you can handle matching wild card TLS certificate
names yourself, or perhaps your TLS library will have explicit
support for it.</p>
<p>In general I suspect that people mostly use the 'verify the TLS
chain and then check the host name separately' approach unless they
have only a single host name they accept (and are confident that
it will stay that way). Wild card TLS certificates are probably
uncommon enough that you can get away with ignoring them here.</p>
<p>(All of this only occurred to me recently when I needed to deal with TLS
client host certificates for reasons outside the scope of this entry,
and wound up doing some tooling work so that I could see what my test
machine was sending as a TLS client certificate.)</p>
<h3>Sidebar: The terrible approach</h3>
<p>The terrible approach is to (securely) look up the DNS name for the
IP address that contacted you and then verify the TLS certificate
it gave you against that hostname. This is terrible primarily for
operational reasons; people often have many outgoing IP addresses,
each of which will usually have a unique name, but they probably
don't want to give all of them a wild card TLS certificate and you
probably don't want to have to list (and update) all of those
individual names. Just like people load balance inbound HTTP (or
even SMTP) connections to a pool of servers, all of which may have
the same TLS certificate, it's sensible to have multiple outgoing
IPs all use a TLS certificate for a specific (and possibly generic)
host name.</p>
<p>(The other issue is that it converts DNS lookup problems into TLS
certificate validation failures.)</p>
</div>
There are at least two ways to 'verify' TLS client host certificates2024-02-26T21:43:52Z2023-10-26T02:47:44Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FirewallsStatefulAndStatelesscks<div class="wikitext"><p>One of the distinctions in <a href="https://en.wikipedia.org/wiki/Firewall_(computing)">network firewalls</a> is between
ones that primarily operate with state entries for connections and
ones that are primarily stateless. Somewhat famously, <a href="https://man.openbsd.org/pf.conf#PACKET_FILTERING">OpenBSD's
PF packet filter wants to be stateful</a>, although you
can coerce it to operate in a stateless mode. <a href="https://utcc.utoronto.ca/~cks/space/blog/unix/OpenBSDPfStatesAndDoS">This stateful mode
can create problems with things like denial of service attacks</a>, so you might ask why OpenBSD prefers
stateful operation. The simple answer is that a stateful packet
filter will often perform better, although this depends on what
sort of traffic it sees.</p>
<p>The reason a stateful packet filter can perform better is that it
usually has an easier job. Generally speaking, the state entries for
'connections' (which for UDP and other stateless protocols are
really flows) are orderless and almost always unique. For example,
for UDP and TCP, the tuple of source IP, source port, destination
IP, and destination port uniquely identify a connection (possibly
augmented with local context such as whether the packet is being
received or sent, and on what network interface). When you have
orderless and generally unique state entries like this, there are
plenty of data structures that provide fast lookups for them, so
it's easy for the packet filter to find the state entry for a given
packet (if it exists) and proceed from there.</p>
<p>By contrast, packet filtering rules are almost always ordered and
not easily established as unique. A given packet often may potentially
match multiple rules, and the order that you check rules for a match
is usually semantically meaningful (often the first matching rule
decides the packet's fate). This means that you need to evaluate
rules in order, or at least provide results that are the same as
if you actually did that. In practice, even though you may have a
lot of packet filtering rules, a given packet can only possibly
match a few of them; however, transforming a general list of packet
filtering rules into efficient data structures for rapid minimal
matching is a non-trivial programming exercise and is rarely done
automatically. This leaves packet filters to do a bunch of rule
checking for every stateless packet.</p>
<p>(In theory you can hand-optimize your firewall rules. In practice
the optimized versions may be much harder to understand and change,
which isn't necessarily a good tradeoff.)</p>
<p>The fly in this ointment is the question of what sort of traffic
your packet filter actually sees. Common firewall implementations
only establish state for successful connections (or flows); if the
firewall rejects packets, they don't create state. Thus, the more
rejected traffic that your firewall sees compared to accepted
traffic, the less state entries are helping you since every rejected
packet always requires checking some or all of your firewall rules.</p>
<p>(But at least the accepted traffic gets a fast pass through the
filter, which may be important even if you're spending a lot of
system resources on checking rules for packets that you reject.)</p>
</div>
Stateful versus stateless firewalls, and why stateful is attractive2024-02-26T21:43:52Z2023-10-15T01:53:37Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/KeysAndCharacterscks<div class="wikitext"><p>When I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/programming/EmacsUnderstandingCompletion">my understanding of completion in Emacs</a>, I mentioned that
on demand completion is accessible through what Emacs calls "M-TAB",
a notation that means 'Meta-<tab>', where Meta is normally your Alt
key(s). Over on the Fediverse, <a href="https://emacs.ch/@wirthy/111212441359594729">Jason P mentioned</a> that this completion
is also accessible through "C-M-i", Ctrl-Meta-i. This will raise
the eyebrows of some people, because Ctrl-i is also a tab; in ASCII
(and thus Unicode), tab is literally Ctrl-i. What's going on here
is a distinction in modern graphical environments between keys and
characters. In fact there's often at least three layers involved.</p>
<p>At the lowest level, your keyboard typically generates <a href="https://en.wikipedia.org/wiki/Scancode">scancodes</a> when keys are pressed. On
a normal keyboard, each physical key has an associated scancode (or
two, one when it's pressed and one when it's released), and it
always generates that scancode. In your operating system and graphical
environments, these scancodes may then get remapped for various
different reasons to create, for example, an <a href="https://www.x.org/releases/X11R7.6/doc/xproto/x11protocol.html#keyboards">X keycode</a>.
At this point these are broadly just numbers, although there is a
default meaning associated with them based on standards (ie,
everything knows that the key normally labeled 'a' on a USB keyboard
will generate a certain scancode).</p>
<p>(Some people use special keyboards with firmware (such as <a href="https://qmk.fm/">QMK</a>) that has things like <a href="https://docs.qmk.fm/#/feature_layers">'layers'</a> that allow them to change
the scancodes generated by physical keys on the fly.)</p>
<p>Then your graphical environment will take these numbers and assign
a meaning to them; in X, these are 'keysyms'. The reason we assign
meanings at this layer is that it handles things like different
keyboard layouts, such as QWERTY versus Dvorak, national keyboards
with alternate symbols associated with them, and people's desire
to remap bits of their keyboards (for example making Caps Lock not
that). Graphical programs, such as Emacs in graphical mode, tend
to do key binding based on keysyms, although often at a slightly
abstract layer where there is just 'Alt' instead of 'Left Alt' and
'Right Alt'. At the keysym layer (and the scancode layer), there
is a clear distinction between the TAB key and the 'i' key hit with
Ctrl held down (or active), and so programs like Emacs can bind
them separately.</p>
<p>(Programs can also do internal mappings and translations so that,
for example, you don't have to separately bind 'TAB' and 'Ctrl-i',
or that 'Ctrl-Return' is treated like Return unless you have a
special binding set for it. These translations are often program
specific; one program may treat Ctrl-Return as Return by default,
and another one may reject it as 'no binding set'.)</p>
<p>Then we have the character layer, where key presses ultimately
generate characters (usually UTF-8 these days). At the character
layer, TAB and Ctrl-i (and Ctrl-Shift-i) are generally indistinguishable,
as they all generate the ASCII (and Unicode) character 9. You could
make them generate different characters if you wanted to, but then
you'd have to decide what the other character is. Various parts of
the intermediate keysym layer aren't representable any more at the
character level; generally all modifiers (like Alt, Meta, and Ctrl)
are lost unless they can be transformed into ASCII control characters.
At the keysym layer you can generally easily tell the difference
between Ctrl-Return, Shift-Return, and plain Return, but not at the
character level.</p>
<p>One reason the character layer matters is because the character
layer is what you get in terminal windows, including in remote
logins over SSH. Programs like Emacs that want to both work in a
terminal and take advantage of richer keysym bindings get to adopt
various workarounds. One long standing Emacs workaround is that an
Escape character gives the next character the Meta marker; 'ESC a'
is M-a. Emacs also has a system where C-x @ <character> applies
various modifiers to the next real key; this is sufficiently awkward
that you probably only want to do it in desperation, or if your
terminal program automatically generates the right prefix for you.
I believe that people who expect to use Emacs in terminal windows
try not to wind up depending on key bindings that don't work well
in a character environment.</p>
<p>(Perhaps partly because of Emacs, many Unix terminal emulators
translate 'Alt-<key>' into the sequence 'Escape <key>'. Other
modifiers are either not translated at all or are translated to
create standard ASCII characters. In the past, terminal emulators
sometimes set the 8th bit in characters to represent Alt (or Meta),
but that died off when UTF-8 became a common thing.)</p>
</div>
Keyboard keys and characters2024-02-26T21:43:52Z2023-10-11T03:02:54Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MFAPushFatigueQuestionscks<div class="wikitext"><p>In a comment on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFABasicOptionsIn2023">my entry about common multi-factor authentication
methods</a>, Ben Hutchings said (in part):</p>
<blockquote><p>I tend to think push-based approval is the least secure of the
three. I've read several incident reports where the attacker got a
user's password and then repeatedly tried to log in, spamming them
with approval requests until they gave in and tapped Approve.</p>
</blockquote>
<p>I've read similar reports (perhaps the same reports), so I believe
that this MFA push notification fatigue is real, but <a href="https://mastodon.social/@cks/111156714799915002">I have
questions</a> because
I don't understand how it actually works. Or, to put it another
way, I don't understand how organizations set up (or break) their
MFA environment such that it works.</p>
<p>In a sensibly set up MFA environment, I would assume that you don't
get unsolicited, unprompted MFA requests out of the blue as an
ordinary part of your ordinarily daily activities. Instead, you
only get MFA requests if you're specifically doing something that
needs authentication, such as logging in or <code>sudo</code> or whatever.
I'd also expect the organization's authentication and MFA endpoints
to require that a valid password be presented first (although if
an invalid password was presented on a public endpoint, the endpoint
might pretend it was doing an MFA prompt, to not provide an attacker
a password validation service). I'd especially expect this to be
required for anything that can be reached from outside the organization,
by unauthenticated people on the Internet.</p>
<p>In this environment, getting a surprise MFA push request (or worse,
several) out of the blue means that someone else has your password,
which should cause you to hit some sort of big red security problem
button to trigger at least a password change. It would also mean
that if someone explicitly rejects one (or several) MFA push
authentications, that should be a red alert to the security team
(even if the person being spammed by notifications doesn't report
it themselves). An MFA push notification might time out on its own
for various reasons, but an active rejection is a very bad sign;
it's the person telling you (the security system) that they did not
make this request and actively rejected it.</p>
<p>(If your organization has internal, non-guarded endpoints that can
trigger MFA push notifications without someone knowing your password,
this at least means that someone is inside your network hitting
those endpoints. That ought to be a security issue all by itself.)</p>
<p>Since MFA push notification fatigue is real (as Ben Hutchings
mentioned), presumably one or more of these assumptions I'm making
is wrong in these environments, either (or both) of the technical
assumptions or the social assumptions of, say, there not being
consequences to you of reporting that your password has been
compromised (or just quietly changing it).</p>
<p>(Although common MFA push notification systems are provided by third
party companies and so might be used by multiple organizations that
you have accounts with, I believe that they tell you who is requesting
a push authorization. Hopefully in some way that can't be forged
by another customer, even if they allow customers to supply some
text to be shown to you.)</p>
<p>PS: Although it's not push notification fatigue itself, having an
organizational setting that you will lock someone's MFA after N
failed requests is rather dangerous. If I've gotten N-1 MFA push
notifications that I've ignored or rejected, I know that I am about
to get locked out if I say no to the next one. That may force me
to roll the dice on whether this is an attacker trying to get me
to say yes or something gone wrong in our systems that is triggering
unexpected MFA challenges. The harder it is to get my MFA unlocked
and the more unpredictable my organization's MFA handling is, the
more I may be likely to take the risk.</p>
<p>(I think what you want is for each service or MFA endpoint to have
its own separate token or other authorization signature, and when
you see N MFA failures from a particular service, you only lock out
that user on that service. A global user lockout should only be
triggered if you have high confidence that the user's password must
have been compromised, based on the services that are failing to
pass MFA.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFAPushFatigueQuestions?showcomments#comments">7 comments</a>.) </div>I have questions about MFA push notification fatigue2024-02-26T21:43:52Z2023-10-01T03:00:12Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/LatencyImpactMyXExperiencecks<div class="wikitext"><p>As <a href="https://utcc.utoronto.ca/~cks/space/blog/web/ExperiencingWebBloat">I mentioned recently</a>, I recently
had an extended outage on my home Internet. When my Internet came
back, it was a little bit different. My old home Internet was DSL
with 14 Mbits down, 7 Mbits up, and about 7 milliseconds pings to
work. The new state of my home Internet is still DSL from the same
provider, but now it's 50 Mbits down, 4 Mbits up, and about 18
milliseconds pings to work at the moment. When my Internet first
came back, I didn't expect to feel or see any real difference in
the experience. It turns out that I was naive.</p>
<p>(Almost all of the ping latency is in the first hop, over my DSL
link. Because I have a home <a href="https://prometheus.io/">Prometheus</a>
setup, I actually have historical data on ping round trip times,
so I can verify the pre-outage details.)</p>
<p>As far as I can tell, my experience of plain text mode SSH sessions
is unchanged, with nothing feeling any different. Unfortunately the
same is not true of my use of remote X (as forwarded over SSH).
Since early 2020, I've become accustomed to doing <a href="https://utcc.utoronto.ca/~cks/space/blog/unix/RemoteXLifesaver">a number of
lightly graphical X things over remote X</a>;
after my link came back, all of these started feeling various laggy
and slow (especially my remote <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/ToolsEmail">exmh</a>,
which I normally handle much of my email in). They weren't unusable,
but they were sluggish enough to make me unhappy. I would type a
key to take some action, and then I'd have a perceptible lag before
the program's visible state updated, in a way I hadn't really had
to before.</p>
<p>Interestingly, not all X programs are particularly affected by this.
In particular (and conveniently for me), GNU Emacs doesn't seem to
be; a a remote X session of GNU Emacs is quite snappy and about as
responsive as a text mode version for most things (although not all
of them). This has led to <a href="https://mastodon.social/@cks/111014415431327200">me suddenly being interested in reading
my (N)MH email through GNU Emacs' MH-E mode</a> (and then some
other latency issues led me to <a href="https://mastodon.social/@cks/111025146725560766">build a little system for remotely
opening URLs without the latency of remotely manipulating X properties</a>). Since X text
these days is all graphics (the remote client draws glyphs locally
and then sends the drawn glyphs as graphical blobs), I'm not sure
why this is, but part of it may be that exmh is written in TCL/TK,
which haven't seen much work for a long time, while GNU Emacs these
days is based on modern graphical libraries that may have seen more
optimization.</p>
<p>Now that I've written it out, it seems obvious at some level that
more than doubling my ping round trip times would have a visible
effect. But on the other hand, it's not particularly visible in my
text typing over SSH, and the vastly increased incoming bandwidth
should help with X programs pushing big glyphs or graphics blobs
to me.</p>
<p>(I think this impact on some X programs is more likely to be from
the increased latency than from my decreased upstream bandwidth,
but I admit I can't be sure.)</p>
<p>PS: In a way this was (and is) also an interesting experience in
seeing how even a bit of response lag can cause people to be unhappy.
<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/ToolsEmail">Exmh</a> was still pretty prompt in updating
to show new messages and things like that, but just a little bit
of visible lag between typing an 'n' and seeing the next message
displayed was enough to get to me.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/LatencyImpactMyXExperience?showcomments#comments">3 comments</a>.) </div>The effects of modest TCP latency (I think) on my experience with some X programs2024-02-26T21:43:52Z2023-09-10T02:19:57Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MFABasicOptionsIn2023cks<div class="wikitext"><p>I am broadly a MFA (Multi-Factor Authentication) skeptic (<a href="https://mastodon.social/@cks/111012545139396564">cf</a>) and as a result
I don't have much exposure to it. For reasons beyond the scope of
this entry, I've recently been needing to understand more than usual
about how it works from the perspective of people using it, so here
is my current understanding of your generally available non-hardware
options that can be used in a desktop environment (security keys
are out of scope).</p>
<p>There are three generally available and used approaches to MFA at
the moment: SMS, <a href="https://en.wikipedia.org/wiki/Time-based_one-time_password">time-based one time passwords (TOTP)</a>, and
what I've heard called 'push-based approval' using smartphone apps.
Of these, I believe that TOTP is the most popular, and a place that
simply talks about 'MFA' is probably talking about <a href="https://en.wikipedia.org/wiki/Time-based_one-time_password">TOTP</a> MFA
authentication, especially if they say they support multiple
smartphone apps.</p>
<p>(At some point this list may include <a href="https://en.wikipedia.org/wiki/WebAuthn">WebAuthn</a>, but right now you mostly
need a hardware security key to use it on your desktop or laptop.)</p>
<p>In SMS MFA, the place you're trying to log in to sends a SMS message
to your phone number with a code that you have to enter (sometimes
you can also get the code emailed to you). Websites vary on whether
you can enroll more than one phone number for these messages. SMS
MFA is considered insecure, partly because it's generally not that
difficult to get someone's number 'ported' to a new device under
your control, at which point you get their SMS messages. On the
other hand, SMS is easy for people to start using, because practically
everyone can already receive text messages.</p>
<p>In push-based approval, a special app on your phone gets push
notifications of pending logins and asks you whether or not you
approve them. With some hand waving, this is pretty secure, as it
requires possession of your phone in an unlocked state and perhaps
unlocking the app itself. An attacker can't step into the middle
to impersonate your phone (or the app on it) the way they can with
SMS and ported numbers. Again, websites and companies vary on whether
you can enroll multiple devices for push based approvals. One current
drawback of push based approvals is that there is no standard
protocol for this, so each provider of this service has their own
custom app (and, of course, you have to trust their app to not be
scraping your phone for every bit of marketable information it can
extract).</p>
<p>The third option is <a href="https://en.wikipedia.org/wiki/Time-based_one-time_password">TOTP</a>. In
<a href="https://en.wikipedia.org/wiki/Time-based_one-time_password">TOTP</a> the website and you share a common secret code (often
provided as a <a href="https://en.wikipedia.org/wiki/QR_code">QR code</a>)
and you use a standard public algorithm to combine this secret with
the current time to generate a numeric code. If you give the server
the right code, the server 'knows' that you know the shared secret
at this time (well, within a time window). Unlike push based
approvals, there's no explicit communication between any server and
the TOTP app on your phone; the app is a standalone, isolated thing.
Because the algorithm is standard and public, it's been implemented
by many different smartphone apps and those apps aren't tied to the
website or provider they're from; any proper TOTP app can do TOTP
with any proper TOTP website (or other server).</p>
<p>Looked at from a suitable angle, TOTP is really a second password
(the shared secret) with a weird way of proving to the server that
you know this password. It is 'Multi-Factor Authentication' partly
because you're normally supposed to use another device to generate
the TOTP code, not your desktop or laptop, and partly because you
don't memorize the TOTP secret, you store it somewhere. If you're
logging in on your smartphone in the first place, TOTP's MFAness
boils down to 'your physical phone is what knows the TOTP secret,
not you', so only someone in possession of your phone (in an unlocked
state) can get at it.</p>
<p>TOTP is a popular way of doing MFA, perhaps the most popular right
now, and it's not hard to see why. It's more secure than SMS and
doesn't require the website to find (and pay) a SMS provider, and
while it's probably less secure than push based approval, it doesn't
require a bespoke mobile application along with a push notification
backend cloud server setup. There are plenty of client applications
for people with smartphones to chose and as I understand it, the
server support is relatively widely available in open source
libraries.</p>
<p>(There are some TOTP desktop applications, but I think your
choices aren't as broad or as polished as on phones. On the
other hand, you can get them even on Linux.)</p>
<p>Websites using TOTP MFA may allow you to enroll multiple devices,
each with their own TOTP secret code. However, even if they don't
explicitly offer this option, there is nothing stopping you from
loading the same TOTP secret code into multiple TOTP apps
on multiple devices, or even directly saving the TOTP secret code
so that you can later feed it into whatever you want. Websites often
ask you not to do this (and especially tell you to throw away the
initial TOTP secret code or QR code, not keep it anywhere people
can find it), but they can't force you and they can't tell if you're
doing this because there's no explicit communication between them
and your TOTP app the way there is with push based approval.</p>
<p>(The advantage of enrolling multiple devices with separate TOTP
secret codes is that you can hopefully revoke just one device's
TOTP secret code if something goes wrong. If everyone has the
same code, everyone has a flag day if it has to be revoked and
redone. You might also get better auditing.)</p>
<p>This means that if for some reason you have to add MFA to a shared
administrative account on some website, you're generally best off
if the website supports TOTP MFA authentication. You can probably
get TOTP clients for any environment the relevant people use and
load the TOTP secret code into all of them, enabling each person
to MFA to the website as the shared account. You can probably even
print out the QR code the website generates for you, fold it up,
and seal it in a 'break glass in case of emergency' envelope in
your password safe.</p>
<p>(Each TOTP app knows the TOTP secret code that's encoded in the
original QR code, but they may well not support any way of giving
it back to you, especially in usable form, partly because that's a
security exposure.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFABasicOptionsIn2023?showcomments#comments">4 comments</a>.) </div>What I understand about two-factor/multi-factor authentication (in 2023)2024-02-26T21:43:52Z2023-09-07T02:56:22Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSInternalCANameConstraintscks<div class="wikitext"><p>For a long time, one of the pieces of advice for dealing with various
<a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>
certificate problems is that you should establish your own internal
Certificate Authority with its own CA root certificate, have your
systems trust it, and then issue certificates from your internal
CA with whatever names and other qualities you needed. My reaction
to this suggestion has traditionally been that it was extremely
dangerous. If your internal CA was compromised in some way you had
given an attacker the ability to impersonate anything, and generally
properly operating a truly secure internal CA is within neither the
skills nor the budget of a typical organization or group (it's
certainly not within ours). Fortunately, this issue was obvious to
a lot of people for a long time, so as part of <a href="https://datatracker.ietf.org/doc/html/rfc5280">RFC 5280</a> we got <a href="https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.10"><em>name
constraints</em></a>,
which restricted the names (in most contexts, the DNS names) that
the CA could sign certificates for. You could include only some
(sub)domains, or exclude some.</p>
<p>(So, for example, you could make an internal CA for for your BMC
IPMI web servers that was only valid for '.ipmi.internal.example.com'.)</p>
<p>All of this sounds good. However, in the real world, some things
appear to have intervened. To start with, TLS libraries, browsers,
and so on didn't immediately add support for these name constraints;
as a result, even today you probably want to do some testing to see
if your particular environment does (possibly using some resources
from <a href="https://bettertls.com/">BetterTLS</a>). The good news is that
<a href="https://systemoverlord.com/2020/06/14/private-ca-with-x-509-name-constraints.html">according to this 2020 article</a>,
browsers now support this, which is probably the most important
case. Another issue is that creating TLS CA certificates with name
constraints isn't the easiest thing in the world, at least with
OpenSSL; other tools may be better, but I haven't looked for any.</p>
<p>(I care about how easy and straightforward it is to add name
constraints because if it's tricky, we're going to need to test
that we actually did it right. I can imagine unpleasant scenarios
where we think we've created a CA root certificate with name
constraints but we actually haven't.)</p>
<p>A third issue is that <a href="https://serverfault.com/questions/1012847/does-chrome-support-x509v3-permitted-name-constraints">until Chrome 112 in April 2023, Chrome
didn't pay any attention to name constraints on CA root certificates</a>,
and <a href="https://alexsci.com/blog/name-non-constraint/">see also</a>,
based on their interpretation of <a href="https://datatracker.ietf.org/doc/html/rfc5280#section-6">RFC 5280 Certificate Validation</a>. As I
understand it, until then Chrome only applied name constraints from
intermediate CA certificates; the root CA certificate was unconstrained.
This is not exactly useful if you're worried about an attacker
managing to compromise your root CA key in some way. Other TLS
code and TLS libraries may have similar issues, although if you
test them directly you can know for yourself.</p>
<p>(Looking at Go, since it's one of my areas of interest, it appears
to support name constraints on CA root certificates and enforces
them. See <a href="https://go.googlesource.com/go/+/refs/heads/master/src/crypto/x509/name_constraints_test.go">src/crypto/x509/name_constraints_test.go</a>.)</p>
<p>We don't currently have any real internal CAs, although <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/FailingAtTLSRootRollover">we have
one for OpenVPN</a>. If we ever
set up one for some reason, I'm going to try to make sure to give
it a name constraint, and ideally as narrow a one as possible.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSInternalCANameConstraints?showcomments#comments">One comment</a>.) </div>TLS CA root certificate name constraints for internal CAs2024-02-26T21:43:52Z2023-09-04T02:13:27Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/YamlIsOkayEnoughcks<div class="wikitext"><p><a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusWhyHistory">Ever since we set up Prometheus</a>,
I've had to deal with everyone's favorite configuration syntax to
hate, <a href="https://en.wikipedia.org/wiki/YAML">YAML</a>. Although YAML
isn't universal in the Prometheus and Grafana ecosystem, it's pretty
pervasive and many components and things you want to use are
configured using it as the configuration syntax, so I've had to
write and read plenty of it. While I have my issues with YAML, over
time I've come to feel that it's an okay enough syntax and that
often, <a href="https://utcc.utoronto.ca/~cks/space/blog/programming/YAMLAndConfigurationFiles">the big picture issues aren't because of its syntax</a>.</p>
<p>There have definitely been general languages for configuration that
I am quite not fond of (I have a low opinion of writing XML by hand,
for example). I don't find YAML to be like this. The syntax is
simultaneously picky and lax, and deeply nested things can be hard
to follow, but overall it's inoffensive to write, modify, and read
(although you really want to get your editor to cooperate with its
indentation; <a href="https://utcc.utoronto.ca/~cks/space/blog/unix/VimSettingsForYaml">YAML is the one thing I actively configure vim for</a>).</p>
<p>There are simpler formats for simple situations, such as <a href="https://toml.io/en/">TOML</a>, but YAML has mostly won in practice in the
areas that I work in. I believe that Python has steadily moved
toward liking TOML, to the extent that <a href="https://docs.python.org/3/library/tomllib.html">tomllib</a> is now in Python's
standard library. In a way I wholeheartedly support that; if your
program needs enough of a complex configuration that TOML doesn't
work well, you probably should take the effort to create a focused
configuration language for it instead of leaning on <a href="https://utcc.utoronto.ca/~cks/space/blog/programming/YAMLAndConfigurationFiles">a serialization
format</a>. But as what is
fundamentally a serialization format, YAML is okay.</p>
<p>(Well, the subset of YAML that people use in practice is okay. There
are some esoteric features that people mostly don't touch, for good
reason, like repeated nodes that use '&' and '*'.)</p>
<p>It feels <a href="https://mastodon.social/@cks/110964693660842507">a bit heretical</a> to say this, but
sometimes there are things that it's not worth having really strongly
expressed views on. For me, YAML is one of those things. I may not
really like it but I can certainly live with it if a project decides
to use it. I'm not going to pick one project over another merely
because one uses TOML and the other YAML, for example.</p>
<p>(This is pretty much a system administrator's view, which is to say
the view of someone who uses systems configured with YAML and writes
their configuration files. Programmers who have to decide how their
system is configured can and probably should have stronger views
and better reasons for picking one particular format than 'it's
inoffensive and it's there'.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/YamlIsOkayEnough?showcomments#comments">One comment</a>.) </div>YAML is an okay enough configuration file format2024-02-26T21:43:52Z2023-08-28T03:19:07Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSShortCertDurationVsBlackBoxescks<div class="wikitext"><p>Back in March, <a href="https://www.chromium.org/Home/chromium-security/root-ca-policy/moving-forward-together/">the Chrome team said that they wanted to reduce
the maximum TLS certificate duration down to 90 days</a>
(because I'm not always completely in touch with the TLS ecology,
I only found out about this recently). In general I'm in favour of
short <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>
certificate lifetimes and in automation for TLS certificate renewals
and deployment, so you might expect me to be all in favour of this.
But I actually think that this proposal would cause real problems
and get significant pushback from people.</p>
<p>(The reduction in certificate lifetime wouldn't directly affect my
group, since we already get all of our TLS certificates from <a href="https://letsencrypt.org/">Let's
Encrypt</a>, which only gives out 90 day
certificates.)</p>
<p>The problem I see is black box devices with TLS that aren't built
with support for automated (certificate) management and deployment,
and instead only support manual installation of new TLS certificates
(for example, through an administrative web interface). One such
class of devices that I'm painfully familiar with is <a href="https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface#Baseboard_management_controller">server
management processors (BMCs)</a>.
A typical BMC generates (or comes with) a self-signed TLS certificate
but provides some way for you to equip it with a proper TLS certificate
through its web interface. We don't bother to go through the hassle
of giving our BMCs proper public DNS names and then getting proper
TLS certificates for them, but I'm sure there are some people who
do. And I'm also sure that there are plenty of other types of black
box devices and appliances out there that have similar features for
their TLS support.</p>
<p>This sort of manual update is tolerable if you only have to do it
rarely (and you don't have too many of the things to do it to). If
you keep having to do it every 80 days or so, people are going to
be rather unhappy. Many of these people will be in small organizations
(because that's the kind of place that buys black box devices) and
so not well placed to spend a bunch of money to upgrade their
devices, or spend a bunch of staff time to try to automate this
from the browser, or get their voices heard about the problems.</p>
<p>In an ideal world all of these devices would get replaced with ones
that have interfaces and APIs for automated TLS certificate deployment.
In the real world, that will take years even if tomorrow all TLS
certificates became valid for only 90 days, and so the vendors of
these devices were immediately forced into developing it.</p>
<p>(These devices aren't necessarily directly connected to the Internet,
so it isn't sufficient for them to have <a href="https://en.wikipedia.org/wiki/Automatic_Certificate_Management_Environment">ACME</a>
clients, although for some of them it would be a nice extra. In
general they need a way to push a TLS certificate to them, often
along with a private key for it.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSShortCertDurationVsBlackBoxes?showcomments#comments">4 comments</a>.) </div>One challenge in reducing TLS certificate lifetimes down to 90 days2024-02-26T21:43:52Z2023-08-25T03:03:00Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CLAsImpedeContributionsIIcks<div class="wikitext"><p>I've seen a view expressed that Contributor License Agreements are
only a small extra piece of formality over contributing bugfixes
and other open source changes. I think this is wrong. Often, the
decisions that are made over whether or not to contribute changes
to open source projects are significantly different than the decisions
that must be made over CLAs, such that my university and similar
institutions have little to lose from the former and a great deal
to lose from the latter.</p>
<p>When I make a bugfix or some other small change to an open source
project's source code as part of my work, my university has only
two real options for what to with it; we can either keep the change
private or publish it under the original open source license used
by the project. Since universities are nominally not in competition
with each other in the way that companies are and are instead into
sharing things with the world, this is an easy and uncontroversial
call for everyone to make. There is pretty much no reason not to
share such small things.</p>
<p>(In a company, sharing your bugfixes for an open source project may
help your competitors who also use the project, so you have some
reason to keep them private. For large changes, the code I write
might in theory reveal intellectual property that the university
would like to keep private in order to patent or otherwise license,
and in general might give the university some leverage to negotiate
license changes or other things with the project. We have no leverage
for small bugfixes or changes.)</p>
<p>A Contributor License Agreement is a legal document and a legal
agreement. No institution enters into legal agreements without care,
and I am specifically not authorized to enter into such agreements
on behalf of my institution; very few people are, and they are all
busy and senior. As with any legal document, signing a CLA requires
the institution's lawyers to scrutinize the terms to see if there's
anything dangerous we're accepting in the process, because a CLA
may contain all sorts of surprising clauses and grants that the
institution specifically agrees that it's giving the other party.
This makes CLAs not anywhere near as simple as 'do we publish this
change under the project's open source license or keep it private'.
<strong>Signing a CLA is not at all the same as publishing a change under
the open source license its project requires</strong>, especially if the
project uses a standard, widely known open source license or a very
close variation of it.</p>
<p>(And releasing something under what is fundamentally a copyright
grant is quite different from executing a signed agreement with a
specific counterparty, who may acquire new legal rights or causes
of action against you due to clauses in the agreement.)</p>
<p>It's not at all odd or unusual that it's much easier to do one than
the other at my institution. Probably this is the case at any number
of organizations. This is a big factor in why <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/CLAsImpedeContributions">CLAs impede modest
contributions</a>, even if and when the
organization is fully in favour of publishing and sharing such
things. One corollary is that it's extremely unwise to assume that
someone's failure be able to execute a CLA means that they can't
actually publish or share their change.</p>
<p>Requiring a CLA is a strong move by the owner of the project. It
says that they would rather have fewer legitimate, fully allowed
changes because they wish that all of the changes they do accept
to be fully covered by their chosen license agreement (whatever the
terms of it are, and sometimes these will include 'we can later
relicense your code on any terms we chose, including commercial
licenses only').</p>
<p>PS: This doesn't make CLAs intrinsically bad. I accept that there
are some organizations that are sufficiently large lawsuit targets
that they feel they need to take strong defensive measures, and
CLAs are one of those measures. I do feel unhappy when such
organizations react to bug reports with 'please write the small
patch for us, and by the way you need to sign a CLA'.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/CLAsImpedeContributionsII?showcomments#comments">5 comments</a>.) </div>CLAs create different issues than making (small) open source contributions2024-02-26T21:43:52Z2023-08-20T01:43:57Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CLAsImpedeContributionscks<div class="wikitext"><p>Over on the Fediverse, <a href="https://mastodon.social/@cks/110903036381089808">I said something</a>:</p>
<blockquote><p>As a university employee, can I sign an individual <a href="https://en.wikipedia.org/wiki/Contributor_License_Agreement">CLA</a> to
contribute a bugfix I made while at work, for something we use at
work? I don't know, but I'm also pretty certain that I can't get
the university's lawyers and senior management to come near your
organizational CLA, and neither my management nor the university's
lawyers probably want to even look into the individual CLA issue.</p>
<p>So basically a CLA means I'm not sending in our bug fixes. Not because
I'm nasty, but because I can't.</p>
</blockquote>
<p><a href="https://mastodon.social/@cks/110903016948288395">I have some views on CLAs in general</a>, but those only
really apply to work I might do on my own. If I'm doing things as
part of work, the university can decide whether or not to keep it
private or send it upstream and by default not carrying private
changes is easier and better (even if this feeds someone's monetization
in the end).</p>
<p>However, as far as I know (and I did look), my university has no
blanket policy on employees signing individual CLAs to contribute
work they did on university time. Obtaining permission from the
university would likely take multiple people each spending some
time on this. Many of them are busy people, and beyond that you
might as well think of this as a meeting where all of us are sitting
around a table for perhaps half an hour, and we all know how much
meetings cost once you multiply the cost of each person's time out.
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversitiesSunkStaff">Universities may feel that staff time is often almost free</a>, but that isn't universal and there are
limits.</p>
<p>Things get much worse if the university would have to sign some
sort of group or institutional CLA. Officially signing agreements
on the behalf of (parts of) the university is a serious matter, as
it should be. There is no such thing as a trivial legal agreement
for an institution, especially an institution that's engaged in
intellectual property licensing (possibly with one of the very
companies that it previously executed an institutional CLA with).</p>
<p>The university and its sub-parts could probably overcome all of
this if we were doing something large and significant; if someone's
research group was collaborating, or a PhD student was doing a major
chunk of work, or the like (and research work is somewhat different
than work done by staff). But for a modest or trivial change? Forget
it.</p>
<p>This doesn't make me happy. If I have a simple bugfix and I can
make a trivial change and contribute it as a pull request, that's
a win over filing a bug report and forcing other people to duplicate
work I may already have done privately. But that's life in the land
of CLAs. <strong>When you require CLAs, you're creating barriers to
contributions</strong>.</p>
<p>(The same is true of a requirement for copyright assignment,
although probably less obviously.)</p>
</div>
Contributor License Agreements (CLAs) impede modest contributions2024-02-26T21:43:52Z2023-08-19T01:54:24Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ContourMouseReviewcks<div class="wikitext"><p>The short background is that I'm <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MouseFear">strongly attached</a>
to real three button mice (mice where the middle mouse button is
not just a scroll wheel), <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MouseButtonVsScrollWheel">for good reason</a>,
but what I really wanted was a three button mouse that also had a
scroll wheel. For a long time people pointed me to the <a href="https://www.ergocanada.com/detailed_specification_pages/contour_design_contour_mouse_optical.html">Contour
mouse</a>
as the single mouse they knew of that featured this (see comments
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MouseFear">here</a> and <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomePeripherals2015">here</a>), but
I kept balking at the price. Then in late 2015, I finally talked
myself into spending the money to get a Contour (<a href="https://utcc.utoronto.ca/~cks/space/blog/FurnitureThought">it's a bit like
talking myself into a decent desk chair</a>), and
once I started using it I came to love it. I always told myself I
would write a review someday, but I never got around to it. Then
<a href="https://mastodon.social/@cks/110142152567571648">I discovered that this mouse has been discontinued</a>, which is why I
call this a pointless review.</p>
<p>(Before I got the Contour I tried out <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/ScrollMouseExperiment">a hack with two mice</a>.)</p>
<p>The Contour (Optical) mouse is an ergonomic mouse that has three
old fashioned mouse buttons on the top (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HP3ButtonUSBMouseReview">like the HP 3 button mouse
I once reviewed</a>), plus a scroll wheel and
a rocker button on the side where your thumb rests (well, the scroll
wheel is above and the rocker button below; my thumb naturally rests
comfortably between them). Because the scroll wheel has to be on
the thumb side, the mouse comes in right and left handed versions,
and also in three different sizes. The 'ergonomic' bit is mostly
that the back of the mouse is comfortably shaped for my palm and
the front mouse button area slopes down to one side (to the right
on a right handed mouse).</p>
<p>When I started using the Contour, I thought the rocker button was
a bit silly, especially since all it did for me by default was go
back and forward in web browsers. Since then I've become quietly
addicted to it to the point where I get irritated when it doesn't
work in some browser (<a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowsersBackImpressiveTricks">or some browser context</a>) or browser-like environment.
I almost never use the keyboard or click the buttons; I move my
thumb down slightly and flick the rocker in the appropriate direction.
It's so automatic now that I don't think about it.</p>
<p>(In X, the rocker button generates button 8 and button 9 events,
which is apparently the standard for this.)</p>
<p>In general, the Contour is (and has been) everything I could ask from
a mouse. It's been comfortable, responsive when I move it around, and
the mouse buttons and the scrollwheel all work fine. It feels quite
natural to work the scrollwheel with my thumb, especially scrolling
down. People who need to move the scroll wheel long ranges might feel
slightly differently, but I don't consciously noticing re-positioning my
thumb to start scrolling again.</p>
<p>Except that we now get to a slight fly in the ointment that was one
factor delaying this review, because there have been at least three
versions of the Contour mouse. The first version of the Contour
that I received had a conventional (black) scroll wheel with
conventional and readily apparent click stops in the scrolling
action. However, another one I received later had changed the scroll
wheel action to be basically free of stops, which was apparently a
deliberate change in the name of ergonomics (you're pushing less
hard to move the scroll wheel) but had the side effect of making
the scroll wheel much more hair trigger. On this Contour, brushing
my thumb against the scroll wheel with a little too much friction
could trigger an inadvertent scroll action, which was a little too
easy to do when just moving the mouse around.</p>
<p>The third version of the Contour is the one that I received when I
hastily bought some spares after I found out it had been discontinued.
This version has a chunkier knurled scroll wheel (that's not all
black), and in a quick test its scroll wheel action is back to the
standard click stop style of regular mouse scroll wheels. I haven't
used these so I can't comment about how the new scroll wheel feels
in real use.</p>
<p>Overall I'm sad to see the Contour mouse be discontinued. Not only
was it a good mouse (even in the hair trigger scroll wheel variation),
but as far as I know this leaves us with no free standing relatively
conventional mouse with three top buttons and a side scroll wheel.</p>
<p>PS: Contour still makes a variety of mice and other ergonomic things,
but not this specific 'three buttons with side scroll wheel and
rocker button' mouse. Past versions of this mouse were known as the
'Contour Perfit' or 'Perfit Optical'. My two currently active Contour
mice report in Linux lsusb as 'Perfit Optical' (the older, clickier
scroll wheel) and 'Contour Mouse' (the newer smooth scroll wheel).
I believe the newest one also reports as 'Contour Mouse'. These
different generations of mice may also use different USB versions,
although it's hard for me to tell right now.</p>
<p>(I believe some versions of this mouse may have been wireless. I
have the wired USB version, partly because I'm not a believer in
wireless mice. Or wireless keyboards.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ContourMouseReview?showcomments#comments">2 comments</a>.) </div>A pointless review of my (current) favorite mouse, the Contour optical mouse2024-02-26T21:43:52Z2023-08-17T03:26:09Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FullLegalNamesProblemscks<div class="wikitext"><p>One response to <a href="https://utcc.utoronto.ca/~cks/space/blog/programming/WhyNotFirstAndLastNameFields">my entry on the problems with 'first' and 'last'
name data fields</a> is
that one should make forms that (only) ask for someone's legally
recognized name, which should be unambiguous and complete. While
superficially appealing, this is a terrible minefield that you
should never step into unless you absolutely have to, which is
generally because you are legally required to collect this information.</p>
<p>The first question is what you mean by legally recognized name or
'legal name'. I have several pieces of government ID and some
well-attested things like credit cards (which are normally supposed
to be in your name), and even the government IDs don't always have
exactly the same name, never mind the credit cards. Depending on
what you're doing with my name and what you need it to match, I
would need to give you some different version of it. If I don't
know why you're specifically demanding my legal name, I'm going to
have to guess which one you need and the one you get may not be the
one you want.</p>
<p>(If you really insist on legally recognized names and you deal with
non-English people, be prepared to accept all sorts of Unicode input
in non-English languages. The true legal name of a Japanese, Chinese,
Korean, Egyptian, etc person is not written in Latin characters,
and even Western names are not infrequently written with some
accented Latin characters. Legal names absolutely do not fit in
plain ASCII. If you're asking for 'legal name, but in the Latin
alphabet', well, that's certainly something.)</p>
<p>The second issue (not so much a question) is who you are to be
demanding to know the name on my government ID. If you ask for my
legally recognized name, I am going to require you to explain why
you specifically need that, instead of the name that I commonly go
by or that I want to give you. If you are doing this to send me
friendly greetings, using my full legal name is not the way to do
it; you should be using whatever name I want to give you for this.
If you're doing this to show my name to other people, even on purely
functional grounds I want you to use the name that those people
will know me by, not the full, formal legal name I only use in
interactions with the government.</p>
<p>(And I'm someone in a position of privilege where it's not particularly
dangerous for me to be known to your random service by my real world
name (<a href="https://utcc.utoronto.ca/~cks/space/blog/web/WhyNotProfilePictures">or even my real world picture, not that I want you to have
that either</a>). This is very much not
always the case for people; real name only policies are toxic and
dangerous for various reasons, and <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/GoogleWhenEvilRealized">forcing them is being evil</a>.)</p>
<p>The third issue is that people not infrequently have good reasons
to not be addressed or known by their current legal name but instead
by another name of their choice. One example is that in the West,
a number of women (although not all) will change their last name
under various circumstances. There are situations where the legal
change to their chosen new last name will lag the actual desire to
use that last name. If you insist on people using their legally
recognized name, <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/LoginsDoChange">you're inflicting pain in the same way that not
allowing people to change their logins does</a>,
and on the flipside <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/RealNamesBroadcastProblem">you may be forcing people to broadcast changes
in their status before they want to</a>.</p>
<p>There are relatively few situations where you actually need to know
someone's legally recognized name as opposed to what they want you
to call them, and you should never ask for it unless you're actually
in one of those situations. Otherwise, you and everyone else is much
better off if you simply ask people for their name, in the sense of
'what do you want to be called'.</p>
<p>(And of course you need to allow people to change their name, legal
or otherwise, because people's names do change.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FullLegalNamesProblems?showcomments#comments">3 comments</a>.) </div>The tangled problems of asking for people's '(full) legal name'2024-02-26T21:43:52Z2023-08-13T23:25:51Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/RPCSystemsGoodVersusBasiccks<div class="wikitext"><p>In my entry on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HTTPUniversalDefaultProtocol">how HTTP has become the default, universal
communication protocol</a>, I mentioned
that HTTP's conceptual model was simple enough that it was easy to
view it (plus JSON) as an <a href="https://en.wikipedia.org/wiki/Remote_procedure_call">RPC (Remote Procedure Call)</a> system. I saw
some reactions that took issue with this (<a href="https://lobste.rs/s/hq1sdr/http_has_become_default_universal">eg comments here</a>),
because HTTP (plus JSON) lacks a lot of features of real RPC systems.
This is true, but I maintain that it's incomplete, because there's
a difference between a good RPC system and something that people
press into service to do RPC with.</p>
<p>Full scale RPC systems have a bunch of features beyond the RPC
basics of 'request that <thing> be done and get a result'. Especially
they generally have introspection and metadata related operations,
where you can ask what RPC services exist, what operations they
support, and perhaps what arguments they take and what they return.
Often they have (or will eventually grow) some sort of support for
versioning. Although it's usually described as a message bus instead
of an RPC system, <a href="https://en.wikipedia.org/wiki/D-Bus">Linux's D-Bus</a>
is a good example of this sort of full scale RPC system (including
features like service registration).</p>
<p>(Large scale RPC systems may or may not have explicit schemas that
exist outside of the source code, but generally the idea is there.
Historically, some large RPC systems have tried to generate both
client and server interface code from schemas, and people have
sometimes not felt happy with the end result.)</p>
<p>These RPC system features haven't been added because the programmers
involved thought they were neat. Full scale RPC systems are designed
with these features (or have them added) because these features are
increasingly useful when you operate RPC systems at scale, both how
big your systems are now and how long you'll operate them. Sooner
or later you really want ways to find out what versions of what
services are registered and active, and introspection tools help
supplement never up to date documentation (or reading the source)
when you have to interact with someone else's RPC endpoint (or
provide a new endpoint for a service where you need to interact
with existing callers).</p>
<p>However, programmers don't need these features to do basic RPC
things. What programmers often start out wanting (and building) is
an interface that looks like '<code>res, err := MyRPC(some-name).MyCall(...)</code>'.
Maybe there's a connection pool and so on behind the scenes in the
library, but the programmers using this system don't have to care.
And you can easily and naturally use HTTP (with JSON payloads) to
implement this sort of basic RPC system. Your 'some-name' is an
URL, your MyCall() packs up everything in a JSON payload and returns
results usually generated from a JSON reply, and so on. On the
server side, your RPC handling is equally straightforward; you
attach handlers to URLs, extract JSON, do operations, create reply
JSON, and so on. Since HTTP has become so universal, libraries and
packages for doing this are widely available, making such a basic
RPC system quite straightforward to code up on top of them. Plus,
you can test and even use this basic RPC system with readily
tools like '<code>curl</code>' (<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusQueryWithCurl">for example, using curl to query your
metrics system</a>).</p>
<p>(If you need authentication you may need to do some additional
work, but this sort of thing is often used for basic internal
services.)</p>
<p>It's not particularly easy or straightforward to make a HTTP based
system into a good RPC system. But often you can get away with a
basic HTTP based 'RPC' system for a surprisingly long time, and it
may be the best or easiest option when you're just starting out.</p>
<p>(The history of programming has any number of things that were built
to be good general RPC systems, but didn't catch on well enough to
survive and prosper. See, for example, <a href="https://en.wikipedia.org/wiki/Remote_procedure_call#General">this list in the Wikipedia
page on RPC</a>;
some of these are still alive and in active use, but none of them
have achieved the kind of universality that HTTP plus JSON has.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/RPCSystemsGoodVersusBasic?showcomments#comments">One comment</a>.) </div>Good RPC systems versus basic 'RPC systems'2024-02-26T21:43:52Z2023-08-08T03:23:09Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CheapOnlyWhileThereIsVolumecks<div class="wikitext"><p>Among other things, I'm a recreational bicyclist, and as this I'm
a fan of basic wired <a href="https://en.wikipedia.org/wiki/Cyclocomputer">cycling computers</a>, which in their own
way are fascinating artifacts. A wired cycling computer is a little
electronic device with an LCD display that shows you things like
your current speed, how far you've come on this trip, the time, and
so on (generally not all at once; typically there are several screens
and you switch between them with a button). A wired bike computer
gets all of this from an internal timer and a simple sensor to track
wheel revolutions; this sensor is basically a <a href="https://en.wikipedia.org/wiki/Reed_switch">reed switch</a> mounted on <a href="https://en.wikipedia.org/wiki/Bicycle_fork">your
bicycle's front fork</a>,
and closed by a magnet you stick on a spoke of your front wheel.
When the magnet goes past the reed switch, the circuit closes and
the bike computer gets a pulse, which it counts (and wakes it from
idle if it was idle). The whole mechanism is so simple and
straightforward (and reliable) that I've always loved it, and the
whole thing is sufficiently power efficient to run for years from
a single coin cell.</p>
<p>(The cycle computer goes from wheel RPM to speed by having you tell
it your wheel's size, through a setup process that also sets the
current time.)</p>
<p>For a time in the 00s and 10s, capable basic wired bike computers
were readily available from many places for not very much money,
which I found impressive. Here was a moderately complex computing
device, capable of math, storing data (bike computers generally
keep an odometer), displaying things, and handling a simple UI, and
all this was cheap enough to sell to you for prices such as $30
Canadian. It was a real demonstration of how cheap basic computing
had become.</p>
<p>Today, if you go looking around your local bicycle retailer or
favorite online outlet, you'll probably find much less selection
and rather higher prices (and what's left is often much more basic
than before). This isn't necessarily because the basic components
of a bike computer have gotten more expensive; if anything, tiny
low powered computers have gotten even cheaper. Instead, it's
probably because these basic bike computers have gotten much less
popular. Today, most people who're interested in this sort of
information use either their phones or a GPS based bike computer
(which doesn't even need a sensor, although it does need a GPS
signal).</p>
<p>The lesson I draw here is that the cheap price and wide availability
of these basic bike computers didn't come just from that their
components (the simple computer internals and the basic LCD screen)
had become inexpensive. That the components were inexpensive was
simply a necessary prerequisite. What drove the finished units to
be cheap was that they were popular and sold in volume. When they
stopped being popular and the volume shrunk a lot, the price went
up even though I rather suspect that the components are no less
inexpensive than they used to be, or would be in large enough volume.</p>
<p>Application of this idea to aspects of computing, servers, and so
on that I now take for granted are left as an exercise for the
nervous. Are SATA SSDs safe, or will someday everything be NVMe?
Hopefully <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SortingOutModernUSB">USB-A connectors and USB 2.0</a> are
safe, because we sure do have a lot of devices that really want
that and I hope to be using some of them for decades to come.</p>
<p>(This is related to but not quite the same as how some computing
things seem to have a floor price and what you get for that floor
price keeps rising; disk drives are one example of this.)</p>
<p>(I'm currently thinking about wired bike computers for <a href="https://mastodon.social/@cks/110765558071873023">reasons</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/CheapOnlyWhileThereIsVolume?showcomments#comments">One comment</a>.) </div>Some cheap things are only cheap if they have enough volume2024-02-26T21:43:52Z2023-07-24T03:24:15Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/HTTPUniversalDefaultProtocolcks<div class="wikitext"><p>Back when I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/web/URLPresenceNotGoodSignal">how the mere 'presence' of a URL on a web
server wasn't a good signal</a>, I
casually mentioned that you were more likely to have everything
answered with a HTTP 200 status on things that weren't really 'web
servers' as such, but which were just using HTTP. You might ask why
you'd use HTTP if you weren't a web server, and the answer is
straightforward and widely known: <strong>HTTP has become the de facto
default communication protocol</strong>. Today, if you need to create a
system where you pull some information from something or push some
information to something, you're most likely to use HTTP for this
purpose. In the process, <a href="https://utcc.utoronto.ca/~cks/space/blog/web/WebServersShouldServeMinimally">the software may be coded in such
a way that it provides a default answer to nearly everything</a>.</p>
<p>In my view, the reason that people see HTTP as the default communication
protocol and why so many things use it this way is simple; it works
well enough and it's there. Because web servers have become common,
pretty much every modern programming environment will have HTTP
clients and simple servers in their more or less standard libraries.
Some environments (like Go and Python) directly support them in the
standard library; others, such as Rust, defer these to third party
packages but have good implements and make it easy to use them.
There's also wide agreement on and support for a way to transport
data over HTTP, in the form of JSON; again, most everything you
want to use supports this relatively easily. The conceptual model
of HTTP is also simple, in that you can basically view it as an RPC
system.</p>
<p>(An additional bonus of using HTTP is that you generally get a bunch
of tools and features more or less for free, although people may
start asking you for extra things in your software once they start
taking advantage of this. For example, people who run your 'web server'
behind a reverse proxy often start asking you for <a href="https://utcc.utoronto.ca/~cks/space/blog/web/HelpingReverseProxying">some basic URL
mapping features</a>.)</p>
<p>You don't have to use HTTP and in a number of cases you're better off
doing something else. But it's pretty certain that doing something else
will require you to write more code, and you'll be cut off from the
ecology of tools that support HTTP and can act as ad-hoc clients and
servers for testing, diagnosis, and so on. Similarly, you don't have
to use JSON as your data encoding over HTTP, but if you don't you're
probably writing more code and putting yourself through a harder time
(<a href="https://cohost.org/tef/post/1877226-why-i-think-rpc-suck">cf</a>).</p>
<p>(Using HTTP will in theory also give you a number of best practices,
although how applicable they are to things that are only using HTTP
as basically a transport protocol is an open question. There are
ideas you can borrow from <a href="https://cohost.org/tef/post/1794038-special-interest-inf">'REST'</a>, though.)</p>
<p>None of this is new or a novel observation, of course. I don't know
when HTTP reached this state of being the default client/server
communication method that got used, but it's certainly been quite
a while.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HTTPUniversalDefaultProtocol?showcomments#comments">3 comments</a>.) </div>HTTP has become the default, universal communication protocol2024-02-26T21:43:52Z2023-07-20T02:14:48Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SocialMediaPostsNotSimplecks<div class="wikitext"><p>A while back, I read Tristan Hume's <a href="https://thume.ca/2023/01/02/one-machine-twitter/">Production Twitter on One
Machine? 100Gbps NICs and NVMe are fast</a> (<a href="https://lobste.rs/s/umatgz/production_twitter_on_one_machine">via</a>),
which goes through the mental exercise of sketching out such a
thing. As part of this, Hume must assess the size of individual
tweets, and if you were doing this for the Fediverse, you'd want
to assess the size of individual posts there. Fediverse posts on
my current server are limited to 500 characters, so things look
simple. But reading Hume's article and thinking about it from a
Fediverse perspective made me realize that things aren't actually
that simple, and the actual posts are significantly more than their
apparent text.</p>
<p>One of the things that isn't common on the Fediverse is using link
shorteners. The reason for this is that they aren't necessary; on
the Fediverse, URLs count for only 32 characters or so of your post
length, no matter how long the actual URL is. As we've seen, <a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowserSmartCutPasteCanBeGood">longer
URLs are truncated to this length when displayed</a>. This is a good feature, but
it means that Fediverse posts aren't as straightforward as they
look; at a minimum, they contain the full URL, even if this puts
them over 500 characters.</p>
<p>(I'm not quite sure how <a href="https://en.wikipedia.org/wiki/ActivityPub">ActivityPub</a> represents post data
and handles cases like this.)</p>
<p>Over the years, I believe that both Twitter and the Fediverse have
added quiet convenience features to their storage of post data.
Some of this is pure metadata; for example, both know if a post is
a reply to a previous post, and if so which post (and by who, and
so on). Other features affect the contents of posts themselves. For
example, I believe that Twitter has for some time tracked @mentions
in tweets using the internal Twitter identifier for the user, so
that if the account is renamed things still work (and an account
name takeover can't suddenly mis-identify who the tweet was to or
mentioning). I believe this is in addition to the raw '@<name>'
text, which you want to retain in case the account vanishes entirely.</p>
<p>All of this is perfectly reasonable, and obviously it's something
that the existing environments deal with fine. But it does mean
that the actual storage of posts is more complicated and larger
than just '240 (Unicode) characters' or so. As is not unusual,
there's more complexity hiding underneath the rock when you turn
it over.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SocialMediaPostsNotSimple?showcomments#comments">One comment</a>.) </div>Social media posts aren't as small and simple as you might think2024-02-26T21:43:52Z2023-07-17T02:55:36Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/LetsEncryptCertIssueErrorWhycks<div class="wikitext"><p>On June 15th (2023), <a href="https://letsencrypt.org/">Let's Encrypt</a>
paused issuing certificates for about an hour (<a href="https://letsencrypt.status.io/pages/incident/55957a99e800baa4470002da/648b36899c7c1405303ea8c4">their status issue</a>).
Later, Andrew Ayer wrote up the outside details of what happened
in <a href="https://www.agwa.name/blog/post/last_weeks_lets_encrypt_downtime">The Story Behind Last Week's Let's Encrypt Downtime</a>, and
Let's Encrypt's Aaron Gable <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1838667#c6">explained the technical details in
the Mozilla issue about it</a>. The
reasons for what happened are interesting, at least to me, and
make a lot of sense even if the result is unfortunate.</p>
<p>What was wrong is connected to <a href="https://certificate.transparency.dev/">Certificate Transparency</a>. When a Certificate Authority
issues a TLS certificate, it gets <a href="https://www.rfc-editor.org/rfc/rfc9162#name-signed-certificate-timestam"><em>Signed Certificate Timestamps</em>
(SCTs)</a>
for a <em>precertificate</em> version of the certificate from some CT logs
and includes them in the TLS certificate it issues. When <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">TLS
clients interact with Certificate Transparency</a>, they verify that a TLS certificate
has the required SCTs from acceptable logs. However, the SCTs aren't
for the actual issued TLS certificate but instead the precertificate,
which is is deliberately poisoned so that it can't be used as a
real TLS certificate. So in order to verify that the SCTs are for
this TLS certificate, the browser has to reconstruct the precertificate
version of the certificate. In order for this to be possible, the
precertificate and the issued certificate have to be identical apart
from the poisoned extension and the SCTs (allowing the browser to
accurately reconstruct the precertificate so it can verify that the
SCTs are for it).</p>
<p>During the incident, Let's Encrypt issued a number of TLS certificates
where the precertificate and issued certificate weren't identical.
These TLS certificates didn't pass <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">browser CT checking</a> and also implied a technical compliance
failure that made them improper as TLS certificates (see Andrew
Ayer's explanation). As explained by Let's Encrypt, one factor in
this failure is that Let's Encrypt constructed the issued certificate
completely separately from the precertificate, rather than by taking
the precertificate and manipulating it. The reason for this decision
is, well, let me quote Let's Encrypt directly (without the embedded
links, sorry, <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1838667#c6">see the comment itself</a>:</p>
<blockquote><p>As Rob Stradling suggests in Comment #2, having requests for pre-
and final certificate issuance routed to CA instances with different
profiles configured would not be an issue if the final certificate was
produced as a direct manipulation of the precertificate (effectively,
by reversing the algorithm described in RFC6962 Section 3.1).</p>
<p>However, Let’s Encrypt is aware of multiple incidents that have
arisen due to CAs trusting client input (e.g. SANs or extensions in a
CSR) and/or directly manipulating DER in this way: Bug 1672423, Bug
1445857, Bug 1716123, Bug 1542793, and Bug 1695786 are just a few
examples.</p>
<p>We designed our issuance pipeline specifically to avoid bugs such
as these. Every issuance, both of precertificates and of final
certificates, follows the same basic pattern: a limited set of
variables are combined with a strict profile to produce a new
certificate from scratch.</p>
<p>[...]</p>
</blockquote>
<p>TLS certificates are complex structured objects in an arcane and
famously complex nested set of standards, <a href="https://en.wikipedia.org/wiki/X.509">X.509</a> using <a href="https://en.wikipedia.org/wiki/ASN.1">ASN.1</a> and all sorts of other fun
things. Manipulating complex structured objects that use complex
formats is a famously dangerous thing, especially if you need the
result to be exactly identical at the binary level and what you're
dealing with a flexible serialization format. We've seen security
bugs with '<X> serialization' for many <X>s for years, if not
decades. For entirely sensible reasons, Let's Encrypt opted to
completely sidestep all of this by constructing each variant of
the certificate from scratch, as they described.</p>
<p>(Unfortunately Let's Encrypt could do this in two different places,
and for a brief period the configurations that drove all of this
in the two places diverged, creating the incident.)</p>
<p>My personal view is that Let's Encrypt made the right decision on
how to construct precertificates and certificates, even though it
was one factor in their issuance failure. This particular issuance
failure is much less severe than other sorts of potential failures
you could get from trying to manipulate TLS certificates, so I'd
rather have it. And the failure caused things to 'fail closed',
with the certificates failing to validate in browsers that check
Certificate Transparency status.</p>
<p>Overall, I think this is an interesting failure case. A sensible
security focused decision combined with an oversight when planning
a deployment created a surprise issue. It feels like there's no
obvious moral, though (and as always, saying it was human error to
not catch the deployment issue is <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HumanErrorNotRootCause">the wrong answer</a>).</p>
</div>
Let's Encrypt's interesting certificate issuance error2024-02-26T21:43:52Z2023-06-27T03:35:41Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/RISCVServersNotSooncks<div class="wikitext"><p>Recently on the Fediverse, <a href="https://mastodon.social/@danluu/110539383653947021">Dan Luu was dubious about a prediction
that RISC-V would take over in datacenters in the next 5 to 10 years</a> (<a href="https://www.eetimes.com/jim-keller-on-ai-risc-v-tenstorrents-move-to-edge-ip/">here's the
EETimes article being quoted from</a>).
Much like Dan Luu, I was skeptical, considering that <a href="https://mastodon.social/@cks/110539533647432485">under nearly
ideal circumstances AMD didn't make much of a dent</a>. But let's take
this from the top, and ask what RISC-V would need and when if it's
going to do this.</p>
<p>(This is implicitly 64-bit RISC-V. No one is going to put 32-bit
RISC-V into datacenters, much less have it take over.)</p>
<p>Obviously if RISC-V is going to take over in datacenters, there
need to be RISC-V servers that people can buy, including off the
shelf. This is especially the case for non-cloud datacenter usage
of servers; only the cloud players and a few other big places design
and manufacture their own servers. These servers need suitable good
RISC-V CPUs and chipsets (either as systems on a chip or separately).
Apart from performance, these systems need multi-socket support,
lots of PCIE lanes, ECC with large modern RAM standards, and so on.
Given that <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/NonX86MakesLifeHarder">moving to RISC-V will make people's life harder</a>, these servers and their CPUs
need to be unambiguously better than the x86 (and ARM) server systems
available at the same time. Given that <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DominationHasALeadTime">domination has a lead time</a> these servers need to be available in
quantity and proven quality before that five (or ten) year deadline,
probably years before.</p>
<p>(Realistically the first generation of RISC-V datacenter servers
would probably not take over, unless they were amazing marvels that
utterly eclipse the competition. I would expect it to need two or three
generations, just to prove things, shake issues out, and convince people
that these servers really are enough better than the competition.)</p>
<p>These RISC-V datacenter servers will also need proven operating
systems and other software to run, and that software will need
proven and good compilers and other tools to build it. Shaking the
architecture specific bugs out of compilers and operating systems
takes time, probably years of increasingly serious usage. The
developers of all of this software will need RISC-V hardware to use
for this, and this hardware mostly can't be early versions of those
datacenter servers (datacenter servers are too loud, too large, and
too expensive for many people). Some developers will want to use
RISC-V hardware as their daily desktop, but I suspect many others
will want a quiet mini-sized box they can put in the corner (and
use over the network). There will also need to be early servers
that can be used to set up the infrastructure of open source (Linux)
development, for things like dedicated builders for Debian and other
large projects (GCC, clang, Rust, the Linux kernel, etc), CI/CD
build servers that smaller open source projects can use, and so on.</p>
<p>(As a practical matter, the quality of compiler optimization, kernel
tuning, and so on has a significant effect on the realized CPU
performance of anything. Bringing all of this optimization up to
speed to take advantage of the raw capabilities of good RISC-V CPUs
will take (more) time.)</p>
<p>All of this will take money both literally, for hardware, and
possibly figuratively, for people's time. The amount of time this
RISC-V bringup takes will be influenced by how much actual money
is spent on it. If interested companies wait for Linux developers
and other parties to spend their own money and time on buying
developer hardware and working on RISC-V kernels, software, and
Linux distributions, it's probably going to take quite a while. If
interested companies spend money, they can to some extent accelerate
this process.</p>
<p>At the moment, RISC-V has very little of this as far as I know
(based partly on <a href="https://mastodon.social/@cks/110573251793445478">replies to my Fediverse post about this</a>). RISC-V is
probably in a somewhat better place than ARM64 was a decade ago
(partly because RISC-V people have learned lessons from ARM's
experiences), but it's not all that far along. On top of that, even
ARM is not doing all that well in competition to x86. I believe
that the only competitive ARM64 servers available today are the
proprietary ones Amazon made for AWS, and while those see real usage
(as covered in comments on <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/NonX86MakesLifeHarder">my earlier entry</a>), they haven't exactly taken
over even AWS.</p>
<p>Given all of the steps between current reality and the prediction,
I believe there's no way it can be reached in five years. Ten years
might be possible, but it feels like an aggressive timeline that
needs a lot of fast development. I'd want to see the first generation
of RISC-V datacenter servers in five years, which means we need
high-performance RISC-V CPUs in only a couple of years, along with
developer hardware (probably in large quantity in order to kickstart
a lot of development that will be necessary if those first generation
datacenter servers are going to sell to anyone in any quantity).</p>
<p>(If we have the first generation datacenter servers in five years, that
gives two years to get a better second or even third generation out,
a year for people to come to trust those servers, and then two years
to ramp up purchases to take over the installed base at year ten. If
people keep datacenter servers long enough that RISC-V servers need to
be dominating sales well before year eight, the timeline gets worse and
thus less plausible.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/RISCVServersNotSoon?showcomments#comments">One comment</a>.) </div>I don't expect to see competitive RISC-V servers any time soon2024-02-26T21:43:52Z2023-06-23T02:51:03Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/DominationHasALeadTimecks<div class="wikitext"><p>Suppose that someone makes a prediction like 'PCIE 5.0 will dominate
the datacenter in five years' (as a hypothetical example). If we're
looking at how likely this is and what would have to happen when
in order to get there, one important thing to remember is that
<strong>domination has a lead time</strong>, and this lead time will shorten all
of the timelines from what you might otherwise think. The ultimate
cause of this lead time is the inertia of the installed base.</p>
<p>Let's take 'dominate the datacenter' as meaning that at least half
of the systems in datacenters will use PCIE 5.0. Systems in a
datacenter (and things in general) have a lifetime; they're bought,
operated for a while, and then replaced with new systems. Thus, at
any given time the systems in datacenters are a mix of ages. Some
were just bought as replacements or expansions, some are halfway
through their expected lifetime, and others are almost at the end
of their lifetime and replacements are being planned. This implies
that if you want to achieve 50% of currently installed systems
having PCIE 5.0 at the five year mark, such systems need to start
coming in the datacenter doors in quantity much earlier than five
years from now. This is the lead time in action.</p>
<p>This lead time pushes back all of the other timelines involved (even
apart from any lead times they may have themselves). If you have
to start having PCIE 5.0 systems arriving in datacenters well before
five years, obviously those systems have to be available, proven,
and attractive by whenever they have to start coming in the door.
Before that can happen, you probably need early PCIE 5.0 systems
so that people can get experience and shake out bugs (and maybe
develop PCIE 5.0 peripherals, get them sold and integrated into
systems, and so on). This goes all the way back up the chain of
dependencies that leads to people buying PCIE 5.0 systems in quantity
in time.</p>
<p>How long a lead time this prediction needs depends in significant
part on how fast people turn over their existing systems. As an
extreme case, if the typical datacenter system lasted ten years,
it's probably already too late to see PCIE 5.0 dominating datacenters
in five; even if 100% of the systems sold from now onward were PCIE
5.0, that might only barely reach half of the installed systems in
five years (there's some wiggle room for datacenter expansion).</p>
<p>The corollary of this is that you can start to rule out predictions
of domination if the lead times necessary given the current state
of the world are getting implausibly short. If there are multiple
remaining steps necessary between here and there, plausible lead
times will depend on how fast you think these steps can be done (or
are likely to be done). If you think it will take three years to
get PCIE 5.0 fully out into the world and widely available in
attractive systems, that leaves only two years to dominate datacenter
system sales in order to get the datacenter population up.</p>
<p>PS: This also means there's a big difference between predicting
that something will 'dominate the datacenter' and that something
will 'dominate sales of datacenter systems'. Dominating sales has
basically no required lead time, depending on what sales period you
look at.</p>
<p>(However, you may find it implausible that sales will abruptly go
from very low level to 'dominating', and therefor want to put in a
lead time for sales to ramp up to that level. But abrupt transitions
in sales are at least plausible, unlike abrupt transitions in
installed bases.)</p>
</div>
Domination has a lead time2024-02-26T21:43:52Z2023-06-22T01:21:20Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/GabonCountryDNSEvaporationcks<div class="wikitext"><p>Over on the Fediverse, <a href="https://mastodon.social/@cks/110543764399829520">I talked about something that might not
be widely known</a>:</p>
<blockquote><p>Today I found my (first) .ga site with useful information that
disappeared in the recent great .ga purge,
<a href="https://web.archive.org/web/20220404145446/https://lekstu.ga/posts/hello-moduledata/"><web.archive.org link></a>
(from 2019 and so not necessarily right for current Go, but it gives
me/you some general ideas).</p>
<p>(I believe this is by <a href="https://infosec.exchange/@joakimkennedy">@joakimkennedy</a> who may either have re-homed
it elsewhere or not done so because these articles are now obsolete
and misleading.)</p>
</blockquote>
<p>(<a href="https://infosec.exchange/@joakimkennedy/110547585375523531">Per the author</a>, these
articles are now available at <a href="https://tcm1911.github.io/">tcm1911.github.io</a> (in <a href="https://tcm1911.github.io/posts/">the blog section</a>).)</p>
<p>The .ga top level domain (TLD) is the country code TLD (ccTLD) for
<a href="https://en.wikipedia.org/wiki/Gabon">Gabon</a>. For many years, .ga
domain registration was handled by Freenom, which allowed people
to register domains in .ga for free, had lots of bad people set up
.ga domains, and finally <a href="https://krebsonsecurity.com/2023/03/sued-by-meta-freenom-halts-domain-registrations/">got sued by Meta (aka Facebook) and
closed down .ga registration</a>.
In the wake of all of this, Gabon decided to take .ga back from
Freenom and run it itself (<a href="https://www.afnic.fr/wp-media/uploads/2023/05/ga-domain-names-soon-to-return-to-Gabonese-management-1.pdf">press release (PDF)</a>,
<a href="https://news.ycombinator.com/item?id=36183824">also some commentary</a>).
As part of taking .ga back, Gabon removed quite a lot of previously
registered .ga domain names.</p>
<p>At one level, there is nothing wrong or surprising about this mass
removal of .ga domain names, including non-cybercriminal ones like
the example I encountered. Every country is ultimately fully in
control of its ccTLD, and may allow, not allow, or remove domain
names within that ccTLD at its own pleasure. A country can opt to
let more or less anyone register a domain under its ccTLD, or they
may decide that they want to restrict their ccTLD to people and
organizations that are in their country or at least sufficiently
associated with it. This is the bargain people take on when they
register under some country code TLD, either because it's free or
because they wanted a particular attractive domain name.</p>
<p>At another level this probably surprised people and surprises people.
An organization waved its hand and millions of domain names evaporated
with (probably) no warning and no recourse; they were there one day
and gone the next. I'm sure there were people with .ga domain names
who experienced quite some disruption as a result of this. Honesty
calls for admitting this fact, even if it's a little bit inconvenient
to a nice neat narrative of 'Gabon took back its ccTLD and purged
the cybercriminal domains' (or even the one where we say 'people
should have known better').</p>
<p>PS: This is obviously yet another example of <a href="https://utcc.utoronto.ca/~cks/space/blog/web/CoolUrlsChange">how cool URLs
definitely do change</a>. Even if people have
their own domain, operate their websites perfectly, and never change
their URL structure, maybe their entire domain will get removed for
reasons entirely outside of their own control. This can happen even
in your own properly approved country TLD(s). For example, Canada
could decide to make a new rule that <domain>.ca is not allowed any
more and it has to be <domain>.<type>.ca, with all of the existing
holders given some time to migrate.</p>
<p>(There's also the famous case of people in the UK who registered
.eu domains when the UK was in the EU, <a href="https://eurid.eu/en/register-a-eu-domain/brexit-notice/">and then the UK left the
EU</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/GabonCountryDNSEvaporation?showcomments#comments">One comment</a>.) </div>The evaporation of lots of .ga domains2024-02-26T21:43:52Z2023-06-16T02:15:13Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertTransLogsCAViewcks<div class="wikitext"><p><a href="https://certificate.transparency.dev/">TLS Certificate Transparency</a>
is a system where browser vendors require Certificate Authorities
to publish information about all of their TLS certificates in
cryptographically validated logs, which are generally run by third
parties (see also <a href="https://en.wikipedia.org/wiki/Certificate_Transparency">Wikipedia</a>). I've
written before about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">the TLS client's view of this</a> and how it relates to potentially
untrustworthy CT log operators, but I haven't written about the
Certificate Authority's view of things. The CA's view is also
relatively similar to the view a TLS server has.</p>
<p>The normal behavior of a CA is that when it wants to issue a TLS
certificate, it will ask various CT logs to give it <a href="https://www.rfc-editor.org/rfc/rfc9162#name-signed-certificate-timestam"><em>Signed
Certificate Timestamps</em> (SCTs)</a>
for the CA's pre-certificate version of the TLS certificate. It
will then take those SCTs and stick them in the issued TLS certificate
as <a href="https://www.rfc-editor.org/rfc/rfc9162#cert_transinfo_extension">a certificate extension</a>
(<a href="https://www.rfc-editor.org/rfc/rfc9162#name-tls-servers">also</a>).
The minimal thing that CAs can do is verify that the SCTs they
receive are properly signed by the CT log and are otherwise well
formed to the best of the CA's ability to tell. A CA that wants to
be more thorough can save information about new SCTs and watch the
relevant CT logs to verify that the log does list the (pre-)certificate
before too long, or request <a href="https://www.rfc-editor.org/rfc/rfc9162#name-retrieve-merkle-inclusion-p">a proof of inclusion from the CT log
operator</a>.</p>
<p>(I'm not sure if the CAs then also submit the issued TLS certificate
to CT logs.)</p>
<p>Since software is fallible, I believe it's smart for CAs to do some
degree of spot checking to make sure that CT log operators are
successfully adding certificates to their log (SCTs or no SCTs).
If your concern is that the CT log updates break in general, you
can do this by checking only a small fraction of certificates (either
on a percentage basis or just a 'check something every so often'
basis). However, I don't know if any CA does this today or if they
merely verify the SCTs, <a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowsersAndCertTrans">given that this is what browsers limit
themselves to at the moment</a>.</p>
<p>(A CA verifying that its TLS certificates are in the CT logs doesn't
leak information to the CT log operator, unlike the situation with
browsers.)</p>
<p>A TLS server with a TLS certificate that has embedded SCTs can
verify them in the same way, either minimally by checking that
they're signed and otherwise proper or more thoroughly by verifying
that the certificate is in the relevant CT logs. Since TLS servers
normally deal with a lot fewer TLS certificates than CAs do, this
thorough verification may be less of a burden for them than it would
be for a CA. If the TLS server has a TLS certificate without embedded
SCTs, in theory the TLS server can obtain SCTs, possibly verify them
thoroughly, and then <a href="https://www.rfc-editor.org/rfc/rfc9162#name-tls-servers">provide them through TLS extensions</a>. In
practice I believe that all commonly obtained TLS certificates will
have embedded SCTs, because that's how you get all of this to work
without trying to get tons of people to update tons of web servers.</p>
<p>(See also <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransBadLogOptions">my entry about what a compromised TLS Certificate
Transparency Log can do</a>. For CAs, mostly
what a 'bad' CT log can do is provide you with valid looking SCTs
but then never add the TLS certificates to the public version of
the CT log. The good news is that this will be clearly visible as
the CT log's fault, not the CA's fault.)</p>
</div>
The Certificate Authority's view of Certificate Transparency and CT Logs2024-02-26T21:43:52Z2023-06-14T03:00:25Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CAsNotAlwaysCertificateAuthoritiescks<div class="wikitext"><p>The CA news of the time interval is, to quote <a href="https://infosec.exchange/@mholt/110514501634238185">Matt Holt</a> (<a href="https://mstdn.social/@jschauma/110515516368024063">also</a>):</p>
<blockquote><p>Well, I inadvertently discovered a zero-day RCE in acme.sh and got a
Chinese CA to shut down overnight:
<a href="https://github.com/acmesh-official/acme.sh/issues/4659"><github issue link></a></p>
</blockquote>
<p>The CA in question was called 'HiCA', and even if you keep good
track of the (many) TLS Certificate Authorities that your browser
or operating system trusts, you may be scratching your head in
puzzlement because you've never heard of it before. That's because
this (now-ex) 'CA' was not an actual Certificate Authority, and
it's far from the only such one.</p>
<p>What's going on is that these days there's a lot of white label
reselling of what I could call <em>root CAs</em>, ie the Certificate
Authorities that are trusted by browsers and systems. Resellers
have their own website and brand, but they don't have a root CA
certificate of their own in browsers; instead the TLS certificates
they get for you are ultimately signed by someone else. Sometimes
this is a very direct relationship, and the TLS certificate visibly
belongs to the CA that the reseller is in front of. At other times,
the reseller has a TLS intermediate certificate in their own name
(<a href="https://crt.sh/?id=8970603416&opt=problemreporting">for example</a>).
As <a href="https://news.ycombinator.com/item?id=36256426">Andrew Ayer explains on HN (I know)</a>, sometimes this
involves the root CA generating and holding the intermediate CA
certificate itself, basically so that the branding in the end TLS
certificate can have the reseller's name on it instead of the root
CA's name (<a href="https://news.ycombinator.com/item?id=36255748">see also</a>).</p>
<p>Earlier this year, Andrew Ayer wrote a whole post on this complex
area, <a href="https://www.agwa.name/blog/post/the_certificate_issuer_field_is_a_lie">The SSL Certificate Issuer Field is a Lie</a>,
which is well worth reading. That some CAs aren't real CAs matters
not just because it's confusing at times like this, but also because
the situation makes it rather unclear who you should talk to if a
TLS certificate needs to be revoked as improperly issued or
compromised. If you're lucky and contact a reseller, they will be
able to pass your message on; if you're unlucky, you may have no
way of doing this, especially on a timely basis (<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1698936">cf</a>). Since such
a 'CA' is only a reseller, not an actual root CA, it's not bound
by any requirements for reporting certificate problems or having a
timely response to reports, unlike actual root CAs.</p>
<p>(To deal with this, <a href="https://crt.sh/">crt.sh</a> has a button on each
TLS certificate's page to tell you who and how to report problems
to. I found this out <a href="https://abyssdomain.expert/@filippo/110083248981135895">via Filippo Valsorda</a>. I don't
know how you find out if you're not able to use crt.sh.)</p>
<p>There's no particular fix for this; it's just something we have
to remember when dealing with the TLS certificate ecology. An
organization that you deal with that calls itself a 'Certificate
Authority' may or may not actually be one, and it can be hard to
tell.</p>
<p>PS: It would be nice if all CA resellers were required to have a
clearly accessible way of reporting problems with their resold TLS
certificates, especially in the cases where the certificate is
issued through an intermediate CA certificate with the reseller's
name on it. However, I won't be holding my breath for that being a
CA requirement; the current situation has been there for years and
years.</p>
</div>
(Apparent) Certificate Authorities aren't always actual CAs2024-02-26T21:43:52Z2023-06-10T02:56:49Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCSRsAreADefaultcks<div class="wikitext"><p>As I found out a while back (in <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/LetsEncryptAndMultipleNames">this entry</a>), the <a href="https://en.wikipedia.org/wiki/Automatic_Certificate_Management_Environment">ACME protocol</a>
that <a href="https://letsencrypt.org/">Let's Encrypt</a> invented and used
submits its actual requests for <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a> certificates
as <a href="https://en.wikipedia.org/wiki/Certificate_signing_request">Certificate Signing Requests (CSRs)</a>, despite
CSRs being famously complicated things that are theoretically full
of information that Let's Encrypt and ACME don't care about. The
story I heard was that this was initially done because Let's Encrypt
worried that the Certificate Authority <a href="https://cabforum.org/baseline-requirements/">Baseline Requirements</a> might require CSRs
to properly issue a TLS certificate, but recently <a href="https://infosec.exchange/@mattm/110510447410465505">Matthew McPherrin
shared a different and great reason on the Fediverse</a>:</p>
<blockquote><p><a href="https://mastodon.social/@cks/110510379332110748">@cks</a>
We don't believe (anymore) that CSRs are required, but the biggest
reason is for compatibility with existing systems. Like some security
cameras will give me a CSR for their web UI, as it's the defacto
format for public keys to request certs.</p>
</blockquote>
<p>Before reading McPherrin's post, it hadn't occurred to me that an
ACME client could submit an externally generated CSR to Let's Encrypt
(or anyone else supporting ACME), but of course this is perfectly
allowed. Since you can submit externally generated CSRs, the ACME
protocol can be used to get a certificate for anything that can
generate a CSR, including self-contained black boxes that generate
keys internally and never expose them to you. As McPherrin notes,
CSRs are the de facto format to use for this sort of thing, simply
because so many CAs spent so long requiring you to <a href="https://mastodon.social/@cks/110506723402360252">create and
submit CSRs</a>.</p>
<p>(Whether any particular ACME client will support this is another
issue entirely, and your mileage will vary. Plus, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/LetsEncryptAndMultipleNames">there are
protocol issues involved</a>. In a
quick check, <a href="https://certbot.eff.org/">Certbot</a> seems to
support supplying your own CSR with the '--csr' switch to
'certbot certonly'.)</p>
<p>One additional thing that may want to work with CSRs is <a href="https://en.wikipedia.org/wiki/Security_token">hardware
security keys</a>
(generally, <a href="https://en.wikipedia.org/wiki/Hardware_security_module">HSMs</a>, but most
places will probably not have full scale HSMs). Since CSRs are a
de facto standard for getting CA-signed keys, the software involved
may want to generate them, and certainly they won't give you the
private key (that's the whole point) so even without a CSR you'd
have to be able to work with the public key alone.</p>
<p>(With that said, many HSMs will let you generate a keypair externally
and then import it. <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/KeyGenerationAndHSMs">History has suggested that this may be more
secure in practice</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCSRsAreADefault?showcomments#comments">2 comments</a>.) </div>Let's Encrypt (really ACME) has a decent reason for (still) using CSRs2024-02-26T21:43:52Z2023-06-09T02:27:28Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/DNSSECFailureDrivesDisablementcks<div class="wikitext"><p>The news of the time interval is that the people in charge of the
New Zealand country zones (things directly under .nz) fumbled a
DNSSEC key (KSK) rollover in such a way as to break DNSSEC resolution
for those domains (see <a href="https://internetnz.nz/news-and-articles/dnssec-chain-validation-issue-for-nz-second-level-domain/">DNSSEC chain validation issue for .nz
domains</a>,
<a href="https://www.nzherald.co.nz/business/banking-apps-some-websites-down-as-internet-glitches-strike-local-sites/AAC63F6I5JHABFB2JPNZYHHEF4/">this news article</a>,
and <a href="https://cloudisland.nz/@ewenmcneill/110454725511113115">more</a>).
The suggested resolution to return these domains to working DNSSEC
was for all of the people running DNSSEC validating resolvers to
flush the zone information for everything under .nz. Or you could
wait for things to time out in a day or two.</p>
<p>You know what else you could do in your DNSSEC validating resolver
to fix this and other future DNSSEC 'we shot ourselves in the foot'
moments? That's right: you could disable DNSSEC validation entirely.
The corollary is that every prominent DNSSEC failure is another
push for people operating resolvers to give up on the whole set of
complexity and hassles.</p>
<p>Some people are required to operating DNSSEC validating resolvers,
and others are strongly committed to it (and are so far willing to
pay the costs of doing so in staff time, people's complaints, and
so on). But other people are not so committed and so the more big
DNSSEC failures there are, the more of them are going to solve the
problem once and for all by dropping out. And then DNSSEC becomes
that much harder to adopt widely even if you think it's a good idea.</p>
<p>(As for whether DNSSEC is a useful idea, see for example <a href="https://ripe86.ripe.net/presentations/51-2023-05-23-dnssec.pdf">this
RIPE86 slide deck by Geoff Huston</a>,
<a href="https://txt.udp53.org/@rr/statuses/01H1NSH0GJEQCA1VSRKCVSZH8H">via</a>,
<a href="https://ripe86.ripe.net/archives/video/1018/">also</a>.)</p>
<p>An additional contributing factor to this dynamic is that attacks
that are (or would be) stopped by DNSSEC seem relatively uncommon
these days. In practice, for almost all people and almost all of
the time, it seems to be that a DNSSEC validation failure happens
because a zone operator screwed up. This gives us <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SecurityAlertProblem">the security
alert problem</a>, where the typical person's
experience is dominated by false positives that just get in their
way.</p>
<p>PS: At this point it's probably too late to fix the core problem,
since DNSSEC is already designed and deployed, and my impression
is that it has low protocol agility (the ability to readily change).
Exhorting people to not screw up things like DNSSEC KSK rollover
clearly hasn't worked, so the only real solution would be better
ways to automatically recover from it. Maybe there are practical
changes to resolving DNS servers that can be done to work around
the issue, so for example they have heuristics to trigger automatically
flushing and re-fetching zones.</p>
</div>
DNSSEC failures are how you get people to disable DNSSEC2024-02-26T21:43:52Z2023-06-01T02:26:39Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/StreamProtocolsAndEncryptioncks<div class="wikitext"><p>In <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ProtocolsAndEncryption">my views on protocols and encryption</a>,
I called SMTP and IMAP 'stream based protocols' without really
explaining what I meant and why this mattered (and why NFS v3 isn't
really one, even though it's also transported over TCP). While
writing a comment on that entry I came to a realization about this
in the context of encryption. The short version is that <strong>stream
based protocols have context</strong>, or equivalently that a specific
connection for such a protocol has state, state that's not explicitly
specified in each of the messages that are exchanged over the
connection (but instead established from the sequence of messages).</p>
<p>When you make a SMTP or IMAP connection, you go through a series
of steps to establish what you're doing. In SMTP, there is a sequence
of (for example) EHLO, STARTTLS, EHLO, MAIL FROM, RCPT TO, and then
DATA and end of DATA. In IMAP, you'll LOGIN and then likely sooner
or later SELECT a mailbox and begin reading and manipulating messages
in that mailbox. These initial setup operations establish the context
for later commands and create a state that your connection is in.
What effect various commands has depends on the context and some
commands are only valid if the connection is in certain states.</p>
<p>(As a corollary, it's well defined what happens and what you do
if a given stream connection breaks. You have to start over from
scratch, re-establishing this context.)</p>
<p>Because the context that commands are issued in is a critical part
of what they mean, if you add encryption to the protocol this context
must be protected in order to insure the integrity of any given
connection. In a stream based protocol, you must protect not just
the individual commands and their responses; you need to protect
the sequence of operations, because removing an operation, repeating
an operation, or splicing in another operation from earlier in the
stream can completely change the effects and meanings of subsequent
commands. Most commonly this is done by simply encrypting the
entire stream from beginning to end.</p>
<p>In many RPC based protocols, such as NFS v3, this is at least
theoretically not an issue; each RPC operation is supposed to stand
on its own (in a protocol sense; clients definitely do have a context
that each operation is happening in). If an operation needs some
context, such as what directory you're looking up a name in, the
RPC request will explicitly carry that context as part of the RPC
data (and the RPC reply carries the context of what it's replying
to). If the other end doesn't recognize the context or the context
is now invalid, the operation will be rejected. In theory this
makes it much less critical to protect the entire sequence of RPC
operations if you're encrypting things, although you almost certainly
still need protection against repeated (replayed) operations.</p>
<p>(RPC operations may not even have a global ordering, the way that
all commands and responses in a single connection of a stream
protocol do.)</p>
<p>There are a number of interesting RPC protocols, or at least ones
that are of interest to me. DNS is a RPC protocol in that you send
individual DNS queries and receive replies to them without any
particular context being supplied. NFS v3 is definitely an RPC
protocol; any context is explicit in NFS RPC requests and replies
(although in practice the NFS v3 protocol has protection against
repeated requests). Arguably HTTP is an RPC protocol as well, even
with mechanisms like <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Keep-Alive">Keep-Alive</a> to
reuse an existing connection for additional requests.</p>
<p>(HTTP is an interesting case since it's transported over a stream,
and you may have to consume and send significant volumes of a stream
as part of each individual HTTP request and reply.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/StreamProtocolsAndEncryption?showcomments#comments">One comment</a>.) </div>Encryption for stream based protocols versus 'RPC' protocols2024-02-26T21:43:52Z2023-05-25T01:23:03Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ProtocolsAndEncryptioncks<div class="wikitext"><p>In the context of <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NFSEncryptionOptions">encryption for NFS</a>, a
good question was raised in the comments for <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NFSVsNFSWithKerberos">this entry</a>:</p>
<blockquote><p>This makes me wonder, what is it about file-access protocols that they
fundamentally "have to" go through a VPN or be tunneled through SSH,
in your opinion, instead of using protocol-integrated security like
everything else can? (This is in context of workstation-to-server, not
server-to-server.)</p>
<p>That is, why we use SSH-over-Internet instead of Telnet-over-VPN, or
for example IMAPS-over-Internet instead of IMAP-over-VPN, trusting
their built-in encryption and authentication, but reject the same
thing in NFS or SMB?</p>
</blockquote>
<p>The start of my views is that encrypted NFS with Kerberos is different
from IMAPS or encrypted SMTP because both of the latter are instances
of '<X> over TLS', while encrypted NFS with Kerberos is its own
bespoke, unique cryptographic protocol and implementation. I like
'<X> over TLS' (provided that TLS identities are competently handled),
because TLS is a well studied, reasonably well understood, and
usually well implemented thing (if you use a common implementation,
and everyone should). Bespoke cryptography is something I consider
dangerous because historically it's had a rather bad track record
(both in implementations and in protocols). A lot of effort from
many people and hard lessons learned have gone into TLS, far more
than into a niche bespoke system (which encrypted NFS with Kerberos
definitely is).</p>
<p>(SSH has both historical and practical reasons to not use TLS, and it's
often used in environments where non-mesh VPNs would be difficult to
substitute for it.)</p>
<p>But that just pushes the question back one level, to why SMTP and
IMAP were able to add TLS while NFS or SMB did not (NFS is apparently
starting to support NFS over TLS). One part of my answer is 'agility'.
SMTP was able to add TLS because it could add a 'STARTTLS' command
to the protocol and get people to use it; IMAP was able to add TLS
partly because it could also add a 'STARTTLS' command and partly
because it was able to add an entire new TCP port that was 'IMAP
over TLS'. As a practical matter, NFS has never had this agility;
the existing NFS protocols had no easy place to add the equivalent
of EHLO and STARTTLS, and the IP ports to use were either fixed or
found in arcane ways that made it hard to add a new port for an
all-TLS version.</p>
<p>(Here is a <a href="https://freebsdfoundation.org/wp-content/uploads/2021/07/Using-TLS-to-Improve-NFS-Security.pdf">mid 2021 FreeBSD article on their NFS over TLS</a>
[PDF].)</p>
<p>A somewhat deeper reason is that TLS is not quite a natural match
for NFS. TLS provides an encrypted stream, which works fine for
IMAP and SMTP because both of those are already stream protocols.
However, NFS is in theory a mostly stateless RPC protocol, although
today it's transported over TCP streams. This means that a
straightforward version of 'NFS over TLS' is encrypting the (TCP)
transport stream, not 'NFS' as such. There are similar protocol
challenges with DNS over TLS.</p>
<p>(I suspect that the mere existence of this mismatch helps create
arguments over the 'right' way to add TLS or encryption in general
to NFS and other such protocols, which of course slows down doing
it.)</p>
<p>The deepest reason I see is that we never successfully created a
generic 'random TCP streams over encryption' system that could be
applied to encrypt something without the cooperation of the protocol.
People sort of tried in the form of <a href="https://en.wikipedia.org/wiki/IPsec">IPsec</a>, but for various reasons IPsec
has not caught on and isn't considered desirable today. Without
such a system, (mesh) VPN protocols are the de facto solution if
you need to encrypt traffic for a protocol that either hasn't been
or can't be upgraded to be transported over TLS (or, if you must,
some other commonly used encrypted transport protocol).</p>
<p>VPN protocols are sort of like 'encrypt arbitrary TCP streams without
the upper layer caring', but they aren't implemented in a way that
makes that transparent. With a VPN, you may need a whole connection
establishment and monitoring system, and you need to force the upper
layer you care about to go over the VPN in some way through tricks
as opposed to simply specifying 'this traffic should be encrypted'
(a feature that <a href="https://en.wikipedia.org/wiki/IPsec">IPsec</a> did support). Tunneling a protocol through
SSH port forwarding is narrower than a full scale VPN but still
requires fiddling with how the system behaves to redirect its traffic
through the tunnel.</p>
<p>(There are programs that you can use to tunnel arbitrary things
over TLS in the manner of SSH port forwarding, but they've become
less popular over time. One reason for this may be that more and
more things have picked up direct support for TLS; another may be
that using TLS exposes you to annoying issues with configuring
certificate validation and so on.)</p>
<p>As for why file access protocols specifically seem to have been
affected by this, one obvious theory is that it's because file
access servers and clients are much more likely to be deeply embedded
into the kernel and the system as a whole, and seen as very critical.
Deeply embedded and critical systems are hard to wrap and resistant
to change (you can't just add libssl to them, and TLS in specific has
a bunch of complex requirements for things like certificate validation
that people don't want to do in the kernel).</p>
<p>PS: If you must use a non-TLS encrypted transport protocol, consider
using the <a href="https://www.wireguard.com/">WireGuard</a> protocol. There
are user level implementations and in general plenty of people are
studying its cryptography, so we can probably have reasonable
confidence that it's sound. Or there's <a href="https://en.wikipedia.org/wiki/QUIC">QUIC</a>, although QUIC is still quite
TLS like for obvious reasons, and even the SSH protocol (again,
well studied and there are good server and client implementations
that you can drop into your system).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ProtocolsAndEncryption?showcomments#comments">3 comments</a>.) </div>Some views on protocols and encryption2024-02-26T21:43:52Z2023-05-24T02:16:46Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/NFSEncryptionOptionscks<div class="wikitext"><p>A comment on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NFSVsNFSWithKerberos">my entry on NFS with Kerberos versus normal NFS</a> mentioned that one advantage of NFS with
Kerberos is that it can encrypt all of the NFS traffic between your
servers (whether they be NFS clients or NFS servers). My view is
that there are better ways to achieve this in today's world, ones
that I trust more for this purpose.</p>
<p>The first option is to use <a href="https://en.wikipedia.org/wiki/IPsec">IPsec</a>
for at least the NFS traffic between NFS servers and NFS clients.
IPsec has the advantage that IPsec security policies will generally
let you encrypt all NFS traffic and only the NFS traffic, so you
don't have to spend CPU cycles encrypting other traffic (if any).
You'd most likely want to set up an <a href="https://en.wikipedia.org/wiki/Internet_Key_Exchange">IKE</a> environment
to establish IPsec keys between relevant machines and to authenticate
them.</p>
<p>The second option is to use some kind of VPN system, with the NFS
servers running VPN endpoints and the NFS clients authenticating
to them to create encrypted connections. To force all NFS traffic
to be encrypted, you would give the NFS servers and NFS clients IP
addresses that can only be reached over a VPN connection, then do your
NFS mounts (and NFS mount permissions) using those IP addresses (or
names associated only with them). Like IPsec, a well done version of
this would have the side effect of authenticating all of the machines
involved. You'd distribute the necessary VPN configurations, keys, and
identification information through whatever existing configuration
management system you use for your servers. If I was doing this, I
would use <a href="https://www.wireguard.com/">WireGuard</a> if possible.</p>
<p>(Since you want all of the NFS servers to be VPN endpoints, you're
naturally interested in <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/VPNMeshAppeal">mesh-capable VPN solutions</a>.)</p>
<p>Given that encryption is encryption, I would expect good implementations
of any of these options to have performance comparable to NFS with
Kerberos. WireGuard is reportedly capable of significant bandwidth
under the right circumstances, so hopefully other options can perform
decently as well. You wouldn't get as much NFS bandwidth (with as little
CPU overhead) as you would without encryption, so if the best bandwidth
possible is your priority you need to build a physically secure network
and run NFS unencrypted over it.</p>
<p>Even if NFS with Kerberos had no other effects, I would rather use
one of these other options to get encrypted NFS, for two reasons.
The largest reason is that I have far more trust in the cryptographic
quality and security of something like WireGuard than I do in NFS
with Kerberos, because the former is much more of interest to people
and thus much more scrutinized. The lesser reason is that using
Kerberos necessarily means your Kerberos server (or servers) are now
a critical part of your NFS infrastructure.</p>
<p>(Of course, you might already have Kerberos as a critical part of
your environment for other reasons.)</p>
<p>PS: Done well, both IPsec and VPNs will authenticate all of the machines
involved to each other. It's not sufficient for an imposter to grab an
IP address of a NFS server or client and start talking; they'd also
need the machine's key material as well. In some environments this is
important. Of course at this point you get into provisioning issues
and how secure the provisioning process is.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NFSEncryptionOptions?showcomments#comments">2 comments</a>.) </div>What I see as good options today for encrypted NFS2024-02-26T21:43:52Z2023-05-23T01:57:25Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/NFSVsNFSWithKerberoscks<div class="wikitext"><p>One of the big divisions between <a href="https://en.wikipedia.org/wiki/Network_File_System">NFS</a> v3 and NFS v4
is that NFS v4 is normally used with Kerberos. In normal configurations,
this means that Kerberos is used to authenticate user requests; a
NFS v4 client that claims to be making a NFS request on behalf of
a given user must prove this by presenting an appropriate Kerberos
ticket. This is different from NFS v3, where normally NFS clients
are trusted by the NFS server to identify which UID the client is
making the NFS request on behalf of. A while back <a href="https://mastodon.social/@cks/110125522353687869">I expressed an
opinion about this on the Fediverse</a>:</p>
<blockquote><p>Hot take on NFS: NFS with Kerberos and NFS without Kerberos are two
different things that aren't all that comparable, and Kerberos for NFS
is a very limited fix for a very limited vulnerability that doesn't
apply to most people.</p>
<p>(NFS with Kerberos could in theory be used for wide-area file
sharing access, but in practice I believe this is almost never
used. Especially over the Internet.)</p>
</blockquote>
<p>In my view, the primary threat this form of Kerberos protects you
from is untrusted single-user machines that claim the user has
logged in to them when the user hasn't. If you trust the machines,
you can trust their claims of what user they're acting on behalf
of. If you don't trust a machine and people log into it, it's game
over for every user that logs into the machine; the machine acquires
access to their Kerberos information and can now make whatever NFS
requests in their name that it wants to. If you can't trust a
multi-user machine, you shouldn't allow people to log into it. So
the primary threat you're protecting against is an untrusted machine
operated by user A claiming that it's actually acting on behalf of
user B, when user B has never touched the machine.</p>
<p>This is a common threat profile in the individual laptop and desktop
space. But very few people are trying to use NFS in that environment
(and for good reason); most commonly people use <a href="https://en.wikipedia.org/wiki/Server_Message_Block">SMB/CIFS</a> there. NFS
today is mostly used between servers, where this threat is mostly
not applicable.</p>
<p>Why this form of NFS with Kerberos isn't particularly comparable
to NFS without it is that this form of NFS with Kerberos requires
you to give up essentially all unattended operations that NFS clients
might perform on behalf of users. Crontab entries, web CGIs or
<a href="https://utcc.utoronto.ca/~cks/space/blog/web/UserRunWebservers">long running web server processes</a>
(including <a href="https://utcc.utoronto.ca/~cks/space/blog/web/ReverseProxiesForFilePermissions">ones you want to use to get around file access permissions</a>), server side filtering in
your mail system, sufficiently long running compute jobs in <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/SlurmHowWeUseIt">your
SLURM compute cluster</a>, automatically
distributed containerized workloads, generally none of those work
because none of them are in a position to get the necessary Kerberos
authentication from the actual person involved. All of them have
to say 'trust me that I'm acting on behalf of this person', and in
the usual form of NFS with Kerberos the NFS server's answer is 'no
I don't trust you, prove it'.</p>
<p>NFS without Kerberos can be used on NFS clients as more or less
just another Unix filesystem, although sharing a single area between
NFS clients may require some additional work. The normal form of
NFS with Kerberos can't be used that way because to a large extent
it can't be used for what we could call 'background' activities,
only for 'foreground' ones that happen when people are actively
logged in. In many environments, <a href="https://support.cs.toronto.edu/">ours</a>
included, these 'background' activities are quite important for
the overall system, which means a switch from NFS without Kerberos
to the common form of NFS with Kerberos would be a serious loss of
capabilities.</p>
<p>(If you do face the threat that NFS with Kerberos is good at defeating,
then you probably have no choice; you can't use NFS without Kerberos.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NFSVsNFSWithKerberos?showcomments#comments">7 comments</a>.) </div>NFS with Kerberos and NFS without Kerberos are two quite different things2024-02-26T21:43:52Z2023-05-22T03:12:09Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MailingListsVsForumscks<div class="wikitext"><p>Once upon a time, the thing for open source projects to have was
one or more mailing lists (a '-user' and a '-devel' split was
common). These days, an increasingly common option is some form of
forum (or sometimes a chat system), to the point where people will
sometimes turn up on mailing lists and ask if there's a forum
version. As an old timer I often rolled my eyes at this shift, but
it recently occurred to me that there is something to be said for
forums in practice. And that is that <strong>in practice forums are much
better at having separate threads of discussion and even entire
topic areas</strong>.</p>
<p>In theory, mail clients can group messages by thread, let people
mute entire threads that aren't of interest to them, and layer on
additional things for topic areas and so on. In practice this relies
on mail clients both being pretty sophisticated and doing the right
thing on replies, which means that in practice it can be fragile.
The default experience of an active mailing list with a mail client
is a steady rain of relatively undifferentiated email. By contrast,
forums don't give you any choice about the matter; your message or
reply will be associated with a specific, distinct thread (and
possibly in a specific topic area). This is more or less enforced
by both the software and the social expectations; even if you can
technically do otherwise with enough work, it won't get you the
results you want.</p>
<p>In turn this matters because threads and topic areas are major filtering
mechanisms. If you ask a question on a thread it's easy to watch the
thread to find replies, and if you're not interested in a thread or an
entire topic, it's easy to not look at it at all. In a sense, forums
are optimized for skimming. Mailing lists aren't; mailing lists are by
default a firehose and it's up to you to figure out how to skim them.</p>
<p>(Another way to put this is that in a forum, the default is to not
read things. In a mailing list, the default is to read things.)</p>
<p>I've come to believe that this has the follow on effect that forums
are easier for newcomers to ask questions in and for experienced
people to stick around in. Newcomers don't have to swallow the
firehose; they can start a thread with their question and watch
only the thread, or cherry pick threads to read that might be
applicable. Experienced people can selectively expose themselves
to however much volume they want by looking at only some topics,
threads, and so on.</p>
<p>In a low volume environment everyone probably benefits from the mailing
list approach; the experienced people are probably interested in
everything and the newcomers won't be overwhelmed by other content as
their question gets answered, and in the meantime all of the experienced
people will likely see it. But as the volume goes up I think filtering
becomes more important for both groups, and especially I've come to
think it's critical for keeping experienced people around.</p>
<p>(All of this makes me rather more sympathetic with projects that have
chosen forums instead of mailing lists than I used to be.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MailingListsVsForums?showcomments#comments">6 comments</a>.) </div>Mailing lists versus forums, some thoughts2024-02-26T21:43:52Z2023-05-11T02:12:50Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CuringColdLockupMachinecks<div class="wikitext"><p>I've had <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ColdLockupMachineMysteryII">a long running mystery where my home desktop would lock up
if it got cold</a>, where as time developed
it seemed that getting too cold was down not much below 68 F (which is
hardly cold, at least for Canadians). Back in December of last year <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ATXChassisPowerSwitchNotes">I
had the idea of 'replacing' the case front panel with just a stand-alone
ATX chassis power switch</a>. I hesitated for a
bit, but the situation was getting more irritating this past winter and
stand-alone ATX chassis power switches are not expensive items. Finally
at one point I carried through with this plan (possibly <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/AlwaysMakeAChecklist">when I had
to open up the case after forgetting a critical step in software disk
shuffling</a>). Somewhat to my surprise,
this relatively simple change seems to have fixed all of the problems.</p>
<p>When I was vaguely planning the change, I had expected to disconnect all
of the various cables from the front panel. When it actually came time
to do it, I only pulled the primary motherboard front panel connectors
on <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2018">my home desktop's motherboard</a>, which I
believe has left USB and audio connected, and I reconnected only the
chassis power switch. These days I suspect that this is more than I
needed to disconnect and both the power LED and 'HDD' LED are probably
safe, but at the time I didn't feel like taking several iterations of
testing.</p>
<p>One of the lessons learned for me is that very odd PC symptoms can
(apparently) be caused by quite small underlying problems. It seems
pretty likely that the underlying cause of my desktop's problems
was the power button and possibly the reset button shorting to a
'pressed' state. In retrospect it's easy to see how this could turn
my desktop off, but coming back on when things warm up (and presumably
the buttons stop being 'pressed') is less obvious. I can imagine a lot
of possibilities, including the motherboard having some sort of short
detection where if things are held 'pressed' for too long, it concludes
something is wrong and refuses to power on until the situation clears.</p>
<p>(I believe the ATX power on behavior is implemented by special
motherboard circuitry, instead of being handled in BIOS by the CPU.
Possibly all of the real behavior is implemented in hardware, with
the BIOS only receiving a 'the user pressed the power briefly'
signal when the system is running.)</p>
<p>In the future, I hopefully will remember that other mysterious
hardware problems might be dealt with in equally simple ways.
<a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/NetworkCablesGoBad">Cables can go bad, for example</a>
(and I've certainly heard stories of disk problems that were cured
with new SATA cables).</p>
</div>
Curing my home desktop from locking up in the cold (so far)2024-02-26T21:43:52Z2023-05-10T02:53:33Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/AdvisoryVsMandatoryFileLockscks<div class="wikitext"><p>On the surface, it sounds like advisory file locks and mandatory
file locks are almost the same thing, with only a little change;
they're both file locks, you're just changing one word and some
small behavior. It's my view that this is a linguistic artifact,
an effect of the words we're using, and they are actually very
different things that are much further apart than their names make
them sound.</p>
<p>Advisory file locks are in effect a form of broadcast interprocess
communication (IPC) between vaguely cooperating processes. Processes
use 'file locking' to broadcast information about what they're doing
(such as reading or modifying a file) and what other processes
shouldn't do (such as modify or sometimes read the file). Generally
there's a simple system to regulate who can broadcast what sort of
messages; for example, in Unix you may need to be able to open a
file for writing before you can obtain an exclusive lock on it (ie,
to broadcast your desire that no one else access the file).</p>
<p>By contrast, mandatory file locks are a form of dynamic mandatory
access control (MAC) that's applied to other processes. When a
process obtains a given sort of mandatory file lock, it actively
prohibits other processes from doing certain things to the file
while the lock is held (what's prohibited depends on the lock type
and the system, but it's common for exclusive locks to prevent both
reading and writing and shared locks to prevent writing). Since
this is <em>mandatory</em> access control, the other processes don't have
to be cooperating ones and they more or less have no say in this.
This isn't an accident, it's the entire point of using mandatory
locks instead of advisory locks.</p>
<p>These two quite different things have quite different design needs.
They also have very different impacts and effects on the rest of a
system. It is hopefully obvious to everyone that there's much less
impact to adding another IPC system (or two, if you have multiple
forms of locks) than adding a new dynamic mandatory access control
system. A new access control system will affect many other things
in your overall system and will likely have interactions all over
the place (for example, with your other access control systems).</p>
<p>(My personal view is that your entire set of access control systems
need to be designed together in order to be coherent, usable, and not
surprising. Especially, adding new MACs after the initial system design
is done has historically not given really great results; there are
often rough corners and unpleasant surprises. MACs often don't compose
together but instead conflict with each other.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/AdvisoryVsMandatoryFileLocks?showcomments#comments">One comment</a>.) </div>Advisory file locks and mandatory file locks are two quite different things2024-02-26T21:43:52Z2023-05-07T23:57:33Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/OpenSSHVersusSSHcks<div class="wikitext"><p>When I started to write yesterday's entry on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/OpenSSHCertificatesNotX509">how OpenSSH certificates
aren't X.509 certificates</a>, I initially
titled it as being about 'SSH certificates'. This wouldn't be
unusual; Matthew Garrett's article <a href="https://mjg59.dreamwidth.org/65874.html">We need better support for SSH
host certificates</a> also
uses 'SSH' here. I changed my entry's title out of a sense of
pickyness, because although OpenSSH is the dominant SSH implementation,
it's not the only one. Or maybe it is, depending on your perspective,
or at least the only SSH that matters and so we might as well talk
about 'SSH certificates'.</p>
<p>In theory, SSH is a protocol, specified across a number of RFCs,
and there are multiple implementations of this protocol (for example,
Go has an implementation in <a href="https://pkg.go.dev/golang.org/x/crypto/ssh">golang.org/x/crypto/ssh</a>). In practice, well,
you can see <a href="https://www.openssh.com/specs.html">OpenSSH Specifications</a>,
which is a handy list of everything OpenSSH supports and implements.
These range from RFCs, to RFC drafts, to <a href="https://cvsweb.openbsd.org/src/usr.bin/ssh/PROTOCOL.certkeys?annotate=HEAD">OpenSSH's extensions for
certificates</a>.
I think you can probably interoperate with OpenSSH if you only
implement the RFCs, but your users may not enjoy it very much.</p>
<p>The other thing is that the evolution of SSH seems to be pretty
much the OpenSSH project's show. I don't think anyone else is working
on new protocol features; instead, OpenSSH comes up with them and
then people with other SSH implementations either follow along or
not. This makes OpenSSH fairly synonymous with 'SSH'; if only OpenSSH
is moving the protocol forward and everyone else follows along
sooner or later, you might as well say 'SSH certificates' and then
mention in passing that other implementations may not support them
(yet).</p>
<p>At the same time, there's a not insignificant amount of other SSH
implementations being used out in the world (in important and
relevant places). In one recently relevant example, one reason
Github didn't take advantage of OpenSSH's protocol extension for
offering multiple host keys (to enable upgrades or transitions) is
that they don't use OpenSSH but instead a different implementation.
As an implementation, OpenSSH is a monolith that's focused on its
particular usage case of general computer access; if you're not
doing that (as eg Github isn't), then you may find using other
implementations easier than trying to (securely) bend OpenSSH to
your needs. These implementations still clearly matter even if
OpenSSH are the only people really evolving the protocol.</p>
<p>(One option would be to use 'OpenSSH' when I talk about some aspect
of the OpenSSH programs and 'SSH' when I talk about protocol level
things, even when the element of the protocol is only in OpenSSH
(maybe unless OpenSSH doesn't yet consider it stable). This would
make it 'SSH certificates', since they're a protocol element, but
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/OpenSSHAndSHA1DeprecationII">OpenSSH's deprecation of 'ssh-rsa' SHA1-based signatures</a> since that's something the programs
do.)</p>
</div>
Some thoughts on OpenSSH versus SSH2024-02-26T21:43:52Z2023-04-15T23:42:41Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/OpenSSHCertificatesNotX509cks<div class="wikitext"><p>Recently I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/web/WebServerMTLSHazards">learning about the extra hazards of mutual
TLS in web server programs</a>, where the
extra hazard is that your Apache or other web server program must
now parse <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>
<a href="https://en.wikipedia.org/wiki/X.509">X.509</a> certificates and
understand ASN.1 encoding and so on, which is a lot of code that
it probably doesn't currently run. When writing that entry, it
occurred to me to wonder if (Open)SSH had the same problem, since
OpenSSH supports user authentication through signed certificates
(instead of personal keypairs). It turns out that the answer is no.</p>
<p>(I found out the answer more or less without looking for it, because
Matthew Garrett mentioned this in <a href="https://mjg59.dreamwidth.org/65874.html">We need better support for SSH
host certificates</a>, written
in the wake of <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/KeyRotationVersusKeyRevocation">Github exposing their RSA private key</a>.)</p>
<p>The specifics are covered in <a href="https://cvsweb.openbsd.org/src/usr.bin/ssh/PROTOCOL.certkeys?annotate=HEAD">PROTOCOL.certkeys</a>,
and the important quote is:</p>
<blockquote><p>[...] The certificates used are not traditional X.509 certificates,
with numerous options and complex encoding rules, but something rather
more minimal: a key, some identity information and usage options that
have been signed with some other trusted key.</p>
</blockquote>
<p>The certificate format reuses the encoding scheme from the SSH
protocol, as covered in <a href="https://www.rfc-editor.org/rfc/rfc4251#section-5">RFC 4251 section 5</a>. This isn't just
clever code reuse; since all of these encodings are used in the
protocol, handling all of them is already security critical in a
SSH server and having to parse them in the context of certificates
should add minimal new attack surface.</p>
<p>(The actual certificates themselves are just a set of fields in
a fixed order; each field uses an already defined encoding from
<a href="https://www.rfc-editor.org/rfc/rfc4251">RFC 4251</a>.)</p>
<p>One simplification over X.509 certificates is that OpenSSH doesn't
support certificate chains. Your SSH certificate is signed directly
by some key, and the OpenSSH server either trusts that key or it
doesn't. This simplifies the life of the OpenSSH server at relatively
low cost for SSH client certificates, since you probably already
want to be able to distribute new SSH Certificate Authority keys
to all of your servers.</p>
<p>(Where it hurts more is for SSH host certificates, where a change in
your CA key will require all your clients to update their copy of it.)</p>
</div>
OpenSSH's (signed) certificates are not TLS X.509 certificates2024-02-26T21:43:52Z2023-04-14T03:14:09Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/PS2ToUSBPragmaticJourneycks<div class="wikitext"><p>Today, <a href="https://digipres.club/@foone/110091175239332243">I was reminded</a>
that at one point I had strong feelings on the issue of <a href="https://en.wikipedia.org/wiki/PS/2_port">PS/2</a> versus USB for keyboards
and mice, where <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/USBKeyboardDislike">I didn't like USB keyboards</a>
and <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomePeripherals2015">my preferred mouse</a> (or variety
of them) was also a PS/2 mouse. However, these days I am entirely
USB based for both keyboard and mouse and I don't particularly want
to go back. What got me here is an assortment of issues, including
the relentless march of time and 'progress' in the computing sense.
Or, the short version, PS/2 is a de facto obsolete connector format.</p>
<p>Because PS/2 is de facto obsolete, increasingly many motherboards
have one or zero PS/2 connectors; insisting on two or even one was
clearly constraining my choices even back in 2015. I could have
tried to keep using <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomePeripherals2015">my 2015 favorite keyboard and mouse</a> through PS/2 to USB converters, but
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PS2ToUSBMyPlans">I ran into issues with them</a> that made me start
to question the wisdom of that. In addition, new mice and keyboards
were mostly or entirely USB (although there are probably mechanical
keyboards that are PS/2 based). When I made <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MechanicalKeyboardFeelings">my first foray into
mechanical keyboards</a>, it was with a
USB mechanical keyboard, and that's continued to my current one. I
also later changed my mouse to one that I'm now very fond of, and
that again is USB based.</p>
<p>(This shift had happened by the time I put together my current <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/WorkMachine2017">work</a> and <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2018">home</a>
desktops; I didn't bother to specifically look for PS/2 ports on either
of their motherboards.)</p>
<p>My pragmatic results are that USB has worked reliably for me here,
more or less as expected; I don't do operating system development
or, generally, shuffle USB things around or do weird USB stuff that
could cause heartburn to Linux's USB stack. In addition, moving to
USB has allowed me to switch to a keyboard and a mouse that I've
come to believe are clearly nicer than <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomePeripherals2015">my old keyboard and mouse</a>. However fond I was of the BTC-5100C and
its small size, I'm pretty sure that it had worse keyboard feel than
my current mechanical keyboard. And the old plain three-button mice
I used to use are clearly inferior to <a href="https://www.ergocanada.com/detailed_specification_pages/contour_design_contour_mouse_optical.html">my current mouse</a>.</p>
<p>(And the switch made selecting desktop motherboards much easier,
since I no longer had to care about PS/2 port(s).)</p>
<p>USB is still more complex than PS/2 and is subject to <a href="https://digipres.club/@foone/110091175239332243">fun issues</a> from time to time.
I mitigate some of these issues by connecting my keyboard directly to
a desktop USB port, instead of going through a hub (and on my work
machine, I think my mouse may also be directly connected).</p>
<p>Over time, the same shift has happened on our servers. Old servers
used to have a PS/2 port, and we tended to use a PS/2 keyboard on
them. Then we started getting some servers that were USB only, so
we switched to USB keyboards in the machine room and lying around.
Now I believe all of our servers are USB only, with no PS/2 left,
and certainly I don't know where any remaining PS/2 keyboards we
have would be. We may have thrown them out by now.</p>
<p>(Okay, I kept my PS/2 BTC-5100C keyboards and mice, so we still
technically have some at work. I even have the PS/2 to USB converters
for them, somewhere. If I really wanted to I could connect one up for a
side by side comparison, although there's no real point; I'm not going
back.)</p>
<p>All of this is unsurprising. Shifts in connector and interface
technology have happened before over my time with computers. Disks
have moved from SCSI to SATA (or IDE to SATA in some environments)
and now toward NVMe; *-ROM drives moved from whatever it used
to be to SATA and are now basically obsolete; AGP and PCI gave way
to PCIe (with some digressions along the way). Keyboards and mice
are different only in that we directly touch them and so I and
others have strong opinions about what ones we want to use (and as
a system administrator I get to reflexively worry about 'will it
work even when there are problems').</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PS2ToUSBPragmaticJourney?showcomments#comments">3 comments</a>.) </div>My pragmatic shift from PS/2 keyboards and mice to USB ones2024-02-26T21:43:52Z2023-03-27T02:13:23Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/KeyRotationVersusKeyRevocationcks<div class="wikitext"><p>As you may have heard, <a href="https://github.blog/2023-03-23-we-updated-our-rsa-ssh-host-key/">Github changed its RSA SSH host key today</a>,
because the private key had been exposed (apparently briefly) in a
public Github repository. As a result of this, a lot of people got
scary looking SSH warnings. One reaction people had to this was to
note that Github could have avoided these warnings if it had used
<a href="https://lwn.net/Articles/637156/">OpenSSH host key rotation</a> to
provide a second RSA host key in advance to prepare for the rotation.
However, there is one little issue with this, <a href="https://mastodon.social/@cks/110079058547714455">which I alluded to
on the Fediverse</a>:</p>
<blockquote><p>One thing Github is doing today is making people extremely aware that
the prior Github RSA key is now invalid. This is a good thing with the
key being presumed compromised, and one I don't think you can get with
ordinary OpenSSH key rotation.</p>
<p>(I first saw the key invalidation issue pointed out in someone's
Fediverse post, which I can't find now.)</p>
</blockquote>
<p>This is something worth repeating: <strong>key rotation doesn't give you
key revocation</strong>, and the two are different things. Key rotation
gets people to accept and use a new key; key revocation gets them
to not accept the old one. Of course if you revoke the current key
you generally want people to rotate into using a new one, but you
can want people to rotate into a new key without any particular
revocation of the old one. </p>
<p>Broadly, key rotation by itself is a precautionary measure (much like
periodic password changes) or a way to get people to upgrade to a better
key (for example, to move from a 1024 or 2048 bit RSA key to a bigger
one, or to switch key types). Key rotation doesn't actively force people
to stop accepting the old key (although if the old key has an embedded
expiry, it may fall out of validity on its own), it just enables them
to also accept the new one so some day you can switch to only using the
new one. If your key has actually been compromised, passively switching
away from it isn't sufficient; you need to get people to actively stop
accepting and using it. You have to assume that your old key in the
hands of an attacker who can still use it, even if you don't, which lets
the attacker target anyone who'll still accept the old key.</p>
<p>What Github has done isn't actual revocation (<a href="https://cloudisland.nz/@ewenmcneill/110080219126850455">as Ewen McNeill noted</a>); numbed by one
alert, people could be coaxed to accept another alert and go back to the
old key. Or an attacker could target people who haven't hit this yet
(or haven't updated their keys) and feed them the old key, which they'd
accept without warnings. But by making this a noisy event, Github has
probably come as close to actual SSH key revocation as SSH allows.</p>
<p>(That SSH doesn't have anything better than this for key revocation
of ordinary host keys is not really its fault. Github is using SSH in
a situation that is really a better fit for the server authentication
properties of <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSThreeWorlds">public web TLS</a>.)</p>
<p>PS: If an attacker can use the Github situation to get people to accept
a second 'remote host identification has changed' key change for Github,
they don't actually need Github's old private key; any new key will do.</p>
</div>
Key rotation is not the same as key revocation (or invalidation)2024-02-26T21:43:52Z2023-03-25T02:49:29Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/RAIDSSDBlockDiscardProblemcks<div class="wikitext"><p>One of the things that's good for the performance of modern SSDs
is <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDsAndBlockDiscardTrim">explicitly discarding unused blocks so the SSD can erase flash
space in advance</a>. My impression is that
modern SSDs support this fairly well these days and people consider
it relatively trustworthy, and modern filesystems can discard unused
blocks periodically (Linux has <a href="https://man7.org/linux/man-pages/man8/fstrim.8.html">fstrim</a>, which is
sometimes enabled by default). However, in some environments there's
a little fly in the ointment, and that's RAID (whether software or
'hardware').</p>
<p>The issue facing RAID is that in a RAID environment (other than
RAID-0), by default there's some relationship between the contents
of sector X on one disk and sector X on another disk. In RAID-1 the
two sectors are supposed to be identical; in other RAID levels the
sectors (along with sectors on other disks) are supposed to have
one or more correct checksums. If you TRIM the same sector on two
or more SSDs, the basic version of block discard support doesn't
promise to give you any particular data, which means that the
relationship between the data on different disks is now potentially
gone.</p>
<p>(Modern SSDs support 'Deterministic Read After TRIM (DRAT)', <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDBlockDiscardHowSecure">cf</a>, but this doesn't promise to return the
same data on two different drives, <a href="https://superuser.com/a/1750461">you might get read errors
instead</a>, and this doesn't deal
with RAID-N checksums.)</p>
<p>Some or perhaps many modern SSDs support 'Deterministic read ZEROs
after TRIM' (variously called DZAT, RZAT, or DRZAT). A RAID-1 mirror
on SSDs with reliable DZAT can TRIM sector X on all mirrors and be
confident that its expected relationship between sectors on disks
still holds. A RAID-N parity system might have more troubles here,
but it can at least only have to (re)write the parity blocks for
an all-zero set of data blocks; the data blocks themselves could
be left TRIM'd.</p>
<p>(Probably a RAID-N system could also do this for SSDs supporting
DRAT; it would TRIM the data and parity blocks, then re-read the
data blocks, calculate the parity for whatever deterministic values
it reads, and write the parity out.)</p>
<p>The other option I can think of is for the RAID system to keep track of
what block ranges have been TRIM'd and so don't have consistent contents
on the actual disks. Some higher end storage systems already support
<a href="https://en.wikipedia.org/wiki/Thin_provisioning">thin provisioning</a>,
which requires them to keep track of what user-visible blocks are valid;
it's straightforward to use this for SSD block discarding as well.
Otherwise the RAID system will require some sort of data structure to
keep track of this, which will probably be new.</p>
<p>(Perhaps RAID systems have come up with other clever solutions to this
problem.)</p>
</div>
The problem RAID faces with discarding blocks on SSDs2024-02-26T21:43:52Z2023-03-23T02:27:23Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/WhyUseNOPsForThingscks<div class="wikitext"><p>A while back I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PowerPCInstructionOddity">an instruction oddity in the PowerPC
64-bit architecture</a>, where the architecture
reused a number of CPU instructions that otherwise had no effect
in order to signal various things to the CPU. You might reasonably
wonder why the PowerPC <a href="https://en.wikipedia.org/wiki/Instruction_set_architecture">ISA</a> decided
to use unofficial NOPs for this, instead of using explicit dedicated
instructions. Although I don't know the actual reasons, I can think
of some reasonable explanations.</p>
<p>First, this reuse of instructions is fully compatible with CPUs
that don't support this feature, either because they're old or
because it's not applicable to them (for example, because they
don't support hardware threads). Of course, new CPUs that don't
care about new instructions could always ignore them, but that
would likely require some work to recognize the new instructions.
Here, a non-supporting CPU can simply handle the instruction as a
normal '<code>or</code>' instruction that will wind up having no effect.
This is potentially especially useful if these instructions might
occur relatively commonly in 'hot' situations, where you don't want
to take various sorts of overhead by having to check CPU features
and so on.</p>
<p>(There are various approaches to this, but they all have some
sort of overhead.)</p>
<p>Second, there may be only a limited amount of explicitly reserved
instruction space in the instruction set architecture, and the ISA
may have to last for decades as you keep expanding it to cover more
and more features (consider all of the vectorization instructions
that keep appearing in the 64-bit x86 ISA). You might not consider
every potential feature to be valuable enough to permanently consume
some of your remaining free instruction space. Reusing unofficial
NOPs is an attractive proposition by contrast, since the instruction
space is already allocated.</p>
<p>Third, using unofficial NOPs instead of new instructions also
provides you a graceful off-ramp to no longer supporting the feature,
if you decide it's not worth it any more. You can just remove the
special handling of the instruction and let it return to its old
meaning of just being an unofficial NOP (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WhatIsAModernNOP">of which you may have a
lot</a>). People's code keeps running and you don't
need any additional work in your future CPUs to recognize and ignore
these instructions.</p>
<p>(This was sort of sparked by one of the comments <a href="https://old.reddit.com/r/golang/comments/10i3svb/an_instruction_oddity_in_the_ppc64_powerpc_64bit/">here</a>.)</p>
</div>
Some reasons why CPUs might re-use unofficial NOPs for other things2024-02-26T21:43:52Z2023-03-18T02:17:11Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SSDBlockDiscardHowSecurecks<div class="wikitext"><p>Suppose hypothetically that you have some SSDs to securely dispose
of, and that for one reason or another you can't use <a href="https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase">built-in SSD
secure erase</a>
on them, for example (apparently) because your BIOS automatically
locks out that option when it boots. You might wonder how well
protected you are if you simply tell the SSD to <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDsAndBlockDiscardTrim">discard all of
its data</a>. Unsurprisingly, the answer is
that it depends.</p>
<p>First off, any SSD you want to use today will support what's called
'Deterministic Read After TRIM (DRAT)' (sort of <a href="https://en.wikipedia.org/wiki/Trim_(computing)#ATA">cf</a>), where the
SSD will always return a fixed result when you read data after a
TRIM operation. Some SSDs also promise to always return zeros in
this situation; this is 'Deterministic read ZEROs after TRIM',
variously abbreviated as 'DZAT', 'RZAT', or 'DRZAT'. These are the
(S)ATA versions, but NVMe has a similar system. All of these mean
that once you TRIM the entire drive, the previous data on the drive
can't be read through normal means, so someone who gets your drive
and puts it in a computer will get garbage (or <a href="https://superuser.com/questions/1652827/how-to-tell-whether-zeros-originate-from-trim-or-from-actually-writing-zeros-on">possibly errors
on NVMe drives</a>).</p>
<p>(My impression is that vendors initially supported DZAT only on
higher end (or at least more expensive) SSDs they sold for use in
RAID arrays, although support for this seems to have trickled down
to at least some modern consumer SSDs. Supporting DRAT but not DZAT
strikes me as mostly a market segmentation thing; if you're going
to be deterministic, making it always zero seems as easy as anything
else.)</p>
<p>If the drive then goes on to actually erase all of the flash blocks with
any copies of what you've TRIM'd, then as far as I know the data is
completely unrecoverable. Flash storage, unlike traditional hard drives,
can really be completely and irrecoverably erased, with no lingering
magnetic ghosts that a sufficiently determined person could in theory
reconstruct. However, SSDs don't particularly promise to actually erase
all of your blocks after you've TRIM'd them. Erasing blocks is a time
and power consuming activity, so while a SSD probably wants to keep a
pool of already erased blocks for new writes, it might not keep going
on at full pace once it thinks it has a big enough pool. SSDs make no
promises here and as far as I know there is no reliable, normal way to
tell how many erased blocks they have or if they've erased all your
blocks. Letting a TRIM'd SSD sit powered on but idle for minutes or
hours likely increases your chances that everything gets TRIM'd, but
doesn't guarantee it. There's also no certainty that a SSD will erase a
block that it's decided is too unreliable to reuse.</p>
<p>The lack of certainty on erasure matters because SSDs can be put
into a special <a href="https://blog.elcomsoft.com/2019/01/life-after-trim-using-factory-access-mode-for-imaging-ssd-drives/">factory mode</a>
that generally allows raw access to the flash storage and allows
you to stop the SSD from doing any further block erasure. If you
can put a drive into this state you can read out TRIM'd but not yet
erased blocks, although you may not know what logical blocks they
were. Serious data recovery companies can probably put pretty much
any SSD from any mainline maker into this recovery mode, which means
that anyone who wants to spend enough money can probably pull out
any not yet erased data from a TRIM'd drive. If they can't get the
pre-TRIM mapping of logical blocks to flash storage, making sense of
the result may take a lot of work but it's probably not impossible.</p>
<p>So on my taxonomy of <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DiskErasingWhoAreYouStopping">who you're trying to stop when securely
erasing disks</a>, simply TRIM'ing your
SSDs definitely stops the basic threat of 'someone plugs it into
a computer and tries', but probably doesn't entirely stop the threat
of 'someone is willing to spend a bunch of money to send it to a
data recovery firm'. Letting your drives sit so that they erase
as many blocks as possible will make the life of the second sort
of person harder, but not impossible.</p>
<p>(TRIM'ing your SSD and then filling it up with new junk data will
probably help here, because it will push the drive to erase almost
everything. Randomly rewriting scattered bits afterward with more
junk will probably push the drive into erasing its overprovisioned
blocks too. But all of that is a speculative guess, because SSDs
are black boxes. If this matters a lot to you, you want to use
SSDs that have good implementations of secure erase. How you find
out what those SSDs are, I don't know.)</p>
<p>(Under Linux, you can use '<code>hdparm -I</code>' to see what SATA SSDs support
(or claim to), and <a href="https://unix.stackexchange.com/questions/472211/list-features-of-nvme-drive-like-hdparm-i-for-non-nvme">see this stackexchange question and answer for
how to do it on NVMe drives</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDBlockDiscardHowSecure?showcomments#comments">One comment</a>.) </div>How secure is merely discarding (TRIMing) all of a SSD's blocks?2024-02-26T21:43:52Z2023-03-05T03:48:35Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/DiskErasingWhoAreYouStoppingcks<div class="wikitext"><p>Various people, <a href="https://support.cs.toronto.edu/">us</a> included,
periodically have the need to securely dispose of disk drives that
we no longer need or want, where by 'securely' we mean that people
shouldn't be able to get our data from the drives after we've gotten
rid of them. Often there are questions of what you need or want
to be doing in order to achieve this security. In my view, part of
the answer to this is depends on who you want to stop from getting
your data (and how many resources you think they have).</p>
<p>(I'm not sure this should be called a <a href="https://en.wikipedia.org/wiki/Threat_model">threat model</a>, but it's the same
sort of general idea.)</p>
<p>So here is my take on multiple levels of threat you might face,
from the most common to the rarest (assuming that you're starting
from working drives before you begin disposing them).</p>
<ol><li>Someone who gets their hands on your disks (either buying them
second hand or picking them up from somewhere), sticks them in a
computer, and tries to read them. This used to happen all of the
time; people would buy surplus disk drives (or entire computers)
from eBay, plug them in, and all sorts of sensitive data would
come flying out.<p>
If this happens today people will be very upset at you, and for
good reasons, because this is basically 'Hardware Disposal 101'.
Everyone should know you can't just turn computers off then toss
them out the door; you have to do something to make it so your
data doesn't leak out with them.<p>
</li>
<li>People who can put drives into <a href="https://blog.elcomsoft.com/2019/01/life-after-trim-using-factory-access-mode-for-imaging-ssd-drives/">a special factory mode</a>
that allows access to low-level data reading commands. On SSDs
this will probably allow them access to reserved space and <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDsAndBlockDiscardTrim">blocks
that have been discarded but not yet erased</a>.<p>
I believe that at least some data recovery services have this
capability, so you're also effectively worrying about people who
can send your old drives to a data recovery service and talk them
into having a go at it. Thus, you should probably assume that any
actual attacker is at this level (as opposed to people who just
picked up some of your drives and are curious what they'll see).<p>
In general, data recovery services go some way to making it so
that an attacker mostly needs money (and your drives) instead of
good technical capabilities. Attackers can to some extent outsource
the technical expertise, assuming they can find a suitable firm
and the firm is willing to work for them on your drives.<p>
</li>
<li>People who can load custom firmware onto drives, giving them at
least as much access as the most powerful factory mode (regardless
of what the drive's normal factory mode supports). These people
can definitely read all of the raw storage (flash or spinning
rust) and otherwise exert very low level control over the drive.
Sometimes or a lot of the time the drive's standard factory mode
will make loading your own firmware unnecessary, so this may
basically be the same as the previous level.<p>
</li>
<li>People who can directly read data from any intact physical storage,
either flash chips or hard drive platters. These people (probably)
don't need the controller to be intact and operable, so even
physical damage or destruction of it alone isn't enough. For
example, these people wouldn't be stopped if you drilled holes in
a SSD's PCB and snapped it apart, as long as the flash chips are
still fine.<p>
</li>
<li>People who can directly read (some) data even from partially
damaged physical storage, such as drilled (or snapped) hard drive
platters or a partially damaged flash chip. To stop these people
you need either complete physical destruction or for the data
that's on the storage to be useless, for example because it's
encrypted and you've destroyed the encryption keys.</li>
</ol>
<p>(There is a related dimension of how much repair people can do to disk
drives that you've deliberately damaged. A data recovery firm that's
given a decent amount of money might be able to repair moderate damage,
like a snapped SSD PCB, and then go on to recover data from it.)</p>
<p>As mentioned, not stopping the first sort of people from getting
at your data is basic negligence by this point. You absolutely have
to do that much. On the other extreme, against the last level of
people any method of destroying a disk drive that isn't using a
feature specifically designed to securely erase its data is probably
not good enough. On the other hand, you're probably not going to be
targeted by such people, and if you are being targeted by them the
<a href="https://www.usenix.org/system/files/1401_08-12_mickens.pdf">Mickens 'Mossad' rule</a> (<a href="https://www.schneier.com/blog/archives/2015/08/mickens_on_secu.html">also</a>)
probably applies.</p>
<p>Modern SSDs have <a href="https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase">(S)ATA secure erase</a> (<a href="https://www.thomas-krenn.com/en/wiki/Perform_a_SSD_Secure_Erase">also</a>) and
<a href="https://wiki.archlinux.org/title/Solid_state_drive/Memory_cell_clearing#NVMe_drive">NVMe secure erase</a>
features that, if implemented properly, will normally protect you
against everyone. As mentioned, the most certain approach is competent
host level encryption where you do your best to totally destroy the
real underlying encryption keys (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DiskEncryptionAndKeying">which haven't always been the
keys you enter yourself</a>), and then probably
you also do a SSD level secure erase. However, all of this requires
the drive to be in working order; if the drive has failed already
and you're worried about someone bringing it back to life and getting
your data, you may have a problem (although host level encryption may
still save you).</p>
<p>PS: As far as I know, once a SSD has erased a given flash block,
the data in that block is irretrievably gone (<a href="https://www.electronics-notes.com/articles/electronic_components/semiconductor-ic-memory/how-flash-memory-works-operation.php">cf</a>,
<a href="https://en.wikipedia.org/wiki/Flash_memory#Writing_and_erasing">also</a>).
This is different from (some) hard drive technologies, where magnetic
echos of old data could remain potentially detectable even after a
sector had been rewritten.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DiskErasingWhoAreYouStopping?showcomments#comments">5 comments</a>.) </div>When securely erasing disks, who are you trying to stop?2024-02-26T21:43:52Z2023-03-04T03:17:38Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/InternetPKIViaWebPKIcks<div class="wikitext"><p>On the Fediverse, <a href="https://mastodon.social/@cks/109939166885735933">I said</a></p>
<blockquote><p>Thesis: any realistic, viable Internet <a href="https://en.wikipedia.org/wiki/Public_key_infrastructure">PKI</a> scheme in
the moderate future will have to bootstrap from web PKI, because web
PKI is where the usage is to drive people to address ('solve' in a
non-mathematical sense) the hard problems. You don't literally have to
use HTTPS, but you need public TLS.</p>
<p>Counterpoint: end to end encrypted messaging. But I think that broadly
that has an 'introduction' (identity) problem.</p>
<p>(Brought on by thinking of DNSSEC cf <a href="https://infosec.exchange/@tqbf/109938525731567458"><@tqbf post></a>)</p>
</blockquote>
<p>A core element of any public key infrastructure (<a href="https://en.wikipedia.org/wiki/Public_key_infrastructure">PKI</a>) is
identifying things, because by themselves public keys are relatively
useless; you care about using public keys to talk to something or
authenticate some information, and for that you need to know who
you're talking to or who is giving you this information. Identifying
things on the Internet can sound simple ('root of trust' everyone
says in chorus) but it turns out to be very hard to do in practice
in the face of attackers, misaligned incentives, mistakes, and other
issues. There is exactly one Internet PKI system that is solving this
problem in practice with a demonstrated ability to operate at scale
and despite problems, and that is <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSThreeWorlds">public web TLS</a>.</p>
<p>Public web TLS is doing the hard work to deal with these problems
on an ongoing basis because HTTPS websites are a dominating thing
that people care about on the Internet. Various organizations put
a lot of money toward this, operating significant infrastructure
and spending expensive people's time on both operational issues
and design issues. The realistic odds that any new Internet PKI
scheme can either get those level of resources or duplicate the
effectiveness of the results without them is low. However, a new
scheme can get many of the benefits by bootstrapping itself from
web PKI in some way, relying on web PKI for at least a first level
of identification protection.</p>
<p><a href="https://en.wikipedia.org/wiki/DNS_over_TLS">DNS over TLS</a> and
<a href="https://www.rfc-editor.org/rfc/rfc8461">RFC 8461 SMTP MTA Strict Transport Security (MTA-STS)</a> are both examples of
bootstrapping additional Internet PKI using web PKI. MTA-STS directly
uses HTTPS as part of this bootstrapping; DNS over TLS merely relies
on public (web) TLS for identifying things, and so depends on all
of the pieces of modern TLS that make it hard to sidestep that. By
contrast, <a href="https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions">DNSSEC</a> is
a completely independent Internet PKI scheme, one that lacks
protections such as an equivalent of <a href="https://en.wikipedia.org/wiki/Certificate_Transparency">TLS Certificate Transparency</a> (see eg
<a href="https://a2mi.social/@dadrian/109939691963999693">this Fediverse post</a>).</p>
<p>The counterpoint to my thesis is end to end encrypted messaging systems,
which don't make any core use of the web PKI ecology. However, these
have an 'introduction' problem, which is the question of how you
establish both your identity within the messaging system and the
identity of the other people you're talking to. In high security
environments, often this requires out of band mechanisms to verify
in-system identities in some way (you and your counterparty might meet
in person to exchange identifying 'safety numbers', for example).</p>
<p>(Web PKI can't be used to solve this problem because web PKI identifies
names on the Internet, in a hierarchy, not people's abstract identities
within some system.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/InternetPKIViaWebPKI?showcomments#comments">One comment</a>.) </div>Future Internet PKI schemes need to be bootstrapped through web PKI2024-02-26T21:43:52Z2023-02-28T03:46:00Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/UniversityAccountsDistributedcks<div class="wikitext"><p>Over on the Fediverse, <a href="https://mastodon.social/@cks/109781073885598561">I said something about university account
systems</a>:</p>
<blockquote><p>Welcome to the university. We have a central account and
identification system, of course. Does everyone use the central
account system? Of course not. Does everyone know what central account
IDs their local accounts have? I see you're new here. Is there always
a one to one mapping between local and central accounts? Not a chance.</p>
<p>Universities are fun places.</p>
</blockquote>
<p>There are all sorts of reasons for this distributed and anarchic
environment. An obvious one is that some of the local systems can
predate the central one. Universities have generally long had the
idea of 'identities' for staff, students, and professors, but these
identities haven't been exposed as computer things, with passwords
and an authentication system that other people could use. Instead
these identities existed in databases, often in separate ones
depending on what sort of role a person had (staff were in the HR
database, students were in the registrar's database, and so on).
For a long time local areas were left to build their own systems
to create a merged view for people they cared about.</p>
<p>Related to this, even once a central computer account database existed,
it was generally not widely available to arbitrary people without a
lot of work. Some of this is (or was) simply the state of software for
delegating accounts and authorizations to third parties, but beyond that
universities are in the unusual position where this central database
can hold a lot of sensitive information, what is often called 'Personal
Identifiable Information (PII)'. A university database of 'all people
here' necessarily includes 'all of the students here', often with some
amount of juicy information that needs to be exposed in order for
authentication systems to be useful.</p>
<p>(And in the days before <a href="https://en.wikipedia.org/wiki/Multi-factor_authentication">MFA</a>, allowing
arbitrary parties within the university to use your central
authentication service also potentially enabled mass attacks against
people's valuable central accounts. Dealing with this is a complex
issue, one that requires real resources and doesn't always have good
answers, partly because a university has some unusual threats. For
example, sometimes students are happy to attack each other.)</p>
<p>Another issue is that a university wide account is sometimes too
powerful of a thing. If there is an outside professor or researcher
visiting one department for a month, the university as a whole may
not want to issue them an account that will give them central email,
enroll them in various things the university is licensed for, and
so on. And then when they leave at the end of the month, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversityAccountsComplicated">the
university may want to deactivate the account but the department
may not</a>, because the department
wants to foster an ongoing relationship with that person.</p>
<p>Additionally, once you have local accounts one thing that people start
wanting is additional accounts for projects and the like. These accounts
will belong to someone, but they're not the person's primary account
(and who the account belongs to can change over time). Project accounts
are especially important in an environment where a lot of projects are
done by graduate students who do graduate and leave (we hope), but
professors often want to keep them around and perhaps running.</p>
</div>
Universities are often environments with distributed accounts and identities2024-02-26T21:43:52Z2023-02-27T03:50:42Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/IPRecursiveRoutingProblemcks<div class="wikitext"><p>There's a general problem in certain sorts of IP network setups
that I will call the <em>recursive routing</em> problem. The recursive
routing problem comes up when you want to send some <em>inner</em> traffic
to an IP address in protected form, by taking the traffic, encrypting
it, and then sending it to the same IP address as <em>outer</em> traffic.
You need this <em>outer</em> traffic to be treated differently from the
inner traffic so it's not recursively re-routed back through the
encryption process again, because otherwise your traffic to the IP
will just spin around in a little circle and never get anywhere.</p>
<p>The simplest way to break the recursive routing problem is to do
the encryption at a level above IP routing. You establish a (TCP)
connection, then you have the application level use the connection
to arrange an encryption layer and transport your data over that.
This is how a huge amount of encrypted data crosses the Internet
every day, in the form of HTTPS (and some SSH and other application
level protocols). This has the problem that it's not a particularly
general solution, and so you wind up with people working out all
sorts of ways to tunnel traffic over HTTPS partly because they don't
have an accessible, usable general purpose encrypted transport layer
they can count on.</p>
<p>One of the things that <a href="https://en.wikipedia.org/wiki/IPsec">IPsec</a>
was supposed to be used for was encrypting such direct IP to IP
traffic. IPsec solved the resulting recursive routing problem with,
basically, <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/IPSecLimitation">magic</a>. Systems knew what
IPsec was and knew that once traffic had gone through IPsec, it was
fired out onto the network without looping back through again. You
could write an IPsec policy that said 'everything to host X must
be IPsec'd', and your system would make this work so that IPsec
traffic didn't actually count as 'everything' (including, I believe,
IPsec setup and negotiations). IPsec got this support partly because
it was '<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPv6AndIPsec">part of IP(v6)</a>', and so was obviously
worthy of special and more or less standardized treatment (and, for
example, <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/IptablesBlockNonIpsec">its own IP protocol along side things like TCP, UDP, and
GRE</a>). Other encrypted protocols
were not so lucky.</p>
<p>(<a href="https://utcc.utoronto.ca/~cks/space/blog/linux/IKEForPointToPointGRE">You could restrict IPsec's scope so it only applied to certain
traffic</a>, but I don't think you had
to.)</p>
<p>You can solve a related version of this, what I call the VPN recursive
routine problem, by using a highly specific IP route as well as a
more general one. You can say '128.100.X.0/24 is routed through the
VPN tunnel, but 128.100.X.VPN/32 is routed through <my default
gateway>'. Your traffic to everything on 128.100.X.0/24 except the
VPN itself will be shoved through the VPN tunnel, but the more
specific route for the VPN's IP diverts its traffic away from the
tunnel. When you don't actually talk to the VPN server other than
for the VPN itself, this is fine, but it's potentially less fine
if you'd like <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/VPNMeshAppeal">a mesh-capable VPN solution where you can make
individual hosts VPN endpoints</a>.</p>
<p>(One hack around this in a mesh-capable VPN environment is to not
use the same IP. VPN endpoints have <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/VPNRoutingProblem">an inside IP as well as an
outside IP</a>, and all of their services are
accessed using the inside IP. But this gives you other issues.)</p>
<p>Otherwise you are looking at operating system specific hacks. Linux
has <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/WireGuardEarlyNotes">a 'fwmark' system that you can use to tag specific packets
and route them specially</a>, using
Linux's general system for <a href="https://en.wikipedia.org/wiki/Policy-based_routing">policy based routing</a>. I believe
that Windows, macOS, and iOS all have their own approaches. Being
Linux based, Android may use fwmark and other things under the hood,
or it may have a different approach.</p>
<p>(If you're lucky, <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/NetworkManagerWireGuardClient">your software will magically hide this from
you</a>. However, even then
the issue can resurface if <a href="https://mastodon.social/@cks/109832254154580906">you want to do something a bit tricky</a>.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPRecursiveRoutingProblem?showcomments#comments">One comment</a>.) </div>The general 'recursive routing' problem in IP networking2024-02-26T21:43:52Z2023-02-09T04:06:48Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/UniversityPeopleWantOurIPscks<div class="wikitext"><p>Suppose that your organization has a VPN server that people use to
access internal resources that you don't expose to the Internet. One of
the traditional decisions you had to make when you were setting up such
a VPN server was whether you would funnel all traffic over the VPN, no
matter where it was to, or whether you'd funnel only internal traffic
and let external traffic go over people's regular Internet connections.
In many environments the answer is that the VPN server is only really
for internal traffic; it's either discouraged or impossible to use it for
external traffic.</p>
<p>Universities are not one of those places. In universities, quite often
you'll find that people actively need to use your VPN server for all
of their traffic, or otherwise things will break in subtle ways. One
culprit is the world of academic publishing, or more exactly online
electronic access to academic publications. These days, many of these
online publications are provided to you directly by the publisher's
website. This website decides if you are allowed to access things by
seeing if your institution has purchased access, and it often figures
out your institution by looking at your IP address. As a result, if a
researcher is working from home but wants to read things, their traffic
had better be coming from your IP address space.</p>
<p>(There are other access authentication schemes possible, but
this one is easy for everyone to set up and understand, and it
doesn't reveal very much to publishers. Universities rarely
change their IP address space, and in <a href="https://en.wikipedia.org/wiki/COVID-19_pandemic">the before times</a> you could assume that
most researchers were working from on-campus most of the time.)</p>
<p>In an ideal world, academic publishers (and other people restricting
access to things to your institution) could tell you what IP addresses
they would be using, so you could add them to your VPN configuration
as a special exemption (ie, as part of the IP address space that
should be sent through the VPN). In the real world, there are clouds,
frontend services, and many other things that mean the answer is large,
indeterminate, and possibly changing at arbitrary times, sometimes
out of the website operator's direct control. Also, the visible web
site that you see may be composited (in the browser) from multiple
sources, <a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowserNetworkDebuggingTweak">with some sub-resources quietly hosted in some cloud</a>. For sensible reasons, the website
engineering team does not want to have to tell the customer relations
team every time they want to change the setup and then possibly wait for
a while as customers get onboard (or don't).</p>
<p><a href="https://support.cs.toronto.edu/">Our</a> VPNs default to sending all
of people's traffic through us. At one point we considered narrowing
this down (<a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowserNetworkDebuggingTweak">for reasons</a>);
feedback from people around here soon educated us that this was not
feasible, at least not while keeping our VPN really useful to them.
When you're a university, people want your IPs, and for good reasons.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversityPeopleWantOurIPs?showcomments#comments">One comment</a>.) </div>In a university, people want to use our IPs even for external traffic2024-02-26T21:43:52Z2023-02-04T04:21:38Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/WhatIsAModernNOPcks<div class="wikitext"><p>I recently wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PowerPCInstructionOddity">an instruction oddity in the PowerPC 64-bit
architecture</a>, where a number of <code>or</code>
instructions with no effects were reused to signal <a href="https://en.wikipedia.org/wiki/Simultaneous_multithreading">hardware thread</a> priority
to the CPU. This came up when Go accidentally used one of those
instructions for its own purposes and accidentally lowered the
priority of the hardware thread.
One of the reactions I've seen has been a suggestion that people
should consider all unofficial NOPs (ie, NOPs other than the
officially documented ones) to be reserved by the architecture.
However, this raises a practical philosophical question, namely
what's considered a <a href="https://en.wikipedia.org/wiki/NOP_(code)">NOP</a>.</p>
<p>In the old days, CPU architectures might define an explicit NOP
instruction that was specially recognized by the CPU, such as <a href="https://www.masswerk.at/6502/6502_instruction_set.html">the
6502's NOP</a>.
Modern CPUs generally don't have a specific NOP instruction in this
way; instead, the architecture has a significant number of instructions
that have no effects (for various reasons including of the regularity
of instruction sets) and one or a few of those instructions is
blessed as the official NOP and may be specially treated by CPUs. The
PowerPC 64-bit official NOP is 'or r1, r1, 0', for example (which
theoretically OR's register r1 with 0 and puts the result back into
r1).</p>
<p>Update: I made a mistake here; the official NOP uses register r0,
not r1, so 'or r0, r0, 0', sometimes written 'ori 0, 0, 0'.</p>
<p>So if you say that all unofficial NOPs are reserved and should be
avoided, you have to define what exactly a 'NOP' is in your
architecture. One aggressive definition you could adopt is that any
instruction that always has no effects is a NOP; this would make
quite a lot of instructions NOPs and thus unofficial NOPs. This
gives the architecture maximum freedom for the future but also means
that all code generation for your architecture needs to carefully
avoid accidentally generating an instruction with no effects, even
if it naturally falls out by accident through the structure of that
program's code generation (which could be a simple JIT engine).</p>
<p>Alternately, you could say that (only) all variants of your standard
NOP are reserved; for PowerPC 64-bit, this could be all <code>or</code>
instructions that match the pattern of either 'or rX, rX, rX' or
'or rX, rX, 0' (let's assume the immediate is always the third
argument). This leaves the future CPU designer with fewer no-effect
operations they can use to signal things to the CPU, but makes the
life of code generators simpler because there are fewer instructions
they have to screen out as special exceptions. If you wanted to you
could include some other related types of instructions as well, for
example to say that 'xor rX, rX, 0' is also a reserved unofficial
NOP.</p>
<p>A CPU architecture can pick whichever answer it wants to here, but
I hope I've convinced my readers that there's more than one answer
here (and that there are tradeoffs).</p>
<p>PS: Another way to put this is that when an architecture makes some
number of otherwise valid instructions into 'unofficial NOPs' that
you must avoid, it's reducing the regularity of the architecture
in practice. We know that the less regular the architecture is, the
more annoying it can be to generate code for.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WhatIsAModernNOP?showcomments#comments">4 comments</a>.) </div>The CPU architectural question of what is a (reserved) NOP2024-02-26T21:43:52Z2023-01-30T03:23:44Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/PowerPCInstructionOdditycks<div class="wikitext"><p>Over on the Fediverse, <a href="https://mastodon.social/@cks/109717006467114169">I reported my discovery of a ppc64 oddity</a>:</p>
<blockquote><p>TIL that the ppc64 (PowerPC 64-bit) architecture overloads 'or
r1,r1,r1' (and the same using all r6 or r2) to change the (hardware)
priority of your thread. This came up in a Go code generation issue,
and Raymond Chen mentioned it in passing in 2018.</p>
<p><a href="https://github.com/golang/go/issues/57741"><Go issue 57741></a> <br>
<a href="https://devblogs.microsoft.com/oldnewthing/20180809-00/?p=99455"><Raymond Chen 2018 article></a> <br>
Also see the discussion in this PDF: <a href="https://student.ing-steen.se/unix/aix/redbooks/sg245768.pdf"><"Advanced POWER Virtualization
..."></a></p>
</blockquote>
<p>(<a href="https://mastodon.social/@cks/109717211018601975">Also</a>.)</p>
<p>As Raymond Chen notes, 'or rd, ra, ra' has the effect of 'move ra
to rd'. Moving a register to itself is a NOP, but several Power
versions (the Go code's comment says Power8, 9, and 10) overload
this particular version of a NOP (and some others) to signal that
the priority of your hardware thread should be changed by the CPU;
in the specific case of 'or r1, r1, r1' it drops you to low priority.
That leaves us with the mystery of why such an instruction would
be used by a compiler, instead of the official NOP (per Raymond
Chen, this is 'or r0, r0, 0').</p>
<p>The answer is kind of interesting and shows how intricate things can get
in modern code. Go, like a lot of modern languages, wants to support
stack tracebacks from right within its compiled code, without the aid
of an external debugger. In order to do that, the Go runtime needs to
be able to unwind the stack. Unwinding the stack is a very intricate
thing on modern CPUs, and you can't necessarily do it past arbitrary
code. Go has a special annotation for 'you can't unwind past here',
which is automatically applied when the Go toolchain detects that some
code (including assembly code) is manipulating the stack pointer in a
way that it doesn't understand:</p>
<blockquote><p>SPWRITE indicates a function that writes an arbitrary value to SP (any
write other than adding or subtracting a constant amount).</p>
</blockquote>
<p>As covered in <a href="https://go-review.googlesource.com/c/go/+/425396/2/src/runtime/asm_ppc64x.s">the specific ppc64 diff in the change that introduced
this issue</a>,
Go wanted to artificially mark a particular runtime function this
way (see <a href="https://go-review.googlesource.com/c/go/+/425396">CL 425396</a>
and <a href="https://github.com/golang/go/issues/54332">Go issue #54332</a>
for more). To do this it needed to touch the stack pointer in a
harmless way, which would trigger the toolchain's weirdness detector.
On ppc64, the stack pointer is in r1. So the obvious and natural
thing to do is to move r1 to itself, which encodes as 'or r1, r1,
r1', and which then triggers this special architectural behavior
of lowering the priority of that hardware thread. Oops.</p>
<p>(<a href="https://go-review.googlesource.com/c/go/+/461597/4/src/runtime/asm_ppc64x.s">The fix changes this to another operation that is apparently
harmless due to how the Go ABI works on ppc64</a>.
Based on <a href="https://go.googlesource.com/go/+/refs/heads/master/src/cmd/compile/abi-internal.md#ppc64-architecture">the ppc64 architecture section of the Go internal ABI</a>,
Go seems to define r0 as always zero.)</p>
<p>I don't know why PowerPC decided to make r1 (the stack pointer) the
register used to signal lowering hardware thread priority, instead
of some other register. It's possible r1 was chosen specifically
because very few people were expected to write an or-NOP using the
stack pointer instead of some other register.</p>
<p>(The whole issue is a useful reminder that modern architectures can
have some odd corners and weird cases.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PowerPCInstructionOddity?showcomments#comments">2 comments</a>.) </div>An instruction oddity in the ppc64 (PowerPC 64-bit) architecture2024-02-26T21:43:52Z2023-01-21T03:47:58Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SSDsAndBlockDiscardTrimcks<div class="wikitext"><p>Although things became complicated later, HDDs started out having
a specific physical spot for each and every block (and even today
most HDDs mostly have such a thing). You could in theory point at
a very tiny spot on a HDD and correctly say 'this is block 5,321
and (almost) always will be'. Every time you wrote to block 5,321,
that tiny spot would get new data, as an in-place update. SSDs
famously don't work like this, because in general you can't immediately
rewrite a chunk of <a href="https://en.wikipedia.org/wiki/Flash_memory">flash memory</a> that's been written
to the way you can a HDD platter; instead, you need to write to
<a href="https://en.wikipedia.org/wiki/Flash_memory#Block_erasure">newly erased flash memory</a>. In
order for SSDs to pretend that they were rewriting data in place,
SSDs need both a data structure to map from logical block addresses
to wherever the latest version of the block is in physical flash
memory and a pool of ready to use erased flash blocks that the SSD
can immediately write to.</p>
<p>In general the size of this erased blocks pool is a clear contributor
to the long term write performance of SSDs. By now we're all aware
that a fresh or newly erased SSD generally has better sustained write
performance than one that's been written to for a while. The general
assumption is that a large part of the difference is the size of the
pool of immediately ready erased flash blocks keeps shrinking as more
and more of the SSD is written to.</p>
<p>(SSDs are very complicated black boxes so we don't really know this for
sure; there could be other contributing factors that the manufacturers
don't want to tell us about.)</p>
<p>One way that SSDs maintain such a pool (even after they've been
written to a lot) is through over-provisioning. If a SSD claims to
be 500 GB but really has 512 GB of flash memory, it has an extra
12 GB of flash that it can use for its own purposes, including for
a pool of pre-erased flash blocks. Such a pool won't hold up forever
if you keep writing to the SSD without pause, but by now we expect
that sustained write speed will drop on a SSD at some point. One
of the many unpredictable variables in SSD performance is how fast
a SSD will be able to refresh its pool given some amount of idle
time.</p>
<p>The other way that SSDs can maintain such a pool is that you tell
them that some logical blocks can be thrown away. One way to do
this is <a href="https://wiki.archlinux.org/title/Solid_state_drive/Memory_cell_clearing">erasing the drive</a>,
which has the drawback that it erases everything. The more modern
way is for your filesystem or your block layer to use <a href="https://en.wikipedia.org/wiki/Trim_%28computing%29">a SSD 'TRIM'
command</a> to
tell the SSD that some blocks are unused and so can be entirely
discarded (the actual specifics in SATA, SCSI/SAS, and NVMe are
<a href="https://en.wikipedia.org/wiki/Trim_%28computing%29#Hardware_support">impressively varied</a>).
Obviously TRIM can be used to implement 'erase drive', although
this may not be quite the same inside the SSD as a real erase; this
use of TRIM for drive erase is what I believe <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ErasingSSDsWithBlkdiscard">Linux's <code>blkdiscard</code>
does by default</a>.</p>
<p>For obvious reasons, correctly implementing TRIM operations in your
filesystem and block storage layers is critical. If there are any
bugs that send TRIM commands for the wrong blocks (either to the
wrong block addresses or mistaking which blocks are unused), you've
just created data loss. People also used to worry about SSDs
themselves having bugs in their TRIM implementations, since modern
SSDs contain fearsome piles of code. By now, my impression is that
TRIM has been around long enough and enough things are using it by
default that the bugs have been weeded out (but then <a href="https://wiki.debian.org/SSDOptimization">see the Debian
wiki page</a>).</p>
<p>(I believe that modern Linux systems default to TRIM being on for
common filesystems. On the other side, OpenZFS still defaults
automatic TRIM to off except on FreeBSD, although it's been long
enough since <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ZFSNoTrimForMeYet">my initial caution about TRIM on ZFS</a> that I should try it.)</p>
<p>One of the interesting issues with TRIM is how it interacts with
encrypted disks or filesystems, which are increasingly common on
laptops and desktops. On the one hand, supporting TRIM is probably
good for performance and maybe SSD lifetime; on the other hand, it
raises challenges and potentially leaks information about how big
the filesystem is and what blocks are actually used. I honestly
don't know what various systems do here.</p>
<p>In many Linux environments, filesystems tend to sit on top of various
underlying layers, such as LVM and software RAID (and disk encryption).
In order for filesystem TRIM support to do any good it must be
translated and passed through those various layers, which is something
that hasn't always happened. According to <a href="https://wiki.archlinux.org/title/Solid_state_drive">the Arch Wiki SSD page</a>, modern versions
of LVM support passing TRIM through from the filesystem, and I
believe that software RAID has for some time.</p>
<p>A further complicate in TRIM support is that if you're using SATA
SSDs behind a SAS controller, apparently not all models of (SATA)
SSDs will support TRIM in that setup. We have Crucial MX500 2 TB
SSDs in <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/ZFSFileserverSetupIII">some Ubuntu 22.04 LTS fileservers</a> where 'lsblk -dD' says the SATA
connected ones will do TRIM operations but the SAS connected ones
won't. However, WD Blue 2 TB SSDs say they're happy to do TRIM even
when connected to the SAS side of things.</p>
<p>(Also, I believe that TRIM may often not work if you're connecting
a SATA SSD to your system through a USB drive dock. This is a pity
because it's otherwise a quite convenient way to work through a
bunch of SSDs to blank out and reset. I wouldn't be surprised if
this depends on both the USB drive dock and the SATA SSD. Now that
I've discovered 'lsblk -D' I'm going to do some experimentation.)</p>
<p>At one point I would have guessed that various SSDs might specially
recognize writes of all-zero blocks or similar things and trigger
TRIM-like functionality, where the write is just discarded and the
logical block is marked as 'trim to <zero or whatever>' (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDBenchmarkingConcerns">I worried
about this in the context of benchmarking</a>).
I can't rule out SSDs doing that today, but given widespread support
for TRIM, recognizing all-zero writes seems like the kind of thing
you'd quietly drop from your SSD firmware to simplify life.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDsAndBlockDiscardTrim?showcomments#comments">One comment</a>.) </div>Some things on SSDs and their support for explicitly discarding blocks2024-02-26T21:43:52Z2023-01-19T04:10:24Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/LetsEncryptAndMultipleNamescks<div class="wikitext"><p>One of the things that people don't like about <a href="https://letsencrypt.org/">Let's Encrypt</a>'s ACME protocol for getting TLS certificates
is that it's complicated (even beyond using <a href="https://en.wikipedia.org/wiki/JSON_Web_Token">JSON Web Tokens (JWT)</a>). Part of this
complexity is that it famously requires you to create and register
an account, and <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/LetsEncryptAuthorizations">the actual authorization process to get a TLS
certificate for a domain involves this account in a multi-step
process</a>. You can readily come up with
simpler single-step processes (such as the one in <a href="https://rachelbythebay.com/w/2023/01/04/cert/">Another look
at the steps for issuing a cert</a>), so you can ask
why ACME requires its multi-step process. It recently occurred to
me that one reason for this is that it probably makes the logic of
issuing a TLS certificate for multiple names simpler.</p>
<p>(We think about multiple names in TLS certificates a lot; most of
<a href="https://support.cs.toronto.edu/">our</a> TLS certificates have at
least two names for what boils down to historical reasons.)</p>
<p>A single-step process for getting a TLS certificate for a domain
would be to tell the Certificate Authority that you want a certificate
(with a given private key), the CA challenges you to demonstrate
control of that domain, and when you do it issues you with that
certificate. However, extending this process to a TLS certificate
for multiple names makes it somewhat more involved. You would have
to send the CA the list of names you wanted in the certificate, it
would have to challenge you separately to demonstrate control of
each domain, and keep careful track of which domains you'd proven
control of and which you hadn't. You'd want to carefully consider
'control of domain' timeouts in this process to decide how close
together do you have to prove control of all domains; within a few
minutes, or a few hours, or a few days? If you'd made a mistake in
a domain name, you'd probably have to cancel or abandon the current
request and start all over from scratch (re-proving control of all
correct names in a second attempt).</p>
<p>The ACME protocol's two step process makes this simpler and more
uniform. You create an account, prove that that account has control
of one or more domains, and then have that account ask for a TLS
certificate for some name or names that it's proven control of
within a specific time period. If you ask for a TLS certificate
that includes a name you haven't demonstrated control over, the
request fails (more or less, <a href="https://www.rfc-editor.org/rfc/rfc8555.html#section-7.4">it's a bit complicated</a>). The
protocol and the server can consider each 'prove control of' request
separately and independently, without having to maintain state on
how many domains you've proven control of (and how recently) and
trigger issuing a TLS certificate when all of the boxes are ticked
off.</p>
<p>(Instead it's the ACME client's job to request actual issuance of
the certificate, which is when it submits the actual <a href="https://en.wikipedia.org/wiki/Certificate_signing_request">Certificate
Signing Request (CSR)</a>. I admit
I was a little bit surprised to find out that ACME actually uses
real CSRs, since they're complicated objects and these days almost
all of their contents get ignored by the CA, or maybe cause your
request to be rejected.)</p>
<p>Using a keypair to identify the account instead of what would in
effect be a (web) cookie means that any leak or interception of the
traffic between you and Let's Encrypt doesn't give out information
a potential attacker can use to impersonate you (and thus use your
identity to get TLS certificates). This is important in a two-step
protocol, but not in a one-step one where you tell the CA the keypair
you're using for the TLS certificate before it starts authorizing
you (all the CA has to do to stop attackers intercepting traffic
is to bind all of the results to the future certificate's public
key).</p>
<p>(The two-step process could make you pre-commit to a TLS keypair
for your eventual certificate or certificates and use that as your
identity, but then you'd have to decide in advance what names would
be in the eventual TLS certificate.)</p>
<p>PS: <a href="https://www.rfc-editor.org/rfc/rfc8555.html#section-7.4">The actual ACME protocol steps for getting a TLS certificate</a> don't
require you to start in a state where you've already proven control
over all of the domains. Instead, the ACME server will tell you if
you need to do some authorizations, and starting from a certificate
request is the normal flow of establishing them. There's a separate
<a href="https://www.rfc-editor.org/rfc/rfc8555.html#section-7.4.1">'Pre-authorization'</a> process
that ACME servers can implement that allows you to authorize accounts
for names before you make a certificate request.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/LetsEncryptAndMultipleNames?showcomments#comments">One comment</a>.) </div>Let's Encrypt's complex authorization process and multi-name TLS certificates2024-02-26T21:43:52Z2023-01-09T04:19:14Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ATXChassisPowerSwitchNotescks<div class="wikitext"><p><a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HomeMachineAndSunkCosts">My problematic home desktop</a> continues
to be problematic, so recently over on the Fediverse <a href="https://mastodon.social/@cks/109571590236666394">I had a
clever idea about 'replacing' the case front panel with just a
stand-alone ATX chassis power switch tucked away somewhere</a>, which <a href="https://mastodon.org.uk/@penguin42/109571593386362502">@penguin42
suggested improving by relying on setting the BIOS to 'always power
on when AC is restored' and not having a chassis power switch at all</a>. This left
me with a certain amount of curiosity about ATX chassis power
switches and the matrix of possibilities for what you can do to
your computer and how that interacts with BIOS 'AC loss' settings.</p>
<p>Famously, ATX power supplies are really controlled by the motherboard,
not by any front panel case switches (although better PSUs will
have a hard power switch so you don't have to yank the cord). The
front panel case power switch is a soft switch that communicates
with the BIOS or triggers power on, and your motherboard can have
the PSU 'turn off' (which still leaves standby power flowing to the
motherboard), which is what enabled modern PC Unixes to have commands
like '<code>poweroff</code>' and '<code>halt -p</code>'. Physically, an ATX chassis power
switch (the front panel switch) is normally a momentary-contact
switch. It is normally off (no current flowing), but when pushed
it connects the circuit for as long as you keep it pressed. Since
the circuit is normally open, not having a chassis power switch
connected is the same as not pressing it, so your system can still
power up in this state under the appropriate conditions.</p>
<p>(I was reasonably certain of this but not completely sure, so I
looked it up.)</p>
<p>BIOSes for ATX motherboards tend to have three options for what to
do if AC power is lost and then return: they can stay powered off,
they can always power on, or they can return to their last state
(either powered off if you turned them off, or powered on if you'd
turned them on). If you run 'poweroff' on your Linux system and
your BIOS is set to either 'stay powered off' or 'return to last
state', you will have to use the chassis power button to trigger
the motherboard to power up (or possibly use the BMC on servers
with it). If you run 'poweroff' on a system set to always power on
when AC comes back, you can cut AC power and then turn it back on
in order to get the motherboard to power back up.</p>
<p>(The downside of 'always power on when AC comes back' is that you
need to make physical changes to make a system stay down after a
power loss and restart, even if the changes are only flipping a
switch.)</p>
<p>If you don't have a chassis power button, you can no longer power
off the system from the front panel, either by triggering some sort
of orderly shutdown or by holding the front panel power button down
long enough that the BIOS does an immediate power off (and if you
have no reset switch too, you can't trigger that either). Instead,
your only option is an immediate hard power off by flipping the PSU
switch. You can also no longer power the system on from the front
panel; instead you need the BIOS to be set to 'always power on after
AC loss' (and stay that way) and then switch AC power off and back
on again, or you're going to have to open up the case to get the
system back on. Some motherboards (including <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/HomeMachine2018">my home desktop</a>) have an a motherboard push switch that
you can use to trigger a power on; otherwise you'll have to reconnect
the front panel (or connect a stand alone switch).</p>
<p>(Naturally, you want to make sure that your BIOS has a reliable
'always power on when AC power comes back' setting before you
disconnect the chassis power switch from the motherboard. Otherwise
you may wind up having an annoying time. This is definitely one of
those changes that you want to make in the right order.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ATXChassisPowerSwitchNotes?showcomments#comments">3 comments</a>.) </div>Sorting out PC chassis power switches for ATX power supplies2024-02-26T21:43:52Z2022-12-26T04:09:36Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/YamlPracticalDocumentationIssuecks<div class="wikitext"><p>These days, YAML is used as the configuration file format for an
increasing amount of systems that I need to set up and operate for
<a href="https://support.cs.toronto.edu/">work</a>. I have my issues with
YAML in general (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/YamlWhitespaceProblem">1</a>, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/YamlComplexityProblem">2</a>),
but in the process of writing configuration files for programs that
use YAML, I've found an entirely practical one, which I will summarize
this way: <strong>a YAML schema description is not actually documentation
for a system's configuration file</strong>.</p>
<p>It's entirely possible to write good documentation for a configuration
system that happens to use YAML as its format. But while showing
people the schema is often easy, it's not sufficient. Even a complete
YAML schema merely tells the reader what values can go where in
your configuration file's structure. It doesn't tell them what those
values mean, what happens if they're left out, why you might
want to set (or not set) certain values, or how settings interact
with each other.</p>
<p>(And people don't even always do complete YAML schemas. I've seen
more than one where the value of some field is simply documented as
'<string>', when in fact only a few specific strings have meaning and
others will cause problems.)</p>
<p>I don't know why just dumping out your YAML schema is so popular
as a way to do configuration file 'documentation'. Perhaps it's
because you have to do it as a first step, and once you've done
that it's attractive to add a few additional notes and stop,
especially if you think the names of things in your schema are
already clear about what they're about and mean. Good documentation
is time consuming and hard, after all.</p>
<p>I suspect that this approach of reporting the schema and stopping
is used for YAML things other than configuration files, but I
haven't encountered such things yet. (I've only really encountered
YAML in the context of configuration files, where it's at least
better than some of the alternatives I've had to deal with.)</p>
<p>(All of this assumes that your configuration file is as simple as
a set of keys and simple values. <a href="https://utcc.utoronto.ca/~cks/space/blog/programming/YAMLAndConfigurationFiles">Not all configuration files are
so simple</a>, but systems
with more complex values tend to write better documentation. Possibly
this is because a dump of the schema is obviously insufficient when
the values can be complex.)</p>
</div>
A practical issue with YAML: your schema is not actually documentation2024-02-26T21:43:52Z2022-12-17T03:55:24Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/RealNamesBroadcastProblemcks<div class="wikitext"><p>One of the ever popular ideas for a 'better Internet' (or a better
service of some sort) brought up by potentially well intentioned
people is that you should have to use your real name instead of any
pseudonym. There are many, many problems with this idea and you can
find it thoroughly answered (and debunked) in various places. But
recently on the Fediverse <a href="https://mastodon.social/@cks/109417934712425607">I had an additional thought about it</a>:</p>
<blockquote><p>One of the not 100% obvious problems of a genuinely enforced "real
names only" policy is that it forces a number of types of people to
immediately broadcast various changes of status by renaming themselves
to their new 'proper name'.</p>
<p>In re this rundown of other issues and how it's a bad idea in
general: <br>
<a href="https://mastodon.laurenweinstein.org/@lauren/109417029952957232"><link></a></p>
</blockquote>
<p>If a place has a true, enforced "real names only" policy, it follows
that you must not only use your real name when you create your account
but also promptly change the real name for your account if your normal
or legal real name changes. Interested or nefarious parties can then
watch for such changes in name to detect interesting (to them) changes
in status. Did you get married and change your last name? Broadcast
that. Did you get divorced and change your last name back? Broadcast
that. Did you decide to legally cut ties with your abusive family and
change your name in the process? Again, you get to broadcast that.</p>
<p>(Of course, there are additional and <a href="https://en.wikipedia.org/wiki/Deadnaming">more sensitive</a> reasons to change a name,
reasons that the person doing it may well very much not want to
broadcast.)</p>
<p>Of course, if you don't update your real name, in practice the place
will not automatically know that your listed real name is now incorrect
and you're out of compliance with its policies. But this makes life
dangerous for you, because by being in violation of policies you're
handing any enemies a means of attacking your account; if they know
of your change in status through other means, all they have to do is
file a complaint.</p>
<p>(This is of course related to the issue that <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/LoginsDoChange">login names do change</a>, as kind of the flipside of it.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/RealNamesBroadcastProblem?showcomments#comments">One comment</a>.) </div>An enforced 'real names only' policy forces people to advertise things2024-02-26T21:43:52Z2022-12-13T02:05:57Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/CAPartialDistrustWhyGoodcks<div class="wikitext"><p>One of the arguments I've heard against supporting partial distrust
of Certificate Authorities in places like Linux root certificate
stores (<a href="https://utcc.utoronto.ca/~cks/space/blog/linux/CARootStoreTrustProblem">which you currently can't really do</a>) is that a bad CA can simply
backdate TLS certificates to get around things like 'certificates
issued from December 1st 2022 onward won't be trusted'. On the one
hand, this is technically true (although these days either such a
TLS certificate wouldn't be usable in the majority of web browsers
or it would soon be detected through Certificate Transparency logs).
On the other hand, there are a collection of reasons to think that
it's a good thing that browser can do this sort of thing (and thus
that more software should support it).</p>
<p>The original view of CA trust was that it was binary; either the
CA was working perfectly fine and would in the future, or the CA
was entirely bad and compromised, and should be distrusted immediately.
While there have been some CA incidents like this, such as <a href="https://en.wikipedia.org/wiki/DigiNotar">the
DigiNotar compromise</a>, in
practice a significant number of the CA problems that have come up
over the years have been less clear cut than this, such as <a href="https://utcc.utoronto.ca/~cks/space/blog/web/WoSignBrowsersNotBlinking">the
WoSign case</a> (and the WoSign case
was exceptionally bad in that WoSign actively issued bad TLS
certificates). The recent case of TrustCor is illustrative; as far
as was reported to Mozilla (<a href="https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/oxX69KFvsm4/m/yLohoVqtCgAJ">and summarized by them</a>),
TrustCor never mis-issued any TLS certificates or committed any
other clear violations of CA requirements. They were merely sketchy.</p>
<p>The problem with fully distrusting a CA is that you cause problems
for people (and software) talking to legitimate sites making
legitimate use of TLS certificates from the CA. Sometimes (as with
DigiNotar) there is no real choice, but often there is a balance
between the harm you would do now and the harm that you will prevent
in the future (and perhaps now). A partial distrust is a way to
shift the balance of harm, so that you do less harm to people today
at an acceptable cost for potential future harm. By reducing the
clear cost, you make people more likely to take action, which improves
the situation even if it's not theoretically perfect.</p>
<p>(Distrusting a questionable CA a year in the future, or six months in
the future, or a month in the future, or starting now, is better than
continuing to trust it without limit.)</p>
<p>The second order effect of being able to partially distrust a CA is that
it gives browsers more power compared to CAs. To copy the joke about
owing a bank money, if you entirely distrust a popular CA today, that's
your problem (people will blame you for the scary TLS messages), while
if you distrust a CA starting in six months, that's much more the CA's
problem. Being able to credibly threaten CAs with distrusting future
TLS certificates without breaking current users is a powerful weapon,
and browsers have repeatedly been able to use it to force changes in CA
behavior.</p>
<p>(This includes somewhat more subtle issues like browsers limiting
the trust interval (starting with future certificates) or requiring
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">CT log attestations</a> starting at a
certain point. Both of those were 'partial distrust' issues in that
they initially applied to future TLS certificates, not current
ones.)</p>
<p>Today, Linux and more generally Unix software is out in the cold
on these decisions and is stuck in the binary world, with the binary
choices about how many people they harm today in exchange for harm
reduction. <a href="https://mastodon.social/@cks/109462273755447425">Ubuntu entirely removing TrustCor's certificates is
probably the right decision overall</a>, but it does
potentially harm people using Ubuntu who have previously been talking
to hosts with TrustCor TLS certificates.</p>
</div>
Why being able to partially distrust a Certificate Authority is good2024-02-26T21:43:52Z2022-12-07T04:44:08Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/IntelCPUsAndXMPAnnoyancecks<div class="wikitext"><p><a href="https://mastodon.social/@cks/109424423390226799">I said something on the Fediverse</a>:</p>
<blockquote><p>Once again I must applaud Intel for making it as hard as possible to
find out if XMP RAM profiles work on Intel desktop CPUs that can't be
overlocked (non-K series, or whatever it is today). Will they give you
a definite answer, even in their website's XMP FAQ? Of course not.</p>
<p>(And I genuinely don't know what the answer is, especially today.)</p>
</blockquote>
<p>Here, 'XMP' is the common and official abbreviation for '<a href="https://www.intel.ca/content/www/ca/en/gaming/extreme-memory-profile-xmp.html">(Intel)
Extreme Memory Profile</a>,
which potentially allows you to run your DIMMs at higher speeds
than the official <a href="https://www.jedec.org/">JEDEC</a> RAM timings. AMD
has an equivalent version (<a href="https://en.wikipedia.org/wiki/Serial_presence_detect#AMD_Extended_Profiles_for_Overclocking_(EXPO)">cf Wikipedia</a>),
but apparently many AMD motherboards also support using XMP DIMM
timing information. As covered in the WikiChip entry on <a href="https://en.wikichip.org/wiki/intel/xmp">XMP</a>, XMP information comes
from the DIMM(s) via <a href="https://en.wikipedia.org/wiki/Serial_presence_detect">SPD</a> and as far
as I know is then interpreted by the BIOS, if enabled.</p>
<p>The situation with XMP/EXPO support on AMD desktop Ryzen CPUs is
straightforward; as far as I know, they all support it, leaving you
only a question of whether the motherboard does. The situation with
Intel CPUs is much less clear, and this is a deliberate choice on
Intel's part. Having poked around, there seem to be two answers.</p>
<p>The practical answer is that it seems that non-overclockable Intel
desktop CPUs have traditionally supported XMP; I've even seen
assertions in online sources that faster XMP speeds mostly don't
involve the CPU at all, so it's all up to the motherboard and its
BIOS. It used to be that you had to use Intel's top end Z series
chipsets to get XMP support, but these days apparently B series
chipsets also support it. So it seems likely that if you buy a
non K-series Core series desktop CPU and pair it with XMP capable
DIMMs on a suitable motherboard, you will get faster RAM.</p>
<p>(How much of a difference faster RAM will make is an open question,
but I've read some things that suggest it's especially helpful if
you're using integrated graphics, as I am on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HomeMachineAndSunkCosts">my problematic home
desktop</a>.)</p>
<p>The Intel answer is that while Intel won't say that you have to
have an overclockable K-series Core desktop CPU in order to use
XMP, all of their examples of qualified DIMMs and systems with
desktop CPUs use overclockable ones as far as I can see. Intel
certainly wants you to buy a K-series Core i5/i7/i9 CPU if you want
to use XMP and it will clearly do quite a lot to nudge you that way
without actually saying anything untrue that could get it in trouble
with authorities (such as 'on desktop CPUs, you must have a K-series
overclockable CPU', which is likely false today since Intel isn't
actually saying that).</p>
<p>Since Intel only officially qualifies K-series Core desktop CPUs,
this lets Intel off the hook if XMP doesn't work with a non-K CPU
in your configuration, even if it looks like everything should be
fine. Will the uncertainty help push you toward spending a bit extra
on a K-series CPU? Intel certainly hopes so. With that said, it does
appear that the price difference between K-series and non K-series
CPUs may have gotten fairly small (for the moment). Still, the whole
thing is yet another irritation of dealing with Intel's desktop CPU
market segmentation.</p>
<p>(<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/DDR4RAMSpeedQuestions">I wrestled with DDR RAM speed questions a few years ago</a>, with no answers then either.)</p>
</div>
The annoying question of Intel CPU support for XMP RAM profiles2024-02-26T21:43:52Z2022-11-29T02:39:29Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/GettingSizePrefixesStraightcks<div class="wikitext"><p>As a system administrator and general computer person, I deal with
at least three different sorts of sizes; bytes in powers of ten
(beloved by disk makers and not me), bytes in powers of two (for
RAM and many other things), and bits in powers of ten (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/EthernetHowFastIsIt">used for
Ethernet speeds</a>). For various reasons I default
to bytes in powers of two, and I've been inconsistent about the
unit suffixes that I use for them when I write about things here
on <a href="https://utcc.utoronto.ca/~cks/space/blog/">Wandering Thoughts</a>. So in the spirit of things like
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NVMeGettingTermsStraight">getting my NVMe terminology straight</a>,
today I'm going to cover what I should use (with no guarantees that
I actually will).</p>
<p>In theory, the official <a href="https://en.wikipedia.org/wiki/Metric_prefix">metric (power of ten) prefixes</a> are written as 'T',
'G', 'M', and 'k'. This isn't in accordance with customary computer
use, which upper-cases the 'k' to 'K'. According to Wikipedia,
<a href="https://en.wikipedia.org/wiki/Binary_prefix">binary prefixes</a> are
written as 'Ti', 'Gi', 'Mi', and 'Ki', although Wikipedia also notes
that there's plenty of usage (my phrasing) of plain 'T', 'G', and
so on to mean the binary versions. However, both usage leave it
ambiguous whether you're writing about bytes or bits.</p>
<p>As covered in Wikipedia's <a href="https://en.wikipedia.org/wiki/Megabyte">Megabyte</a> page, in theory this is
disambiguated with a trailing 'B' to mean bytes. Thus, 'TB' means
decimal terabytes and 'TiB' means binary terabytes. Or you can just
write out 'TBytes' and 'TiBytes'. Per Wikipedia's <a href="https://en.wikipedia.org/wiki/Bit_rate">Bit rate</a> and <a href="https://en.wikipedia.org/wiki/Data-rate_units">Data-rate units</a> pages, units of
bits (or bits per second) are written out in full, as 'Gbit/s'
(decimal) or 'Gibit/s' (binary, should you find a use for binary
bitrates).</p>
<p>Actual usage by real programs and people does not correspond to
this nice picture. It's very common for programs and people, myself
included, to use 'G' or 'GB' to mean a power of two gigabyte; for
example, this is what several versions of Unix df will produce with
'df -h'. Modern computer users like things such as memory and disk
usage in powers of two bytes because these things are normally
allocated in sizes like 4 KiBytes (to write it out in full).</p>
<p>For completely correct usage I should use 'GiB', 'TiB', and 'MiB'
when I mean power of two bytes instead of power of ten bytes (which
is almost all of the time), and 'Gbits' when I mean power of ten
bits (or bitrate). In practice, if I say '10G' or '1G' in the context
of Ethernet, people are going to know what I mean (and they may not
know what the data rate is exactly, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/EthernetHowFastIsIt">just as I didn't until recently</a>). Similarly, since almost no one uses power
of ten sizes for things related to RAM and memory, '1 G' or '1 GB'
is relatively unambiguous to people, even though it's technically
incorrect.</p>
<p>In the end there's no good answer to this mismatch between official
usage and customary usage (and expectations). The metric/SI focus
on powers of ten is right for general usage, but in computing a lot
of things are based around powers of two (once bytes became fixed
at 8 bits), making a default to that very natural. We're probably
never going to reconcile the two sides, especially in informal usage
(which my writing generally is).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/GettingSizePrefixesStraight?showcomments#comments">2 comments</a>.) </div>Getting my unit size 'prefixes' (really suffixes) straight, sort of2024-02-26T21:43:52Z2022-11-28T03:54:36Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TwitterQuoteTweetsPathcks<div class="wikitext"><p>Twitter's 'quote tweets' feature is back in the news in my circles
because <a href="https://mastodon.social/@Gargron/99662106175542726">the Fediverse's Mastodon software famously deliberately
doesn't have them</a>.
I find 'quote tweets' to be a fascinating example and case of how what
looks like relatively neutral technical or design 'solutions' can
drastically change the social side of a service. But to understand this,
I need to cover the path that Twitter took to having quote tweets,
because they didn't spring out of nowhere.</p>
<p>In the beginning, Twitter had no quoted tweets at all. However, people
still wanted to do things like discuss things that other people had said
or point their followers to something with additional commentary. So
people did the obvious thing; they wrote a new tweet and linked to the
original. Like this:</p>
<blockquote><p>I find this argument for abandoning UTC leap seconds to be interesting
but ultimately wrong-headed. http://twitter.com/auser/....</p>
</blockquote>
<p>If you were sufficiently interested in whatever the tweet was talking
about, you could follow the link and read it, but otherwise you were
probably flying pretty blind. My vague memory is that people did this
every so often but not very much.</p>
<p>Later, the web version of Twitter got a general link preview feature
that I believe is (still) called 'cards'. If a tweet has a link,
Twitter will put a little snippet, preview, or whatever of the link
target below the tweet itself, so you can see something of where
you'll be going before you click (and maybe you won't click at all,
especially as Twitter will do things like play a Youtube video
inline). Naturally, if your link was to a tweet, Twitter would
basically inline the tweet in the card. This pretty much created
the visual presentation of a quoted tweet even if you still created
them by hand, and my memory is that this made them rather more
popular since when you linked to a tweet this way, people could see
what you were reacting to or commenting on right away, making it far
less opaque and more interesting.</p>
<p>Then finally Twitter decided that enough people were quoting tweets
this way that they would make it an actual feature, 'Quote Tweet'.
Although the visual appearance of the result didn't change much (or
maybe at all), the actual feature made it much easier to actually
do, especially on smartphones (I'm not sure if it changed what
notifications the original tweet author got, but it may have).
Naturally this significantly increased use of the general feature
and led to a situation where many people consider it to contribute
to negative behavior (<a href="https://mastodon.social/@Gargron/99662106175542726">cf Mastodon's reasons for not having an
equivalent</a>).
What had once been a relatively esoteric and little used thing
suddenly became a common thing, even the source of memes (eg various
'quote tweet this with ...').</p>
<p>(I suspect that making it much easier to quote tweets on smartphones was
a big part of increasing their usage, since my understanding is that a
significant amount of Twitter usage is from smartphones.)</p>
<p>Each step in this evolution is reasonable and appealing to people using
Twitter in isolation, and is probably not large in either technology or
design (if you accept the general idea of cards). But the end result is
a quite different social experience.</p>
<p>(I'm sure that this has happened in other systems. But Twitter's step by
step evolution from extremely minimal beginnings makes it visible and
fascinating this way, especially as the early people using it came up
with many of the core ideas that were later implemented as features.)</p>
</div>
Twitter's 'quoted tweets' feature and how design affects behavior2024-02-26T21:43:52Z2022-11-23T02:22:08Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/EthernetHowFastIsItcks<div class="wikitext"><p>I've been around computer networking for a long time, but in all
of that time I've never fully understood what the speed numbers of
the various Ethernet standards actually meant. I knew what sort of
bandwidth performance I could expect from '1 Gigabit' Ethernet as
measured on TCP streams, but I didn't know exactly what '1G' meant,
beyond that it was measured in bits (for example, was it powers of
ten G or powers of two G). My quiet confusion was likely increased
by <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusCheckingNetworkInterfaces">the numbers that the Prometheus host agent reports</a>, where a Gigabit
Ethernet interface is reported as '125,000,000'.</p>
<p>('1G' in powers of ten decimal is 1,000,000,000. I'm writing it
out with commas because otherwise comparing the two numbers by eye
is painful.)</p>
<p><a href="https://en.wikipedia.org/wiki/Gigabit_Ethernet">The Wikipedia page on Gigabit Ethernet</a> starts out a bit
confusing (for people like me) by talking about how the physical
link generally uses <a href="https://en.wikipedia.org/wiki/8b/10b_encoding">'8b/10b encoding'</a>, which inflates the
line rate by 25% over the data rate. However, what is '1G' about
1G Ethernet is the data rate, not the raw on the wire rate. That
data rate is '1000 Mbit/s' (per the Wikipedia page), which tells
us that we're dealing with powers of ten bit rates. When you divide
that 1000 Mbit/s of data by 8 (to convert to bytes), you get 125
(decimal) Mbytes/s, and the network interface rate reported by the
Prometheus host agent now makes sense.</p>
<p>(As a practical matter this means that the 8b/10b encoding is
something we ignore. It's a coincidence that the line rate of
'1250 Mbit/s' looks similar to the '125 Mbytes/s' data rate.)</p>
<p>If we then convert from powers of ten to powers of two, we would
expect a theoretical bandwidth of around 119 MiBytes/sec (sort of
using <a href="https://en.wikipedia.org/wiki/Binary_prefix">the official binary prefix notation</a>). We're not going to
get that bandwidth at the TCP data level, because this doesn't
account for the overhead of either Ethernet framing or TCP itself.</p>
<p>I'm not as clear about how <a href="https://en.wikipedia.org/wiki/10_Gigabit_Ethernet">10G Ethernet</a> is encoded on
the wire, but it doesn't matter for this because what we care about
is still the data rate. 10G Ethernet has a data rate of 10,000
Mbit/s ('10 Gbit/s' in <a href="https://en.wikipedia.org/wiki/List_of_interface_bit_rates">Wikipedia's list of interface bit rates</a>), which
translates to 1,250 Mbytes/sec (power of ten). This theoretically
maps to 1192 MiBytes/s (ten times 1G Ethernet), but you're never
going to get that bandwidth for TCP data flows because of the
assorted overheads.</p>
<p>(This of course explains the Prometheus host agent's report that
the 10G network speed is '1,250,000,000', and similarly that the
100 Mbit speed is '12,500,000'. All of which are much clearer when
written out with commas for separation, or whatever character is
used for this in your locale.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/EthernetHowFastIsIt?showcomments#comments">One comment</a>.) </div>Understanding how fast Ethernet really is (and in what units)2024-02-26T21:43:52Z2022-11-18T04:15:49Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FediverseWhoRunsInstancescks<div class="wikitext"><p>One of the current ideas floating around the Fediverse is that
(significant) organizations should be encouraged to run their own
instances, in part to implicitly verify the identities of people
posting from them (one writeup of this idea is <a href="https://newsletter.danhon.com/archive/4230/">here</a>). If someone with the
name of my <a href="https://en.wikipedia.org/wiki/Member_of_parliament">MP</a>
posts from the official instance run by the Canadian (federal)
Parliament, for example, I can be pretty sure of what I'm getting.
In related news, <a href="https://mastodon.mit.edu/">MIT has stood up their own Fediverse instance</a>. This got me thinking some thoughts
<a href="https://mastodon.social/@cks/109311286409915223">over on the Fediverse</a>,
which I'll repeat here in slightly edited form with some annotations.</p>
<p>(You may want to read the replies over on the Fediverse; people had
things to contribute, and not everyone is as pessimistic as me.)</p>
<blockquote><p><a href="https://mastodon.social/@cks/109311286409915223">@cks</a>:
The more I think about it, the less certain I am that lots of
organizations should run Fediverse instances and encourage their
members to use them. The big issue is: what happens when you stop
being associated with the organization, especially if the parting is
an unhappy one? Is it a good idea to tie people's Fediverse presences
to their employer or university or etc?</p>
</blockquote>
<p>If people will only use their organization's Fediverse instance for
'official communication', then that's one thing; it's useful in
some circumstances (for example, an official Canadian Parliament
instance) but not generally. But in practice this is not how a
Fediverse instance would normally be used if you opened it up to a
general university population, or even a population of professors.
This sort of official communication is also not what's in the air
of 'academic Twitter'; what's in the air is university people moving
their personal presence from Twitter to the Fediverse.</p>
<p>(While you can migrate from one Fediverse instance to another, to
the best of my knowledge this requires the cooperation of your
original instance. If you left it and its organization on good
terms, this cooperation will probably be available and you can even
migrate in advance. If you had an abrupt and unhappy parting, it may
not be.)</p>
<p>I continued:</p>
<blockquote><p>I agree that running your own instance is the clearest way for
organizations to show that something is the real X (for appropriate
Xs). But a lot of people probably shouldn't tie their identity to
their current location that way, because they will move in the future.</p>
<p>(At universities, students definitely move on and professors sometimes
do too.)</p>
</blockquote>
<p>With universities specifically, the kinds of online activity that
people are interested in moving from Twitter to the Fediverse are
not primarily the anodyne press release level announcements. It's
the live interactions of academic twitter, with people talking about
their work, interacting with each other, and so on.</p>
<blockquote><p>The early days of the web and the Internet featured a lot of people
using their work/university email and web pages as their online
presence. In the long run this often did not go well and generally the
advice today is to not do that.</p>
<p>(<a href="https://mastodon.social/@cks/109311346775073591">...</a>)</p>
<p>The current social expectation is that if someone on your Fediverse
instance becomes inactive, their posts still remain forever. This is
a recipe for steadily increasing costs over time, which organizations
often have a bad reaction to sooner or later, especially for costs
from people no longer associated with you.</p>
<p>(We've seen this before with alumni email eventually getting shut
down.)</p>
</blockquote>
<p>What I didn't cover is that if people continue using your instance
after they depart (as in the alumni case), you're taking on a steadily
increasing support burden due to this. Some number of the now departed
people will forget their credentials or otherwise run into access
problems, some number of them may get phished, and so on and so forth.</p>
<p>All of this isn't a short term concern. But if we're going to stand
up a Fediverse instance and encourage general use of it, I think
we should be in it for the long haul. My Fediverse account has been
around on its server for more than five years now, and I'm a
comparative latecomer; going forward, we should be looking at least
five to ten years into the future of any instance we stand up. That
sort of long term planning is the responsible thing to do (<a href="https://utcc.utoronto.ca/~cks/space/blog/web/WebsiteShortDesignLifetime">even
if it's uncommon on the web</a>).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FediverseWhoRunsInstances?showcomments#comments">4 comments</a>.) </div>Some thoughts on organizations running their own Fediverse instance2024-02-26T21:43:52Z2022-11-09T03:26:17Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/PeopleLikeFileExtensionscks<div class="wikitext"><p>In some circles, it's popular to denigrate file extensions as a
Windows-ism that's only necessary because of (historical) limitations
of that platform. However, we have a fair amount of evidence that
people like file extensions even on platforms where they aren't
necessary, and adopt them by choice in various circumstances even
without technical need.</p>
<p>The obvious primary source for this evidence is people's habits on
Unix. Unix doesn't need file extensions and they're by no means
universally used in file names, yet they are popular in a variety
of situations. The most obvious example is that Unix based programming
languages very frequently have a convention of using specific file
extensions on their source files. For modern programming languages
you can say that this is some degree of wanting to go along with
the convention and wanting to be cross-platform to Windows, but
it's harder to make this view stick for older languages that predate
assumptions that everything would wind up on Windows sooner or later.</p>
<p>(It's not just programming languages on Unix that use file extensions.
For example, it's entirely a convention to put tar archives in files
with a .tar extension, but it's pretty universal.)</p>
<p>The situation with C is especially striking, because C uses two
file extensions, .c and .h, despite there being no technical
requirement for this within the language. You can #include a
file with a .c extension just as readily as you can a .h file
(and sometimes people do), but purely by convention we use the
two extensions to signal the generally intended use of the file
(and then the C compiler frontends go along with this by looking
at you funny if you tell them to compile a .h file). People
clearly like being able to see a clear split of intended use.</p>
<p>Compilers for C and other languages show one use of using file
extensions, namely that they let you create a bunch of different
file names from a base name plus a varying extension. Starting with
fred.c you can traditionally ask the C compiler to generate an
assembler version, which winds up in 'fred.s', or turn it into an
object file, 'fred.o'. Sometimes you'll go straight to a final
executable, conventionally called just 'fred' (with no extension).
And you can have 'fred.h', the visible API for the code in 'fred.c',
and most everyone will understand what you mean just from seeing the
two files together in a directory listing.</p>
<p>One potential reason for the original popularity of file extensions
on Unix is that Unix shells and other Unix tools such as '<code>find</code>'
makes it convenient to work with file extensions if you want to do
something to 'all C files' or 'all C headers' or the like. Another
is that putting this information about the intended file type into
the file name makes it immediately visible in directory listings
and other places you see file names (which may include other files,
for example Makefiles).</p>
<p>You can argue that all of these things should be done through
file 'type' metadata (which would still let you distinguish C
headers from C source code). However, the drawback of this is
that people would have to learn additional sets of syntax and
features for searching and operating on this file type metadata,
and some things would be more awkward that they are today (where
you can ask for file name patterns like 'vdev*.h').</p>
<h3>Sidebar: The original Macintosh and its lack of file extensions</h3>
<p>The original Macintosh operating system (ie, classic Mac OS)
explicitly stored both the 'file type' and creator code of every
file separately from its name (see the Wikipedia entry on <a href="https://en.wikipedia.org/wiki/Resource_fork">resource
forks</a>). As a result,
I believe that people mostly didn't use file extensions in classic
Mac OS file names. I suspect that C programmers on Mac OS may still
have called things 'file.c', but that was probably partly through
custom and habit.</p>
<p>(I once did write some C code on classic Mac OS, but it was so long ago
that I've long since forgotten how it worked.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/PeopleLikeFileExtensions?showcomments#comments">2 comments</a>.) </div>People like file extensions whether or not they're necessary2024-02-26T21:43:52Z2022-10-29T02:49:46Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SecurityItsOurOwnFaultcks<div class="wikitext"><p>Over <a href="https://twitter.com/thatcks/status/1585308481326059520">on Twitter</a>
and <a href="https://mastodon.social/@cks/109236880041951999">the Fediverse</a> I said
something:</p>
<blockquote><p>A quiz: you've a normal, ordinary person and you've received an email
with a PDF invoice attached (so it says). You click on the invoice in
your mail program and it shows you this. What are you seeing and how
alarmed should you be?</p>
<p><img style="max-width: 100%" src="https://utcc.utoronto.ca/~cks/pics/spam/fake-adobe-pw.png" width="2000" height="auto" alt="A blurred invoice in the background with an 'Adobe PDF / Sign in to view invoice payment' dialog on top, asking for your password." title="The email address has been modified from the real version."></p>
</blockquote>
<p>The spoiler is that this is <a href="https://utcc.utoronto.ca/~cks/space/blog/spam/PhishViaFiletypeConfusion">the 'HTML attachment presented as a
PDF attachment' phish</a> that I
talked about yesterday. This isn't a real PDF that's been encrypted
and magically needs your password to unlock; this is a HTML form
that will send your password to the phisher if you try to 'sign in'
to see the PDF.</p>
<p>(This image is from a browser instead of a mail client. <a href="https://twitter.com/thatcks/status/1585437209783574528">A mail
client I tried rendered it somewhat differently</a>, but I
don't know how other ones behave. <a href="https://utcc.utoronto.ca/~cks/space/blog/spam/PhishViaFiletypeConfusion">As covered</a>, since spammers do it I have
to assume it works in enough environments to be useful.)</p>
<p>We (the computing community) did this to ourselves. We created a
situation where a HTML attachment received in email could plausibly
look, to ordinary people, like something that would appear from
trying to look at an ordinary PDF. There are a whole bunch of
individual pieces and steps that got us here, each sensible on their
own in some view, but the collective result is that we did this to
ourselves. We have no one else to blame when ordinary people fill
in their password and hit 'Sign in'.</p>
<p>These steps aren't just displaying HTML attachments and PDF attachments
in mail clients in a way that's hard for ordinary people to immediately
tell apart if they're not already suspicious. It's also things like
creating a world where opening an attachment might plausibly require
a password or additional authentication to actually see it. It's a
computing world where you can be challenged for authentication at
what feels like random times for random reasons and there's enough
noise that what's one more roadblock in the way of getting your work
done.</p>
<p>(On a larger scale, there's also the issue that we have no general
secure file transfer system beyond 'mail people documents that are
encrypted in a variety of ways', ranging from 'locked' PDFs to encrypted
ZIP archives to <a href="https://age-encryption.org/">more technical options</a>.)</p>
<p>I don't have any solutions. I'm not sure a solution is even possible
at this point. Come back in fifty or a hundred years; maybe we'll
have figured one out by then. Or everything will have changed so much
that the problem is irrelevant.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SecurityItsOurOwnFault?showcomments#comments">5 comments</a>.) </div>Our computer security problems are our own fault2024-02-26T21:43:52Z2022-10-27T01:08:26Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FilessytemProgressiveDeletecks<div class="wikitext"><p>I recently read Taras Glek's <a href="https://taras.glek.net/post/curious-case-of-maintaining-sufficient-free-space-with-zfs/">Curious Case of Maintaining Sufficient
Free Space with ZFS</a>,
where Glek noticed that ZFS wasn't immediately updating its space
accounting information when things were deleted. This isn't necessarily
surprising and I'm not sure it's unique to ZFS. In practice, I believe
that many filesystems don't actually perform all steps of deleting a
file at once (as we see it from the outside).</p>
<p>There are two conjoined problems for filesystems when deleting
things. First, in order to really delete things from a filesystem,
you need to know what they are. So to delete a file, the filesystem
needs to know specifically what disk blocks the file uses so the
filesystem can go mark them as free in the data structures it uses
to do this. This information about what disk blocks are used is
not necessarily in memory; in fact, very little about the file may
be in memory. This means that in order to delete the file, the
filesystem may need to read a bunch of data about it off of the
disks and then process it. For large files, there are several levels
of this data in a tree structure of <em>indirect blocks</em>. This isn't
necessarily a fast process, especially if the system uses HDDs and
is under IO pressure already.</p>
<p>(Generally each indirect block you have to read from disk requires
a seek, and HDDs still can only do on the order of 100 of them a
second. SATA and SAS SSDs are much faster, and NVMe SSDs even faster
still, but there is still some latency and delay for each block.)</p>
<p>The second part is that this information about what disk blocks are
in use may be larger than you want to hold in memory and process
at once (especially when combined with all of the metadata for free
filesystem blocks that you're about to update). You can reduce
memory usage (and perhaps complexity) by freeing the file's disk
blocks in conveniently sized batches. In a filesystem with some
kind of journaling (including ZFS), this can also reduce the size
of the journal record(s) you need to commit in order to make things
work.</p>
<p>This progressive deletion is mostly invisible to people, but one
place that it can materialize is in filesystem space accounting and
space allocation. If you're freeing blocks and updating metadata
in batches, it's natural to update the visible information about
disk space used and free in batches too, rather than try to do it
all at the end (or worse, all at the start). This is probably
especially the case if you're committing things in batches too.</p>
<p>(A filesystem does generally know how many blocks of disk space a
file takes up, so it can choose to update the accounting information
right away at the start of the deletion. But then it creates a
situation where not all of the claimed free space is actually usable
right now, although there are other workarounds for that.)</p>
<p>PS: It's also possible to have deletion happen asynchronously from
the perspective of user level programs, where their calls to
'<code>unlink()</code>' return almost immediately while the files are actually
deleted in the background. But I don't know if any filesystems
actually do this.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FilessytemProgressiveDelete?showcomments#comments">One comment</a>.) </div>Filesystems and progressive deletion of things2024-02-26T21:43:52Z2022-10-25T02:40:51Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSWhyCAARecordsMattercks<div class="wikitext"><p>I've known about <a href="https://en.wikipedia.org/wiki/DNS_Certification_Authority_Authorization">DNS Certification Authorization (CAA) records</a>
for a while, but I've generally considered them mostly an interesting
curiosity instead of something that people should generally care
about. If you knew that you only got TLS certificates from Let's
Encrypt (for example), you could set a CAA record on your domain
to this and get what I thought of as 'a bit of extra security'. But
yesterday, when writing about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransAboutEcology">how Certificate Transparency is
about improving the TLS ecology</a>, I had
a sudden realization of why you should care about CAA records and
consider them.</p>
<p><a href="https://utcc.utoronto.ca/~cks/space/blog/web/SSLCoreProblem">The fundamental Certificate Authority problem</a>
is that any CA in the world can issue valid TLS certificates for
your domain, regardless of how good or bad their practices are.
These days all CAs at least claim to be careful and there is some
assurance that they are, but their actual operational practices
vary and a specific CA may be vulnerable to issues like <a href="https://twitter.com/rmhrisk/status/1574995320727293952">BGP based
attacks</a>
(<a href="https://www.coinbase.com/blog/celer-bridge-incident-analysis">also</a>).</p>
<p>What having a CAA record does is <strong>you can confine TLS certificate
issuance to CAs whose processes you actually trust</strong>. You don't
have to trust that some random CA reseller with an intermediate
TLS certificate will properly do multi-network domain validation
and not be fooled by anything short of epic BGP route hijacking,
for one non-hypothetical example. You can restrict what CAs can
ruin your life down to ones that you use and ones that you believe
have good, thorough processes (for example, Let's Encrypt, which
apparently takes significant care to avoid being fooled by issues
like this).</p>
<p>(My feeling is that this is especially useful for smaller organizations
that don't necessarily have the pull to get prompt action from the
TLS ecology. Of course, smaller organizations are probably less
likely to be targeted this way in the first place.)</p>
<p>In the old days of TLS, CAA records would probably have been less
useful in practice than they are now. This is because <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransAboutEcology">Certificate
Transparency has forced CAs to clean up their acts</a>. Today, anyone can monitor CT logs and
cross-verify the CA issuing a TLS certificate against the domain's
CAA record. That anyone can do this means that a CA not getting CAA
handling right is potentially much more visible, so CAs are pushed
to get it right. In the old days, you could have mandated that CAs
respect CAA records but the practical odds would be that they'd
have process problems, and you probably wouldn't have caught them
before an incident.</p>
<p>(What CT logs do is give you visibility into cases where one part
of an organization sets a CAA record and another part goes out to
get a TLS certificate from someone not in the CAA record. That
request should fail, but if CA processes are imperfect, it would
succeed and probably not get noticed without CT logs.)</p>
</div>
Why I feel DNS CAA records are a real TLS security improvement in practice2024-02-26T21:43:52Z2022-10-24T01:52:31Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertTransAboutEcologycks<div class="wikitext"><p>In Emily M. Stark's <a href="https://emilymstark.com/2022/08/23/certificate-transparency-is-really-not-a-replacement-for-key-pinning.html">Certificate Transparency is really not a
replacement for key pinning</a>,
one thing that Stark notes is that <a href="https://certificate.transparency.dev/">Certificate Transparency</a> doesn't really have strong
security properties. You can say some fuzzy things about security
properties that CT perhaps offers (although <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">they get fuzzier when
you look at the details</a>), but there's
very few concrete security claims you can make (or that people try
to make, for example in <a href="https://www.rfc-editor.org/rfc/rfc9162">RFC 9162</a>). Having been thinking
about this for a while, I think that Stark is correct here, and
that Certificate Transparency is not about security as much as it
is about improving the 'Web <a href="https://en.wikipedia.org/wiki/Public_key_infrastructure">PKI</a>' ecology</p>
<p>Certificate Transparency has unquestionably improved the TLS ecology
in general. Concretely, people have used CT logs to detect problematic
TLS certificates, ones with various sorts of bad field values and
so on. More broadly, requiring Certificate Authorities to log their
issued certificates to CT logs has pushed CAs away from bad practices,
because they know those practices will now be visible (or they will
be in violation of browser requirements). It may have contributed
to CAs being more willing to close down subordinate CA signing
certificates (since those will now be more visible). And people
certainly use CT logs to scan for (probably) miss-issued certificates
for domains of interest (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransBadCertMeanings">which can have multiple causes</a>).</p>
<p>Further, Certificate Transparency has raised the costs of advanced
attacks for powerful attackers, because it's vastly increased the
chances that the first use of a compromised or suborned CA will be
the last use. In the pre CT days, you could quietly get a TLS
certificate for 'facebook.com' from a CA and have some chance of
hiding this unless you made wide-scale use of it. Today, this is
impossible; Chrome (the dominant browser) won't accept a TLS
certificate without endorsements from multiple CT logs, and that
makes it very likely that your 'facebook.com' TLS certificate will
appear in those logs, be spotted, and trigger alarms.</p>
<p>All of this has vastly improved the public TLS ecology. It's not
all roses (attackers reportedly scan CT logs for new hosts to probe,
for example, and internal hosts with valid TLS certificates are
much more visible), but it's much better than it used to be. But,
as Stark covers, Certificate Transparency doesn't offer lots to
the individual host or domain, especially a small one, especially
if you don't want to (or can't) sign up for a real time monitoring
system. And as Stark also covers, if you do detect that there's a
mis-issued TLS certificate for your domain, there's not necessarily
much you can do about it.</p>
<p>The other way to put this is to say that <strong>Certificate Transparency
is more about holding Certificate Authorities to account than it
is about directly improving the security of any individual website</strong>.
Holding CAs to account is critical for the overall security of all
TLS sites, but it doesn't necessarily directly help anyone in
specific, especially small sites.</p>
<p>(This general improvement in CA practices has enabled what I feel
is one real TLS security improvement for (small) websites, but
that's another entry.)</p>
<p>PS: I don't think this is a new observation about Certificate
Transparency, but I feel like writing it down anyway.</p>
</div>
TLS Certificate Transparency is about improving the (web) TLS ecology2024-02-26T21:43:52Z2022-10-23T01:35:20Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertTransBadCertMeaningscks<div class="wikitext"><p>In Emily M. Stark's <a href="https://emilymstark.com/2022/08/23/certificate-transparency-is-really-not-a-replacement-for-key-pinning.html">Certificate Transparency is really not a
replacement for key pinning</a>,
Stark asks a good question:</p>
<blockquote><p>[...] what is a domain owner supposed to actually do if they find a
malicious certificate for their domain in a CT log? [...]</p>
</blockquote>
<p>I don't have an answer to this, but we can ask a related question:
what does it mean if your CT log monitoring turns up a TLS certificate
for your domain that you don't know about?</p>
<p>I think that there are four things that it could mean, which I'll
order from the least likely to the most likely. First, the Certificate
Authority could be compromised and the attacker has chosen to burn
that compromise (and probably the entire CA) in order to get a TLS
certificate for your host or domain. This is probably the least
likely option but the most valuable thing for the overall TLS ecology
to detect.</p>
<p>('Compromised' here includes the government of the CA's jurisdiction
turning up with an order for it to issue some TLS certificates.)</p>
<p>Second, the CA's processes for issuing TLS certificates could have
problems that have been exploited (deliberately or accidentally),
what's sometimes called a 'mis-issuance'. Historically mis-issuance
has come in all sorts of forms, including trying to trick the CA
about your identity (corporate or otherwise). Mis-issuance is a CA
issue that the CA is going to have to fix right away once it's
detected, complete with officially revoking the mis-issued TLS
certificates (for all the good that will or won't do). Mis-issuance
is tragically still not completely stamped out.</p>
<p>Third, the CA's attempt to do domain validation could have been
fooled through technical means like BGP route hijacking, <a href="https://twitter.com/rmhrisk/status/1574995320727293952">which
apparently is something that has happened (or been attempted)
repeatedly</a>
(<a href="https://www.coinbase.com/blog/celer-bridge-incident-analysis">also</a>).
You might call this mis-issuance, but here the flaw is outside of
the CA's processes and code, although the CA still needs to change
its processes to make this harder (or impossible).</p>
<p>Fourth and most likely, some elements of your domain have been
compromised by an attacker who used those elements to pass domain
validation and get a TLS certificate. Depending on how the CA in
question does domain validation, this could be DNS, email (possibly
only specific addresses), a particular host, or a firewall. To a
CA doing domain validation, an attacker with sufficient control
over the right thing looks just like you requesting a real TLS
certificate. Unfortunately this is both probably the most common
and the least likely to be dealt with in a useful way unless you're
a big site, since ordinary TLS certificate revocation is still not
very useful as far as I know.</p>
<p>(The major browsers are working to change this situation with some
clever tricks and so life may be better someday. Today, if you're
an ordinary site I think TLS certificate revocation only affects
people using Firefox who haven't disabled <a href="https://en.wikipedia.org/wiki/Online_Certificate_Status_Protocol">OCSP</a> checks.)</p>
<p>Some bad TLS certificates may point to signs of multiple things.
For example, if you have <a href="https://en.wikipedia.org/wiki/DNS_Certification_Authority_Authorization">a CAA DNS record</a>
but an unexpected TLS certificate is issued for your domain from a
CA that's not listed, you could have both an attacker in control
of your host and a CA with a mis-issuance problem (that they're not
properly checking CAA records).</p>
<p>(I may have missed more possibilities. I'm deliberately excluding
all varieties of 'it was basically legitimate activity inside your
organization', which in some cases can look very much like an
attacker in action.)</p>
</div>
What it means to see a 'bad' certificate in TLS Certificate Transparency logs2024-02-26T21:43:52Z2022-10-17T02:57:10Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SSDNoPerformanceIntuitionscks<div class="wikitext"><p>Back in the old days of mechanical hard drives (HDDs, aka 'spinning
rust'), it was possible to feel that you had a reasonable general
understanding of their performance because they were physical objects
with relatively straightforward general operating principles. For
example, they read your data by moving 'the' drive head to the track
and then listening to the track as it spun past underneath the head
to read either the individual sectors you wanted or (toward the
end) the entire track (and then extracting what you wanted). You
could almost always assume that these physical actions were the
limiting factor on IO performance, and for a long time they didn't
change very fast (especially the time it took to move the head to
a track).</p>
<p>Flash based solid state disks are much more complex and opaque
objects, without this reassuring mechanical nature. A SSD has your
data in 'flash', which is often divided up into more than one bank,
which can allow some reads (and maybe even writes) to happen in
parallel. There is a Flash Translation Layer that turns 'disk block
addresses' into locations in the flash; necessary portions of this
FTL itself may need to be fetched from flash, or maybe the SSD loads
it into RAM right away when it powers on. Writes are famously even
more complicated, and because of that complexity there's often
background processing happening in the SSD that may affect how your
IO performs.</p>
<p>(The many pieces involved in a SSD's performance also provide plenty
of room for differences between SSDs, much more so than there's
been in HDDs for a long time. Manufacturers can even switch components
and change designs within the same model over time, with visible
performance effects (generally it goes down).)</p>
<p>Because there are no slow mechanical parts to SSDs (only somewhat
slow flash parts), the speed of everything else in the system also
increasingly matters. This is both the speed that the SSD's CPUs can
do their work (and they have plenty of work) and the speed that the
host system can send them work to do and respond when the work is
complete. Because they have such fast communication between the host
and the SSD, decent NVMe disks can make this very visible, with it
requiring significant efforts on the host to achieve their theoretical
performance.</p>
<p>I had (and still have) performance intuitions for HDDs, although
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/HDDsNowSomewhatBetter">I have to revise my HDD intuitions every so often</a>. I don't really have performance intuitions
for SSDs, except that I expect individual SATA SSDs to come close
to saturating the theoretical SATA maximum read rate. Not only are
SSDs pretty complex objects with hard to understand performance
(and performance that can vary drastically from model to model),
but the conditions around them are constantly changing since the
host-side software keeps changing (and since it matters, you may
need to think about reasonably specific configurations, not general
intuitions).</p>
<p>My lack of performance intuitions (especially ones I trust) cases
a shadow over my feelings about various pieces of software design.
For the case that started me thinking about this, I have relatively
little feel for whether my old entry on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/WhenMaildirWins">what problems the Maildir
mail storage format does and doesn't solve</a> is
still reasonable applicable, and under what circumstances (SATA SSD
versus NVMe SSD, for example).</p>
<p>(And even if it still hurts on an NVMe SSD to read a bunch of little
files instead of read sequentially through one big one, it probably
hurts for different reasons, such as overheads in OS system calls and
issuing all those separate IOs.)</p>
<p>PS: Even when I can remember general SSD performance numbers, these
'raw' numbers don't necessarily translated into observed system
performance the way that they did in the era of HDDs. A HDD that
could do 150 seeks per second would probably deliver 150 random
reads a second through your filesystem (because it was by far the
limiting factor). A NVMe SSD that can do 10,000 read IOPS may or
may not deliver 10,000 random reads through the filesystem, OS,
kernel, and hardware that you're using (because the NVMe SSD may
well no longer be the limiting factor).</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SSDNoPerformanceIntuitions?showcomments#comments">One comment</a>.) </div>My performance intuitions and the complexities of SSD performance2024-02-26T21:43:52Z2022-10-07T03:20:31Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/UniversityBYODAndSecuritycks<div class="wikitext"><p><a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/BYODOurView">Bring Your Own Device (BYOD) is perpetually popular within
universities for multiple reasons</a>, including
the straightforward reason that it saves the university a lot of
money to assume that <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/Shifting2FAViewsHere">various groups can be required to have (and
use) their own devices for some things</a>.
However, there is a fundamental difference between BYOD at a
university and BYOD in a corporate environment that makes university
BYOD more risky in a security sense.</p>
<p>Organizations have a strong interest in making sure that the devices
people use to do their work (and access to organization's resources)
are secure. When BYOD is in effect, my impression is that the common
corporate approach is to require people to put their BYOD devices
under some sort of corporate remote management (sometimes called
'Mobile Device Management (MDM)' when it applies to smartphones and
the like). This remote management is then used to apply security
settings, insure things are up to date, look for signs of compromise
(and perhaps remotely wipe the device if they're detected), and
often intrusively track what's done on the device in the name of
the organization's security.</p>
<p>This is a non-starter in a university environment. If the university
was to demand that people enroll personal devices in university
MDM, ceding control over them to the university, the responses would
probably be rather rude. This isn't necessarily because universities
are a hotbed of resistance to authority and support of personal
freedom and so on; instead, it's partly because <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversityNonEmployeesII">a lot of people
at the university don't consider themselves to be working for it</a>. If you're working for someone and you
bring your own device, maybe it seems reasonable to give your bosses
power over it. But if you're not working for someone and they show
up to demand the ability to spy on and control your smartphone,
that's another thing entirely. This is especially so if the university
refuses to provide you with a smartphone and require you to have
and use your own for the organization's purposes.</p>
<p>So as a practical matter, the university on the one hand literally
or effectively requires graduate students and various other people
to use their own personal smartphones, laptops, and so on for some
aspects of university work, and on the other hand can't demand that
those personal devices be enrolled in a generally hypothetical
organizational device management. This leaves universities with a
rather different BYOD security posture than normal organizations.
A normal organization can at least hope that there are no compromised
devices used by employees on their (internal) networks and it's
probably alarming if you detect some. At a (large) university,
that's a Tuesday.</p>
<p>(Universities can and do enroll university owned and provided devices
in device management, but generally this only really applies to staff,
and at that only some staff. In practice a lot depends on who is paying
for the staff and their devices.)</p>
<p>(This entry was sparked by reading Matthew Garrett's <a href="https://mjg59.dreamwidth.org/61089.html">Bring Your Own
Disaster</a>, which raises a lot
of good points about problems created by BYOD in general.)</p>
</div>
Universities, "Bring your own device", and security2024-02-26T21:43:52Z2022-10-04T01:52:34Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/UniversityNonEmployeesIIcks<div class="wikitext"><p>In an ordinary conventional company (or more broadly many organizations),
everyone present not merely <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversityNonEmployees">matters to the company</a> but also is paid by the company and actively
work for it. They're an employee or a contractor of the company,
they get paid for it, and they think of their relationship with the
company this way. Even in sectors that are plagued by 'unpaid
internships', I think that such people still think of themselves
as working for the company, just for free.</p>
<p>This is not how universities work for rather a large number of
people present. Obviously it's not how it works for undergraduate
students, who are giving the university money in exchange for an
education (in theory), but it's also generally not how it works for
graduate students either, both in practice and in how people think
of it. Graduate students may receive a stipend and work as paid
teaching assistants and the like, but even when they do they mostly
don't think of themselves as employees of the university in general
and working for it. Graduate students are here to get a degree, to
the benefit of both themselves and the university, and stipends and
the like are tools that the university provides to support them in
this. Even paid postdocs are somewhat in this situation, because
there is an implicit bargain going on; postdocs are not mere employees
working for the university, instead they are being supported while
they advance their career with the aid of the university.</p>
<p>(The situation with professors is more complicated and tangled, but
there is definitely a strong aspect of give and take because a professor
is expected to bring in a certain amount of external funding.)</p>
<p>That members of the university population are this way and feel this way
matters for much more than legalities. In a conventional company, the
company can impose things on people partly because it can say 'you have
to do this because you work for us'. If a university tried to do that,
the replies from a lot of its people would likely be rather profane,
because the people involved don't consider themselves to be working for
the university and so don't concede that the university has that sort of
power over them. This is so even if, technically, some of those people
are effectively paid part-time or even full-time employees at the time.</p>
<p>I would be remiss if I didn't point out that university tacitly or
explicitly encourage this mindset in their members because in practice
it's rather to the university's benefit. If graduate students and
postdocs viewed the university purely as an employer, they would be
basically certain to demand much better pay and working conditions. The
university very much doesn't want that to happen (as shown by how
universities react to the often quite reasonable requests from graduate
student teaching assistants, not infrequently leading to strikes or
threats of same).</p>
<p>(The university even encourages this mindset among its paid, full
time staff; among other things, this saves the typical university
a lot of money in aggregate in staff salaries. Sometimes this causes
staffing problems, when the disconnect between university pay rates
and market pay rates gets too large to be bridged by 'higher missions'
and the like.)</p>
<p>(I've previously written <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/UniversityNonEmployees">Universities and their non-employees</a> about how universities are in practice
indifferent to whether or not many individual members can get their
work done, unlike how companies do generally care about this because
an idle employee who can't work because of something like a dead
computer costs the company money.)</p>
</div>
Universities and their non-employees (part two)2024-02-26T21:43:52Z2022-10-03T01:44:48Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertTransBadLogOptionscks<div class="wikitext"><p>One of the potential concerns in the <a href="https://certificate.transparency.dev/">Certificate Transparency</a> ecosystem is that a CT Log
could be compromised. But what can an attacker who's in control of
a CT log actually do? That's a question both of how CT logs work
in general and of the current uses that people make of them, both
clients (ie browsers) and Certificate Authorities. So here's what
I can see about that, based partly on <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsClientView">the TLS client's view of
CT logs</a>. To start with, let's restate
an obvious thing: a CT Log cannot by itself create a valid TLS
certificate. Any real attack requires not just a compromised CT log
(or several), but a Certificate Authority that's either compromised
or can be used to mis-issue some certificates for you.</p>
<p>(The other thing to say is that no browser relies on a single CT log,
and neither should you. Currently, both Chrome and Safari require
validation from at least two CT logs, which means that an attacker
needs to compromise two of them in order to have a chance of fooling a
browser.)</p>
<p>Even without a compromised CA, an attacker in control of a CT Log
can make it not work globally in some way or ways. The log can stop
giving <em>Signed Certificate Timestamps</em> (SCTs) to CAs when they ask
for them as the CAs issue new TLS certificates, either for all
certificates or for some of them. It can not actually add some or
all TLS certificates to the log, although it gave out SCTs for them.
It can stop answering some or all queries about the log (what <a href="https://www.rfc-editor.org/rfc/rfc9162">RFC
9162</a> calls <a href="https://www.rfc-editor.org/rfc/rfc9162#name-log-client-messages">Log Client
Messages</a>), from
some or all parties. If detected, all of these would be taken as
a sign of a malfunctioning log and would eventually result in the
log being dropped from CAs and browsers.</p>
<p>I believe that not answering some queries globally can be used to
(temporarily) hide the specifics of some TLS certificates in the
log. As far as I can see from <a href="https://www.rfc-editor.org/rfc/rfc9162">RFC 9162</a>, the only way for an
outside party to see that specific TLS certificates are in the log
is to ask for them (by index position) with a <a href="https://www.rfc-editor.org/rfc/rfc9162#name-retrieve-entries-and-sth-fr">Receive Entries and
STH from Log</a>
request. If a compromised log wants to hide the presence of some
TLS certificates, it can refuse to answer entry queries for any
ranges that include the TLS certificates. These certificates are
in the log's Merkle tree and it can provide valid proofs of their
inclusion, but you can only discover them if you know some details
about them (for example, a TLS server gave them to you). To other
people trying to audit the log, it might look like <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/TLSCertTransLogsAndLoad">ongoing load
problems</a> or some other more innocent
excuse.</p>
<p>(I suspect that there are enough heavyweight CT log auditors
that this excuse wouldn't pass muster for very many days, but
it might last long enough for a very special TLS certificate
the attacker got to be of some use.)</p>
<p>The simplest non-global thing for a compromised CT log to do is to
give the attacker SCTs for special certificates (which can then be
put in those certificates) and then not add the certificates to its
public, global log. Depending on <a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowsersAndCertTrans">how much use a browser makes of
CT</a>, a browser may well accept a
properly signed TLS certificate with these SCTs in it, because all
the browser checks is that the SCTs are properly formed and signed.
If this was detected, a compromised CT Log could then try to blame
the lack of inclusion on general operational issues. Right now, it
appears that <a href="https://utcc.utoronto.ca/~cks/space/blog/web/BrowsersAndCertTrans">this SCT-only compromise would probably fool browsers</a> while hiding those bad TLS certificates
from all of the people who watch CT logs.</p>
<p>I believe that a compromised CT Log can generate a special version
of its Merkle tree that includes extra certificates any time it
wants to. This special tree could then be used to create a <em>Signed
Tree Head</em> (STH), proofs of inclusion for those certificates in the
tree, and a proof that this special tree is an update of a previous
legitimate, globally visible tree. However, the careful examination
of the STH information might reveal more and more oddities as the
tree is generated further and further away from the times in the
STHs for the TLS certificates. This special tree's STH will also
not be related to the next legitimate, globally visible STH; it
will be a one time, frozen fork of the CT log's log.</p>
<p>With more work, a compromised CT log could fork its log into two
versions, a private one with extra TLS certificates that were added
at the time they were nominally generated and a public version
without them. Since this is a fork, the public and private Signed
Tree Heads aren't compatible with each other (there's no path from
one to the other), so the CT log would have to make sure that any
given client only saw either the public version or the private
version. How difficult this is depends on how identifiable the
client is to the CT log, how it queries the log, and whether or
not the client talks to TLS servers that include STHs from the
log in their TLS handshake or stapled OCSP response.</p>
<p>Finally, a compromised CT log could return special replies (possibly
corrupt ones) to log queries to some clients in an attempt to exploit
bugs in the clients. Right now I believe this wouldn't affect
browsers, which work only with SCTs and which don't directly get
them from the CT logs. It might affect CAs, who directly request
SCTs from CT logs and who may audit the logs to make sure that the
CA's certificates are present as they're supposed to be, although
you would hope that the CA's code would be hardened and well confined,
and it could definitely affect anyone who audits CT logs. I suspect
that current CT log auditing programs aren't particularly hardened,
except through basic implementation language safety (one log
monitoring program I know of is written in Go, for example).</p>
<p>(This entry was sparked by Emily M. Stark's <a href="https://emilymstark.com/2022/08/23/certificate-transparency-is-really-not-a-replacement-for-key-pinning.html">Certificate Transparency
is really not a replacement for key pinning</a>
which got me to start thinking about the cluster of issues around
CT logs and CT log compromise.)</p>
</div>
What can a compromised TLS Certificate Transparency Log do?2024-02-26T21:43:52Z2022-09-26T02:49:05Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/TLSCertTransLogsClientViewcks<div class="wikitext"><p><a href="https://certificate.transparency.dev/">TLS Certificate Transparency</a>
is a system where browser vendors require TLS Certificate Authorities
to publish information about all of their TLS certificates in
cryptographically validated logs, which are generally run by third
parties (see also <a href="https://en.wikipedia.org/wiki/Certificate_Transparency">Wikipedia</a>). This
raises the question of how clients (generally browsers) interact
with Certificate Transparency. As far as I can tell, it depends
on how thorough a client wants to be about verifying that a TLS
certificate really is in a given CT log.</p>
<p>The current version of Certificate Transparency is described in
<a href="https://www.rfc-editor.org/rfc/rfc9162">RFC 9162</a>. Following RFC
9162, when a client gets a TLS certificate issued by a participating
CA (which is all of them that want to work with Chrome and Safari),
it will also receive (<a href="https://www.rfc-editor.org/rfc/rfc9162#name-tls-servers">in one way or another</a>) some
number of <a href="https://www.rfc-editor.org/rfc/rfc9162#name-signed-certificate-timestam"><em>Signed Certificate Timestamps</em> (SCTs)</a>.
Each SCT is a promise by some CT log to include the certificate
(broadly speaking) in the log within a time interval specified by
the log, and is signed by the CT's private key. A garden variety
client can verify the SCT signatures (for CT logs that it knows of
and accepts) and stop there. Generating a valid SCT requires (some)
control of that log's private key and its activities, and if the
key or the log is compromised, there's potentially not lots of point
in going further.</p>
<p>(A client may also receive additional CT related information from
the TLS server, up to all of the information it needs to validate
things more thoroughly, See <a href="https://www.rfc-editor.org/rfc/rfc9162#name-clients">TLS Client in the RFC</a>.)</p>
<p>A client that wants to be more thorough can then <a href="https://www.rfc-editor.org/rfc/rfc9162#name-retrieve-merkle-inclusion-p">request a proof
of inclusion in the CT log from the log operator</a>,
provided that enough time has gone by since the SCT's timestamp.
I believe that it may need to bootstrap a <a href="https://www.rfc-editor.org/rfc/rfc9162#name-retrieve-latest-sth"><em>Signed Tree Head</em> (STH)</a>
from the CT log, unless it got one (and an inclusion proof) from
the TLS server. That the TLS server can provide the STH and inclusion
proof from the CT log is good for privacy but potentially bad for
your confidence in the SCT, because it means that your client has
no outside check on all of them. If an attacker had access to the
CT log's private keys, they could potentially manufacture a STH and
inclusion proof along side their signed SCT and have their server
give all of them to you.</p>
<p>(I don't know how common it is for TLS servers to provide the
additional CT information to clients. In modern usage TLS certificates
have embedded SCTs, so they take no extra configuration work to
provide; the other information requires the server operator to set
it up and do things, and possibly for the client to have special
features.)</p>
<p>I believe this means that a thorough client must learn and save the
STHs for CT logs, and then (periodically) <a href="https://www.rfc-editor.org/rfc/rfc9162#name-retrieve-merkle-consistency">get a Merkle consistency
proof between two STHs</a>,
possibly along side getting the latest STH for the CT log. Assuming
that STH's are relatively coarse grained and aren't issued for every
new TLS certificate sent to the CT log, it presumably leaks less
information to the log operator to ask for a consistency proof
between some STH and the current STH than it does to ask for a proof
of inclusion of some specific certificate (if the server gave you
that).</p>
<p>(Since this thoroughness requires state and state management, it's
probably mostly restricted to browsers.)</p>
<p>Periodically verifying that two STHs are properly related to each
other means that if someone has lied to you about a proof of inclusion
(which requires a false STH), they have to keep lying from then
onward in order to remain undetected. Otherwise, you will someday
get a current STH for the real CT log (without the TLS certificate)
and then there will be no path between your latest false STH and the
real one.</p>
<p>Continually lying to you this way will be very difficult (if not
impossible) if a bunch of TLS servers provide you with proofs of
inclusion and their view of the log's STH during your TLS conversations.
These TLS servers are seeing the true log and so getting true STHs from
it and then providing these true STHs to you. Really, you don't need a
bunch of TLS servers to be doing this, you just need some really popular
ones to be doing it for common CT logs.</p>
<p>PS: <a href="https://www.rfc-editor.org/rfc/rfc9162">RFC 9162</a> is surprisingly readable, especially its general
discussion sections on server and client stuff. Interested people
may want to at least skim them.</p>
</div>
The TLS client's view of Certificate Transparency and CT Logs2024-02-26T21:43:52Z2022-09-23T02:38:15Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/USBTemper2ReadingsNotescks<div class="wikitext"><p>A while back, <a href="https://twitter.com/thatcks/status/1565481315872526343">I tweeted something</a> that has a
story attached:</p>
<blockquote><p>A person with a single machine room temperature sensor knows the room
temperature (where the sensor is). A person with three temperature
sensors lined up next to each knows only uncertainty (and has a wish
for a carefully calibrated and trustworthy thermometer).</p>
</blockquote>
<p>If you set out to get some inexpensive USB temperature sensors to
supplement <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/MachineRoomTempMonitoring">a more well developed and expensive temperature sensor
system</a>, it's quite likely
that you'll wind up with the <a href="https://old.pcsensor.com/usb-thermometer/temper2.html">PCsensor TEMPer2</a> or something
in that line, and then you might be curious about how accurate their
readings are. Having now collected readings from three of them over
a while, my summary is that you shouldn't expect industrial or lab
grade results from them, although the results are probably useful
if handled cautiously. So here are some observations, which are
almost certainly specific to our version of the TEMPer2 (<a href="https://utcc.utoronto.ca/~cks/space/blog/linux/USBTemper2SensorToPrometheus">which
Linux reports as having USB vendor and product IDs of 1a86:e025</a>).</p>
<p>(It's clear that 'TEMPer2' is a brand name that's been used on
a variety of different pieces of electronics over the years,
even if the external appearance and general functionality has
stayed the same.)</p>
<p>Since this is long and observational, I'll put the summary up front.
If you're going to use TEMPer2s, <strong>you need to test the behavior
of each of your specific units</strong>, <strong>you probably want to trust the
probe temperature more than the internal temperature</strong>, and <strong>you
want to use a USB extender cable</strong> (partly to get the probe far
enough away from your computer, since the TEMPer2 only comes with
a relatively short wire for the probe).</p>
<p>The TEMPer2 has two temperature sensors, one inside the USB stick
and another that uses a temperature probe on a wire. As far as the
internal USB or 'inner' temperature goes, you definitely want to
read <a href="https://halestrom.net/darksleep/blog/048_indoorairsensing/">Halestrom's article on indoor air sensor products</a> and
then get yourself a USB extender cable so that you aren't plugging
the USB stick directly into your computer. Although both temperature
sensors have anomalies, you're probably better off with the probe's
temperature if you have to use one (although you want to test this).
When I initially had all of our TEMPer2s plugged directly into
computers (two servers and one desktop), their USB temperature was
extremely stable and unchanging; using a USB extender cable gave
them more realistic variation.</p>
<p>At this point, we have three TEMPer2 units in use, all with the
probe fitted, the USB stick on an extender cable, and the probe
sensor next to the USB stick (so they should read the same). One
is in a machine room right next to <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/MachineRoomTempMonitoring">our regular machine room sensor
for that room</a>, one is in a
machine room relatively near our existing sensor, and the third is
attached to an office desktop.
Both machine room TEMPer2s show changes in both USB and probe
temperature readings that track temperature changes from our regular
sensor, and they appear to be similar in magnitude (although we
haven't had any big temperature swings). None of the TEMPer2 readings
agree with the regular sensor readings, not even the one where the
three sensors are next to each other; however, they aren't that far
off (it looks like typically around 1 C below our regular sensor
reading). One machine room TEMPer2 has the probe reading higher
than the USB one (by about 0.5 C I think), while the other one has
them generally the same or very close to it. The desktop TEMPer2
has the probe reading below by about 1 C.</p>
<p>(When I've looked at my desktop thermometer, which is near the
desktop TEMPer2 USB and probe, it's been quite close to the USB's
measured temperature.)</p>
<p>All three TEMPer2 units sometimes show what I consider to be reading
anomalies where the USB temperature, the probe temperature, or both
will latch on to some value for a long period of time, with absolutely
no variation in reading for hours on end. One machine room unit
does this quite frequently for the USB temperature (and sometimes
for the probe), the desktop unit does it quite frequently for the
probe temperature (and sometimes for the USB), and the other machine
room unit doesn't seem to do it very much or for very long.</p>
<p>(The probe temperature on the desktop unit commonly latches at 21.06
C and its USB temperature at 21.68 C, while the machine room TEMPer2
latches the USB temperature at 19.56 C and sometimes latches the
probe temperature at 19.81 C. The other machine room TEMPer2 sometimes
latches the probe temperature at 23.56 C. As you can see, there's
a lot of variation here. These TEMPer2 units apparently report their
temperatures over USB in hundredths of a degree C, so the two digits
are authentically what they're reporting.)</p>
<p>Unless something changes when the desktop TEMPer2 unit is moved
into a machine room and connected to a different computer, it seems
like it's a less trustworthy unit than the other two. It definitely
seems to have a different behavior where the USB sensor varies its
readings more often than the probe and may be more trustworthy (my
office is probably not the same temperature to the hundredth of a
degree for hours on end). Given this unit's behavior and the varied
behavior of all three of them, we definitely need to test any future
units under controlled circumstances to see how they behave, even
if that's just putting them next to another unit with more known
behavior to compare.</p>
<p>In the past, people have measured various PCsensor branded hardware
(including 'TEMPer2' units) and found that its accuracy varied with
the true temperature. If precise accuracy is important to you, you
probably don't want to use a TEMPer2 in the first place but if you
have to, I think you should calibrate it over the temperature range
you care about. Given our results so far, the only use we'd make
of TEMPer2 units is to get a vague idea of the temperature of some
place and be able to tell if it's gone up a lot.</p>
<p>PS: The TEMPer2 units can report small temperature variations from
reading the reading; for example, I've seen probe readings of 19.62
C, 19.68 C, and 19.75 C, and USB readings of 22.87 C and 22.93 C.
So in general the 'latching' isn't as simple as the internal precision
being way less than .01 C (although I can't imagine it's that
precise, so presumably there's some rounding and noise happening).</p>
<p>(Although the raw data is in <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusGrafanaSetup-2019">our metrics system</a>, there's no convenient way
to find out how many different sensor readings we've seen. It does
appear that the smallest variation between any two readings is 0.06
C, and the largest one is 0.75 C (observed for the internal sensor;
the largest variation for a probe is 0.56 C). This comes from readings
that are generally taken 30 seconds apart.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/USBTemper2ReadingsNotes?showcomments#comments">3 comments</a>.) </div>Some notes on the readings you get from USB TEMPer2 temperature sensors2024-02-26T21:43:52Z2022-09-22T03:40:17Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/IPTunnelsAndRoutingcks<div class="wikitext"><p>Let's suppose that you have an inside network I on which you have
a bunch of things used in your environment, like <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/CentralizeSyslog">central syslog
servers</a>, your client mail gateway
machine, and so on, and you also have an external machine E, that
you would like to be able to use those services. One obvious seeming
way that you could do this is by setting up some form of network
tunnel between E and a touchdown machine T that has access to your
inside network I (these days you might use <a href="https://www.wireguard.com/">WireGuard</a>, for example). Your exterior machine
E sets a network route to I that goes through the tunnel, the traffic
pops out from T (behind your perimeter firewall), and everything is
happy, right?</p>
<p>Unfortunately, often not so much these days, because of stateful
firewalls. The problem is that the nice simple model I've described
here has <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SymmetricAndAsymmetricIPRouting"><em>asymmetric routing</em></a>.
E's packets to the inside network I goes via the tunnel and T, but
under normal circumstances the return traffic from machines on I
will go out their regular network gateway, not back via T. If there
is a stateful firewall between your I machines and E, it may well
become unhappy and start blocking this traffic, since it looks like
a half-open connection (since it's only seeing half of the traffic).</p>
<p>One answer to this is to have the tunnel endpoint T be a NAT gateway,
not just a simple tunnel endpoint that passes traffic in the clear;
then E's traffic to machines on I emerges with T's IP address on
it, so return packets go back through T. However this leaves you
with a different asymmetric routing problem if machines on I ever
reach out to E on their own (for example to SSH in to it or to
collect metrics from it). Their packets will flow out normally,
un-NAT'd, but E will try to send packets for them back through the
tunnel and T's NAT'ing. You can solve this with simple "policy
based" routing on E, so that reply packets go out the interface
they came in on.</p>
<p>(You can also solve this by having E only route through the tunnel
for machines on I that never reach out to it, setting host routes
instead of network routes, although this is potentially fragile.)</p>
<p>Another solution is to teach all machines on your internal network
I that the external machine E is actually reached through a special
route to the tunnel gateway T. If T is not on I itself and is reached
through the same router as the default route, you might be able to
do this by a change to your gateway router alone. The obvious
drawback to this is that now the tunnel gateway T becomes an
additional point of failure for reaching E (well, for machines on
the internal network I).</p>
<p>(In a sufficiently complex environment, this can be automated through
routing announcements; T and your collection of other tunnel gateways
all announce what IP addresses are reachable through them, and various
things listen and respond appropriately. When a tunnel is down, the
routes are withdrawn and things go back to the defaults.)</p>
<p>Your life is generally easier if the external machine E is not
directly reachable from the Internet and instead you have to go
through a different IP address to reach it (such as a gateway);
often this means you have no conflicting routes, since you can't
reach E's (private) IP address except through the tunnel. Of course
you may have a similar problem if you need to manage the gateway
machine itself, since that machine definitely has a public IP
address.</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/IPTunnelsAndRouting?showcomments#comments">One comment</a>.) </div>The problem of network tunnels and (asymmetric) routing2024-02-26T21:43:52Z2022-09-17T02:45:27Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ServerMemoryShiftingAmountscks<div class="wikitext"><p>One of the things happening <a href="https://support.cs.toronto.edu/">here</a>
is that we're in the process of rolling over our Ubuntu 18.04 servers
on to <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/OurServerAges2022">our current generations of server hardware</a> as we rebuild them as Ubuntu 22.04
based machines. This (and other local events) has caused us to take
a look at what older servers we want to keep and what ones we want
to get rid of, or at least exile to the depths of the back shelves.
Surprisingly, one factor is their CPU performance, but another one
is how much RAM they have, and this has set me thinking about the
(slowly) shifting scales of how much memory basic 1U servers come
with and how much is what we consider 'adequate'.</p>
<p>These days, going through Dell's configurator for R350s and R250s
(the current generation of what we have), it seems hard to get
something with less than 16 GB of RAM and impossible to get something
with less than 8 GB. What we consider to be <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/OurServerAges2022">our current servers</a> all came with 8 GB of RAM as our
floor amount (I don't know if we could have gotten them with less,
but I suspect not). However, our smallest older servers go as low
as 2 GB of RAM, with some having 4 GB. According to DMI information,
the smallest DIMMs we have in any Ubuntu server are 2 GB DIMMs, so
based on that we couldn't have servers with less than 2 GB of RAM
in general.</p>
<p>As a practical matter, I don't think we'd deploy any reused server
with less than 4 GB of RAM, and we might take the effort to bring
them up to 8 GB. We have very few machines with less than 8 GB now,
and it's not just because of the hardware generation they're on.
We've simply wound up in a situation where we default to thinking
that 8 GB is the minimum amount of RAM that a server should have
(and we add more if it seems called for). Of course this isn't
absolutely necessary; we probably have plenty of servers that don't
really need 8 GB, and I've never had problems on my virtual machines
with 4 GB.</p>
<p>I'm not energetic enough to trawl our records to see how much RAM
various generations of servers were bought with, but there certainly
was a day when 1 GB or 2 GB was what they came in the door with.
Some very quick exploration suggests that we were getting basic
servers with 1 GB of RAM as far back as fifteen years ago, and ten
years ago we seem to have been on the cusp of only being able to
get new servers with a minimum of 2 GB of RAM. We probably had a
while when servers came in the door with 4 GB, but for the past few
years 8 GB has been the minimum.</p>
<p>I suspect that this shift in server RAM sizes is driven by a similar
effect to the shift in hard drive sizes (both SSDs and HDDs), where
manufacturers mostly hold the price constant and keep increasing
the DIMM size. I do sort of wish that memory DIMM sizes had risen
at the same rate that SSD sizes did; instead, they seem to have
stagnated for a while, and certainly didn't rise aggressively. (This
is one reason that the current generation and the past generation
of my desktops have had the same 32 GB of RAM, although my current
generation is getting a bit old by now and prices might have shifted
lately.)</p>
<p>(It certainly would be nice for 32 GB or 64 GB or even 128 GB to be a
standard, inexpensive memory size. But not so much, at least for us,
although it is now much more reasonable to have 32 GB machines and we
have a number of them.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerMemoryShiftingAmounts?showcomments#comments">4 comments</a>.) </div>The amount of memory in basic 1U servers and our shifting views of it2024-02-26T21:43:52Z2022-09-12T03:21:32Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ServerNVMeU2U3AndOthers2022cks<div class="wikitext"><p><a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerWithU2NVMeIn2022">The other day</a> I casually looked around
to see how readily available <a href="https://en.wikipedia.org/wiki/U.2">U.2</a>
NVMe drives were compared to SATA SSDs. In the process I saw some
mention of '2.5" U.3' NVMe drives, which was a connector type I'd
never heard of, and did some digging.</p>
<p>The short summary of <a href="https://en.wikipedia.org/wiki/U.2">U.2</a> is that it's NVMe drives in more or
less the 2.5" SSD form factor (although according to Wikipedia, U.2
can also deliver two SATA lanes), with a different edge connector.
<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerWithU2NVMeIn2022">Our recent experience with some U.2 based servers</a> says that this works; our U.2 NVMe drives
in drive carriers look and handle basically the same as SATA SSDs
in drive carriers in other servers. To tell them apart, you have
to either look at the back of the drive where the connectors are
or notice the big 'NVMe' sticker on the front of the drive carrier.</p>
<p>U.3 is sort of an evolution of the U.2 connector and form factor,
but it's sufficiently unloved that <a href="https://en.wikipedia.org/wiki/NVM_Express#U.3_(SFF-8639_or_SFF-TA-1001)">Wikipedia barely has a mention
of it</a>.
The goal of U.3 is to create a 'tri-mode' standard where the same
server drive bay can support U.3 NVMe, 2.5" SAS, or 2.5" SATA drives
(and also the same server backplane and controller; see <a href="https://quarch.com/news/what-you-need-know-about-u3/">here</a>, <a href="https://www.storagereview.com/news/evolving-storage-with-sff-ta-1001-u-3-universal-drive-bays">here</a>,
and <a href="https://www.boston-it.ch/blog/2022/03/17/boston-labs-test-micron-7400-ssds.aspx">here</a>).
A U.3 NVMe drive is backward compatible to U.2 drive bays, but a
U.2 NVMe drive can't be used in a U.3 drive bay, presumably for
reasons.</p>
<p>For people like <a href="https://support.cs.toronto.edu/">us</a>, ordinary
1U servers with U.3 drive bays would be reasonably attractive. We'd
mostly use them with SATA SSDs, but if we had a server that could
benefit from NVMe it would be easy to switch over to it. If we had
an NVMe drive failure and had no spare for some reason, we could
swap in a SATA SSD to get the server back on the air. And we wouldn't
need specific spares for NVMe servers the way we do with U.2, because
any server could be an NVMe server if we needed it to be.</p>
<p>(The natural number of 2.5" drive bays for a 1U server seems to be
four, and with NVMe drives that only needs PCIe x16, which is pretty
widely available.)</p>
<p>However if you do Internet searches for U.3 you'll soon discover
that there's a competing set of standards for NVMe disks on servers,
the <a href="https://en.wikipedia.org/wiki/Enterprise_%26_Data_Center_SSD_Form_Factor">EDSFF</a>
series, and <a href="https://www.servethehome.com/e1-and-e3-edsff-to-take-over-from-m-2-and-2-5-in-ssds-kioxia/">some people feel that U.2 and U.3 are doomed in the
face of them</a>.
The <a href="https://www.snia.org/forums/cmsi/knowledge/formfactors">EDSFF form factors</a> are specific
to NVMe SSDs; there's no concession for backward compatibility to
the 2.5" form factor.</p>
<p>I have no idea how this is going to shake out. <a href="https://www.anandtech.com/show/17517/phison-and-seagate-announce-x1-ssd-platform-u3-pcie-40-x4-with-128l-etlc">People are still
announcing new U.3 NVMe drives today</a>,
but <a href="https://www.anandtech.com/tag/edsff">there's EDSFF activity too</a>.
My biased perspective is that right now we're more interested in
U.3's flexibility to choose between SATA and NVMe for SSDs than
what EDSFF might deliver. But if EDSFF causes NVMe prices to drop
to the level of SATA SSDs, sure, we'd be happy to go NVMe. It's
flash storage either way, so if everything else is equal we'd rather
have the faster version.</p>
</div>
U.2, U.3, and other server NVMe drive connector types (in mid 2022)2024-02-26T21:43:52Z2022-08-26T02:40:31Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/ServerWithU2NVMeIn2022cks<div class="wikitext"><p>Back in early 2021 I wrote about <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/ServerSSDVsNVMeIn2021">my impressions of NVMe versus
SATA (or SAS) SSDs for basic servers</a>. At
that point I didn't expect us to get NVMe based servers any time
soon, especially for servers not focused on fast storage. Well,
times change, and we now have a number of 1U servers with U.2 NVMe
drives. These aren't really "basic" servers in our usual sense;
instead they tend to be <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/BigServerFastFirefoxBuild">pretty powerful compute servers</a>. But they're still 1U servers
and in theory there's nothing to stop people from having lower end
ones with NVMe SSDs. Our experiences with these servers have been
positive, in that everything works as we expect and basically how
things would be if these were SATA SSDs instead.</p>
<p>(Obviously the U.2 NVMe drives are a lot faster and have lower
latency, but these servers mostly don't put any real stress on
their storage.)</p>
<p>We didn't get these servers with <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NVMeGettingTermsStraight">NVMe disks</a>
instead of SATA (or SAS) disks because we had some attraction to
NVMe; if anything, we prefer SATA SSDs to U.2 NVMe SSDs because
it's much easier to get spares and replacements (SATA SSDs are
commodity items; U.2 NVMe SSDs are more expensive and harder to
find). Instead, we got these servers with U.2 NVMe drives because
that's the configuration they really wanted to come in. All of these
servers have four hot swap drive bays (taking their own proprietary
drive carriers), although we normally only use two (for a mirrored
pair of system disks), and we opted to get them with four U.2 NVMe
drives each in order to build up a pool of spares.</p>
<p>Physically and in operation these are just like conventional SATA
or SAS drive carriers (from this particular system vendor) and more
or less just like conventional 2.5" SATA and SAS drives (they may
be thicker, but I don't pay close attention to that). In fact they're
so physically similar that I'm glad the vendor puts a big 'NVMe'
label on the front, because otherwise we could easily get confused
about which drive carrier is U.2 NVMe and which drive carrier is
SATA SSD.</p>
<p>One particular area which they are just like SATA drives in drive
carriers is that we've hot-swapped inactive U.2 NVMe drives without
problems. Linux certainly didn't explode. This gives me hope that
we'll be able to deal gracefully if a system drive fails and has
to be replaced. Hopefully, a failed NVMe drive won't have adverse
consequences for the PCIe fabric it's connected to.</p>
<p>(Our hot-swapping of inactive drives came about because we left
all four drives inserted in some servers, although we were only
using two, and then later wanted to pull the two inactive drives
out.)</p>
<p>I don't know why this particular vendor decided to make these systems
be basically native U.2, although they're not really storage servers
(being 1U systems with only four drive bays). All of the systems
that are this way are dual-socket AMD Zen3 Epyc based ones, so maybe
it's partly because they have so many PCIe lanes available.</p>
</div>
We now have some 1U servers with U.2 NVMe SSDs and they're okay2024-02-26T21:43:52Z2022-08-25T02:51:48Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SMARTAttributeNamesMadeUpcks<div class="wikitext"><p>A well known part of <a href="https://en.wikipedia.org/wiki/S.M.A.R.T.">SMART</a>
is its system of <a href="https://en.wikipedia.org/wiki/S.M.A.R.T.#ATA_S.M.A.R.T._attributes"><em>attributes</em></a>, which
provide assorted information about the state of the disk drive.
When we talk about SMART attributes we usually use names such as
"Hardware ECC Recovered", as I did in <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SMARTAttributesVolatile">my entry on how SMART
attributes can go backward</a>. In an ideal
world, the names and meanings of SMART attributes would be standardized.
In a less than ideal world, at least each disk drive would tell you
the name of each attribute, <a href="https://en.wikipedia.org/wiki/CPUID#EAX=80000002h,80000003h,80000004h:_Processor_Brand_String">similar to how x86 CPUs tell you their
name</a>.
Sadly we don't live in either such world, so in practice those nice
SMART attribute names are what you could call made up.</p>
<p>The only actual identification of SMART attributes provided by disk
drives (or obtained from them) is an ID number. Deciding what that
ID should be called is left up to programs reading SMART data (as
is how to interpret the raw value). Because of this flexibility in
the standard, disk drive makers have different views on both the
proper, official names of their SMART attributes as well as how to
interpret them. Some low-numbered SMART attributes have almost
standard names and interpretations, but even that is somewhat
variable; SMART ID 9 is commonly used for 'power on hours', but
both the units and the name can vary from maker to maker.</p>
<p>Disk drive makers may or may not share information on SMART ID names
and interpretations with people; usually it's not, except perhaps
to some favoured drive drive diagnostic programs. Often, information
about the meaning and names of SMART attributes must be reverse
engineered from various sources, especially in the open source
world. Open source programs such as <a href="https://www.smartmontools.org/">smartmontools</a> often come with an extensive
database of per-model attribute names and meanings; <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/SMARTUpdateDriveDatabase">in smartmontools'
case, you probably want to update its database every so often</a>.</p>
<p>As a corollary of this, names for SMART attributes aren't necessarily
unique; the same name may be used for different SMART IDs across
different drives. Across our collection of disk drives, "Total LBAs
Written" may be any of SMART ID 233 (some but not all Intel SSDs),
241 (most brands and models of our SSDs and even some HDDs), or 246
(Crucial/Micron). Meanwhile, SMART IDs 241 and 233 have five different
names across our fleet, according to smartmontools.</p>
<p>(SMART ID 233 is especially fun; the names are "media wearout
indicator", "nand gb written tlc", "sandforce internal", "total
lbas written", and "total nand writes gib". The proper interpretation
of values of SMART ID 233 thus varies tremendously.)</p>
<p>Fortunately, <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NVMeAndSMART">NVMe is more sensible about its drive health information</a>. The NVMe equivalent of (some) SMART attributes are
standardized, with fixed meanings and no particularly obvious method
for expansion.</p>
<p>PS: Interested parties can peruse the smartmontools <a href="https://github.com/smartmontools/smartmontools/blob/master/smartmontools/drivedb.h">drivedb.h</a>
to find all sorts of other cases.</p>
</div>
The names of disk drive SMART attributes are kind of made up (sadly)2024-02-26T21:43:52Z2022-08-17T02:13:49Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/SMARTAttributesVolatilecks<div class="wikitext"><p>Recently, <a href="https://support.cs.toronto.edu/">we</a> had a machine stall
hard enough that I had to power cycle it in order to recover it.
Since the stall seemed to be related to potential disk problems, I
took a look at SMART data from before the problem seemed to have
started and after the machine was back (this information is captured
in <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusGrafanaSetup-2019">our metrics system</a>).
To my surprise, I discovered that several SMART attributes had gone
backward, such as the total number of blocks read and written
(generally SMART IDs 241 and 242) and 'Hardware ECC Recovered'
(here, SMART ID 195). I already knew that <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SMARTWeirdPowerOnHours">the SMART 'power on
hours' value was unreliable</a>, but I hadn't
really thought that other attributes could be unreliable this way.</p>
<p>This has lead me to look at SMART attribute values over time across
our fleet, and there certainly do seem to be any number of attributes
that see 'resets' of some sort despite being what I'd think was
stable. Various total IO volume attributes and error attributes
seem most affected, and it seems that the 'power on hours' attribute
can be affected by power loss as well as <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/SMARTWeirdPowerOnHours">other things</a>.</p>
<p>Once I started thinking about the details of how drives need to
handle SMART attributes, this stopped feeling so surprising. SMART
attributes are changing all the time, but drives can't be constantly
persisting the changed attributes to stable storage, whether that's
some form of NVRAM or the HDD itself (for traditional HDDs with no
write endurance issues). Naturally drives will be driven to hold
the current SMART attributes in RAM and only persist them periodically.
On an abrupt power loss they may well not persist this data, or at
least only save the SMART attributes after all other outstanding
IO has been done (which is the order you want, the SMART attributes
are the least important thing to save). It also looks like some
disks may sometimes not persist all SMART attributes even during
normal system shutdowns.</p>
<p>This probably doesn't matter very much in practice, especially since
SMART attributes are so variable in general that it's hard to use
them for much unless you have a very homogenous set of disk drives.
There's already no standard way to report the total amount of data
read and written to drives, for example; across our modest set of
different drive models we have drives that report in GiB, MiB, or
LBAs (probably 512 bytes).</p>
<p>(Someday I may write an entry on fun inconsistencies in SMART
attribute names and probably meaning that we see across our disks.)</p>
<p>PS: I don't know how NVMe drives behave here, since <a href="https://utcc.utoronto.ca/~cks/space/blog/tech/NVMeAndSMART">NVMe drives
don't have conventional SMART attributes</a> and we're
not otherwise collecting the data from our few NVMe drives that
might someday let us know for sure, but for now I'd assume that
the equivalent information from NVMe drives is equally volatile
and may also go backward under various circumstances.</p>
</div>
Disk drive SMART attributes can go backward and otherwise be volatile2024-02-26T21:43:52Z2022-08-16T01:51:50Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/MFAAccountRecoveryDistrustcks<div class="wikitext"><p>A bunch of third party websites really want you to use multi-factor
authentication these days. Some of them aren't giving some people
a choice about it; for example, <a href="https://www.bleepingcomputer.com/news/security/pypi-mandates-2fa-for-critical-projects-developer-pushes-back/">PyPI recently mandated MFA for
sufficiently popular projects</a>.
I have decidedly mixed feelings about this in general, and I've
realized that one reason for them is that I don't trust the some
of the potential failure modes of multi-factor authentication.
Specifically, the ones related to 'account recovery', also known
as what happens when things go wrong with your MFA-related devices.</p>
<p>There's no general account recovery problem with MFA. For example, if
the MFA hardware token from my employer was lost or destroyed, I'd
report it and various processes would happen and a new one would show up
and get registered to me. If the MFA I used with my bank was lost, I'd
go to my bank branch to talk to them, and eventually things would get
reset. But both of these situations have some things in common. I can
actually talk to real people in both situations, and both have out of
band means of identifying me (and communicating with me).</p>
<p>Famously, neither of these is the case with many large third party
websites, which often have functionally no customer support and
generally no out of band ways of identifying you (at least not ones
they trust). If you (I) suffer total loss of all of your means of
doing MFA, you are probably completely out of luck. One consequence
of this is that you really need to have multiple forms of MFA set
up before you make MFA mandatory on your account (better sites will
insist on this). People advise things like multiple hardware tokens,
with some of them carefully stored offsite in trusted locations.
This significantly (or vastly) raises the complexity of using MFA
with these sites.</p>
<p>More broadly, this is a balance of risks issue. I care quite a bit
about the availability of my accounts, and I feel that it's much
more likely that I will suffer from MFA issues than it is that I
will be targeted and successfully phished for my regular account
credentials (or that someone can use 'account recovery' to take
over the account). If loss of MFA is fatal, my overall risks go way
up if I use MFA, although the risk of account compromise goes way
down.</p>
<p>(As a side note, this is likely not PyPI's situation. PyPI is
apparently giving people security keys, and is clearly in touch
with these people through additional channels. If PyPI considers
you and your package critical, it's very likely that you can recover
from an MFA loss. PyPI here is much more like my employer than it
is like, say, Google. But most random websites that ask me to enable
MFA are much more like Google than PyPI.)</p>
<p>(This isn't my only issue with 'you must have MFA' requirements,
but it's a starting point.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/MFAAccountRecoveryDistrust?showcomments#comments">5 comments</a>.) </div>My distrust of multi-factor authentication's account recovery story2024-02-26T21:43:52Z2022-07-11T03:31:11Ztag:cspace@cks.mef.org,2009-03-24:/blog/tech/FilesystemsVsGeneralTreescks<div class="wikitext"><p>A while back I read Felix Kohlgrüber's <a href="https://fkohlgrueber.github.io/blog/tree-structure-of-file-systems/">The Tree Structure of File
Systems</a>
(<a href="https://lobste.rs/s/ydno8w/tree_structure_file_systems">via</a>).
As I read it, it argues that filesystem directories should be a
richer and more general data structure, one that could contain both
data and children (<a href="https://utcc.utoronto.ca/~cks/space/blog/web/WebPathsNotQuiteFilesystemAPI">which would make web paths map better on to
the 'filesystem API'</a>). As it
happens, I think there is a relatively good reason that directories
are organization this way, one beyond simple history.</p>
<p>To put it simply, filesystem directories are optimized for out of
memory traversal in order to look up names. Generally their actual
representation is some form of data structure filled with <name,
reference> pairs, where the reference tells you where (on disk) to
find almost all substantive information about the name. In the old
days and in basic filesystems, the containing data structure was a
list; in modern filesystems, it's usually some form of balanced
tree (sometimes once the directory is big enough). Directories are
optimized to be searched through with as few disk reads as possible,
because disk reads are assumed to be expensive.</p>
<p>(Modern filesystems often put some sufficiently commonly used metadata
about each name into the directory as well, because that way you can
tell programs about it without them having to a separate disk read,
possibly a separate disk read for each name in the directory.)</p>
<p>If you augment this basic data structure with data content associated
with the directory (well, the directory name) itself, you don't want
to put this new data in line with the lookup data structure, where
it will force you to read more data from disk in order to do the
same traversals. Instead it's mostly likely going to be referred to
separately. There will be one set of disk storage associated with the
inventory of the directory's children and a second set of disk storage
for the data content. Entities without children will have a 'no such
thing' marker for their first sort of disk storage and ones without
data will have a 'no such thing' marker for their second. If a lot
of filesystem entities only have one or the other (ie, if they're
conventional files or directories), then a lot of the time having
two slots is a waste of (limited) space in an area where filesystems
have traditionally cared about (in the metadata kept for most or all
filesystem entities).</p>
<p>More broadly, I think it's a mistake to look at filesystems through
the eyes of general tree structures. Filesystems originated in a
very constrained situation and continue to be focused on fairly
constrained one, one where any indirect reference to something is
very slow and the less that you need to access the better.</p>
<p>(It's true that modern memory references have gotten slower and
slower relative to raw CPU speed, but they're still faster than
even NVMe speeds. Plus, a lot of in memory data structures are still
not being designed by programmers to minimize references and pack
things as densely as possible, for various reasons.)</p>
</div>
<div> (<a href="https://utcc.utoronto.ca/~cks/space/blog/tech/FilesystemsVsGeneralTrees?showcomments#comments">3 comments</a>.) </div>Filesystems versus general tree structures2024-02-26T21:43:52Z2022-07-04T02:32:01Z