Picking FreeType CJK fonts for xterm on a modern Linux system

January 10, 2017

Once I worked out how to make xterm show Chinese, Japanese, and Korean characters, I had to figure out what font to use. I discussed the general details of using FontConfig to hunt for CJK fonts in that entry, so now let's get down to details.

The Arch Linux xterm example uses 'WenQuanYi Bitmap Song' as its example CJK font. This is from the Wen Quan Yi font collection, and they're available for Fedora in a collection of wqy-*-fonts packages. So I started out with 'WenQuanYi Zen Hei Mono' as the closest thing that I already had installed on my system.

(Descriptions of Chinese fonts often talk about them being an 'X style' font. It turns out that Chinese has different styles of typography, analogous to how Latin fonts have serif and sans-serif styles; see here or here or here for three somewhat random links that talk about eg Heiti vs Mingti. Japanese apparently has a similar but simpler split, per here, with the major divisions being called 'gothic' and 'Mincho'. Learning this has suddenly made some Japanese font names make a lot more sense.)

Fedora itself has a Localization fonts requirements wiki page. The important and useful bit of this page is a matrix of language and the default and additional fonts Fedora apparently prefers for it. Note that each of Chinese, Japanese, and Korean pick different fonts here; there isn't one CJK font that's the first or even second preference for all of them. Since you have to pick only one font for xterm's CJK font, you may want to think about which language you care most about.

(This is probably where Han unification sticks its head up, too. Fedora talks about maybe influencing font rendering choices here on its Identifying fonts page.)

In Ubuntu, apparently some CJK default fonts have changed to Google's Noto CJK family. A discussion in that bug suggests that Fedora may also have changed its defaults to the Noto CJK fonts, contrary to what its wiki sort of implies. The Arch Wiki has its usual comprehensive list of CJK font options and there's also Wikipedia's general list. Neither particularly mentions monospaced fonts, though, assuming that this is even something that one has to consider in CJK fonts for xterm.

All of this led me to peer into the depths of /etc/fonts/conf.d on my Fedora machines to look for mentions of monospace. Here I found interesting configuration file snippets that said things like:

   <match>
       <test name="lang">
           <string>ja</string>
       </test>
       <test name="family">
           <string>monospace</string>
       </test>
       <edit name="family" mode="prepend">
       <string>Noto Sans Mono CJK JP</string>
       </edit>
   </match>

   <alias>
       <family>Noto Sans Mono CJK JP</family>
       <default>
           <family>monospace</family>
       </default>
   </alias>

I'm not really up on FontConfig magic, but this sure looked like it was setting up a 'Noto Sans Mono CJK JP' font as a monospace font if you wanted things in Japanese. There's also KR, SC (Simplified Chinese), and TC (Traditional Chinese) variants of Noto Sans Mono CJK lurking in the depths of my Fedora system.

After looking at an xterm using WenQuanYi Zen Hei Mono side by side with one using Noto Sans Mono CJK JP, I decided that the Noto version was probably better looking (on my very limited sample of CJK text, mostly in file names and font names) and also I felt slightly more confident in picking it, since it seemed more likely to be closer to how eg gnome-terminal was operating and also the general trend of CJK font choices in various Linuxes. I wish I could find out what CJK font(s) gnome-terminal was using, but the design of current versions makes that difficult.

(Some experimentation suggests that in my setup, gnome-terminal may be using VL Gothic here. I guess I can live with all of this, however it comes out; mostly I just want CJK characters to show up as something other than boxes or especially spaces.)


Comments on this page:

I'd say I hate to nitpick, but that would be lying. I think you know this already; consider this my own notes:

You haven't worked out how to make xterm show Chinese, Japanese, and Korean characters. You worked out a configuration that shows Japanese characters, and mis-displays Chinese and Korean strings which include unihan characters.

https://en.wikipedia.org/wiki/Han_unification#Examples_of_language-dependent_glyphs

xterm and Unix text strings don't know about language. You need formatted text with language information. Which is needed for other purposes too, e.g. screen readers. Like the lang attribute this website appears to lack :-P.

By cks at 2017-01-10 12:39:50:

Alan: you're right here, and your link provides a handy source of text that can be copied into an xterm or any other terminal program to test which language rendering is in use. For my own use I deliberately picked Japanese as the unihan glyph choice, but other people will want a different one.

(As you note, the general problem is unsolvable in a terminal program because Unix text does not come with language markers. This is where I mutter that life would be simpler here without Han unification.)

One advantage of using Noto Sans Mono CJK <language> is that you can switch which glyphs you get by switching the language suffix. Eg, if I need to get correct Chinese characters I can use 'xterm -fd "Noto Sans Mono CJK SC"'. I suspect (but haven't tested) that there's no good way to do this with just locale settings, at least for xterm.

Written on 10 January 2017.
« Making modern FreeType-using versions of xterm display CJK characters
ZFS's potentially very useful 'zpool history -i' option »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jan 10 00:48:32 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.