== A brief mention of some tools for debugging Linux NFS client issues Someone here recently asked for tips on debugging a mysterious Linux NFS client hang. I didn't have any answers, but I did happen to know where to look for some Linux-specific tools. (The person had already exhausted the abilities of things like _tcpdump_ to help.) The most obvious thing is to use the [[magic SysRq UsingMagicSysrq]] to get a dump of the kernel call stacks of all processes (the _t_ command). Once you find the hanging processes in all of the output, you can usually see what operations they're hanging on, both high level and somewhat low level. (Here's where I observe that it's a pity that there's no way to ask for a magic SysRq dump of a specific process. Hopefully someone will now tell me that I'm wrong.) The Linux NFS client also has its own debugging hooks, accessible through _/proc/sys/sunrpc_; unfortunately, they're rather underdocumented and magical. What you want are the files ((rpc_debug)) and ((nfs_debug)), each of which is a bitmap of flags that control which RPC or NFS events get logged; you write a decimal integer to them to set the bitmap's value, or a _0_ to turn off all logging. (In addition, writing any number to ((rpc_debug)) will give you a cryptic dump of RPC 'task' information. Having just read through a bunch of kernel source code, my opinion is that there is almost nothing useful in it unless you are a kernel hacker. If you really want this dump and nothing else, write a _0_ to ((rpc_debug)).) The values for the various things you can get reports of are found in the kernel source in ((include/linux/sunrpc/debug.h)) (the ((RPCDBG_)) #defines) and ((include/linux/nfs_fs.h)) (the ((NFSDBG_)) #defines). You can use a suitably large value like 32767 to turn everything on. Note that this can produce a lot of kernel messages very fast, especially if you turn on lots of things. Also, one of the big reasons this stuff is not documented is that it is primarily intended for kernel hackers, so to understand the results you may need to go dig in the kernel NFS and RPC code (in _fs/nfs_ and _net/sunrpc_ respectively). (There are similar debug files for the NFS server and for the NLM. Exploring these is left as an exercise for the reader.)