Something Rotten In The State Of Linux

I can’t remember how SunOs 4 startup worked, but it’s certain that I was first exposed to the “System V” system with SunOS 5, marketed as “Solaris” from its release in 1991.

Right from the beginning, I thought it was absurd. I was OK with the concept of “run levels”, representing “single user”, “multiple users”, “graphical interface”, and so on, but each run level was implemented by a set of scripts accessed by symbolic links, the name of which defined its activation and order of execution.

As an example, if I look at this current Linux laptop wot I’m typing on, /etc/rc3.d/S13networking does whatever is needed to get the machine’s network working. But that isn’t a real file, it’s a link to /etc/init.d/network but the init system needs to know to start it before /etc/rc3.d/S15nfs-common (a link to /etc/init.d/nfs-common) because network files can’t work before the network is up.

The same real file /etc/init.d/network is also linked to /etc/rc1.d/S13networking and /etc/rc5.d/S13networking in order to start the network in these different run levels. That means you need to remember that changing the real file to fix something affects all run levels, or changing the prefix because something else needs to run first will have to be done separately for all run levels.

It’s a rats nest, and many people don’t like it, so there have been many ideas for different schemes. But the fact that my Linux machine in 2015 still uses the same system is evidence that none of the replacements was significantly successful.

Until now. An init system called “systemd” is beginning to be implemented in most major varieties of Linux.

And it’s an abomination, worse in many ways than the ancient System V init. All software has bugs, but some software is designed wrong, and systemd is one of those, because the thinking behind it is wrongheaded. A key strength of Unix-style operating systems has always been the loose coupling of functions, encapsulated in the idea that programs should “do one thing, and do it well”.

Systemd tries to do many, many things. From a developer’s perspective, that inevitably makes it big and complex and difficult to maintain. And some of the things it wants to do are actually operating system functions. It’s clear that what the originators of systemd have in mind is an operating system on top of an operating system. Systemd will control users. Systemd will control devices. Systemd will control security.

linux, elephant, penguinThe thing is, a Linux system already has all of those functions. Loose-coupled, with software that “does one thing, and does it well”, so that any bug is localized, and easy(-ish) to isolate and fix.

So why has systemd been widely adopted if it’s obviously not fit for purpose? Well, it’s actually one of its worst flaws which has propelled it to success. The monolithic nature and lack of separation mean that you can’t have just a bit of systemd, you have to eat the whole elephant.

The Gnome project, for example, has adopted the “logind” part of systemd to manage the different users logged in, thus making systemd what developers call a “dependency”: you can’t easily have a recent version of the Gnome desktop unless you have systemd installed. (“Wait a minute,” you may say, “Different users logged in? I have a laptop with one user: me.” Well, exactly. The dependency on systemd is to handle a situation that doesn’t apply to the majority of users, but you still have to have it, or else the whole thing won’t work.)

Another project with a dependency on systemd is udev, the Linux process which looks for hardware changes and makes the device available to other software. For example, plugging in a USB hard drive will allow the folders in it to be accessed. Part of that process is handled by udev.

It’s udev which is bending my brain at the moment. I was lying when I talked about the whole elephant; or, rather, being foresighted. The current version of udev only needs one software library from systemd, but the project development has been merged with systemd, and it looks certain that the whole elephant will return, angrier than ever.

My current Linux systems use udev, and thus are contaminated by systemd, even though I don’t use it as the init system, and never will. The basic graphical interface, xorg, depends on udev to tell it about mice and keyboards (which means you can plug in, say, a second mouse and have it work immediately. But how often do you do that?) but I’ve discovered how to configure xorg to use separate drivers, and that’s working fine.

The other essential thing that isn’t working yet without udev is network devices, ethernet and wifi, which will take more work. And not absolutely essential, but nice, would be the USB drive thing. It would be easy to set it up as a fixed device, but having it appear and disappear will require some programming.

If you found this blog by desperate internet search, wanting to get your Linux system working properly and efficiently, well, I’m only a seeker too. I have no definite answers, but what I do may well incorporate the “mdev” element of busybox, or maybe “eudev” from the Gentoo project. Go search.


One thought on “Something Rotten In The State Of Linux

  1. It took me a while to get round to doing more work, but I have now confirmed that it is quite possible to have a Linux system which does not run udev. It’s a different matter to have a system without udev *installed*, because of package dependencies. That’s the case with Debian, anyway. As an example, the very useful video player VLC won’t install without a library to support MTP access to mobile devices, and that library won’t install without the udev library. In an ideal world, that would be an optional feature (and in my opinion, a video player has no business accessing devices directly) but it’s hardwired in.

    I wasted some time and effort developing a non-udev “initial ramdisk”. In Debian and many other Linux systems, the kernel is supplied with a ramdisk by the boot loader at boot time. The main purpose of the ramdisk is that when mounted as the root drive, it provides kernel modules, plus scripts or other executables which the kernel may need to get up and running. After that the real root disk is mounted in place of the ramdisk.

    I did get it working perfectly, but realized in a flash of inspiration that I didn’t need it. If the kernel has the drivers compiled in to allow it to boot the real root drive directly (rather than as modules which it has to somehow load off a drive before it can access a drive) then everything else for the boot process comes off the real root drive. The initial ramdisk in Debian only exists so that the kernel can load drivers as modules for different hardware. In reality, only a minority of systems need special drivers, but that’s a common Linux feature: the fringe cases drive the whole design and make it more complicated.

    So, once I had the basic SATA and ext4 drivers built in to the kernel, it could boot without a ramdisk.

    When I stopped running udev at boot time though, a number of things stopped working, such as the ethernet and wireless. The explanation was that udev had been loading the driver modules at boot time. In fact, it’s an aspect of udev’s design. Although it’s a “hotplug” handler, designed to react to new hardware being plugged in, at boot time it treats the existing devices as though they had just been plugged in (even if they are permanently attached, like a sound chip on the motherboard).

    There was a simple solution. The file /etc/modules contains a list of modules to be loaded at boot time. I simply added the correct module name every time I found something that wasn’t working. Arguably, anything which is part of the permanent hardware could have its driver configured to be part of the kernel rather than a module, but the former approach enabled me to get things working without rebuilding the kernel.

    (In fact, I later built a new kernel with all the drivers included, but it didn’t work completely. The wifi chip in the netbook needs a firmware file downloaded to it and while it happens automatically when the module is loaded, it does not when the driver is in the kernel. There is kernel code to load firmware, but as of 4.0.4 it appears not to work.)

    Here’s a tip: when you boot with udev, do “lsmod” to see which modules it has loaded, and save the result in a file. Then, when you don’t run udev, you can compare and you’ll have an idea what is missing. One useful thing that udev does is to assign owner and access to devices. I had to do that in /etc/rc.local instead, for example “chown -R root:audio /dev/snd” then “chmod 776 /dev/snd/*”.

    The graphical user interface xorg also has a package dependency on udev. It uses udev to “find” mice and keyboards, using the evdev driver, but you can manually specify the alternate mouse and keyboard drivers as entries in xorg.conf (although, in an inexplicable quirk, you still need to have the evdev driver installed, or nothing will work).

    At this stage, I had a system which booted and ran normally, and all the hardware was working, both built-in and plugged in (the latter just a USB mouse). But I needed the system to react correctly to new devices being plugged in.

    One piece of good news was that the modules I’d added to the list in /etc/modules to get the mouse working also worked for an additional mouse or external keyboard, so I didn’t need to do any work there. But one feature I use heavily is plugging in USB storage. And doing so didn’t cause anything to happen.

    This is where we have to replace what is supposed to be udev’s core functionality, handling “hotplugging”. If you write the name of a program into /proc/sys/kernel/hotplug then the kernel will execute that program every time the hardware changes. You could probably code up something from scratch, but I chose to use “mdev”.

    In a flagrant violation of the traditional Unix principle of “do one thing, and do it well” the program busybox does everything. If you link the executable to a name of a utility, then busybox will make an attempt to emulate that utility, in a simple way. For example, if you had executed “ln -s /usr/bin/busybox /usr/local/bin/ls”, then the “ls” command will actually invoke busybox, which, knowing it was called as “ls”, will give a cut-down performance as “ls”.

    “mdev” isn’t an existing Linux utility, but linking busybox (the convention is “ln -s /usr/bin/busybox /sbin/mdev”) effectively creates it as a new utility. You get it to handle hotplug events using the command “echo /sbin/mdev > /proc/sys/kernel/hotplug” at boot time, for example in /etc/rc.local. Now mdev (really busybox) will run whenever you plug something in, or unplug it.

    mdev has a configuration file at /etc/mdev.conf which tells it what to do for different events. For my purposes, the important part was the last argument, a command to execute when device “sd??” was created. That’s an external hard drive or USB stick. I had already written a similar program for udev to execute, one which automatically mounted the new drive, with a sensible name. For example, I have two 16Gb USB devices, one red and one grey, with the same brand, but they identify themselves differently. That enabled me to have then automount as /media/usb/Red_16Gb and /media/usb/Grey_16Gb. I liked that.

    Well, it only took me about half an hour to adapt the script from udev to mdev. Like udev, mdev fills in environment variables with information relating to the event. There’s a little less direct info, but instead there’s a link into the /sys filestructure (complex and messy) where you can grope for relevant data.

    But that was just one case, one line in /etc/mdev.conf. Apart from input devices (they use USB and HID drivers) the only other one I had lying around was a digital television stick. I plugged it in and nothing happened. However, when I used the modprobe command to load its drivers, it did come to life, which actually involved downloading firmware into it. I don’t even know if it was the kernel or mdev which did the downloading, but I don’t really care. All I needed to do was get modprobe executed for the relevant module when the stick is plugged in.

    I don’t know why, but for the TV stick, mdev has the USB identity in an environment variable, which it didn’t for the storage devices. It meant I could put a rule in /etc/mdev.conf that triggered on “$PRODUCT=2040/7060/100” since the lsusb command showed the same numbers, albeit as 2040:7060 (I don’t know what the third term means). All the rule had to do was call a script to run “modprobe dvb-usb-dib0700”. A tiny bit of scripting.

    But that’s how it will be. Every time I need to support some new bit of hardware, I’ll have to write a bit of code to load the drivers. For my next project, my everyday working PC, I’ll need to take care of the USB scanner and the interchangeable caddy (DVD or SATA drive).

    That’s where udev has an advantage. It comes with about 40 files with hundreds of ready-written “rules” to recognise events and do stuff. I’ll have to do the work myself. But I think it’s worth it.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s