Category Archives: Hackery

Eggs

If we’re to take the title of Kevin Forbes’s Simulated Comic Product at its word, we can only conclude conclude that, on a day-to-day basis, it is at least as tasty as the real thing.

His recent Easter-themed comic, Eggs, is in a league of its own, though. In three short panels it tells the story of two quick-thinking children who narrowly thwart the escape of a genetic freak. Or of two ruthless brats who unhesitatingly betray a gentle victim of science back to his creator-tormentors. Take your pick. Forbes doesn’t shove an interpretation down your throat, and that’s what elevates this particular strip to the level of genius.

vesafb

During one of my recent visits to Chez Dirk, I got to see Linux’s VESA framebuffer console, something I’d previously encountered only in the context of the Gentoo installer, running on a regular system, and I found myself liking what I saw. The higher resolution is gentle on the eyes, and having a viewport of more than 80 columns by 25 lines makes the text-mode experience a lot easier to stomach.

It also happens that I’m in the last throes of yielding to the inevitable, and replacing my 19″ CRT with a 20″ flat-panel: the thought of seeing the VGA text mode, at 640×480, interpolated up to the 1600×1200 native to the monitor is enough to make my eyes hurt in anticipation.

So I spent some time last weekend trying to get vesafb working on one of my boxen as a proof-of-concept. In the end, I succeeded, but… Jeebus. The process highlighted everything that can be other-than-fun about the Linux experience. While it was not quite enough to propel me into the rarefied stratosphere of bitterness currently home to Jamie (“Fuck the skull of ALSA“) Zawinski, it was definitely irksome. Pieces of canonical reference material on the subject were at least one major stable kernel version out of date; others seemed flat-out at odds with reality. There was a bit there where I found myself muttering “VESA: it’s everywhere you can’t quite get to.” I’m better now.

In the deluded but psychologically vital belief that this will somehow help the next poor schmuck down the pike, here’s what I learned in the process of getting vesafb working on a system running Linux kernel 2.6:

  1. In order to even be presented the VESA VGA framebuffer option, you’ve got to build framebuffer support into the kernel. Building it as module won’t do: vesafb is a builtin-only option, so if its parent is modularized, it will be silently hidden from you. Navigate your way into Device drivers -> Graphics support, in the kernel configurator, and enable (with ‘y’, not ‘m’) “Support for frame buffer devices” and “VESA VGA graphics support”. Then dig down another level, into Console display driver support, and enable both “Video mode selection support” and “Framebuffer Console support”. (Again, modules are an inadvisable selection here, given that all of this stuff runs near the very beginning of the boot process, before even the loading of any hypothetical initrd.)
  2. Several of the docs indicate that once you’ve built the framebuffer into your kernel, you can add the option vga=ask to your kernel invocation in LILO or GRUB, reboot, and then enter the number of a graphical mode at the prompt. As far as I can tell, this is a big stinkin’ lie. Let me emphasize that. Big! And stinky! While it’s true that you can specify an assortment of text modes at that prompt, anything that would require a switch to framebuffer mode gets rejected out of hand as invalid. I’m not sure why that is. Casual perusal of vesafb.c suggests that the switch from text to graphics mode must happen while the processor’s still in real mode; presumably by the time the kernel is prompting interactively, we’re already in protected mode, and it’s too late.
  3. You will find, in the Framebuffer HOWTO and various other bits of documentation, a table listing the allegedly-canonical VESA mode numbers. This table, based on the VESA BIOS Extensions 2.0 specification, may or may not correspond to your hardware’s capabilities, especially if the latter’s compatible with VBE 3.0. The best way to find out what your card can actually do is to boot into GRUB, call up a command prompt, and invoke the vbeprobe command, which will spit out a list of mode numers, along with their associated resolutions and color depths. (If’ you’re not running GRUB, I’m very sorry for you. Consider upgrading at your earliest convenience. Seriously.)
  4. Armed with a mode number emitted by vbeprobe, and a little bit of extra arcana from the Framebuffer HOWTO and its ilk, you are now almost ready to get down to business. Add 200 hexadecimal to your desired mode number, convert the result to decimal, and use that value to as the parameter to the vga= argument your bootloader passes to the kernel. (You might be able to get away with skipping the hex-to-decimal conversion, but this varies by bootloader flavor and version; if you’re not in a violently-disinclined-to-fuck-around frame of mind by this point in the process, you haven’t been playing long or hard enough.)

    Here’s a table of values that have worked for me. You are strongly encouraged to determine your own. No guarantee is expressed or implied, not responsible for items lost or stolen, etcetera, etcetera. That having been said, the first of these should be safe enough, since it’s what the Gentoo installer defaults to.

    Decimal value Mode
    791 1024×768, 16 bpp
    792 1024×768, 32 bpp
    794 1280×1024, 16 bpp
    838 1600×1200, 16 bpp

With all this done, you should be able to boot into a high-resolution framebuffer console.

This isn’t needed for the proof-of-concept boot, but you will likely want to pass a few extra options to the vesafb driver once you have things up and running. vesafb tends to be CPU-intensive, since it involves a lot of unassisted memory movement. Compiling support for Memory Type Range Registers (MTRRs) into your kernel, and telling vesafb to use that support, will help mitigate the hit. Back in the kernel configurator, under Processor type and features, make sure that “MTRR (Memory Type Range Register) support” is enabled. Then add video=vesafb:mtrr to your set of bootloader kernel arguments. Actually, depending upon your particular card’s degree of brain damage, you will probably also want to request one of the video-memory-based scrolling modes, either ypan or ywrap. See Documentation/fb/vesafb.txt for details.

To re-summarize the required kernel options, here they are in their raw form, the one native to .config (or /proc/config, if you’re reading them out of a suitably-configured live kernel):

CONFIG_MTRR=y
CONFIG_FB=y
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
CONFIG_FRAMEBUFFER_CONSOLE=y

Enjoy. If the preceding text either helps you, or conspicuously fails to help you, in getting vesafb mode working on your system, I’d be interested in hearing about it.

Tick Tock, Tick Tock, Clarice

One of the incidental consequences of having the Squeezebox 2 — a recent acquisition which I’ll have to describe in detail another time — in the bedroom is a sudden blinding awareness of the craptacular nature of PC clock hardware.

I should explain.

The Squeezebox 2 is a very clever device intended to act as an interface between your stereo and your massive digitized music collection. Connect it to your receiver via RCA cables, connect it to your computer — running the server software — via an Ethernet cable, and you’re off to the races. It sports a dimmable, eminently-legible vacuum-fluorescent display, comes with a thoughtfully-designed remote, and has no moving parts of its own, making it utterly silent and thus something you can put on your nightstand without fearing that it will trouble your dreams.

Part of its cleverness lies in its reliance upon the power and versatility of general-purpose hardware, running easily-modified software — the thing on the far end of the Ethernet cable, in other words — to do the heavy lifting. It has just enough intelligence of its own to ask someone smarter for help.

It never really shuts off, either, unless you pull the plug, which allows it to do the gratifying trick of starting up instantly when called upon. Instead it goes into, at best, a light doze when you hit the Power button on the remote. It can, depending upon your preference, be configured to do any number of things while snoozing, from playing convincingly dead to displaying RSS feeds to showing a clock.

This brings us back to the approximate neighborhood of the original point. In keeping with its “let the server do the work” philosophy, the Squeezebox 2 doesn’t actually keep time on its own, being wholly dependent instead upon the server’s clock. It so happens that I already have a clock on the bedroom wall — an Oregon Scientific unit that tells the temperature and synchronizes itself to WWVB nightly — and thus it is that I conspired to present myself with inescapable evidence of just how awful a job the PC does of keeping time without help.

When I first plugged in the Squeezebox 2, I noticed that its clock was eight or so hours off. “Right”, thought I, realizing that the server’s clock had never been properly set, and proceeded to build and run ntpdate pool.ntp.org so as to prime the hourglass. The Squeezebox 2’s clock and the wall clock were in perfect sync, and I was happy — except that next morning I woke up and noticed that the Squeezebox 2 had gained about six seconds on the wall clock. By evening that amount had reached ten seconds.

Ten seconds. In a day!

Fine. Time to go whole hog and set up ntpd to run long-term so as to determine and eventually correct for clock drift. Only it turned out that ntpd never seemed to get around to calculating a reasonable drift value, let along correcting the clock. Eventually, after recompiling the ntp package with the debug flag set — easy, thanks to Gentoo and Portage — and quite a bit of confused poking around, I figured out what the problem was. My ntpd.conf contained the notrust directive as a parameter to the default configuration, meaning that the daemon rejected each reply it was busily soliciting for lack of cryptographic authentication. Brilliant. (To be fair, Portage tried to warn me about this at build time. Alas, that warning went the way of all Portage warnings, off the far end of stdout. That’s something else I need to correct, and soon.)

With that fixed, ntpd seems to have finally figured out what time it is. At some point I’ll go back and figure out if there’s a way for it to communicate securely with the servers in pool.ntp.org, at least. (Still and all, I have to admit that “spoofed time” is not foremost on my list of personal anxieties.)

Morals of this story:

  1. PC hardware sucks on an intrinsic design level, irrespective of manufacturer or vendor. (“Sure, it sucks, but it’s an industry-standard sucking!”) I have cheap plastic made-in-China quartz clocks that do a better job of keeping time.
  2. pool.ntp.org is your friend. It’s really nice to be able to specify a server name in your ntpd.conf without having to feel like a leech lest you approach the server administrator cap in hand.
  3. ntpd is your friend. Sort of. It does the job admirably once it’s configured, but good Lord, could its error reporting use some work. Anything that can spend several hours busily querying the world at large for the time of day, only to consign every answer it receives to the bit bucket for lack of a cryptographic signature, without making so much as a peep in either the system log or its own, has… self-expression issues. Perhaps this is controllable via an ntp.conf configuration directive or ntpd command-line argument, but if so, I haven’t found the trick of it. Which brings me to my next point:
  4. The NTP documentation is not your friend. All honor and respect to David Mills for his perseverance in creating and maintaining a system that does a more-than-decent job of synchronizing clocks over a variably-laggy packet-switched network, but someone needs to sit the man down and explain the basics of good manual design to him.

    One: scrap, for God’s sake, the Tcl/Tk-inspired color scheme. Two: ditch the Pogo cartoons. Cuteness is tolerable when your documentation is otherwise up to snuff. When this is not the case, cuteness is nothing but an aggravating reminder to your reader that your attention was elsewhere than where it should have been. Three: focus on making the package as a whole simple, straightforward, and accessible. For openers:

    • Make sure it’s sanely searchable. That means one or more of the following: indexing the site with your own search engine; paying the few bucks a year to buy your own domain so that searches using Google’s “site:” keyword work properly; just flattening the entire documentation down into a single page so that I can use my browser’s built-in search facility to scan the whole thing. Under no circumstances should I have to open the five different subsections of the ntpd manual, each on a separate page, in order to track down the description of the configuration directive I’m looking for. (Of course, if you just had something as radically newfangled as an index for configuration directives, or even just a coherent organization for said directives, maybe I wouldn’t have to rely upon searching so much.)
    • Try to make it reasonably portable. HTML is nice and even justifiable if you’ve got a lot of cross-references, but if you’re going to bother providing a man page at all, make sure that the meaty bits haven’t been badly truncated by your HTML-to-man conversion. (This may in a sense be the distribution’s fault rather than yours: maybe you didn’t provide a man page at all, and they just did the best they could with what they had. This doesn’t really get you off the hook, though. Failure to provide man pages is itself a capital crime, or should be. The fact that you’re not alone in comitting it — the GNOME folks come to mind, as does the GNU project’s dedication to info pages — does nothing to exonerate you either.)

Thank you, Lord Almighty

And I thank you, Lord Almighty up above,
Just for sending down the ‘F’ train to me.

— Mike Doughty,
“Thank you, Lord, for sending me the F train”

Sketches from the notebook of a man walking around in a daze, trying to come to terms with just how hosed he is in terms of the data he has suddenly lost access to:

I left reiserfsck running overnight, having disabled DMA for the wounded drive. (Since reiserfsck failed with lots of ominous-sounding warnings about DMA timeouts and lost interrupts, it seemed worth a shot. Of course, it slowed the process by about a factor of ten — hence the “overnight” part.) In the morning, no joy.

I had to run some errands this afternoon, partly in preparation for John and Jody’s wedding later this week. While I was out and about, I picked up a low-end router, a D-Link DI-604, so that I could restore some reasonable level of network connectivity to the apartment. (I’d have gotten a DGL-4100, but no one had them in stock. It’s probably just as well.)

Walking through the software section at Fry’s, and glancing at the various data-recovery programs on display there, reminded me of the existence of a tool I hadn’t thought about in a while, and had never had occasion to use before: Steve Gibson’s SpinRite. I resolved to look at it more closely once I got home and had re-established web access.

When I got home, I plugged in the little D-Link router, and realized that I should have acquired it, or something like it, a long time ago. Having your own highly-tweakable Linux-based router is nice, but having a foolproof, solid-state box as a backup makes for amazing peace of mind. I will go back to the Linux-based approach in short order, of course, but it will be nice to know that the D-Link is waiting in the wings, ready to pinch-hit the next time I find myself having to juggle hardware. Being able to browse the web for tools and tips did wonders for my peace of mind, to say nothing of having the phone working again.

The first thing I did was scrutinize SpinRite, which looked sufficiently promising that I decided to try it. (Possible salvation for $89 a pop? I’ll take that action!)

Before loosing SpinRite upon the drive, I thought I’d let reiserfsck --rebuild-tree have one last crack at it. In the morning, after the unsuccessful overnight reiserfsck attempt, I had noticed that the drive was a bit dusty, and that some of the dust appeared to be lodged under the integrated-controller PCB. Having a screwdriver nearby, and too much time on my hands, I unscrewed the PCB and blew it clean with a few blasts from a can of compressed air before reassembling the whole thing.

Desperate. Pathetic. Everyone knows that that sort of blind, ritualistic hardware voodoo never does any good.

Except when it does. Except when it does. Because this time around, reiserfsck --rebuild-tree plowed right past the point where it had stopped dead during the five previous recovery attempts, and left me with a successfully-rebuilt and mountable filesystem. I have no idea what convinced the drive to venture back from the sunless lands — for all I know, it was knocking it about eight inches to the carpeted floor when I bumped the box I’d rested it on — and I’m not inclined to care. I’m just grateful for the undeserved third chance I’ve been given. (Yes, tar is chugging away as I write this. I may be stupid, but I’m not criminally stupid.)

All the files I really care about have already been backed up to another disk; I’ll be burning them to optical media in the morning, just to be safe. At this point, I’m down to saving the data I could afford to lose, but would rather not have to recreate. Having paid for SpinRite, I find myself not needing it at the moment. I am utterly unconcerned. I’ll probably unleash it upon the old disk once the backups are complete, just to see what it finds and reports.

I am insanely lucky. My father is fond of saying that it is better to be lucky than good, but I will strive not to push my luck quite so aggressively in the future. Next up: an actual backup strategy.

Important Safety Tip

If you are trying to get Gentoo Linux running on a machine with a SATA hard disk, you will either need to pass nolvm2 to the kernel at CD-boot time, or issue the command dmsetup remove_all after you’ve booted.

Otherwise, mount will frustratingly claim that the SATA-disk partitions are busy when you try to mount them after having created them and their resident filesystems with cfdisk and mkfs.

Given current hardware trends, the percentage of first-time users attempting to set up Gentoo on a SATA disk would seem likely to vastly outstrip the percentage of those needing LVM2 functionality out-of-the-box; given that, the decision to favor the former at the expense of the latter seems… ill-considered at best. But then, I wasn’t consulted.

How Long? Not Long.

“…’cause what you reap is what you sow.”

Well.

It turns out that the problem might have been with avestriel’s disk after all.

That seems the obvious conclusion, anyway, given that the thing died an ugly death this morning. I took it offline to check up on a filesystem inconsistency, and was informed that the inconsistency could not be resolved without using --rebuild-tree. “No problem,” thought I. “I’ve done it once before.”

Well, yeah, except that last time I hadn’t been treated to controller-level DMA timeout errors when I was roughly a third of the way through the process. “Ooookay,” I thought, trying not to panic, “maybe it’s the controller. Or the cable. Or the power supply.” Easily tested: yank the drive, plug it into a different machine using a different cable. Hope, hope, hope. Nope. Same result.

Bugger.

The drive is currently sitting atop the case of the second machine, powered off. I’m hoping that being allowed to cool off for a bit will somehow make it happy. (It should give you some idea of just how desperate I am at this point, that I’d be willing to clutch a straw so thin.)

If that fails, I will have to do something I’ve never done before, and solicit the services of a professional data-recovery firm. There’s an assload of mail on that disk that I’d rather not lose. (Recommendations as to reputable firms in this area would be gratefully accepted.)

Once that’s arranged, I will have to hire the services of a different sort of professional firm to kick my own ass to the degree it deserves. I don’t think I’m up to the task myself. I mean, the drive all but sent me an engraved suicide pre-announcement. I have no excuse for not having made backups by now. None. Yet here I am.

Memo to self: the next time smartmontools so much as sneezes, buy a new drive and toss the old one. The headache spared will more than offset the money spent. And start making regular backups, for frigsake. Idiot.

Puzzling Evidence

avestriel crashed again on Sunday and Monday nights — both times around 2:00 in the morning. Strange. It’s been fine since, having made it through Tuesday and Wednesday nights uneventfully. Stranger.

I left the office window wide open on the crash nights, and it got pretty cold in there. It seems unlikely that that could be the cause, but stranger things have happened. (I also had a chance to open up a power supply that’s the twin of avestriel’s yesterday, and was properly horrified at the sight of the great big amoebic blobs of solder that lurked beneath the cover plates. I’d like to believe that something that ugly is the cause of my problems. Of course, if it is, I have to ask the attendant question: is it worth spending any additional money buying replacement parts for machine that’s that old?)

No, I still haven’t backed it up, and I still haven’t brought its replacement up. Too much other stuff to do.

Another Satisfied Customer!

I spent a couple of hours this afternoon walking my new brother-in-law through the diagnosis and repair of the damage done to his computer by a freak moving accident that occurred yesterday. Somehow, relocating the thing between outlets killed its power supply. Don’t ask me to explain how that’s supposed to work, but at least the power supply, an Enermax, did its job by dying without taking any of the more delicate components with it.

The recovery process involved recommending a good power-supply tester, and a good power supply. (He decided to get both at once, figuring that if the problem was his old power supply, he’d save himself a second trip, while if it wasn’t, he could always return the new power supply later.)

Fortunately, it was, the new Antec NeoPower 480 he bought seems to fit the bill perfectly, his machine is once more chugging along, and my sister doesn’t have to cede back the box he handed down to her so that he can play World of WarCraft.

I think this makes the third weekend in a row that I’ve been on the phone doing technical support for members of my extended family. On the one hand, it’s satisfying to help folks get their stuff working again; on the other, I can’t help but feel the occasional temptation to fake a learning disability.

(Then again, it marks one of my few positive accomplishments this weekend, so perhaps I shouldn’t knock it.)

udev, how do I love thee?

When the final trumpet sounds, and the Linux kernel is called to account, the ledger eternal will bear witness to its numerous crimes and misdeeds — yeah, autofs, I’m looking at you — but udev, the 2.6 device-management infrastructure, will emphatically not be among these. No, it will go in the other column, the one that makes the case for redemption and eternal life.

udev is device management done right, elegantly and with just the right degree of abstraction: it handles USB dongles and PCI cards with equal aplomb, yet its configuration syntax is not so abstruse as to defy mortal aspirations. Quite the contrary, in fact. I was able to go from square one to a working configuration in about half an hour, with a little help.

Using udev, my USB devices can finally have predictable names, regardless of the order in which I plug them in. This in turn means that my fstab entries regarding removable devices and their respective filesystems can finally be something more than a laughable exercise in starry-eyed optimism.

Perhaps I will butt up against, and curse, udev’s limitations in short order, but for the time being, it’s a pure delight to use.

Gentoo 2005.0

Gentoo Linux 2005.0 has been released. Of course, the beauty of Gentoo is that nearly everyone who’s already using it couldn’t care less — they’re kept current by the magic of Portage.

Still, the Gentoo install CD doubles as an amazingly versatile rescue disc, not to mention makes a dandy temporary-terminal boot disc, for those times when you don’t want the full glory (and overhead) of Knoppix.

If Gentoo and Knoppix don’t constitute Exhibits A and B as to why P2P applications like BitTorrent are not merely legitimate, but insanely useful, I don’t know what does.