| 
 | 
 | 
 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Friday, 1 July 2005 | Echunga | Images for 1 July 2005 | 
| Top of page | ||
| next day | ||
| last day | 
More work on code analysis tools today, including setting up a web page describing my experiences. Sadly, they're not too good, but started looking at a couple which promise to be better; probably they'll end up being incomplete.
In the evening to a dinner held by Ian Gilfillan and colleagues of the South Australian Democrats, held at St Paul's Retreat, a monastery at the end of the freeway, and thus not too far from us. A pleasant evening, more intimate than last time.
| Saturday, 2 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Quiet day today. First thing in the morning installed Fedora Core 4 on the old sat-gw machine, which wasn't easy. The first time it failed with some disk error, the second time it discovered, long after it had started the installation, that it didn't have enough disk space. At least the third time succeeded. Interestingly, it didn't have any trouble with the DVD+RW that I used to install: it's slightly scratched, and the Digitrex seems to have difficulty writing to it.
Spent the rest of the morning brewing, for a change doing only a single brew and bottling a previous brew at the same time. That seems easier; I'll try it for a while and see how it works out.
In the afternoon had intended to do more work on mplayer, but apart from some playing around to understand what it's doing, didn't get very far. I need to make a list of all the things that need to be done.
Di Saunders along for dinner in the evening. Haven't seen much of her lately.
| Sunday, 3 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Had meant to finish off AUUGN today—the big thing was to get the bootstrap checksums right—but I wasn't able to work out how to do that. The documentation at my disposal went into all sorts of detail about things that I know, but I couldn't work out how the checksums are calculated.
Instead turned my attention to mplayer, and worked out a “to do” list, then started work on it. There's a surprising amount to be done, and even more surprisingly, it wasn't difficult. The change to get mplayer to detect the aspect ratio was a single line:
--- libvo/x11_common.c  2005/07/03 04:41:01     1.2
+++ libvo/x11_common.c  2005/07/03 22:36:46
@@ -433,7 +433,7 @@
       vo_screenwidth = modeline.hdisplay;
     if (!vo_screenheight)
       vo_screenheight = modeline.vdisplay;
+    monitor_aspect = ((float) vo_screenwidth) / ((float) vo_screenheight);
     }
       
    | Monday, 4 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Meeting day again today, so got nothing done. It didn't help that I was very tired all day.
| Tuesday, 5 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
More evaluation of code analysis software today, and spent some time looking at CScout, which looks good. It's still under development, and there are a few unevennesses, unfortunately including the display of the call graphs. CScout uses a web browser to display its results, but what I see is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd" [I wish I understood enough XML to know what's wrong here.
| Wednesday, 6 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
I had intended to give up on code analysis tools today and just continue with my code rearrangement, but somehow that didn't work well. I really need some better understanding of the code, and it would seem that the current crop don't (quite) cut it. Spend some more time looking at the CScout issue, and determined that at least one of the problems was an incorrect MIME type, but unfortunately the program is in binary only, so I can't do much about it. Also heard from Guo-Rong Koh that Source Navigator should be able to display caller graphs, but my attempts to get it to do so resulted in an inordinate use of CPU time and nothing much else. So far it almost looks as if cscope is the way to go.
Recently I've been making fun of messages like this, which you frequently see at the end of Microsoft-domain mail messages:
No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.323 / Virus Database: 267.8.1/28 - Release Date: 24/06/2005I started by adding to my .sig:
The virus contained in this message was not detected.
That caused a surprising amount of confusion, but after a while it got boring, and I currently have:
The virus contained in this message was detected by LEMIS anti-virus.
That, too, has caused confusion, noticeably from somebody—admittedly not very bright—who equated it with the PGP signature that I add to my mail messages. I got the following reply. And yes, the text is original, including the breakage.
Of course there's no file.bin in the message. I don't know whether he imagined that, or whether his MUA (Yahoo! web mail) invented it. Still, I know I'm hitting home when I find, in somebody else's message,
- ------- Greg's virus checker didn't find this one either.
| Thursday, 7 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Finally gave up on the work I've been doing on code analysis tools over the last couple of weeks and started working my way manually through the code I have to restructure. The job was made no easier by the fact that we have a coding standard that requires that comments about externally visible functions must be in the header file, and which prohibits any comments in the source (.c) file. I've been trying to get this changed since I've been with the company, but it's not going to happen. What a pain! Yes, it's easy enough to open another window into the header file and search for the documentation for each function—but why? And of course, that's two windows per file. It's difficult enough working through multiple files in the first place without doubling the number you need to examine. Very frustrated.
| Friday, 8 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
More work on API design. This must be some of the most painful stuff I've ever done. It almost came as a relief when my keyboard started beeping whenever I pressed any key. Investigation showed that it wasn't the keyboard, and I suspected X, but of course I couldn't just change to a different virtual terminal, so I had to connect remotely and stop X (which had been running for a little over three months at the time).
That confirmed that it was X: the keyboard worked perfectly. Unfortunately, due to a bug in the version of X I'm running, I couldn't restart X: it couldn't initialize one of the display cards, so I had to reboot. Looks like it's time to upgrade the system, which is over a year old and is giving me continuous problems when trying to build ports.
In the evening, took another crack at compiling VLC. It's no easier than last time. As then, ran into problems building this sillily named autom4te program
=== root@teevee (/dev/ttyp6) color="red">/usr/ports/multimedia/vlc color="blue">394 -> pd /usr/ports/multimedia/libmpeg2/
=== root@teevee (/dev/ttyp6) color="red">/usr/ports/multimedia/libmpeg2 color="blue">395 -> l
I had done a make clean before the build; obviously it didn't clean all the dependency subdirectories. In addition, ports age far too quickly:
Finally got it all compiled and installed, and ran into exactly the same problems I had in February:
libdvdnav: Language 'en' not found, using 'ÿÿ' instead libdvdnav: Menu Languages available: ÿÿ *** libdvdread: CHECK_VALUE failed in nav_read.c:351 *** *** for dsi->dsi_gi.zero1 == 0 *** libdvdnav: Language 'en' not found, using 'ÿÿ' instead libdvdnav: Menu Languages available: ÿÿ libdvdnav: Language 'en' not found, using 'ÿÿ' instead libdvdnav: Menu Languages available: ÿÿ X Error of failed request: BadShmSeg (invalid shared segment parameter) Major opcode of failed request: 145 (MIT-SHM) Minor opcode of failed request: 2 (X_ShmDetach)
Spent some time looking for that. Google confirms that I'm not the only person to see it, but it seems that nobody has posted a solution. Sent a message to the FreeBSD multimedia mailing list and gave up for the day.
The telemarketeers are at it again. In the afternoon, received a call purporting to be from Primus and offering me significant discounts on my Telstra phone line (never mind that the line was with Call Australia). The first conversation, on my office number, went something like this:
Woman: Hello, can I speak to the telephone owner, please? I am calling from Primus. We are offering discounts to Telstra customers. Me: What is your name? Woman: My name is .. ah .. Michael. Me: How do you spell that? Woman: ah ... click
The next time was some hours later, on the private number. It sounded like the same woman, and the name sounded similar but unintelligible (Micha?). She claimed that this was the first time she had called, though I think the concept of anybody having two voice lines confused her. Certainly she was confused when I said that the line was with Call Australia, but she still knew that Primus was cheaper. It's not; I did quite a comparison last November and found:
Rates:          rental  local   mobile  capped  flagfall  national  capped      .US     .DE     .GB
                        call            mobile                      national
Telstra         26.95
NewTel          26.95   .16     .33     none    .25     .22/.16       1.99      .17/2   .27/2.5  .16/2
Primus A        27.4    .19     .33     2       .35     .242/0.099              .176/2.5 .286/2.5 .176/2.5
Primus B        31.09   .165                    .35
Primus C        29.95   .165                    .35
Primus D        28.45   .185                    .35
Primus 1        55      0                       .35
Optus 1T        29.95   .15
Optus advance   25.95   .2                      .37     .16
      At this point, I was dealing with NewTel, a different name for what appears to be practically the same company as CallAustralia. CallAustralia is in fact marginally cheaper than NewTel. But what really worries me is the way these people do their business. On the second the woman again tried to make it look as if I were getting a discount on my existing phone bill, when in fact it would be a completely new contract. This is obviously intentional.
| Saturday, 9 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Back to looking at AUUGN this morning, and finally got it to create a bootable disk. As so often, I had been looking in the wrong place, and the real issue was the combination of boot parameters for mkisofs. To make matters worse, Glen Turner had given me the correct parameters last week:
Give this a go:
  mkisofs -J                           Joliet extensions
          -R                           Rock Ridge extensions
          -v                           Verbose
          -T                           Generate TRANS.TBL for Joliet
          -m TRANS.TBL                 Ignore any existing TRANS.TBL
          -o disc1.iso                 Output
          -b isolinux/isolinux.bin
          -c isolinux/boot.cat
          -no-emul-boot
          -boot-load-size 4            Completely bogus for bad BIOSs
          -pad                         Bogosity for Linux bugs
          -boot-info-table             Patch isolinux.bin with ElTorrito
          magic
          disc1/                       Input
      Of these, the ones that made the difference were the -boot-load-size 4 -boot-info-table, but I needed Krzysztof Krawczyk to remind me. Updated my mkcd script to do this automatically; I'm sure it will need more tweaking next time I try to build a bootable CD or DVD for a different operating system.
Google Maps is an interesting service. Recently some people pointed out that the maps of the area round here are better than many much more important places in the world. For example, maximum resolution maps are available for my own property (the right-angled triangle around the houses in the centre of the map). At present you can't say the same thing for Jerusalem,Mecca or Vienna: you have to zoom out three steps to see anything at all. Spent some time with the xemarkers file, originally from the xearth distribution, that I have been modifying for years, and converted it to a web page. Many of the locations prove to be inexact, and like the ones above, they may need zooming out.
| Sunday, 10 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Since I had finished AUUGN yesterday, I spent all today on—AUUGN. Somehow the little bits and pieces add up, and they took time. By late afternoon everything was complete except for the President's column—that came in late in the evening.
Still, I did get a chance to go out riding. This should be a weekly thing, but somehow recently the track record hasn't been too good.
| Monday, 11 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Wasn't feeling too well today, and didn't go to the Monday Meeting. That had the undoubted advantage that I thus had time to upgrade wantadilla to the latest version of FreeBSD. Well, I should have had time. At the same time, did more revision of my upgrade instructions, most of which I followed. It'll still be a while before they're ready for prime time.
Things were complicated by the fact that the version of FreeBSD that wantadilla is running was 5.2, and you can't build a 6.0 kernel from 5.2, so I first had to install 5.3. By the time all that was done, with the help of a few PEBKACs, it was evening. I rebooted and discovered that I couldn't start the shell I had installed beforehand: libraries were missing. Using DESTDIR to specify installation targets doesn't work overly well: the build process had still checked for the presence of the libraries on the local system, so after rebooting they weren't there.
As if that wasn't enough, the issue of Makefiles, which caused me so much grief in February, has cropped up again. Since then we've removed FunnelWeb from the equation, so fixing things should be easier now.
| Tuesday, 12 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Up early this morning to try to get wantadilla back up and running. It wasn't easy. My upgrade instructions turned out to be relatively reliable, but as usual, the devil is in the detail. Spent a couple of hours messing around with X, which at the best of times isn't easy to upgrade. Turned out that one of the problems had been the double kernel upgrade: I had upgraded my kernel file for release 5.4, but not for 6.0, and in the meantime some drivers had become optional.
Even after that, though, I couldn't get it to start. It's not made any easier by existing bugs in X.org which mean that when I stop X, I can't start it again without rebooting. After a lot of comparison, discovered that I had previously been starting X from /etc/XF86Config, and now I was trying from an old /etc/xorg.conf. Spent some time merging the output of X -configure, without too much success: I still couldn't start one of the screens. Strangely, starting from the (unchanged) /etc/XF86Config worked, so stayed with that.
After that, things worked relatively well, but once again I ran into problems building over NFS (the dreaded autom4te problem). They shouldn't happen, and next time I reboot echunga I'll start rpc.lockd and see if that helps. In the meantime, though, it's easy enough to check out a ports tree locally.
| Wednesday, 13 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Things are gradually coming back to normal: in fact, apart from the X problems yesterday, things went relatively smoothly. There are still myriads of loose ends to gather together, but none seem to be overly complicated. Probably the biggest is that there are more things than ports to be installed: locally developed software is one of them.
In the process, ran into space problems on my backup disks, and finally got round to looking at the temporary files that I saved a year ago tomorrow. In the process, found more strangenesses in my mklinks program, but after a bit of restructuring it looks like it's working pretty well. Maybe I should write it up and announce it.
| Thursday, 14 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Spent most of the day revisiting the Makefile work that I last looked at in February. Things are much simpler now that we no longer use FunnelWeb, even simpler than I had expected. The main issue was doing regression testing, which also included rewriting the regression tests. The only ugliness remaining is the insistence of building code in directories outside the immediate hierarchy. But I've given up trying to change that.
| Friday, 15 July 2005 | Echunga | Images for 15 July 2005 | 
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Spent most of the day working on the build infrastructure, and got it working more or less the way I wanted (though we still have these layering violations building directories outside the hierarchy). Hopefully we won't have any more trouble with that.
Letter from Alexander Downer today, not about my telephones, but about a mail message I had sent him 6 weeks ago. His response surprised me: yes, he knows that the Indonesian word “bahasa” means simply “language”, but he still uses it to refer to the Indonesian language. When I pointed it out, his response was surprising: that's the colloquial way of saying it. Words like “nigger”, ”boong” and ”chink” were once also colloquial. Is he endorsing this too? I think that his way of ignoring the language of another country is not conducive to better international relations. I'd be interested in the opinion of any Indonesian or Malay readers of this diary.
In the evening started to put together another machine for multimedia. Using FreeBSD is so painful that I'm prepared to give Linux another try, so put together yet another box with an Athlon XP and same the MSI K7N2 Delta motherboard that I'm already using in wantadilla and echunga. This new Athlon XP with the Barton core comes in two versions: one supporting a 200 MHz front-side bus and another that only supports 166 MHz. To see which it was, set the FSB value to 200 MHz in the BIOS setup.
Bad idea. When I tried to reboot, it reported a BIOS checksum error and a not-ready drive A (after I had just disabled the floppy controller). After powering down and up again, it was dead as a doornail. Checked the manual and reset the CMOS RAM, but that made no difference. Followed the links in the manual to a dead URL (http://www.msi.com.tw/support/bios/boot.htm), and spent some time looking round the site, only to find something basically saying “take back to your vendor”. Called the vendor, spoke to Alan, and got some advice to disconnect all power, including the CMOS battery and the power connectors, and to wait 30 minutes: under these circumstances, said Alan, a basic backup BIOS would kick in. Did that without any success, and then discovered one of the very few jumpers on the mother board: set the FSB to 100 MHz. Tried that, and it came up again. My guess is that the FSB bus speed is so basic that it has to be set at reset time, so it must be set in flash ROM or some such.
Then installed Fedora Core 4 on the machine, using the AUUGN DVD that I had burnt earlier; that worked fine, thus proving that the thing works.
Somehow also found time to play around with mklinks, and seem to have solved the directory deletion problem. It's becoming almost a worthwhile tool.
| Saturday, 16 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Brewing in the morning—making only one brew at a time is really much more relaxing, and I can spend the intervening time bottling the previous brew.
In the afternoon, more work on the new machine. I installed Fedora Core 4 on it because it seems to have the best third-party multimedia support, but it certainly drives me mad. Everything seems to be so difficult to set up—and this although I've done a lot of work with this particular distro of Linux. In particular, I couldn't find a way to stop it displaying this silly GNOME desktop. Somehow GNOME and KDE are splitting the free UNIX wold more than anything else, and every time I look at Linux it annoys me. Today started doing something constructive about the fact and started on a “why I hate Linux” page. Like my other rants, it's really intended to be constructive. Presently it just shows the issues I have at the moment, but in time it could grow into something like a migration guide, at least for me.
This page didn't last very long; it did, indeed, migrate into a How to set up Linux page
| Sunday, 17 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
More work on the Linux box today, and gradually managed to get most things sorted out. Some were more difficult: I've come to the conclusion that some of the utilities have been recompiled for Fedora. For example, the rm command asks for confirmation by default. I couldn't find anything in the environment to make it do so, and the man page documents the correct behaviour. Also, Emacs appears not to read its startup file (~/.emacs). The latter seemed straightforward enough: I had the latest Emacs tarball on echunga, so tried configuring and building it. The results weren't quite what I expected:
Finding pointers to doc strings...done Wrote /var/tmp/emacs-21.3/lib-src/fns-21.3.1.el Dumping under names emacs and emacs-21.3.1 make[1]: *** [emacs] Segmentation fault
I've seen that before, but not in the past ten years. Along with the myriads of warnings, this looks like some problem with the header files. Trying it on asterix, the AMD64 box, wasn't much better:
=== root@asterix (/dev/pts/1) /var/tmp/emacs-21.3 6 -> ./configure
creating cache ./config.cache
checking host system type...  x86_64-unknown-linux-gnu
configure: error: Emacs hasn't been ported to `x86_64-unknown-linux-gnu' systems.
Check `etc/MACHINES' for recognized configuration names.
      In addition, looking at NFS-mounted file system showed evidence of extreme corruption:
=== grog@deeveear (/dev/pts/1) ~ 1 -> l emacs
total 128
-r--r--r--  1 grog 1000  1112 Feb 24  1994 2iso.el
drwxr-xr-x  2 grog 1000   512 Jul  7 12:54 RCS
?---------  ? ?    ?        ?            ? a2ps-print.el
?---------  ? ?    ?        ?            ? a2ps.el
?---------  ? ?    ?        ?            ? autoconf-mode.el
?---------  ? ?    ?        ?            ? autotest-mode.el
-r--r--r--  1 grog 1000 28473 Jul  7 12:54 c-mode.el
-r--r--r--  1 grog 1000   159 Feb 15  1993 decr.el
-r--r--r--  1 grog 1000 13172 Jun  7  1993 din-tastatur.el
      It took me some time to realize that this wasn't NFS' problem at all. The ? files were broken symlinks; but that's not the way to show them, and it's not the way other Linux (here SuSE) shows them either:
total 1 drwxr-xr-x 3 1004 root 1024 2005-07-07 12:54 ./ drwxr-xr-x 276 1004 1000 64512 2005-07-18 07:47 ../ -r--r--r-- 1 1004 1000 1112 1994-02-24 00:09 2iso.el lrwxr-xr-x 1 1004 1000 40 2005-06-17 11:53 a2ps.el -> /usr/local/share/emacs/site-lisp/a2ps.el lrwxr-xr-x 1 1004 1000 46 2005-06-17 11:53 a2ps-print.el -> /usr/local/share/emacs/site-lisp/a2ps-print.el lrwxr-xr-x 1 1004 1000 49 2005-06-17 11:53 autoconf-mode.el -> /usr/local/share/emacs/site-lisp/autoconf-mode.el lrwxr-xr-x 1 1004 1000 49 2005-06-17 11:53 autotest-mode.el -> /usr/local/share/emacs/site-lisp/autotest-mode.el -r--r--r-- 1 1004 1000 28473 2005-07-07 12:54 c-mode.el -r--r--r-- 1 1004 1000 159 1993-02-15 19:46 decr.el -r--r--r-- 1 1004 1000 13172 1993-06-07 20:17 din-tastatur.el
Spent some time discussing the matter on IRC with Chris Yeoh, who is already running a system quite similar to what I am building. He's using Debian, so it sounded like a good idea to install that instead; hopefully it won't have as many wrinkles. Started downloading a couple of DVD images from the Internode mirror server; what an advantage ADSL is (not to mention the free download; in the early 1990s it would have cost me USD 2,700,000 to download 9 GB). But even with 1536 kbps download speed, it still takes about 8 hours to download a DVD, so I had to postpone further work to later.
In the middle of all that, finally managed to go riding. Despite the weather forecast, it was dry and sunny. Nice afternoon.
| Monday, 18 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Up early to tidy up my mail before going to the weekly meeting, and also tried a few more things with the Linux box. The DVD downloads had both crashed during the night, so I wasn't able to do anything on that front, but did check the latest version of Emacs: it's 21.4a. Downloaded that and tried to compile it; still the same result on both machines. I had at least expected it to know x86_64. It looks as if the Fedora project isn't giving its fixes back to the GNU project.
After that to the weekly meeting in town, which finished off the work day. Back at home, the Debian downloads had finally finished, and burnt the DVDs. These DVD+Rs that I bought last year are of impossibly bad quality: I'm ending up with over 50% failure rate. A good reason to stick to rewritable media: at least there you can try again, while my waste paper basket is filling up with DVD+R coasters.
Installing Debian is certainly very different from installing Fedora! Although the Fedora distribution DVD is just shy of 3 GB, and the Debian DVDs total about 9.5 GB, I ended up with a NetBSD-style minimal installation, without even X. I wonder if I did something wrong there.
| Tuesday, 19 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Spent most of the day looking at how to clean up the warnings in our product build. The first step was to increase the warning levels, which I managed quite spectacularly: after adding all the likely looking gcc warning flags (by no means all of them), the product compiled with over 70,000 warnings. Managed to get that down to under 10,000 before it started getting too painful:
v_trans.c:308: warning: passing arg 2 of `tr_os' with different width due to prototype v_trans.c:308: warning: passing arg 2 of `os_wr' discards qualifiers from pointer target type
It would be so much easier to know what type and what qualifier. In the search of that, went off looking for various flavours of lint, and on Tim Stoakes' recommendation installed splint.
Somehow I don't like any version of lint, and splint wasn't able to convince me. Spent some time playing around with various options, but still ended up with messages like this:
style.h:1519: Include file <sys/types.h> matches the name of a POSIX library,
    but the POSIX library is not being used.  Consider using +posixlib or
    +posixstrictlib to select the POSIX library, or -warnposix to suppress this
    message.
  Header name matches a POSIX header, but the POSIX library is not selected.
  (Use -warnposixheaders to inhibit warning)
      After doing that, I still got things like:
arg.c:90:21: New fresh storage (type string) passed as implicitly temp (not
                released): rp_str(serv_id)
  A memory leak has been detected.  Storage allocated locally is not released
  before the last reference to it is lost.  (Use -mustfreefresh to inhibit
  warning)
arg.c:101:12: Null storage returned as non-null: NULL
  Function returns a possibly null pointer, but is not declared using
  /*@null@*/ annotation of result.  If function may return NULL, add /*@null@*/
  annotation to the return value declaration.  (Use -nullret to inhibit warning)
      It's not at all clear why I'm even getting the first message; it's just passing a string as a parameter to a function. I don't see any leak. And the second appears to result from a rather strange view on what a function may and may not return. Sure, I can go and add all this stuff, but at some point I'm left wondering whether it's the way to go. Spent some time looking at other alternatives, but didn't find anything.
In the background, made another couple of attempts at installing Debian 3.1 “Sarge”. Sure, I could recover from the mistakes that I made, but since there was no hurry, and I'd like to get things right, I went to the trouble. Here's a rough log:
After that, spent some time building a new kernel, and once again fell foul of the Linux kernel build system. All was installed nicely, but “update LILO” didn't. With some help from David Kaiser, tried to install GRUB instead, but that just hung: the command
grub-install --no-floppy /dev/hdafailed with the messages (in dmesg only):
inserting floppy driver for 2.4.27-2-k7 devfs_mk_dir(floppy): using old entry in dir: c1c18640 "" floppy0: no floppy controllers found
It looks like floppies are dead; long live the floppy. Now te reenable the disabled floppy controller in the BIOS.
| Wednesday, 20 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
More investigation of code cleanliness tools today. I still haven't found an alternative to splint, so spent some time looking at which messages I should take seriously and which not. I really can't see why it's an error to take advantage of C language features like promotion (for example, where you pass a 32 bit integer as a parameter to a function which expects a 64 bit integer; the compiler implicitly widens the value). splint reports this kind of thing, apparently expecting you to write:
fun ((int_64_t) value);
This is just ugly. I can disable it, but then, it appears, it also no longer reports on the inverse (shrinking 64 bits to 32, which would lose data). I think we need some discussion on this, so since I have other things to do (database comparisons in particular), I'll do that instead.
Into town to Scoozi for an ADUUG lunch. It's pizza and pasta only, just about, so I had a pizza. At least it was better than average.
In the evening returned to the issue of getting my Debian box up. I had received a message from James Andrewartha, saying, amongst other things,
If you boot the debian installer with "expert" or "expert26" at the isloinux prompt, you'll get a step by step installation which will - let you specify an IP manually - let you choose which kernel (expert uses 2.4, expert26 uses 2.6) - might let you chose the boot loader - I thought it defaulted to grub, but it does depend on how you partition the drives (eg grub doesn't boot into lvm on amd64, so you get lilo unless you leave /boot as a normal partition). - possibly give you more choice over package selection - prompt you for more questions in general
Further investigation showed that I could have found this information by pressing F1. But I needed to, and it's the only installer that doesn't offer this kind of information up front.
This evening I spent over an hour trying to boot my new Debian kernel, in vain. Spent some time on IRC with David Kaiser, who guided me through the minefield that was GRUB. I have kept the complete transcript, but it boils down to:
VFS: Cannot open root device "hda1" or unknown-block(0,0) Please append a correct "root=" boot option. Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
I was able to boot back to the old kernel, and investigation of the menu showed nothing obvious:
title Debian GNU/Linux, kernel 2.6.12.3 root (hd0,0) kernel /boot/vmlinuz-2.6.12.3 root=/dev/hda1 ro savedefault boot title Debian GNU/Linux, kernel 2.4.27-2-k7 root (hd0,0) kernel /boot/vmlinuz-2.4.27-2-k7 root=/dev/hda1 ro initrd /boot/initrd.img-2.4.27-2-k7 savedefault boot
Somehow this whole thing is completely ridiculous. The system doesn't give me the choice of bootstrap (LILO or nothing), when installing a kernel the “update LILO” function doesn't work, installing GRUB comes very close to being impossible, and when I do the kernel fails to boot with messages that make no sense. The FreeBSD project got rid of this sort of issue 10 years ago, and I had thought that it was a thing of the past in Linux as well.
| Thursday, 21 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Into the office this morning to find the ADSL link down since about 1:30 am. No problems with the phone lines themselves, and the modem was happy, but the link showed no response, not even ARP. Called the Internode help desk and spoke to Zac, who established that it was a Telstra problem, and warned me that it could take up to 24 hours to fix. Half an hour later, just as I was reflecting that a day without network access is like a day without work, Zac called back and told me the line was back up. Thank God for that. I wonder how often this is going to happen.
More consideration of code quality tools, and my enquiries just brought me back to splint. I need to think about that a bit, and probably discuss it with the others. Spent the rest of the day looking at database technology, specifically MySQL.
My Debian woes continue on two fronts: on the one hand, a number of people have sent me suggestions. On the other, I still haven't been able to boot my kernel. Spent some time on IRC again today, with the result that I was able to install a kernel from the DVD and to boot it (in the process, accidentally reinstalling LILO, which still doesn't give me a choice of what to boot; but I'm assuming that that's an RTFM). I still can't boot the kernel I built on Tuesday. It's rather ironic that, in the course of my database investigations, I rebooted asterix (Fedora Core 3, using GRUB) as obelix (FreeBSD 5-CURRENT). Since the Linux system is on a BIOS extended partition, I can't use the FreeBSD boot, so I had to modify GRUB to boot FreeBSD. That worked immediately. Why am I having so much trouble with the Debian installation?
| Friday, 22 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
I had intended to look at databases today, but first I wanted to send out a mail message about code quality analysis, using one medium-size source file as an example. It took all day! In the process, I managed to get the number of warnings down from 376 (about one for every two source lines) down to 0. Lots of the output from splint and some from gcc seem dubious, though; I need to understand why we're getting messages like this:
cable.c:112:14: Arrow access from possibly null pointer p_cb: p_cb->cbf_mag1 A possibly null pointer is dereferenced. Value is either the result of a function which may return null (in which case, code should check it is not null), or a global, parameter or structure field declared with the null qualifier. (Use -nullderef to inhibit warning)
This one may make some sense, but specifying whether your pointer can be null or not is outside the scope of the C language.
cable.c:141:29: New fresh storage (type ubyte_ *) passed as implicitly temp
                   (not released): rp_adr((p_dir_name))
  A memory leak has been detected.  Storage allocated locally is not released
  before the last reference to it is lost.  (Use -mustfreefresh to inhibit
  warning)
      I don't know what splint has been thinking, but crack seems to come to mind. The source line in question creates storage and returns it to the caller. Maybe splint thinks this is wrong. I think splint is wrong.
cable.c:150:8: Implicitly only storage (type p_rp_t) not released before
    assignment: global_get()->globalf_cable_cbg_cabledir = rp_cre(NULL)
  A memory leak has been detected.  Only-qualified storage is not released
  before the last reference to it is lost.  (Use -mustfreeonly to inhibit
  warning)
      This seems to a variant of the previous. Memory leak indeed!
cable.c:141:85: Null storage passed as non-null param: file_create (..., NULL) A possibly null pointer is passed as a parameter corresponding to a formal parameter with no /*@null@*/ annotation. If NULL may be used for this parameter, add a /*@null@*/ annotation to the function parameter declaration. (Use -nullpass to inhibit warning)
There's another kind, too:
cable.c:407:9: Null storage p_err returned as non-null: p_err Function returns a possibly null pointer, but is not declared using /*@null@*/ annotation of result. If function may return NULL, add /*@null@*/ annotation to the return value declaration. (Use -nullret to inhibit warning) cable.c:323:17: Storage p_err becomes null
I suppose it would be nice to think in advance about whether NULL is a valid value for each and every pointer, but I don't think that I should be annotating source code for one tool's view of the world.
style.h:1648:18: Type implemented as macro: bool A type is implemented using a macro definition. A typedef should be used instead.
This source code uses the type bool for boolean values. So does splint, it seems. It accepts the source code even if I remove the definition of bool altogether; gcc doesn't, of course. If I change it to a typedef, I get:
style.h:1648:16: Datatype bool declared with inconsistent type: int A function, variable or constant is redefined with a different type. (Use -incondefs to inhibit warning) load file standard.lcd: Specification of bool: boolean
This is nonsense. This is the only definition, and it matches exactly the definition that splint uses for C++:
     bool.h:14:typedef int bool;
      After everything else, I'm left with five messages of the type:
cable.h:417:8: Function exported but not used outside cable: cb_a_b A declaration is exported, but not used outside this module. Declaration can use static qualifier. (Use -exportlocal to inhibit warning)
How can an analysis of a single file guess that? I suspect that it's intended to go over all source files together. It'll be a while before I can do that.
So: does splint help or not? I still don't know. I'll use it for a while, until I can find something better: it finds things that gcc does (and the converse; it's frightening how few issues both tools find).
Yana home for the weekend. Haven't seen her in a while.
| Saturday, 23 July 2005 | Echunga | Images for 23 July 2005 | 
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Out riding today, Yana riding for the first time in four years:
 
          
      Diane Saunders also came along; as she observed, it's also been a while since we went out together. Pleasant afternoon.
Back to looking at Debian after that. Spent some time trying to do a make xconfig in the kernel build directory:
* Unable to find the QT installation. Please make sure that the * QT development package is correctly installed and the QTDIR * environment variable is set to the correct location.
So I went looking for QT, and found nothing. Then I looked for qt and found a whole lot of packages. The most likely one was qt3-dev-tools. I installed that, but it didn't help. This shows a basic issue with Debian: How do I convert these messages into something that apt-get will understand? Chris Yeoh came in and told me that it was libqt3-dev When I asked him how he knew, he replied “ no real way, but generally if you need to compile against feature foo, debian has it packaged as libfoo-dev, generally with a version number appended.
After that I was able to look at the configuration along with the comments, and confirmed a suspicion that I had had, that the kernel didn't have XFS support in it. Yes, there was a module, but it seems that that doesn't help much without an initrd file, and so far nobody has been able to tell me how to build that under 2.6. Built a kernel and also got LILO to prompt for the kernel, but on booting it still died with a root-related error which I didn't write down (to be left for later).
Instead went out looking for mplayer for Debian, and found some instructions that almost worked: I was able to download and install the package, but the fonts were missing. More investigation needed.
| Sunday, 24 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Continued with my Debian saga today, and spent some time looking for the mplayer-fonts package; it doesn't seem to be there. Finally gave up and downloaded the original tarball from the official web site. With only a little trouble, and with help from the existing FreeBSD implementation, was able to get mplayer to work—sort of. But it doesn't say much for the method of distribution if I need to spend hours looking for the package.
The “sort of” was that there was no sound, and that mplayer was spitting out error messages from a function in libdvdread once a second. The latter may be related to the former, but giving a source line number in a package distributed in binary form only wasn't very helpful. Spent much more time looking for the cause of the lack of sound, and established that the sound hardware hadn't been detected, and that the tuner card seemed to be confusing the probe routines. Removing the card enabled the sound hardware to be recognized, but it still didn't work. Spent more time scouring the web for documentation, and found some out-of-date documentation on the Debian web site, referring to the 2.2 kernel and “sound cards”, and giving information that may or may not still be correct (for example, a reference to /proc/sound; I found similar information in /proc/asound). It didn't say what to do if it didn't work.
About here I decided that I've been wasting too much time with Debian. Installed from the SuSE 9.3 DVD-ROM I got last month, and made a lot more progress, though not as far as completion. At least SuSE doesn't leave you scratching your head and wondering what to do next. I wonder if my experience with Debian parallels other people's experience with FreeBSD; I fear it might, though FreeBSD at least is a little more uniformly documented. Still, I feel like I'm chickening out by not staying with Debian. For the while, I'm leaving the Debian partition intact, so I may go back.
Later heard from tridge, who appears to have found a day-one bug in the BSD directory navigation functions. A regression test for a new release of Samba didn't delete all the entries in a file. Spent a bit of time looking at that, though not enough to decide whether it's a bug or a feature.
| Monday, 25 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Meeting day today, so as usual we didn't get done. In the meeting, at least, I got general confirmation that my approach to the code tidy-up is acceptable. That will push off the database work for at least another day or two.
Heard again from tridge, who is now sure that he has found a bug in BSD telldir, so he has worked around the problem in Samba. Time to see what's really going on and to fix it.
It's gratifying how, no matter which version of Linux I try, I get mail from people with suggestions. Many have suggested that my problems with rm are due to the presence of an alias; they're not: I've checked that. While I was trying Debian, I got useful input from many, notably James Andrewartha. Now I'm trying SuSE 9.3, and I am now getting helpful mail from Mads Martin Joergensen, who has contacted me in the past. Also received c't magazine today, including an article on how to use apt-get with SuSE. It's looking like a reasonable choice at this stage, even if I did waste a lot of time trying to stop it from running ls in colour.
We're coming up to another AUUG board meeting, and I still haven't booked my flight. Looked around for “last minute” on Google and discovered lastminute.com, which deserves the distinction of being one of the very few web sites so broken that I can't use it at all:
 
       
      On a medium-resolution screen, an even marginally legible font size makes it impossible to use. I don't know how to start a search; no button is visible, and I don't even know if there is one. Note that this window fits on a single screen (use “full screen” to view).
| Tuesday, 26 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Another day spent cleaning up a single file and documenting it. And that despite the fact that the fixes I made last week make it less of a problem.
In the afternoon, spent fully two hours trying to make an online booking for a flight to Sydney; I'm due to leave on Thursday, so it's high time. I failed. Without exception, all web sites are broken. None of them render correctly with 18 pixel characters (which on my display corresponds to 10 pt).
Webjet have a completely broken site that keeps looping on the entry of personal details. By contrast, their rendering is almost acceptable, if you ignore the truncated fields and the tiny text (about 4 pt) in the tabs at the top.
Flight Centre is a company I deal with on a regular basis. They have a broken web page, of course, and I had difficulty selecting the times of day, which overlap into the advertising on the right. It doesn't make much difference, though: their search engine seems to ignore the specification, and suggested flights at 6 am when I asked for evening flights. Still, not to worry: they're nearly 50% more expensive than the rest.
http://www.studentflights.com.au/ looks like it might be cheap, but in fact it was very expensive. It appears to be an operation of Flight Centre.
Travel.com seem to do it almost right, and they had the best prices, but they appear to want me to enter credit card details over an insecure web connection. I tried going to https://www.travel.com.au, but the page redirected to http://www3.travel.com.au/home.html. I changed that to https://www3.travel.com.au/home.html, which worked until I tried to make a booking, when I ended up at the credit card details page under http. Tried to call them on the phone. After 35 minutes, got an answer from Zoe, who obviously had never heard of a secure web connection before. Asked to be connected with the web master, but ended up being transferred to Jeunnie, with an American accent. She told me to select (sorry, “click on”) the (tiny) tab at the bottom right, but the page rendered so badly that it didn't work. Then she tried to send me the page by email. That didn't work either:
Their DNS is also misconfigured: their mail server claims to be maildom.travel.com.au, a non-resolvable name. So I can't receive mail from them either. Finally she read it to me, uninterruptibly until she read that I could recognize a secure transaction because of the https: in the URL. I pointed out that that's what I was complaining about. Finally she talked to somebody technical, who relayed that they had an SSL connection hidden inside the frame. Ran tcpdump and found that yes, indeed, some of the traffic was https. But which part? It's reasonable to assume that it includes the credit card number, but the very fact that any of the transaction takes part without encryption worries me greatly.
To add insult to injury, during this time the last cheap flight disappeared, and I was left facing a price hike of $200. Very frustrated.
| Wednesday, 27 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Another day spent removing warnings from a single file! The document is now at 9 pages, and the groff source is 50% longer than the original file. In the process came across a whole lot of issues that can't be solved that simply. What a can of worms!
| Thursday, 28 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Somebody sent me money via PayPal to an incorrect email address. While searching their web site, came across this.
Finally managed to tie up my code tidyup investigations today; it took nearly all day. Now to let other people go through the pain of thinking about the alternatives. Moved on to look at MySQL, which, if for nothing else, shines by the quality and layout of its documentation. Spent some time looking at the internals documentation;, which has become slightly out of date as a result of the recent Bitkeeper fiasco. MySQL uses Bitkeeper internally, and the documentation refers to a free license, now no longer available. Discovered from BitKeeper that there's a bkbits Bitkeeper client, so installed that. Being free, it doesn't have any documentation beyond a simple demonstration script, so took the opportunity to write a brief HOWTO for it, and also spent a little bit of time updating the MySQL documentation.
The free client seems to work OK. I'm left wondering what tridge and the others were so worried about back in April.
Into town in the afternoon for a couple of meetings of the ICT Council for South Australia, including a Special General Meeting to officially agree to the name “ICT Council”.
| Friday, 29 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Spent most of the day looking at MySQL today, and had the bad idea of trying to build the development version. That started with unclear instructions, continued with yet another problem with the FreeBSD Ports Collection (it installs my beloved GNU autoconf and friends with non-standard names: for example, autoconf is currently installed at autoconf259). Even after I clarified that, ended up with errors from autoconf that suggest that there was something inconsistent about what I checked out. Tried the mysql50-server port instead, and ran into my usual problem: not complete lack of documentation this time (MySQL provides that), but gratuitous and undocumented differences (for example, safe_mysqld gets installed as /usr/local/bin/mysqld_safe). I just can't believe that nobody cares enough about their work to document things properly.
| Saturday, 30 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
People have been talking about RSS feeds for a while, so I thought I should join in. It's a pity that I don't understand what it's all about, nor—more importantly—why it should be good. Anyway, started with a script to extract the relevant headings from my diary, but it quickly became apparent that that's not enough: to do it right, it needs to be a summary of the events, and it's difficult to automate that. Looks like I'll just have more manual work for the moment.
In the afternoon, more work on SuSE 9.3, including more updates to my new system installation method, which has been a work in progress for a year now, and still is. It's making some progress, though.
Mads Jørgensen had sent me some ideas on how to use yast2 and friends more effectively, but after some playing around still couldn't manage it. They should resolve all dependencies, but after a couple of hours playing around with it, I still can't work out how.
Also had problems using the combination of bash, xterm and a remote FreeBSD display: for some reason, the Alt key generated incorrect characters. It worked fine with bash locally on the FreeBSD system, fine with bash locally on the Linux system, fine with Linux Emacs remotely on the FreeBSD system; only Linux bash on the FreeBSD display showed the problem. To make matters even more confusing, once I got my fvwm2 menus set up and was able to start the remote xterms from the FreeBSD system, I had no problem any more either. I suppose it must be some environment issue, but I can't think which. One thing that did add to the confusion, though, was that after reading my .bashrc, the standard bash on SuSE 9.3 SIGSEGVed when doing file name completion. Recompiled bash and that problem no longer occurred, but I suppose I need to find out how to enter a bug report.
The tuner card is also interesting: every second I get this message:
Jul 30 11:00:35 deeveear kernel: cx88[0]/0: AUD_STATUS: 0x1772 [mono/no pilot] ctl=BTSC_AUTO_STEREO
The AUD_STATUS alternates through three or four different values varying in single bits. Sent a message to Ryan Verner, to whom I had lent the card a few months back, and discovered that he, too, had the problem, and that it doesn't seem to stop the thing from running. It seems easy enough to silence:
--- drivers/media/video/cx88/cx88-tvaudio.c~    2005-04-23 05:11:34.000000000 +0930
+++ drivers/media/video/cx88/cx88-tvaudio.c     2005-07-31 09:24:49.855143992 +0930
@@ -720,10 +720,12 @@
        mode  = reg & 0x03;
        pilot = (reg >> 2) & 0x03;
+#if 0
        if (core->astat != reg)
                dprintk("AUD_STATUS: 0x%x [%s/%s] ctl=%s\n",
                        reg, m[mode], p[pilot],
                        aud_ctl_names[cx_read(AUD_CTL) & 63]);
+#endif
        core->astat = reg;
        t->capability = V4L2_TUNER_CAP_STEREO | V4L2_TUNER_CAP_SAP |
      But it looks like it'll be a while before that will help much.
| Sunday, 31 July 2005 | Echunga | |
| Top of page | ||
| previous day | ||
| next day | ||
| last day | 
Spring is in the air! The daffodils and wattles are blooming, the sun shining, so we had to go off riding. Pleasant day, though the horses (Yvonne was riding La Tigre) were also full of beans.
Spent more time looking at RSS feeds. Part of the problem is that I don't know what I should be looking for. People on the FreeBSD IRC channel helped with conflicting info, but in the end I came to the conclusion that I needed to do enough work on the RSS feed version that a simple extractor wouldn't work, and that I need to do it manually. Hopefully I'll get some feedback telling me what I'm doing wrong.
Spent more time today attempting to install software on SuSE. On the positive side, the KDE help center contains useful information, and you can start it without KDE from /opt/kde3/bin/khelpcenter, but it required a search index, and that didn't build:
Why do I have to go through three windows just to find a message that's incorrect (“The KDE libraries are not designed to run with suid privileges ”)?
This was attempted from a non-root environment. It worked after reboot, so there seems to be some glitch there.
Again I had this irritating flicker and insistence on auto-raise that I had also noticed with kscope. Tried GNOME, which doesn't have this irritating behaviour on the screen, but it also doesn't have the SuSE docco. Maybe because of the way it was installed?
Did a complete update, and was left with the feeling that nothing had happened. Certainly the kernel update completed without rebooting. Went back and selected “System update”, on the way getting the interesting message:
That wouldn't be so interesting in itself if it weren't completely wrong. Yes, these file systems are low on space, but the values shown there have no relationship with reality:
=== grog@deeveear (/dev/pts/11) ~ 12 -> df -m
Filesystem           1M-blocks      Used Available Use% Mounted on
battunga:/                3970      2936       718  81% /battunga
echunga:/dump            76286     75199       325 100% /dump
wantadilla:/dumpa        76286     69297       887  99% /dumpa
wantadilla:/dumpb       187781    183278      2625  99% /dumpb
echunga:/                 8922      7673       536  94% /echunga
      Even /battunga, which isn't over 4 GB in size, has more space left than yast2 thinks it had in total. That's not an NFS issue: the df output (in MB) is from the same system.
After that, it did reboot the system. But there was no indication at the end of the update step that anything else needed to be done.
At the end of the day I was left wondering whether I had done something obviously wrong, whether I had done something non-obviously wrong, or whether I had run into one or more bugs. Still, despite everything, this is no worse that what I've experienced with commercial software (see my experience with MacOS X in February), and better than most. I have a feeling that it's worth persevering, especially since I found that it appears to have some support for digital video recording, though it doesn't exactly reach out and grab you.
Do you have a comment about something I have written? This is a diary, not a “blog”, and there is deliberately no provision for directly adding comments. It's also not a vehicle for third-party content. But I welcome feedback and try to reply to all messages I receive. See the diary overview for more details. If you do send me a message relating to something I have written, please indicate whether you'd prefer me not to mention your name. Otherwise I'll assume that it's OK to do so.
| Top of page | Previous month | Greg's home page | Today's diary entry | Next month | Greg's photos | Copyright information |