|
|
This is a shortened version of my personal diary for January 2005, shortened to those entries of interest to computer people. See http://www.lemis.com/grog/diary.php for my personal diary.
Every barbecue is different; this time was marked at least by fewer photos . Also spent some time talking about infrared receivers. Daniel brought a kit that had built a while ago, but (strangely) it only recognized some of the codes generated by the Philips remote control that I wanted to use. Chris wrote a program for TiVo some time ago, mainly intended to help the TiVo send commands to foreign VCRs, but it should also be able to help us identify what the difference is with the “dead” keys.
Essey Deayton held a barbecue today, the last one now: she's moving to Queensland. It's sad to see her go, but maybe that's a bright side to it; like Assurancetourix in the Astérix stories, she has the ability to attract bad weather. Last time we had such a storm the we can still see the results on the water tank; today, just before the barbecue, we had another storm and 13 mm rain, enough to interrupt my satellite reception several times:
Jan 3 12:06:26 sat-gw sm200d[20817]: Tuner +*** TunerLock Jan 3 12:06:28 sat-gw sm200d[20817]: Tuner ++++ Running UP Jan 3 12:06:42 sat-gw sm200d[20817]: Tuner +*** TunerLock Jan 3 12:06:50 sat-gw sm200d[20817]: Tuner **** No signal Jan 3 12:07:12 sat-gw sm200d[20817]: Tuner ++++ Running UP Jan 3 15:19:17 sat-gw sm200d[20817]: Tuner +*** TunerLock Jan 3 15:19:26 sat-gw sm200d[20817]: Tuner **** No signal Jan 3 15:19:39 sat-gw sm200d[20817]: Tuner ++++ Running UPI also lost modem connections a couple of times, though it's difficult to blame that on the weather unless we're about to see yet another telephone cable failure.
Having to correct mail messages is a pain, and I spent some time working on a document Communicating with email. It's only a start, but hopefully it'll come good soon (after which this link will automatically be updated).
One of the issues was completing the Fedora Core 3 installation that has been dragging on for days. After reading the man pages for rpm and yum and a HOWTO for rpm, it was still not clear to me how to find out what packages I needed to install to do software development.
On FreeBSD I'd fire up fire up sysinstall and get a list by category of what's available, along with a one-liner stating what it is. The best I could find on the Fedora CDs is a directory listing or rpm -qa, which gave me things like:
jwhois-3.2.2-6 libxml2-2.6.14-2 make-3.80-5 irda-utils-0.9.16-3 bind-libs-9.2.4-2 pdksh-5.2.14-30 ftp-0.17-22 gettext-0.14.1-12 rpm-python-4.3.2-21 stunnel-4.05-3There doesn't seem to be anything corresponding to FreeBSD pkg_info, which at least gives the one-liner. And by comparison, even ls sorts the names alphabetically.
Sent out a message to the Linux SA mailing list and got a number of replies that fell into the following categories:
Well, no. In the meantime I was given something else to do, coding for once: set O_DIRECT for a particular file, since it will be read in once and then discarded. That sounded trivial: just change the open flags. Unfortunately, it wasn't that simple: the program uses buffered I/O via a library wrapper, definitely a suboptimal choice. Considered that I might be able to do it anyway by calling fcntl to set the open flags after the return from fopen, but for some reason that didn't work: the flag was set and stayed that way, and there was no error indication, but then it appears that the file returned an immediate end-of-file indication. Using gdb wasn't easy: all this happens in a thread, and my breakpoints didn't stick. I obviously need to RTFM about debugging threaded programs. I was able to put an int3 instruction in the thread and hit it that way, but that's a bit hit and miss.
(Update, 10 January 2005) In fact, it seems that the issues weren't with threading at all; as I had originally expected, recent versions of gdb have no problem with threaded applications. The problem is function renaming. Like other POSIX.1 based systems, Linux provides a function called fopen, and it actually links in a function of that name (so you can set a breakpoint on it), but it's not what it calls:[Switching to Thread -151038272 (LWP 22706)] Breakpoint 3, file_open (p_filename=0x815021c "/fooblah/fooblah/settings.dat", flags=130) at file.c:738 738 handle = fopen(p_filename, "rb"); === gdb -> s 755 handle = fopen(p_filename, "ab"); Huh? Where did the call to fopen go? === gdb -> zs 0x0808ee1f 755 handle = fopen(p_filename, "ab"); 0x808ee1f <file_open+663>: call 0x804a0f4 <_init+1608> === gdb -> 0x0804a0f4 in ?? () 0x804a0f4 <_init+1608>: jmp *0x812a2f8 === gdb -> 0x0097f050 in fopen64 () from /lib/tls/libc.so.6 0x97f050 <fopen64>: push %ebp === gdb ->It's also very irritating that the s command doesn't step into the function, so you have to go at the assembler level.
Yana is leaving for a year in Europe in a couple of weeks, and she wanted to take her ancient Dell Latitude CPi laptop with her. It's done very good service: it's now 7 years old and has a 266 MHz processor and 96 MB of memory, but Yana has never complained. The batteries still seem to be in good condition:
Battery 0: Battery status: high Remaining battery life: 83% Remaining battery time: 3:31:00I was going to give her my next machine, an Inspiron 7500 built in mid-2000, but that has already had two sets of batteries die on it, the keyboard is falling apart, and the display hinges (which I replaced once in November 2002) appear to be dying again. Instead, decided to get a new machine, and after some discussion on the lists, decided to buy a Dell Inspiron 1150, not too different from adelaide, my Inspiron 5100, on eBay. It seems that the rules have changed since then: at the time I had to have it shipped to the USA (that hasn't changed, presumably because Dell wants it that way), but I could pay with my own non-US funds. That doesn't work any more, and I had to enlist the help of Wes Peters to pay for the thing (and to ship it on to me). Spent quite some time doing that.
Di Saunders along in the afternoon to use the phone. Hers has been out all day, and Telstra have told her that it won't be repaired before Tuesday evening, a clear violation of their Customer Service Guarantee that they didn't even bother to justify. And our federal MP thinks that the penalties for failure to observe the CSG are enough.
What a mess Linux system calls are! There are a number which have been paired, such as lseek and lseek64; at least in this case, the latter is designed to handle 64 bit offsets. Now it's true that the name lseek came when they extended offsets from 16 bits (seek), but that's a long time ago and a recognized mistake. Linux goes one step further and hides things so that a call to lseek actually issues a system call lseek64, at least in my case.
Got that sorted out, along with a surprising number of missing functions (lchmod, for example), and got it as far as running before running into trouble. Investigation showed that the system library call open gets bound to a function open64. I have no idea why, but it's really confusing. This also seems to be the real reason why I thought I couldn't debug threaded libraries last week: it wasn't the threads, it was the renamed function. What a mess. Anyway, proved that it can work, so left the rest to Monday.
Essey Deayton along round noon to give Yana an old Kodak DX3215 digital camera. Confirmed that it didn't present a disk-like interface via USB, so off to install gphoto—not for the first time—and ran into the same old problems. There's no man page gphoto(1): after some investigation, discovered it's called gphoto2(1). How I wish that people wouldn't include version numbers in the names of their programs!
gphoto2 is pretty basic, and accessing a digital camera is one of those things where a GUI interface can be of use, so located and installed gtkam, which was able to identify the camera pretty quickly, and sometimes even present the names of the directories on the camera. When I tried to download them, though, all I got was:
No idea what causes that. Tried it with my Nikon CoolPix 880 with similar results (identified the camera but couldn't access it). The “help” button was no help—literally: somehow it didn't get installed. It seems that every time I install a port I run into problems like this.
Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset must all be multiples of the logical block size of the file system. Under Linux 2.6 alignment to 512-byte boundaries suffices.This is a problem with Linux only; FreeBSD has no such restriction. On Friday I had done some testing which suggested that this problem didn't even exist, but it bit me here.
It would be easy to just allocate 512 bytes more data and use the first 512 byte boundary as the I/O base, but after going through one instance of doing so, it became clear that we need a way to allocate memory with particular alignment. The most obvious way would be to put a wrapper around the existing membank allocator. But where to store the information? Clearly we need to pass the aligned address to the application, which then frees it at some later time.
One possibility is to store the base address in the block itself (after all, we have 512 bytes to store it in). The exact method isn't so obvious, though: the alignment of the original block could be arbitrary (even with one of the last two address bits set), so it's not clear how to do it right. Didn't come to any conclusion.
My relay board for the sprinkler has arrived. What a nuisance that it requires a 25 pin printer extension cable to connect it. It should have been easy enough to put a 26 pin header on the thing as well, so that it could be mounted inside the computer case and connected internally. That also has the advantage that such cables are readily available.
User System Elapsed malloced context switches memory voluntary involuntary Old (buffered) 451.16 135.19 465 95174 36795 24322 unbuffered 443.87 134.02 459 87654 36528 23685 u+direct 418.46 35.16 428 84240 3641 5410This shows a minor improvement as the result of the unbuffered I/O (about 1.3%, the order of magnitude that I would have expected), and nearly 10% using the combination of unbuffered I/O and O_DIRECT. This is considerably more than I had expected. Looking at the summary above, it's clear that the big win is in the system code, which is only about 25% of the previous values, and the number of context switches, which is down by up to 90%. I'm assuming that the speed improvement we're seeing is largely a reduction in the CPU time, since my test programs on the same hardware showed almost no difference.
We're not done yet, of course: this is with old hardware. We need to check it again on the Opteron. I'm sure we'll see something very similar, but the performance improvements could be less or more.
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 182901467904 (LWP 14754)] mb_alloc_aligned (len=6430760, alignment=512) at mem_aligned.c:126 126 hint->lbyte [7] = 0; (gdb) bt #0 mb_alloc_aligned (len=6430760, alignment=512) at mem_aligned.c:126 ... (gdb) i loc encompass = (ubyte_ *) 0x2a96472038 "" aligned = (ubyte_ *) 0x96472200 <Address 0x96472200 out of bounds>Looking at the code, we had:
ubyte_ *encompass; /* encompassing block */ ubyte_ *aligned; /* and aligned block */ encompass = (ubyte_ *) mb_all (len + alignment); /* allocate memory */ aligned = (ubyte_ *) (((int) encompass + alignment) /* get last address to use */ & ~(alignment - 1)); hint = &((struct basehint *) aligned) [-1]; /* hang off the start */ header_length = aligned - encompass;This is very ugly code, and I had spent some time wondering how to do it correctly, but I hadn't expected the problems it caused. After some consideration, came to the conclusion that it's safest to use real address arithmetic (in other words, array addressing) as far as possible. Finally replaced the code with:
int offset; encompass = (ubyte_ *) mb_all (len + alignment); /* allocate memory */ offset = ((int) encompass) & (alignment - 1); aligned = &encompass [alignment - offset]; hint = &((struct basehint *) aligned) [-1]; /* hang off the start */Still, there's something basically wrong with address arithmetic in C. It seems that the pedanticism of ANSI C hasn't helped much.
In the process, ran into incredible problems with HTML tables. Either the browsers aren't doing what I expect them, or I'm doing something wrong with the markup. In any case, it seems really difficult to get the table at the head of the page to display correctly. It should have my photo on the left, a heading and the date in the middle, and three links on the right. Even when there's enough space, my browsers tend to either wrap one of the lines or not to use the width of the page. And since HTML is in the eye of the beholder, it's impossible for me to know whether it will work properly or not.
While I was pondering this, I saw on IRC:
<brueffer> groogle: how does it feel to get slashdotted? <groogle> brueffer: Relatively painless. <groogle> brueffer: Is it happening again? * groogle notes that sfr is just installing the new machine, so that's unlikely.It turned out that the answer was “yes”: somebody had posted an article on my temperature control system. It's amazing how the load from a Slashdot posting looks the same every time:
This is pretty much the same as the T-shirt we had made when Tridge posted his TiVo hacks. Here's David Gibson wearing one:
In this case, though, it suffered a couple of dents where Stephen took down the server for the scheduled replacement. They could have chosen a better time to slashdot me.I'm now getting regular requests for brewing temperature information, sometimes too regular:
Jan 16 02:09:09 brewer tempcontrol: Query from ip68-10-120-61.hr.hr.cox.net (68.10.120.61) Jan 16 02:09:24 brewer tempcontrol: Query from unknown (195.159.15.218) Jan 16 02:09:56 brewer last message repeated 13 times Jan 16 02:09:59 brewer tempcontrol: Query from unknown (195.159.15.218) Jan 16 02:10:01 brewer tempcontrol: Query from unknown (195.159.15.218) Jan 16 02:10:33 brewer last message repeated 13 times Jan 16 02:10:46 brewer last message repeated 4 timesIt seems that the machine at 195.159.15.218 is in some kind of loop. It's not clear whether it's malicious or not, but I firewalled them anyway. In another case, it was much more obvious:
07:48:21.890190 < 68.194.48.16.49458 > 192.109.197.147.35846: S 1756009650:1756009650(0) win 1024 07:48:21.961893 < 68.194.48.16.49459 > 192.109.197.147.27225: S 1756075187:1756075187(0) win 3072 07:48:22.204138 < 68.194.48.16.49459 > 192.109.197.147.30331: S 1756075187:1756075187(0) win 3072 07:48:22.208404 < 68.194.48.16.49460 > 192.109.197.147.59035: S 1756140724:1756140724(0) win 4096 07:48:22.220790 < 68.194.48.16.49457 > 192.109.197.147.61826: S 1755944113:1755944113(0) win 3072 08:49:29.940607 sm200d < ool-44c23010.dyn.optonline.net.49459 > brewer.lemis.com.16637: S 1756075187:1756075187(0) win 4096 08:49:29.940692 sm200d < ool-44c23010.dyn.optonline.net.49459 > brewer.lemis.com.6813: S 1756075187:1756075187(0) win 3072Sent a message to abuse@optonline.net, but of course got neither a reply nor a stop to the attempts. My firewall tells me:
pkts bytes target prot opt in out source destination 5981 239K DROP all -- * * 68.194.48.0/24 0.0.0.0/0 1389 83500 DROP all -- * * 195.159.15.0/24 0.0.0.0/0
A better solution would be for the called function to be able to wander down the stack and find the first unknown return address, and then resolve this into file and line number. Spent some time looking at the documentation for BFD, but only came to the renewed conclusion that GNU info is really a pretty terrible documentation system.
Yana's new laptop (a Dell Inspiron 1150) arrived today, and spent some time installing that. It doesn't show FreeBSD up from the best side that the current release (5.3) contains two different tools for resizing Microsoft FAT partitions (obsolete), but doesn't include anything for resizing the standard NTFS partitions. Fortunately, found ntfsresize, the same tool I used on adelaide 18 months ago, on a Knoppix CD-R, but that shouldn't be necessary.
Problems with the laptop in another area too. Unlike adelaide, I was able to start X with no problems, but this time it doesn't recognize the touch pad. Spent some time investigating that, without success.
The ICT council meeting was at the Adelaide TAFE, a nice place but not an easy one to find your way around. Many people late as a result, and we had a relatively quiet meeting, in which, however, I found unexpected support for my idea of a South Australian “open source” multimedia project.
Got some replies about how to get the touchpad on Yana's laptop to work: it needs a flag to the psm driver, after which it worked fine:
--- /boot/device.hints 2004/11/05 01:27:17 1.1 +++ /boot/device.hints 2005/01/19 17:36:23 @@ -27,6 +27,8 @@ hint.atkbd.0.irq="1" hint.psm.0.at="atkbdc" hint.psm.0.irq="12" +# Needed on Inspiron 1150 +hint.psm.0.flags="0x1000" hint.vga.0.at="isa" hint.sc.0.at="isa" hint.sc.0.flags="0x100"This information was from Rob de Graaf. He also states that you need to add entries to /etc/rc.conf, but that's not correct: you just need to remove the entry disabling the mouse.
More work on Yana's laptop. This is just too complicated for non-technical people to install.
Spent some time working on my sprinkler project, and got the power supply working. It's surprising how much time it can take when you're out of practice. The result was also pretty ugly; I think I should consider making my own PCB boards just to make them look better.
At least the power supply worked. Connecting it to the existing relays confirmed a vague suspicion I had had: to control the water pump (700 W), Barry “too much is barely enough” Engel had installed a three-phase power relay that probably handles 10 kW. The coil also draws 1.35 A at 24V (yes, 34 W, 5% of the rating of the pump, just to keep the contacts closed):
More to the point, of course, is that the old sprinkler controller was only rated for a total of 0.7 A for everything. So it's no wonder that it died. Thanks, Barry.
Despite my expression of disdain, Michael also sent me an invitation to join gmail, Google's web mail service. I did sign up and got it to work, but I was left with the singularly powerful feeling “so what?”. What I found was:
In the end gave up with POP and made a slight modification to the manual method I have used to get my mail in the past. It basically compresses the mail spool and transfers it with scp, making it an order of magnitude faster than fetchmail. It also leaves behind backups on the source system in case of problems. The whole thing took me about 10 minutes, much less than the frustration of trying to get fetchmail working.
In the evening finally connected up my sprinkler controller:
The most difficult part was the wiring: the software worked out of the box. Also had heat problems with the power supply: it's surprising how little heat these small heat sinks can dissipate. The power supply delivers about 35 V unfiltered from the 24 VAC power supply, and the sprinkler solenoids draw about 0.35 A at 24 V. That means that the voltage regulator must dissipate 11 × 0.35 = 3.85 W. That's enough for the heat sink to be too hot to touch. Similar considerations apply for the 12 V supply. It looks as if I'll have to rebuild the power supply with bigger heat sinks. The regulators can handle 1 A each, so that's not the issue.
Quiet day today. Chris Yeardley had arrived last night, but she wasn't staying long, and it was too hot to go riding anyway. She brought some old computers with her, though: a 386 and 486, each with 8 MB memory. They may come in handy for things like the sprinkler project. Started thinking about installing an old version of FreeBSD on them, but didn't get very far. The 386 didn't have a CD-ROM drive, and the 486 appears to have problems with the disk drive, though that could be a BIOS problem.
She also brought a Dell Inspiron 8100 with intermittent disk problems. There wasn't much to be seen, but it sounds like another contact problem. Spent yet another fruitless effort searching the impossible Dell web site for documentation. This time even Google didn't help much, though Dell's habit of giving the manuals slightly different titles each time didn't help much. Found an online version online version on their Japanese support site, but who knows how long it'll stay there. There's nothing obvious on the site to suggest that the information might be there.
Good question. Jaycar's web site is no more obnoxious than others, but it looks really bad compared with a good catalogue. It's nice to be able to find things by keyword, but you need to enter the keywords one at a time. Looking for a housing for the power supply for the sprinkler, I entered housing and got things related to security cameras, only. The entries for the individual items are not as well formatted as in the catalogue, and they display at only 10 items per screen, making navigation difficult on the one hand, and requiring clicks on the more >> button on the other hand to be able to see enough detail. On the positive side, many have links to documentation; the catalogue doesn't have anything similar.
The real problem was placing the order, though: I started entering stuff and then was distracted by something else. Came back and my order had expired. That's common enough, but what a pain! You'd be really upset if you wrote something down on paper, came back an hour later and discovered that the ink had faded away. The real problem here is the issue of sessions: at a time when most computers didn't have much power, it seemed sensible to get the server to do most of the work. It doesn't any more, but communication between software on the client and on the server is far too complicated. It needs a complete rethink.
On the work side, got my O_DIRECT kludges sorted out, and discovered it was using inordinate amounts of memory. While testing that, discovered that the newest version of the project doesn't work at all on my system: it SIGSEGVs immediately. Strangely, nobody else had seen that.
Shane Adcock, the electrician, put in a brief experience and had interesting information: he's building a new office not too far from here and installing networking. He asked a number of questions, but also came up with some interesting information: Cat 6 cable requires different tools from Cat 5E, and they don't seem to be available.
More importantly, though, is that the Echunga telephone exchange will be equipped with an ADSL DSLAM in the next few weeks. Checked out the impossible Telstra web site (what suburb? Echunga has a population of 500) and found nothing (“Congratulations! BigPond Broadband Satellite is available”). Why do so many companies have such completely useless web sites, and why do almost none of them make it easy to find contact addresses and phone numbers? Found a phone number anyway, called them up and got somebody who was more interested in transferring my phone lines back to Telstra, but finally determined (after I insisted) that ADSL was indeed coming to Echunga. No further information, but she did put me on a list, though it's not clear what the purpose of the list is.
A couple of weeks ago I bought a second-hand Apple laptop, and it arrived today. It's a PowerBook G3 SCSI, which is apparently also known as WallStreet—maybe. This machine has a 400 MHz processor and an 18 GB disk, both of which are only available on the “bronze keyboard” and “FireWire” versions, but this is neither of those.
Initial experiences were interesting: on the one hand it's full of eye candy, and it's anything but intuitive to use. On the other hand, it's the first laptop I've ever been able to use out of the box, sort of: the BSD interface makes up for a lot of other pain. It remains to be seen whether I can get it to behave the way I want.
Into town in the afternoon for a meeting. Looks like we have some new things under way, including a beer bust on Fridays—shades of Tandem—and other things that will require me in town more frequently. We're living in interesting times.
Previous month | Greg's home page | Today's diary entry | Next month | Greg's photos |
$Id: hackers-diary-jan2005.html,v 1.1 2008/02/10 23:51:58 grog Exp $ |