Greg
animals brewing
food and drink gardening
general health
history language
music multimedia
opinion photography
politics Stones Road house
technology
Greg's diary
recent entries
Translate this page
Select day in February 2018:
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28
Select month:
2017 May Jun Jul Aug
2017 Sep Oct Nov Dec
2018 Jan Feb Mar Apr
Today's diary entry
Diary index
About this diary
Greg's home page
Greg's photos
Network link stats
Greg's other links
Copyright information
    
Groogle

Tuesday, 20 February 2018 Dereel
Top of page
next day
last day

Panic!
Topic: technology, opinion Link here

Updated the system on teevee today, as I do about once a month. The last time was:

FreeBSD teevee.lemis.com 11.1-STABLE FreeBSD 11.1-STABLE #2 r327971: Mon Jan 15 10:55:53 AEDT 2018     grog@teevee.lemis.com:/home/obj/eureka/home/src/FreeBSD/svn/stable/11/sys/GENERIC  amd64

Nothing very interesting there, which is why I almost never mention it. Today it finished, I rebooted, went away briefly, and came back in time to see the system displaying:

reboot after panic: page fault
writing core to /var/crash/vmcore.1

Huh? I can't recall when I last had a panic with a FreeBSD -STABLE kernel. OK, watch what it does next. Finish rebooting normally, (automatically) start X and...

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address    = 0x8
fault code               = supervisor read data, page not present
instruction pointer      = 0x20:0xffffffff82798f0b
stack pointer            = 0x28:0xfffffe011773b570
frame pointer            = 0x28:0xfffffe011773b640
code segment             = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 3
current process          = 1140 (Xorg)
trap number              = 12
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff80ac1637 at kdb_backtrace+0x67
#1 0xffffffff80a7b5a6 at vpanic+0x186
#2 0xffffffff80a7b413 at panic+0x43
#3 0xffffffff80ef7f22 at trap_fatal+0x352
#4 0xffffffff80ef7f79 at trap_pfault+0x49
#5 0xffffffff80ef77e6 at trap+0x2c6
#6 0xffffffff80ed7c80 at calltrap+0x8
#7 0xffffffff827a65f3 at nvidia_dev_dtor+0x23
#8 0xffffffff80946885 at devfs_fpdrop+0xc5
#9 0xffffffff80949592 at devfs_open+0x142
#10 0xffffffff8106754c at VOP_OPEN_APV+0x7c
#11 0xffffffff80b4d003 at vn_open_vnode+0x203
#12 0xffffffff80b4cbdd at vn_open_cred+0x34d
#13 0xffffffff80b45ec2 at kern_openat+0x212
#14 0xffffffff80ef8fa8 at amd64_syscall+0xa38
#15 0xffffffff80ed84f1 at fast_syscall_common+0x105
Uptime: 1m43s

I couldn't read all that, of course, so in to look at the dump. After all, when it comes to FreeBSD dumps, I wrote the book. OK:

=== root@teevee (/dev/pts/5) /var/crash 2 -> gdb -k /boot/kernel/kernel vmcore.1
gdb: unrecognized option `-k'
Use `gdb --help' for a complete list of options.
=== root@teevee (/dev/pts/5) /var/crash 3 -> kgdb /boot/kernel/kernel vmcore.1
GNU gdb 6.1.1 [FreeBSD]
...
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
Attempt to extract a component of a value that is not a structure pointer.
Attempt to extract a component of a value that is not a structure pointer.
#0  0xffffffff80a7b38b in doadump ()
(kgdb) bt
#0  0xffffffff80a7b38b in doadump ()
#1  0xffffffff80a7b3b4 in doadump ()
#2  0xfffffe011773b200 in ?? ()
#3  0xffffffff80a7b10e in kern_reboot ()
Previous frame identical to this frame (corrupt stack?)

What went wrong there? Really corrupt kernel stack? What appeared on the screen seems plausible enough. And then I found a new file that never used to be there:

=== root@teevee (/dev/pts/5) /var/crash 4 -> l
total 802
...
-rw-------  1 root  wheel      146,739 20 Feb 15:54 core.txt.0
-rw-------  1 root  wheel          473 20 Feb 15:59 info.last
-rw-------  1 root  wheel  313,876,480 20 Feb 15:59 vmcore.1

That seemed to be related to the previous panic, but presumably the panic was the same. Looking inside showed all kinds of goodies, far beyond what I needed, but including a proper gdb trace:

#0  doadump (textdump=<value optimized out>) at pcpu.h:229
229     pcpu.h: No such file or directory.
        in pcpu.h
#0  doadump (textdump=<value optimized out>) at pcpu.h:229
#1  0xffffffff80a7b10e in kern_reboot (howto=260)
    at /eureka/home/src/FreeBSD/svn/stable/11/sys/kern/kern_shutdown.c:366
#2  0xffffffff80a7b5e0 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>)
    at /eureka/home/src/FreeBSD/svn/stable/11/sys/kern/kern_shutdown.c:759
#3  0xffffffff80a7b413 in panic (fmt=<value optimized out>)
    at /eureka/home/src/FreeBSD/svn/stable/11/sys/kern/kern_shutdown.c:690
#4  0xffffffff80ef7f22 in trap_fatal (frame=0xfffffe01176c04b0, eva=8)
    at /eureka/home/src/FreeBSD/svn/stable/11/sys/amd64/amd64/trap.c:817
...
#8  0xffffffff82799f0b in rm_free_unused_clients ()
   from /boot/modules/nvidia.ko
#9  0xfffffe01176c05d8 in ?? ()
#10 0xfffff80004bd0a00 in ?? ()
#11 0xfffffe01176c05c0 in ?? ()
#12 0xffffffff827a89fe in os_pci_read_dword () from /boot/modules/nvidia.ko
#13 0xffffffff827a75f3 in nvidia_dev_dtor () from /boot/modules/nvidia.ko
#14 0xffffffff80946885 in devfs_fpdrop (fp=<value optimized out>)
    at /eureka/home/src/FreeBSD/svn/stable/11/sys/fs/devfs/devfs_vnops.c:193

So it seems that since I last debugged a kernel, people have changed the method Yet Again. I had half expected that—that's why I tried two different incantations, but they're now both useless. Time for another RTFM.

   Debugging a crash dump
     By default, crash dumps are stored in the directory /var/crash.
     Investigate them from the kernel build directory with:

           gdb -k kernel.debug /var/crash/vmcore.29

That doesn't help much. It looks more like a case of WTFM., or possibly the issue is that this was the version of kgdb relating to the old kernel.

In any case, core.txt.0 (why isn't there a core.txt.1?) told me everything I needed to know: it's related to the nvidia X driver. Whose fault? Difficult to say, but my outstanding bug report means that I'm running an old version of the driver. Does it occur with the newest driver? That'll take a bit of testing. In the meantime the workaround was to revert to the old kernel.


Wednesday, 21 February 2018 Dereel → Ballarat → Dereel Images for 21 February 2018
Top of page
previous day

Shopping again
Topic: general Link here

Off to Ballarat again this morning for shopping, also taking Yvonne with me to go to the physiotherapist's. While she was there, I went on to some furniture shops to see what was available. Now that I have given up on wanting an ottoman, which is called tchéise in Australia, the selection seems much more straightforward. Either a conventional lounge room suite like we have always had:


This should be Furniture-1.jpeg.  Is it missing?
Image title: Furniture 1          Dimensions:          4108 x 3024, 2121 kB
Make a single page with this image Hide this image
Make this image a thumbnail Make thumbnails of all images on this page
Make this image small again Display small version of all images on this page
All images taken on Wednesday, 21 February 2018, thumbnails          All images taken on Wednesday, 21 February 2018, small
Diary entry for Wednesday, 21 February 2018 Complete exposure details

 

or an all-in-one round-the-corner job:


This should be Furniture-3.jpeg.  Is it missing?
Image title: Furniture 3          Dimensions:          4108 x 3024, 2148 kB
Make a single page with this image Hide this image
Make this image a thumbnail Make thumbnails of all images on this page
Make this image small again Display small version of all images on this page
All images taken on Wednesday, 21 February 2018, thumbnails          All images taken on Wednesday, 21 February 2018, small
Diary entry for Wednesday, 21 February 2018 Complete exposure details

 

On the whole I'm more in favour of the former, but first we need more endless discussions. The other thing that was surprising was how big a difference there was in prices of what appeared to be similar suites, up to a factor of 4.

More fun at Woolworths. Last week I discovered that dried white beans are international food, but it seems that Thai red curry is not. It's fish, at least if it contains any:


This should be International-fish.jpeg.  Is it missing?
Image title: International fish          Dimensions:          4108 x 3024, 3299 kB
Make a single page with this image Hide this image
Make this image a thumbnail Make thumbnails of all images on this page
Make this image small again Display small version of all images on this page
All images taken on Wednesday, 21 February 2018, thumbnails          All images taken on Wednesday, 21 February 2018, small
Diary entry for Wednesday, 21 February 2018 Complete exposure details

   
This should be International-fish-detail-2.jpeg.  Is it missing?
Image title: International fish detail 2
Complete exposure details
Dimensions: 717 x 722, 149 kB
Dimensions of original: 717 x 722, 149 kB
Display this image:
thumbnail    hidden   alone on page
Display all images on this page as:
thumbnails    this size
Show for Wednesday, 21 February 2018:
thumbnails    small images    diary entry

Once again we needed the help of multiple assistants to find where to look for what we were looking for (sprats), which, unfortunately, we didn't find.


Car service intervals
Topic: general Link here

To Ballarat Automotive to pay for the repairs to my car. I can't pick it up until I have another driver with me, and currently Yvonne doesn't fit that bill. But I had already established from CJ Ellis that the oil change interval for the car is indeed 15,000 km. How many had I done since the last service nearly 2 years ago? 5,000!

OK, there's the second half: “or 12 months”. Why 12 months? When I was a lad, the answer was simple: people who do low mileages do short trips, and the engine never warms up properly, so it accumulates moisture. This is exactly what my ancient “Book of the Car” (1970) states.

But what were the time intervals in those days? Round 1 year, I suspect, though at the time oil change intervals were typically 5,000 km. The distance has increased, but the time hasn't. And in any case, I don't do short journeys: I do few trips, but they're typically 60 or 70 km or more. So I decided against, not hindered by the fact that the car is on its last legs anyway.


nVidia pain, next step
Topic: technology, opinion Link here

I've had a lot of pain with the nvidia driver for FreeBSD lately: first the performance bug I experienced last month, and then yesterday's panic. The two are not completely unrelated: as the result of the performance issue, I'm using an old version of the driver. Could it be that only this driver causes problems?

In any case, I had a rather strange request from the person handling the driver bug:

Can we get video showing performance drop when only single display is connected to the gpu?

I have tested with Ubuntu 16.04.2 + 384.111/390.32 drivers + GeForce GT 710 + Unity desktop . I observed these messages are generating in /var/log/Xorg.0.log file when display goes to sleep mode. I put display in sleep mode with command "xset dpms forece off" and checked Xorg logs. I observed below message generating after every second. But I didn't observed perf drop while interacting with desktop. Also I see below message when opened ubuntu applications like Power Management Settings, Abut this Computer, System Settings , Displays etc.

[  8618.484] (--) NVIDIA(GPU-0): HP 22uh (DFP-0): connected
[  8618.484] (--) NVIDIA(GPU-0): HP 22uh (DFP-0): Internal TMDS
[  8618.484] (--) NVIDIA(GPU-0): HP 22uh (DFP-0): 330.0 MHz maximum pixel clock
...

OK, this is clearly a different monitor. In any case the presence of the messages should be enough to investigate. But I suppose I should humour them, though it's a fair amount of work. In this case, though, it would also make sense to see whether the panic still occurs.

So I saved the old driver and installed the latest. Ran as before. No problem! No messages.

Why? It's a new driver, 390.25, released on 29 January 2018, as the archive shows. When I entered the bug report on 21 January, the driver release was 384.111, though the archive now shows that version 390.12 was released on 4 January. Was it really? Potentially the ports maintainer didn't include it in the port, though the change log reports:

r460676 | danfe | 2018-02-02 19:34:33 +1100 (Fri, 02 Feb 2018) | 4 lines

Update to the latest long lived branch version, 390.25.

PR: 225574

------------------------------------------------------------------------
r459638 | danfe | 2018-01-22 20:05:44 +1100 (Mon, 22 Jan 2018) | 5 lines

Update nVidia drivers to their latest versions which fix frequent kernel
panics reported by some users.

PR: 225346

------------------------------------------------------------------------
r457308 | danfe | 2017-12-27 05:55:18 +1100 (Wed, 27 Dec 2017) | 5 lines

Update nVidia driver ports to their most recent versions, bringing assorted
bugfixes and support for X.Org xserver ABI 23 (xorg-server version 1.19).

PR: 224597

The reference to panics in revision 459638 is interesting. Following up the bug report shows that one of the versions in question was 340.104, the version I was running. But what was the panic? No mention, only that it happened about once a day, not quite the same situation.

Yet another race condition, but hopefully I won't need to worry about it any more.


This page contains (roughly) yesterday's and today's entries. I have a horror of reverse chronological documents, so all my diary entries are chronological. This page normally contains the last two days, but if I fall behind it may contain more. You can find older entries in the archive. Note that I often update a diary entry a day or two after I write it.     Do you have a comment about something I have written? This is a diary, not a “blog”, and there is deliberately no provision for directly adding comments. But I welcome feedback and try to reply to all messages I receive. See the diary overview for more details. If you do send me a message relating to something I have written, please indicate whether you'd prefer me not to mention your name. Otherwise I'll assume that it's OK to do so.


Greg's home page This month Greg's photos Greg's links

RSS 2.0 Valid XHTML 1.0!