Saturday, November 08, 2008

Something broke...

Day One

Three kernel panics already and the day has just started. Looks like something is seriously wrong with my computer, since it used to run fine before.

45 minutes of memcheck'ing later, still no hint on what is wrong. Nothing suspicious in dmesg or other logs.

Kernel panics continue. I now cleaned up some older Nvidia drivers, just in case any of them was causing trouble... no success.

Now did an explicit downgrade from Nvidia 173 to 96, but that didn't help either. Thing continues to crash every hour or two.

Day Two

Next day, just a single kernel panic the whole day, still clueless on the cause. Switching to console didn't work at one point, only resulted in graphic mess, might be related or not.

Day Three

One Kernel Panic so far, this one was interesting, since it crashes a second after I unmounted a disk, with the umount command segfaulting before the crash. Rest of the day was crash free.

Day Four

A very crash happy day so far, one crash with garbled graphics on the screen, two random crashes, one crash while booting. Now playing around with lm-sensors and cpuburn to see if I can produce the crashes somehow. Now downgraded the kernel, maybe that will help or maybe not. I now disabled swap. Side note: Wacom tablet no longer works under Ubuntu 8.10 with an old 2.6.24 kernel, movement is registered, put pressing down isn't.

Everything to no avail, the thing continues to crash and quite rapidly.

Next step: try "acpi=off apm=on apm=power-off irqpoll" as boot parameter.

Day Five

The boot parameter had no effect, two crashes already, one right when switching the computer on this morning while booting. Now trying if: apt-get remove acpid acpi --purge will do anything.

Day Six

Crashes are getting worse and worse, now the thing will only survive for minutes at a time. After running memcheck again for 1:30h it showed finally errors, however it shows errors in *both* RAM modules, removing one or the other still gives errors. So it might not be the RAM thats bad, but the mobo or whatever. No idea. I know reseted BIOS to "Save default". No success.

This PC is screwed, now preparing to move stuff to another box.

