UFies.org – The Plan

Another late night it seems. Ok, so here’s the plan. Cat5 has put me in his debt by lending me his desktop machine, a nice little Athlon XP1800+, to use for UFies. Originally it was just going to be for the motherboard, but in the end I decided that I was going to do a bit more with it. Take two of the hard drives out of the current UFies.org box and as you saw there are enough, set them up in RAID1 and copy the entire ufies system onto it from the backup drive. Suddenly now the box is back up and running exactly as it was before, with only a little less power in it. With the server in a new box the old server is free and clear to be tested, poked, prodded, and whatever. No dickering around with moving hardware from one place to another, just plug it in and the same scsi/ide/memory is there to be used.

The “new” server can go back to the server farm for a week or so until new hardware is ordered and a new system is built and tested. Because there is no server down that I’m feeling guilty about and no people wanting to go home looking over my shoulder, I’m free to do it slow and properly. Well, until Cat5 arrives after a week of computer withdrawl and steals my systems that is.

So for the last few hours I was doing (again) what I thought would be relatively quick and easy. Hardware transferred over, boot up, make sure things are good, create partitions, create raid, all good. Had some trouble convincing the new server to boot up with a third hard drive in it, so I ended up copying all the data from the backup to the hard drives with 1/2 of a RAID1, but that’s ok. When the data was transferred over I put in the other drive back in and booted up knoppix, started the raid, and waited for things to sync up (again).

Again, right at the end an odd hard drive error came up. No huge worries, hard drive errors are what I’m all about, and doing a software RAID resync seems to be a good way to flush them out. So I take out the second drive, throw in yet another spare IDE from the original server and let that sync. I’m currently waiting for the last of the three partitions to finish with 8.2… 8.0…. 7.9…. minutes remaining. My fingers are crossed 🙂

Once this is done I boot up the old root with knoppix in rescue mode, run lilo, make sure the partition changes are in ok, and reboot a couple of times to make sure things come up properly and then get the box ready to be dropped off sometime tomorrow morning. The hosting guys were nice enough to agree to come in and let me drop the box in (should be a matter of plugging things in… in theory).

6.0… 5.8… 5.6…

Update: Had a minor speedbump in there (/usr was currupted after the drive muck up I mentioned, so had to re-copy it… now the box boots with three IDE drives… odd, happy, but odd), but we’re almost there. Files are copied over and ready to run lilo and reboot.

And the crowd goes wild! Up and in single user mode! Why the hell have /home and /var suddenly dissapeared? Got me. Ok, up and going… drives don’t go into DMA mode… argh! Oh, forgot to enable VIA support, make modules modules_install, modprobe via82cxxx, ok, back up and going…. ok, reboot, can’t boot from CD again, oh well, attach backup hard drive, mount, check, ok, all good, copy data, compile a kernel at the same time as well, why the hell not, it’s 2am and I’m in a crazy mood! Ok, /var copied over and 9 gigs of /home left… bed time! (2:21am)

2 Comments on “UFies.org – The Plan”

  1. So are we talking bad hardrives still? argh..
    I’ll send Neil out with a WD 40GB with the motherboard if you require it… Too bad there isn’t any warrenty left on these suckers, or is there? I guess your going to be testing them when rebuilding the main server?
    Also when you do that, run the SCSI drives through tests as well. Seatools is what we use at work (http://www.seagate.com/support/seatools/) Not sure, but it might work on the IDE drives as well.

  2. By any chance are you mixing different drive manufacturers (ie.- Seagate, Western Digitals, IBMs…) ?
    I have a PC which had MAJOR issues with a WD and Seagate drive on the same bus, mixed in with the CDROM drive, which was causing me NO END of grief. It wasn’t until I figured out the magic scheme to connect everything that all the DMA errors disappeared, and ever since then the machine has been working tirelessly.
    Another odd thing I noticed with AMD processors is that when they start to overheat, they also begin exhibiting oddball DMA errors and such. Make sure that your CPU fan isn’t doing something silly like sticking from a cable caught in its fins.
    (and if these seem like simplistic suggestions, trust me, the DOH! factor when I resolved everything wasn’t lost on me …)