While the procedure I mentioned in my original summary (at end of message) worked; there were, and still are, some caveats. With additional input from Derek Smythe and Noel Milton Vega, and a gob more testing on my part, I thought I would post a complete procedure. This allows me to take ufsdump backup tapes from one system to another when the first has failed and I don't have an identical machine to recover to. In my case I'm planning for a situation where I might have to recover a Sun Blade 100 to a Sun Enterprise 250 (of which I have several hand-me-downs in waiting). ---------- Caveats: I'm using Solaris 9. With 10, I understand, I would need to do a "root_archive" (see man page). Many of the variations I tried with devfsadm to rebuild the device tree ended up with the internal drives being c1 rather than c0. The procedure below simply works. When I did it with -v for verbose, I was astonished at how much it did. Both machines (the "dead" and the replacement) are sun4u platform with UltraSPARC II CPU's. I don't know how this procedure would work if the differences were more extreme. I brought the recovered system up without a network connection. I could see that the system booted, that the expected applications came up, and that the console messages were as expected. The applications were doing their regular things, trying to punch their way out of the box, and complaining they had no network. If I had given it a network connection, it would have played havoc with my network. Those who have the infrastructure set up might use flash archives to accomplish what I am doing here. I'm using hand-me-down machines and still lobbying for a tape robot so that I can centralize backups. ------------- Procedure: Dead machine. Have backups from ufsdump. Set up replacement machine with appropriate tape drive, CD, and boot drive that can be reformatted. Boot off Solaris 9 Install CDROM, choose language, when it asks about formatting drives, quit. This gets you to a unix prompt. # format choose the boot drive, and format/partition it according to the boot drive on the original machine. It is important that you have kept such information as part of your backups -- a printout of /etc/vfstab, `df -k`, and a printout of the partition table. The drive I had was actually bigger than the one I was replacing, so I had extra space. I just made sure the relevant partitions large enough. # newfs /dev/dsk/c0t0d0s0 set up a file system on whichever partitions you are going to need to recover from the tapes. put the tape in. # mount /dev/dsk/c0t0d0s0 /mnt mount the partition you are going to recover. # cd /mnt get into the partition where you are going to recover. # rm -r lost+found don't know that that is really necessary, but it eliminates an error complaint when the tape tries to recover that. # mt status make sure you are at the right position for the recovery. my tapes have several "files" per tape corresponding to the partitions on the boot drive. It is important to have a copy of the backup script or command that was used to make the tape so that you know exactly what is on it and in what order. This should have been done in advance, since the machine is now dead. # ufsrestore rf /dev/rmt/0n do the recovery. From a console, verbose mode can actually slow it down, so I leave that off. I use the no-rewind device "0n" so that I can then grab the next partition off the same tape. # ls check to make sure you have it. # cd / # umount /mnt repeat the above (from the mount down to the umount) for each partition that needs to be recovered. Be sure to check the mt status and ls to make sure you have what you think you should have. # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0 get your boot blocks in. Worth noting here that, although I have documented things with s0 as the root partition, I actually had my system set up with s3 as the root partition. I installed the boot blocks on s3. now, because the hardware is actually somewhat different (e.g. scsi devices rather than ide), some changes have to be made on some information in the root partition. # mount /dev/dsk/c0t0d0s0 /mnt # mv /mnt/etc/hostname.eri0 /mnt/etc/hostname.hme0 this was specific to my switch of systems. others may differ. the Sun Blade uses eri for the network interface, whereas the E250 uses hme. I also had 10 virtual interfaces on this machine, so I had to repeat the above command that many times, modified each time like: # mv /mnt/etc/hostname.eri0:1 /mnt/etc/hostname.hme0:1 then toss and rebuild the device tree: # rm /mnt/etc/path_to_inst # rm -r /mnt/dev # rm -r /mnt/devices # /usr/sbin/devfsadm -r /mnt -p /mnt/etc/path_to_inst if you put the -v on the end, it will scroll off things it is adding for 5 minutes or more. it is of interest that I didn't find the -p option in my man pages. I knew it wasn't doing it, and I kept trying slightly different things. Finally I found a page describing veritas recovery that mentioned the -p. It worked. I had also tried a variety of alternatives that were more selective about the rm. I kept ending up with my internal drives being c1 rather than c0. Complete removal of the device tree got rid of that problem and worked. I actually newfs'd this partition and started from scratch several times, just to make sure I had a straight through clean procedure and wasn't muddying things up with multiple retries. # touch /mnt/reconfigure just for good measure. it can't hurt really. # umount /mnt Now, at this point, if the eeprom is in order, I could just reboot, and I would be in business. However, I have had issues at one time or another with inherited machines having settings that got me in trouble in one way or another. So, ... # stop-a (stop key, a) {1} ok printenv check all the eeprom settings. in particular, I want {1} ok setenv auto-boot? true {1} ok setenv boot-device disk remember, I said I had my root partition on s3? to make that work, I needed boot-device disk:d rather than disk. {1} ok setenv diag-switch? false that's just to keep it from going to diagnostics and then booting to diag-device, which is net. {1} ok boot disk or, I would say "boot disk:d" to boot off s3 as the root partition. also, note that these settings are particular to my hardware. depending on the hardware you are setting up on, this may differ. at this point, I should be in business. ------------------------ ------------------------ now, since I had to do all those virtual interfaces, and the rebuilding of the device tree, I decided to put that in a script and store it on the machine that "might die" so that it would be on the backup tapes I would be recovering from. I put it in /etc/rebuild. It contains some documentation of the rebuild process and the following executable lines (all with absolute paths to get rid of the risk of someone executing the script in the wrong situation). I figured this might save me some typing, and possible typos, some day when I'm in panic mode. #!/bin/sh mv /mnt/etc/hostname.eri0 /mnt/etc/hostname.hme0 mv /mnt/etc/hostname.eri0:1 /mnt/etc/hostname.hme0:1 mv /mnt/etc/hostname.eri0:2 /mnt/etc/hostname.hme0:2 mv /mnt/etc/hostname.eri0:3 /mnt/etc/hostname.hme0:3 mv /mnt/etc/hostname.eri0:4 /mnt/etc/hostname.hme0:4 mv /mnt/etc/hostname.eri0:5 /mnt/etc/hostname.hme0:5 mv /mnt/etc/hostname.eri0:6 /mnt/etc/hostname.hme0:6 mv /mnt/etc/hostname.eri0:7 /mnt/etc/hostname.hme0:7 mv /mnt/etc/hostname.eri0:8 /mnt/etc/hostname.hme0:8 mv /mnt/etc/hostname.eri0:9 /mnt/etc/hostname.hme0:9 mv /mnt/etc/hostname.eri0:10 /mnt/etc/hostname.hme0:10 rm /mnt/etc/path_to_inst rm -r /mnt/dev rm -r /mnt/devices /usr/sbin/devfsadm -r /mnt -p /mnt/etc/path_to_inst touch /mnt/reconfigure Then, I could sidestep all that typing by doing: # mount /dev/dsk/c0t0d0s0 /mnt # /mnt/etc/recover # umount /mnt That's it. --------------- Chris Hoogendyk - O__ ---- Systems Administrator c/ /'_ --- Biology & Geology Departments (*) \(*) -- 140 Morrill Science Center ~~~~~~~~~~ - University of Massachusetts, Amherst <hoogendyk@bio.umass.edu> --------------- Erdvs 4 -------- Original Message -------- Subject: [Summary] Recovery backups to slightly different hardware Date: Wed, 18 Oct 2006 16:54:38 -0400 From: Chris Hoogendyk <hoogendyk@bio.umass.edu> To: Sun Managers List <sunmanagers@sunmanagers.org> References: <45365B4A.6040905@bio.umass.edu> Thanks to everyone. [Original message at bottom.] Essentially all had the same suggestion with slight variants -- Karl Rossing from Federated Ins., CA; Claude Charest from Hydro-Quebec, CA; Steve Beuttel from cox.net; Francisco from Ann Arbor, MI, US (www.blackant.net); Michael Maciolek from world.std.com; Stan Pietkiewicz from Statistics Canada; and Christopher Manly from Cornell University. I used Steve's suggestion, because he provided step by step detail that accounted for idiosyncrasies of copying device trees: Assuming you're booted from the CD, and your "/" is mounted on "/a", try: "cd /a" "mv dev <yymmdd>_dev" "mv devices <yymmdd_devices" "mkdir dev devices" "chmod 755 dev devices" "chown root:sys dev devices" "cd /dev; find . -depth -print | cpio -pdm /a/dev" "cd /devices; find . -depth -print | cpio -pdm /a/devices" "cd /a/etc" "mv path_to_inst <yymmdd_path_to_inst" "cp -p /etc/path_to_inst /a/etc/path_to_inst" Then reboot. -Steve- Others suggested using devfsadm. I should probably look into that for the future. However, Steve's method worked. I also did a touch /a/etc/reconfigure for good measure. --------------- Chris Hoogendyk - O__ ---- Systems Administrator c/ /'_ --- Biology & Geology Departments (*) \(*) -- 140 Morrill Science Center ~~~~~~~~~~ - University of Massachusetts, Amherst <hoogendyk@bio.umass.edu> --------------- Erdvs 4 Chris Hoogendyk wrote: > I have been trying to do a proof of concept and document the details to > recover one of our critical servers in case it fails for some reason. > (Just last month we had a building wide power snafu that caused untold > $$$ damage. My servers survived, but the event instilled the fear of > God, so to speak.) The server in question is a Sun Blade 100 (yeah, I > know, it's not a Server) that is running our name services and for our > internal network. If it goes down, the network starts falling apart. > > Anyway, most of our departmental servers are E250's, and we happen to > have a few extra E250's for backup. > > Both of these systems are sun4u and we are running Solaris 9. I have > backup tapes that are done using ufsdump from an fssnap snapshot piped > through ssh to a remote tape drive on another server. I've used these to > recover files and directories, but never had to do a full recovery. So, > I figured I would grab a backup tape, a spare E250, plop some drives in > it, and try to do a recovery. > > I started out by booting off the Solaris 9 install CD, formatting and > partitioning c0t0d0 to match the boot drive on the Sun Blade, and then > doing newfs and recovering all the partitions from the backup tape using > ufsrestore. Everything seems to be there. I went into /mnt/etc and did > `mv hostname.eri0 hostname.hme0` for each of the interfaces, 'cause I > knew that would hit. Then I did the installboot, got back to the OK > prompt and did a `boot disk:d` (that's where the root partition is). It > goes through all it's stuff and finishes up with: > > ----------------------------- > > Rebooting with command: boot disk:d > Boot device: /pci@1f,4000/scsi@3/disk@0,0:d File and args: > Loading ufs-file-system package 1.4 04 Aug 1995 13:02:54. > FCode UFS Reader 1.12 00/07/17 15:48:16. > Loading: /platform/SUNW,Ultra-250/ufsboot > SunOS Release 5.9 Version Generic_118558-03 > 64-bit|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/|\-/ > Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. > Use is subject to license terms. > WARNING: status 'fail' for '/rsc'-/|\-/|\-/ > configuring IPv4 interfaces: hme0 hme0:1 hme0:10 hme0:2 hme0:3 hme0:4 > hme0:5 hme0:6 hme0:7 hme0:8 hme0:9. > Hostname: pilot > /dev/dsk/c0t0d0s1: No such device or address > The / file system (/dev/rdsk/c0t0d0s3) is being checked. > Can't open /dev/rdsk/c0t0d0s3 > /dev/rdsk/c0t0d0s3: CAN'T CHECK FILE SYSTEM. > /dev/rdsk/c0t0d0s3: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. > > WARNING - Unable to repair the / filesystem. Run fsck > manually (fsck -F ufs /dev/rdsk/c0t0d0s3). Exit the shell when > done to continue the boot process. > > > Type control-d to proceed with normal startup, > (or give root password for system maintenance): > > ----------------------------- > > When I went in and tried `format`, it said "no disks found". > > I rebooted off the cdrom, did `format`, and they are there. > > I actually did 2 more things in the process of debugging and getting to > this point. > > I did `mount /dev/dsk/c0t0d0s3 /mnt`, went into /mnt/etc and did a > `touch reconfigure`. > > I also went into /mnt/platform/SUNW,Ultra-250 and didn't find a "unix", > whereas I did find it in /mnt/platform/sun4u. So, I did `mv > SUNW,Ultra-250 SUNW,Ultra-250.orig` followed by a `ln -s sun4u > SUNW,Ultra-250`. This got me past an earlier error, ... I think. > > > > So, now I'm stuck and not quite sure whether this is impossible or I'm > just missing the magic trick. I thought since they were both UltraSPARC > and sun4u that I would be able to do it. Any suggestions or insight > would be much appreciated. > > > --------------- > > Chris Hoogendyk > > - > O__ ---- Systems Administrator > c/ /'_ --- Biology & Geology Departments > (*) \(*) -- 140 Morrill Science Center > ~~~~~~~~~~ - University of Massachusetts, Amherst > > <hoogendyk@bio.umass.edu> > > --------------- > > Erdvs 4 > _______________________________________________ > sunmanagers mailing list > sunmanagers@sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Wed Oct 25 15:12:05 2006
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:02 EST