Booting up Solaris 10 from a SAN replicated LUN on a different Sun SPARC server

The quickest way to recover from a total disaster is to have some sort of replication implemented. There are two different methods of real-time replication, hardware and software. My experiences with software replication such as Symantec Veritas Volume replicator for AIX was not pleasing. It required constant maintenance and troubleshooting. The best is hardware replication if you can afford it. A lot of organizations pick software replication as it generally cost a lot less up front, but the cost of maintenance eventually adds up.

I will explain how to recover a Solaris 10 server from hardware replicated SAN disk. It took me sometime to figure out how to boot up from the replicated SAN LUN (disk), and many more hours to understand why the steps I applied works.

In this example I have a SUN SPARC M3000 server with two Qlogic fiber channel cards (HBA) installed in the PCI solts. The HBAs were already configured to connect to the SAN disk (LUN). This LUN contained the replicated copy of a production Slaris 10 server. The production server had two ZFS pool residing in a single LUN.

Using the Solaris 10 installation CD boot up the Sparc server into single user mode.

boot cdrom -s

The first thing you need to do is to see if the HBAs are working. The connected status indictes that communication between the server and switch is working. There are two HBAs installed for redundancy, both connected to the same LUN.

# luxadm -e port
/devices/pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0:devctl CONNECTED
/devices/pci@1,700000/pci@0/pci@0/SUNW,qlc@0/fp@0,0:devctl CONNECTED

Now you need to find out if the SAN disk is visible from the server. Even though both HBAs are connected to the same SAN disk, you will see two separate SAN disks in the results below. It just means there are two paths to the SAN.

# luxadm probe
No Network Array enclosures found in /dev/es

Found Fibre Channel device(s):
Node WWN:50060e80058c7b10 Device Type:Disk device
Logical Path:/dev/rdsk/c1t50060E80058C7B10d1s2
Node WWN:50060e80058c7b00 Device Type:Disk device
Logical Path:/dev/rdsk/c2t50060E80058C7B00d1s2

In the above example the first LUN is c1t50060E80058C7B10d1s2. This is the logical device name which is a symbolic link to the physical device name stored in the /devices directory. Logical device names contain the controller number(c2), target number (t50060E80058C7B10), disk number (d1), and slice number (s2).

The next step is to find out how the disk is partitioned, the format command will give you that information. You need this information to understand how to boot up the disk.

# format
Searching for disks…done

AVAILABLE DISK SELECTIONS:
0. c1t50060E80058C7B10d1 /pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b10,1

1. c2t50060E80058C7B00d1 /pci@1,700000/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b00,1

Select the first disk 0.

Specify disk (enter its number): 0
selecting c1t50060E80058C7B10d1
[disk formatted]

FORMAT MENU:
disk – select a disk
type – select (define) a disk type
partition – select (define) a partition table
current – describe the current disk
format – format and analyze the disk
repair – repair a defective sector
label – write label to the disk
analyze – surface analysis
defect – defect list management
backup – search for backup labels
verify – read and display labels
save – save new disk/partition definitions
inquiry – show vendor, product and revision
volname – set 8-character volume name
! – execute , then return
quit

Display the labels and slices (partitions). In Solaris each slice is treated as a separate physical disk. In the below example you can tell that the disk is labeled as VTOC (Volume Table of Contents) because you can see the cylinders. VTOC is also known as SMI label. If the disk was labeled with EFI (Extensible Firmware Interface), then you would see sectors instead of cylinders. Partition 0 (slice 0) holds the operating system files, the boot disk. Please note that you cannot boot from a disk with EFI label. Slice 2 is the entire physical disk because it contains all cylinders, 0 – 65532.

format> verify

Primary label contents:

Volume name = < ascii name =
pcyl = 65535
ncyl = 65533
acyl = 2
nhead = 15
nsect = 1066
Part Tag Flag Cylinders Size Blocks
0 root wm 0 – 3356 25.60GB (3357/0/0) 53678430
1 unassigned wm 0 0 (0/0/0) 0
2 backup wm 0 – 65532 499.66GB (65533/0/0) 1047872670
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 3357 – 65532 474.07GB (62176/0/0) 994194240

Here is what we know so far, we know that the disk name is c1t50060E80058C7B10d1. Slice 0 on this disk contains the boot files. The physical path for this disk is /pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b10,1. Now we need to find out the physical path for Slice 0.

I know that the disk contains ZFS filesystems because it is a replica of the production disk. When a ZFS filesystem is moved to a different SPARC server it must first be imported because the hostid is different.

List the ZFS pool contained on the disk using the zpool import command. There are two ZFS pools in the below example, epool and rpool. Take a note of the status and action.

# zpool import
pool: epool
id: 16865366839830765202
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and the ‘-f’ flag.
see: http://www.sun.com/msg/ZFS-8000-EY
config:

epool ONLINE
c2t50060E80058C7B00d1s7 ONLINE

pool: rpool
id: 10594898920105832331
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and the ‘-f’ flag.
see: http://www.sun.com/msg/ZFS-8000-EY
config:

rpool ONLINE
c2t50060E80058C7B00d1s0 ONLINE

Import the pool with the zpool import command. The options -a will import all the ZFS pools it can find, the -f option will force the import. If you do not specify the force option, then the import may fail with the error “cannot import ‘rpool’: pool may be in use from other system, it was last accessed by server name (hostid: 123456)”. Ignore the error message about failed to create mountpoint.

# zpool import -af
cannot mount ‘/epool’: failed to create mountpoint
cannot mount ‘/rpool’: failed to create mountpoint

List the imported ZFS pools.

# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
epool 472G 260G 212G 55% ONLINE –
rpool 25.5G 5.75G 19.8G 22% ONLINE –

List the ZFS filesystems. In the example below notice the mountpoint / is mounted on the zfs filesystem rpool/ROOT/zfsboot. This is the boot partition, it resides in rpool.

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
epool 260G 205G 21K /epool
rpool 7.74G 17.4G 99K /rpool
rpool/ROOT 4.74G 17.4G 21K legacy
rpool/ROOT/zfsboot 4.74G 17.4G 4.48G /
rpool/ROOT/zfsboot/var 139M 17.4G 105M /var
rpool/dump 1.00G 17.4G 1.00G –
rpool/swap 2.00G 19.4G 16K –

Change the mountpoint for rpoo/ROOT/zfsboot to /mnt so you can mount it to read the contents.

# zfs set mountpoint=/mnt rpool/ROOT/zfsboot

Confirm that the mountpoint was changed.

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
epool 260G 205G 21K /epool
rpool 7.74G 17.4G 99K /rpool
rpool/ROOT 4.74G 17.4G 21K legacy
rpool/ROOT/zfsboot 4.74G 17.4G 4.48G /mnt
rpool/ROOT/zfsboot/var 139M 17.4G 105M /mnt/var
rpool/dump 1.00G 17.4G 1.00G –
rpool/swap 2.00G 19.4G 16K –

Now mount rpool/ROOT/zfsboot.

# zfs mount rpool/ROOT/zfsboot

List the logical disks.

# cd /dev/dsk
# ls
c1t50060E80058C7B10d1s0 c2t50060E80058C7B00d1s0
c1t50060E80058C7B10d1s1 c2t50060E80058C7B00d1s1
c1t50060E80058C7B10d1s2 c2t50060E80058C7B00d1s2
c1t50060E80058C7B10d1s3 c2t50060E80058C7B00d1s3
c1t50060E80058C7B10d1s4 c2t50060E80058C7B00d1s4
c1t50060E80058C7B10d1s5 c2t50060E80058C7B00d1s5
c1t50060E80058C7B10d1s6 c2t50060E80058C7B00d1s6
c1t50060E80058C7B10d1s7 c2t50060E80058C7B00d1s7

As stated earlier the physical path for the disk we are looking for is /pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b10,1. The boot slice is 0. We can derive from the physical path that the disk name is 50060e80058c7b10. We also know from the output of the format command that the physical disk 50060e80058c7b10 maps to the logical disk c1t50060E80058C7B10d1. Therefore we can derive that the logical boot disk is c1t50060E80058C7B10d1s0.

Now find out what physical path c1t50060E80058C7B10d1s0 is a symbolic link for and that is your complete boot path. In the below example the boot path starts at the first slash (/) right after /devices. It is /pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b10,1:a. You need to replace ssd@ with disk@ when entering the path into EEPROM.

# ls -l c1t50060E80058C7B10d1s0
lrwxrwxrwx 1 root root 82 Jul 5 10:21 c1t50060E80058C7B10d1s0 ->
../../devices/pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/ssd@w50060e80058c7b10,1:a

If you have more than one ZFS pool the non root pool may not get mounted upon booting up the server, as in this example. You may get the below error.

SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Mon Jul 5 11:54:14 EDT 2010
PLATFORM: SUNW,SPARC-Enterprise, CSN: PX654321, HOSTNAME: Andrew-Lin
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: 33e5a9f1-49ac-6ebc-f2a9-dff25dea6b86
DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information.
AUTO-RESPONSE: No automated response will occur.
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run ‘zpool status -x’ and replace the bad device.

http://sun.com/msg/ZFS-8000-D3fameserverq9{root}: zpool status -x
pool: epool
state: UNAVAIL
status: One or more devices could not be opened. There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using ‘zpool online’.
see: http://www.sun.com/msg/ZFS-8000-3C
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
epool UNAVAIL 0 0 0 insufficient replicas
c3t60060E8005652C000000652C00002100d0s7 UNAVAIL 0 0 0 cannot open

The above error is caused by the zpool.cache file. This file contains the old paths of the disks from the previous server. The default behavior of Solaris 10 is to read the path from the zpool.cache file to speed up the boot sequence. You should delete this file and the system will recreate a fresh one during the boot up sequence.

Below are the steps to rename the zpool.cache file.

# cd /mnt/etc/zfs
# ls
zpool.cache
# mv zpool.cache zpool.cache.old

Now you need to reverse the changed you applied to the mountpoint earlier. Make sure that you change directory out of /mnt to /, otherwise the set mountpoint command will fail with the error device busy. Ignore the cannot mount ‘/’: directory is not empty message.

# zfs set mountpoint=/ rpool/ROOT/zfsboot
cannot mount ‘/’: directory is not empty
property may be set but unable to remount filesystem

Confirm that the mount points were changed.

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
epool 260G 205G 21K /epool
rpool 7.74G 17.4G 99K /rpool
rpool/ROOT 4.74G 17.4G 21K legacy
rpool/ROOT/zfsboot 4.74G 17.4G 4.48G /
rpool/ROOT/zfsboot/var 139M 17.4G 105M /var
rpool/dump 1.00G 17.4G 1.00G –
rpool/swap 2.00G 19.4G 16K –

Shutdown the server.

# init 0

Now set the boot device in EEPROM.

setenv boot-device /pci@0,600000/pci@0/pci@8/SUNW,qlc@0/fp@0,0/disk@w50060e80058c7b10,1:a

The server is ready to be booted with the boot command.

{0} ok boot

About Andrew Lin

Hi, I have always wanted to creat a blog site but never had the time. I have been working in Information Technology for over 15 years. I specialize mainly in networks and server technologies and dabble a little with the programming aspects. Andrew Lin

View all posts by Andrew Lin →