Replacing a disk in VxVM
--------------------------------------------------------
This has always been a difficult task while replacing a disk. The main worry which runs in our mind that data should not be lost while performing the activity, it should be intact otherwise you know the pain of facing other team.
Here i will be explaining very simple steps which will make you comfortable in doing such critical activities.
I will be talking about concatenated volume where you don't have redundant copies of data later i will cover the mirrored volume.
Disk replacement in concatenated volume
Here is the vxprint -htq output
bash-3.00# vxprint -htq
Disk group: appdg
dg appdg default default 4000 1347953587.80.vcs1
dm disk03 disk_3 auto 65536 2027168 -
dm disk04 disk_4 auto 65536 2027168 -
v appvol - ENABLED ACTIVE 3121152 SELECT - fsgen
pl appvol-01 appvol ENABLED ACTIVE 3121152 CONCAT - RW
sd disk03-01 appvol-01 disk03 0 2027168 0 disk_3 ENA
sd disk04-01 appvol-01 disk04 0 1093984 2027168 disk_4 ENA
In the above appdg diskgroup first subdisk is disk03-01 which is associated with disk03 and the plex is a concatenated type.
In the above scenario, if in case disk03 which is the first disk added to diskgroup and on which the Veritas has created the first subdisk for "appvol" volume. There is no way to replace the disk, if you try to do so, you will end up in corrupting the filesytem and the data.
But if you want to replace the second disk(flag - failing disk status) which is absolutely possible without data loss but during this activity volume would not accessible.
First we will use vxdiskadm to replace the disk, later we will try with only command prompt.
Connect the new disk in an empty slot and initialize it.
Run vxdiskadm from the command prompt, it will open up a list of menu items.
-> Select option no 4 - Remove a disk for replacement
Select the correct disk which you want to replace
(Caution - Do not select first the disk in the disgroup)
Once you have removed the disk it will prompt to select a disk with which you want to replace. Select the disk which you have initialized.http://www.blogger.com/blogger.g?blogID=7967882172610062116#editor/target=post;postID=2988077713332921354
Once you are done with the above steps, you will find the plex is in disabled and recover steps
# vxprint -htqv
Disk group: appdg
v appvol - DISABLED ACTIVE 3121152 SELECT - fsgen
pl appvol-01 appvol DISABLED RECOVER 3121152 CONCAT - RW
sd disk03-01 appvol-01 disk03 0 2027168 0 disk_3 ENA
sd disk04-01 appvol-01 disk04 0 1093984 2027168 disk_5 ENA
#
You need to correct the above
Below sequence of commands needs to executed to fix the abovfe
#vxmend -g appdg fix stale appvol-01
#vxmend -g appdg fix clean appvol-01
#vxvol -g appdg start appvol
That's it you are done. Mount the volume and you are ready to use the volume.
Replacing a failed disk in mirrored volume
Here is my volume status
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir02-01 appvolmir-02 mir02 0 1024000 0 disk_4 ENA
Where appvolmir is mirrored volume in mir diskgroup, having 2 plexes appvolmir-01 and appvolmir-02
Change the plex status to offline
#vxmend -g mir off appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir DISABLED OFFLINE 1024000 CONCAT - RW
sd mir02-01 appvolmir-02 mir02 0 1024000 0 disk_4 ENA
#
Disassociates the offline plex from the volume
#vxplex -g mir dis appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
pl appvolmir-02 - DISABLED - 1024000 CONCAT - RW
sd mir02-01 appvolmir-0 mir02 0 1024000 0 disk_4 ENA
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
Remove the plex
#vxedit -g mir -r rm appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
Once plex is removed, removed the disk from veritas control
#vxdg -g mir rmdisk mir02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
So disk removal is done, Now its time to attach a new disk and sync the data. Identify a new disk should be of similar or more size.
Here in this example i would be using the disk_5 as a replacement disk.
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mirreplacedisk disk_5 auto 65536 41764864 -
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
So once the new disk is initialized and added to the diskgroup , mirror the volume. Here is the vxdisk list ouput
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0d0s2 auto:sliced ibmdg01 ibmdg online
disk_3 auto:cdsdisk mir01 mir online
disk_4 auto:cdsdisk - - online
disk_5 auto:cdsdisk mirreplacedisk mir online
disk_6 auto:SVM - - SVM
disk_7 auto:SVM - - SVM
disk_8 auto:ZFS - - ZFS
disk_9 auto:ZFS - - ZFS
disk_10 auto:ZFS - - ZFS
disk_11 auto:cdsdisk - - online
disk_12 auto:cdsdisk - - online
"mirrreplacedisk" is the replaced disk
#vxassist -g mir make mirror appvolmir mirreplacedisk
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mirreplacedisk disk_5 auto 65536 41764864 -
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mirreplacedisk-01 appvolmir-02 mirreplacedisk 0 1024000 0 disk_5 ENA
#
--------------------------------------------------------
This has always been a difficult task while replacing a disk. The main worry which runs in our mind that data should not be lost while performing the activity, it should be intact otherwise you know the pain of facing other team.
Here i will be explaining very simple steps which will make you comfortable in doing such critical activities.
I will be talking about concatenated volume where you don't have redundant copies of data later i will cover the mirrored volume.
Disk replacement in concatenated volume
Here is the vxprint -htq output
bash-3.00# vxprint -htq
Disk group: appdg
dg appdg default default 4000 1347953587.80.vcs1
dm disk03 disk_3 auto 65536 2027168 -
dm disk04 disk_4 auto 65536 2027168 -
v appvol - ENABLED ACTIVE 3121152 SELECT - fsgen
pl appvol-01 appvol ENABLED ACTIVE 3121152 CONCAT - RW
sd disk03-01 appvol-01 disk03 0 2027168 0 disk_3 ENA
sd disk04-01 appvol-01 disk04 0 1093984 2027168 disk_4 ENA
In the above appdg diskgroup first subdisk is disk03-01 which is associated with disk03 and the plex is a concatenated type.
In the above scenario, if in case disk03 which is the first disk added to diskgroup and on which the Veritas has created the first subdisk for "appvol" volume. There is no way to replace the disk, if you try to do so, you will end up in corrupting the filesytem and the data.
But if you want to replace the second disk(flag - failing disk status) which is absolutely possible without data loss but during this activity volume would not accessible.
First we will use vxdiskadm to replace the disk, later we will try with only command prompt.
Connect the new disk in an empty slot and initialize it.
Run vxdiskadm from the command prompt, it will open up a list of menu items.
-> Select option no 4 - Remove a disk for replacement
Select the correct disk which you want to replace
(Caution - Do not select first the disk in the disgroup)
Once you have removed the disk it will prompt to select a disk with which you want to replace. Select the disk which you have initialized.http://www.blogger.com/blogger.g?blogID=7967882172610062116#editor/target=post;postID=2988077713332921354
Once you are done with the above steps, you will find the plex is in disabled and recover steps
# vxprint -htqv
Disk group: appdg
v appvol - DISABLED ACTIVE 3121152 SELECT - fsgen
pl appvol-01 appvol DISABLED RECOVER 3121152 CONCAT - RW
sd disk03-01 appvol-01 disk03 0 2027168 0 disk_3 ENA
sd disk04-01 appvol-01 disk04 0 1093984 2027168 disk_5 ENA
#
You need to correct the above
Below sequence of commands needs to executed to fix the abovfe
#vxmend -g appdg fix stale appvol-01
#vxmend -g appdg fix clean appvol-01
#vxvol -g appdg start appvol
That's it you are done. Mount the volume and you are ready to use the volume.
Replacing a failed disk in mirrored volume
Here is my volume status
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir02-01 appvolmir-02 mir02 0 1024000 0 disk_4 ENA
Where appvolmir is mirrored volume in mir diskgroup, having 2 plexes appvolmir-01 and appvolmir-02
Change the plex status to offline
#vxmend -g mir off appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir DISABLED OFFLINE 1024000 CONCAT - RW
sd mir02-01 appvolmir-02 mir02 0 1024000 0 disk_4 ENA
#
Disassociates the offline plex from the volume
#vxplex -g mir dis appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
pl appvolmir-02 - DISABLED - 1024000 CONCAT - RW
sd mir02-01 appvolmir-0 mir02 0 1024000 0 disk_4 ENA
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
Remove the plex
#vxedit -g mir -r rm appvolmir-02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
dm mir02 disk_4 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
Once plex is removed, removed the disk from veritas control
#vxdg -g mir rmdisk mir02
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
So disk removal is done, Now its time to attach a new disk and sync the data. Identify a new disk should be of similar or more size.
Here in this example i would be using the disk_5 as a replacement disk.
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mirreplacedisk disk_5 auto 65536 41764864 -
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
#
So once the new disk is initialized and added to the diskgroup , mirror the volume. Here is the vxdisk list ouput
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0d0s2 auto:sliced ibmdg01 ibmdg online
disk_3 auto:cdsdisk mir01 mir online
disk_4 auto:cdsdisk - - online
disk_5 auto:cdsdisk mirreplacedisk mir online
disk_6 auto:SVM - - SVM
disk_7 auto:SVM - - SVM
disk_8 auto:ZFS - - ZFS
disk_9 auto:ZFS - - ZFS
disk_10 auto:ZFS - - ZFS
disk_11 auto:cdsdisk - - online
disk_12 auto:cdsdisk - - online
"mirrreplacedisk" is the replaced disk
#vxassist -g mir make mirror appvolmir mirreplacedisk
# vxprint -htqg mir
dg mir default default 4000 1348198265.15.vcs1
dm mirreplacedisk disk_5 auto 65536 41764864 -
dm mir01 disk_3 auto 65536 2027168 -
v appvolmir - ENABLED ACTIVE 1024000 SELECT - fsgen
pl appvolmir-01 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mir01-01 appvolmir-01 mir01 0 1024000 0 disk_3 ENA
pl appvolmir-02 appvolmir ENABLED ACTIVE 1024000 CONCAT - RW
sd mirreplacedisk-01 appvolmir-02 mirreplacedisk 0 1024000 0 disk_5 ENA
#
No comments:
Post a Comment