Wednesday, September 19, 2012

Replacing a disk in Veritas Volume Manager

Replacing a disk in VxVM
--------------------------------------------------------

This has always been a difficult task while replacing a disk. The main worry which runs in our mind that data should not be lost while performing the activity, it should be intact otherwise you know the pain of facing other team.


Here i will be explaining very simple steps which will make you comfortable in doing such critical activities.


I will be talking about concatenated volume where you don't have redundant copies of data later i will cover the mirrored volume.


Disk replacement in concatenated volume


Here is the vxprint -htq output


bash-3.00# vxprint -htq

Disk group: appdg

dg appdg        default      default  4000     1347953587.80.vcs1

dm disk03       disk_3       auto     65536    2027168  -
dm disk04       disk_4       auto     65536    2027168  -

v  appvol       -            ENABLED  ACTIVE   3121152  SELECT    -        fsgen
pl appvol-01    appvol       ENABLED  ACTIVE   3121152  CONCAT    -        RW
sd disk03-01    appvol-01    disk03   0        2027168  0         disk_3   ENA
sd disk04-01    appvol-01    disk04   0        1093984  2027168   disk_4   ENA


In the above appdg diskgroup first subdisk is
disk03-01 which is associated with disk03 and the plex is a concatenated type.

In the above scenario, if in case disk03 which is the first disk added to diskgroup and on which the Veritas has created the first subdisk for "appvol" volume. There is no way to replace the disk, if you try to do so, you will end up in corrupting the filesytem and the data.


But if you want to replace the second disk(flag - failing disk status)  which is absolutely possible without data loss but during this activity volume would not accessible. 

First we will use vxdiskadm to replace the disk, later we will try with only command prompt.


Connect the new disk in an empty slot and initialize it.


Run vxdiskadm from the command prompt, it will open up a list of menu items.



-> Select option no 4 - Remove a disk for replacement


Select the correct disk which you want to replace


(Caution - Do not select first the disk in the disgroup)


Once you have removed the disk it will prompt to select a disk with which you want to replace.  Select the disk which you have initialized.http://www.blogger.com/blogger.g?blogID=7967882172610062116#editor/target=post;postID=2988077713332921354


Once you are done with the above steps, you will find the plex is in disabled and recover steps


# vxprint -htqv

Disk group: appdg

v  appvol       -            DISABLED ACTIVE   3121152  SELECT    -        fsgen
pl appvol-01    appvol       DISABLED RECOVER  3121152  CONCAT    -        RW
sd disk03-01    appvol-01    disk03   0        2027168  0         disk_3   ENA
sd disk04-01    appvol-01    disk04   0        1093984  2027168   disk_5   ENA
#


You need to correct the above


Below sequence of commands needs to executed to fix the abovfe


#vxmend -g appdg fix stale appvol-01


#vxmend -g appdg fix clean appvol-01


#vxvol -g appdg start
appvol

That's it you are done. Mount the volume and you are ready to use the volume.


Replacing a failed disk in mirrored volume

Here is my volume status

# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mir01        disk_3       auto     65536    2027168  -
dm mir02        disk_4       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
pl appvolmir-02 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir02-01     appvolmir-02 mir02    0        1024000  0         disk_4   ENA


Where appvolmir is mirrored volume in mir diskgroup, having 2 plexes appvolmir-01 and appvolmir-02

Change the plex status to offline

#vxmend -g mir off appvolmir-02

# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mir01        disk_3       auto     65536    2027168  -
dm mir02        disk_4       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
pl appvolmir-02 appvolmir    DISABLED OFFLINE  1024000  CONCAT    -        RW
sd mir02-01     appvolmir-02 mir02    0        1024000  0         disk_4   ENA


Disassociates the offline plex from the volume

#vxplex -g mir dis  appvolmir-02

# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mir01        disk_3       auto     65536    2027168  -
dm mir02        disk_4       auto     65536    2027168  -

pl appvolmir-02 -            DISABLED -        1024000  CONCAT    -        RW
sd mir02-01     appvolmir-0 mir02    0        1024000  0         disk_4   ENA

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
#
 

 Remove the plex

#vxedit -g mir -r rm appvolmir-02

 # vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mir01        disk_3       auto     65536    2027168  -
dm mir02        disk_4       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
#

Once plex is removed, removed the disk from veritas control

#vxdg -g mir rmdisk mir02 




# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mir01        disk_3       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
#



So disk removal is done, Now its time to attach a new disk and sync the data. Identify a new disk should be of similar or more size.

Here in this example i would be using the disk_5 as a replacement disk.

# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mirreplacedisk disk_5     auto     65536    41764864 -
dm mir01        disk_3       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
#


So once the new disk is initialized and added to the diskgroup , mirror the volume. Here is the vxdisk list ouput

# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c0d0s2       auto:sliced     ibmdg01      ibmdg        online
disk_3       auto:cdsdisk    mir01        mir          online
disk_4       auto:cdsdisk    -            -            online
disk_5       auto:cdsdisk    mirreplacedisk  mir          online
disk_6       auto:SVM        -            -            SVM
disk_7       auto:SVM        -            -            SVM
disk_8       auto:ZFS        -            -            ZFS
disk_9       auto:ZFS        -            -            ZFS
disk_10      auto:ZFS        -            -            ZFS
disk_11      auto:cdsdisk    -            -            online
disk_12      auto:cdsdisk    -            -            online


"mirrreplacedisk" is the replaced disk

#vxassist -g mir make mirror appvolmir mirreplacedisk

# vxprint -htqg mir
dg mir          default      default  4000     1348198265.15.vcs1

dm mirreplacedisk disk_5     auto     65536    41764864 -
dm mir01        disk_3       auto     65536    2027168  -

v  appvolmir    -            ENABLED  ACTIVE   1024000  SELECT    -        fsgen
pl appvolmir-01 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mir01-01     appvolmir-01 mir01    0        1024000  0         disk_3   ENA
pl appvolmir-02 appvolmir    ENABLED  ACTIVE   1024000  CONCAT    -        RW
sd mirreplacedisk-01 appvolmir-02 mirreplacedisk 0 1024000 0      disk_5   ENA
#
 













No comments: