How to undo pvremove

By Joel Yliluoma, August 2008

Preface

When you have committed a major blunder and you realize you are really screwed, you have two choices:
  1. You accept the situation and start over
  2. You hack the reality and overcome the puzzle

This is a story of how I chose the latter.

Oh, and this talks about Linux, LVM2, swsusp and disk management. If you are not interested of those topics, feel free to close this page now.

LVM, for those uninitiated, is a mechanism which combines multiple physical volumes (i.e. disk partitions) into a pool ("volume group"), from which logical volumes can be allocated (and filesystems created on those logical volumes). The logical volumes can span multiple harddisks, and can be migrated at runtime, without unmounting, and in fact, happens completely transparently to any userspace programs. Say, you have five 200 GB harddisks, and you need to form a 500 GB filesystem. Not a problem. At some point, you realize you need a 10 GB swap; not a problem, just lvcreate+mkswap+swapon. Later, you want to remove one of those harddisks but it's got filesystems on it. Not a problem, just pvmove and everything is migrated to other disks. While system is operating normally.

Background

A little background of my situation.

On my little server that runs a number of websites such as Kanjidict and TASVideos, there are three SATA harddisks: sda, sdb and sdc. These three harddisks all contain just huge partitions, which together comprise a large LVM volume group.

I was going to move my server from a room to another room in my house, and I didn't want to shut down and restart all services on the computer. I wanted to use Linux's neat "hibernate" feature, which means that it saves everything it is working on, to a harddisk and powers off. Upon reboot, it loads everything from the harddisk and continues as if nothing happened. To be more precise, it saves the system state to a swap partition. However, there is a limitation: The swap partition cannot be a LVM logical volume. It must be a real disk partition.

Lead-in to problem

The problem was, there was nowhere room on my harddisks to create a real disk partition for swapping, so I had to make room.

Luckily, due to its design, LVM allows stuff to be migrated across harddisks.

I deleted unneeded logical volumes and shrank some others, and used lvm2defrag until I arrived in the situation that the sdc2 physical volume was completely empty, and could be safely removed from the volume group.

  lvm> pvs
  PV         VG    Fmt  Attr PSize   PFree
  /dev/sda1  owner lvm2 a-   372.61G  29.11G
  /dev/sdb1  owner lvm2 a-   372.61G   5.96G
  /dev/sdc2  owner lvm2 a-   186.11G 186.11G
This is where I did a mistake.

   lvm> pvremove /dev/sdc2
     Can't pvremove physical volume "/dev/sdc2" of volume group "owner" without -ff
Huh?

   lvm> pvremove --help
     pvremove: Remove LVM label(s) from physical volume(s)
   pvremove
           [-d|--debug]
           [-f[f]|--force [--force]
           [-h|-?|--help]
           [-t|--test]
           [-v|--verbose]
           [-y|--yes]
           [--version]
           PhysicalVolume [PhysicalVolume...]
   lvm> pvremove -f /dev/sdc2
     Can't pvremove physical volume "/dev/sdc2" of volume group "owner" without -ff
   lvm> pvremove -ff /dev/sdc2
   Really WIPE LABELS from physical volume "/dev/sdc2" of volume group "owner" [y/n]? n
     /dev/sdc2: physical volume label not removed
     Command failed with status code 5.
   lvm> pvremove -ff /dev/sdc2
   Really WIPE LABELS from physical volume "/dev/sdc2" of volume group "owner" [y/n]? y
     WARNING: Wiping physical volume label from /dev/sdc2 of volume group "owner"
     Labels on physical volume "/dev/sdc2" successfully wiped

I wondered why it required these double --force flags. Soon I wondered no longer.

   lvm> lvs
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Volume group "owner" not found
Huh???

   lvm> pvs
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Volume group "owner" not found
     Skipping volume group owner
     <repeated three times>
   lvm> lvscan
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Volume group "owner" not found
At this point, it became soon apparent to me that I screwed up, majorly.

See, I used this command:

   pvremove        Remove LVM label(s) from physical volume(s)
When I should have in fact used this command:

   vgreduce        Remove physical volume(s) from a volume group
Luckily, the server was still operating just fine. It's just that the LVM tools did not seem to honor the existence of my volume group anymore. I became desperately aware of the fact that if the server is now rebooted, it may not be able to access any volumes at all. Literally all data on the server would be lost!

Solving the problem

First, I tried to use LVM's tools to rescue the lost partition.

   lvm> lvscan
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
     Couldn't find all physical volumes for volume group owner.
     Volume group "owner" not found
   lvm> pvdata --help
     pvdata: Display the on-disk metadata for physical volume(s)
   pvdata
           [-a|--all]
           [-d|--debug]
           [-E|--physicalextent]
           [-h|-?|--help]
           [-L|--logicalvolume]
           [-P[P]|--physicalvolume [--physicalvolume]
           [-U|--uuidlist]
           [-v[v]|--verbose [--verbose]
           [-V|--volumegroup]
           [--version]
           PhysicalVolume [PhysicalVolume...]
   lvm> pvdata
     There's no 'pvdata' command in LVM2.
     Use lvs, pvs, vgs instead; or use vgcfgbackup and read the text file backup.
     Metadata in LVM1 format can still be displayed using LVM1's pvdata command.
I also tried vgcfgrestore, which is supposed to restore lost metadata. It didn't help either.

Soon I realized, that I need to look further for help.

I looked into the partitions themselves. I wanted to see what the partition has eaten.

   root@chii:/dev# hexdump -C /dev/sdc2 |more
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000800  00 07 c0 00 0c 00 00 00  80 94 f2 45 00 07 c8 00  |.........E...|
   00000810  0c 00 00 00 80 94 f2 46  04 07 d0 00 0c 00 00 00  |......F.......|
   00000820  80 94 f2 47 00 07 d8 00  0c 00 00 00 80 94 f2 48  |..G.........H|
   00000830  00 07 e0 00 0c 00 00 00  80 94 f2 49 00 07 e8 00  |.........I...|
   00000840  0c 00 00 00 80 94 f2 4a  04 07 f0 00 0c 00 00 00  |......J.......|
   00000850  80 94 f2 4b 00 07 f8 00  0c 00 00 00 80 94 f2 4c  |..K.........L|
   00000860  00 08 00 00 0c 00 00 00  80 94 f2 4d 00 08 08 00  |..........M....|
   <snip>
   00000ff0  0c 00 00 00 80 94 f3 11  04 0d 10 00 0c 00 00 00  |...............|
   00001000  ae fa 4b 8a 20 4c 56 4d  32 20 78 5b 35 41 25 72  |K. LVM2 x[5A%r|
   00001010  30 4e 2a 3e 01 00 00 00  00 10 00 00 00 00 00 00  |0N*>............|
   00001020  00 f0 02 00 00 00 00 00  00 ca 00 00 00 00 00 00  |..............|
   00001030  a0 0c 00 00 00 00 00 00  e1 1c c6 87 00 00 00 00  |.............|
   00001040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001200  2d 62 73 4b 67 2d 77 54  49 72 2d 56 6a 34 4b 63  |-bsKg-wTIr-Vj4Kc|
   00001210  37 22 0a 64 65 76 69 63  65 20 3d 20 22 2f 64 65  |7".device = "/de|
   00001220  76 2f 73 64 63 32 22 0a  0a 73 74 61 74 75 73 20  |v/sdc2"..status |
   00001230  3d 20 5b 22 41 4c 4c 4f  43 41 54 41 42 4c 45 22  |= ["ALLOCATABLE"|
   root@chii:/dev# hexdump -C /dev/sdb1 |more
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  87 0a f4 de 20 00 00 00  4c 56 4d 32 20 30 30 31  |.. ...LVM2 001|
   00000220  64 66 4f 78 66 46 41 56  54 59 44 63 44 67 71 4e  |dfOxfFAVTYDcDgqN|
   00000230  36 44 70 47 65 70 46 34  32 57 49 42 44 63 56 62  |6DpGepF42WIBDcVb|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
   00000270  00 f0 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |...............|
   00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001000  c7 58 f6 36 20 4c 56 4d  32 20 78 5b 35 41 25 72  |X6 LVM2 x[5A%r|
   00001010  30 4e 2a 3e 01 00 00 00  00 10 00 00 00 00 00 00  |0N*>............|
   00001020  00 f0 02 00 00 00 00 00  00 f4 00 00 00 00 00 00  |..............|
   00001030  a0 0c 00 00 00 00 00 00  e1 1c c6 87 00 00 00 00  |.............|
   00001040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001200  31 0a 0a 73 65 67 6d 65  6e 74 31 20 7b 0a 73 74  |1..segment1 {.st|
   00001210  61 72 74 5f 65 78 74 65  6e 74 20 3d 20 30 0a 65  |art_extent = 0.e|
   00001220  78 74 65 6e 74 5f 63 6f  75 6e 74 20 3d 20 32 35  |xtent_count = 25|
   00001230  0a 0a 74 79 70 65 20 3d  20 22 73 74 72 69 70 65  |..type = "stripe|
   <snip>
   root@chii:/dev# hexdump -C /dev/sda1 |more
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  54 73 f8 85 20 00 00 00  4c 56 4d 32 20 30 30 31  |Ts. ...LVM2 001|
   00000220  78 64 4e 63 70 6c 32 6b  65 46 73 73 45 68 5a 71  |xdNcpl2keFssEhZq|
   00000230  4e 69 4f 69 47 65 79 58  55 32 41 38 6a 66 56 77  |NiOiGeyXU2A8jfVw|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
   00000270  00 f0 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |...............|
   00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001000  c7 58 f6 36 20 4c 56 4d  32 20 78 5b 35 41 25 72  |X6 LVM2 x[5A%r|
   00001010  30 4e 2a 3e 01 00 00 00  00 10 00 00 00 00 00 00  |0N*>............|
   00001020  00 f0 02 00 00 00 00 00  00 f4 00 00 00 00 00 00  |..............|
   00001030  a0 0c 00 00 00 00 00 00  e1 1c c6 87 00 00 00 00  |.............|
   00001040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001200  31 0a 0a 73 65 67 6d 65  6e 74 31 20 7b 0a 73 74  |1..segment1 {.st|
   00001210  61 72 74 5f 65 78 74 65  6e 74 20 3d 20 30 0a 65  |art_extent = 0.e|
   00001220  78 74 65 6e 74 5f 63 6f  75 6e 74 20 3d 20 32 35  |xtent_count = 25|
   00001230  0a 0a 74 79 70 65 20 3d  20 22 73 74 72 69 70 65  |..type = "stripe
   <snip>
As I looked at this information, it became apparent to me what pvremove had done to the partition. It had overwritten a section from 00000200..000000FFF, which describes the contents of the LVM physical volume. So to make it functional again, I would need to reconstruct that section from the disk. "That should be easy", I thought to myself, as I've already got two backups, on sda1 and sdb1.

Just in case I accidentally do even more harm, I made some backups:

   root@chii:/dev# dd if=/dev/sdc2 bs=16 count=288 of=/sdc2-backup
   root@chii:/dev# dd if=/dev/sdb1 bs=16 count=288 of=/sdb1-backup
   root@chii:/dev# dd if=/dev/sda1 bs=16 count=288 of=/sda1-backup
And then I copied the section from sdb1 to sdc2:

   root@chii:/dev# dd if=/dev/sdb1 bs=16 count=288 of=sdc2
   288+0 records in
   288+0 records out
   4608 bytes (4.6 kB) copied, 0.00936218 s, 492 kB/s
   root@chii:/dev# hexdump -C /dev/sdc2 |more
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  87 0a f4 de 20 00 00 00  4c 56 4d 32 20 30 30 31  |.. ...LVM2 001|
   00000220  64 66 4f 78 66 46 41 56  54 59 44 63 44 67 71 4e  |dfOxfFAVTYDcDgqN|
   00000230  36 44 70 47 65 70 46 34  32 57 49 42 44 63 56 62  |6DpGepF42WIBDcVb|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
   00000270  00 f0 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |...............|
   00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001000  c7 58 f6 36 20 4c 56 4d  32 20 78 5b 35 41 25 72  |X6 LVM2 x[5A%r|
   00001010  30 4e 2a 3e 01 00 00 00  00 10 00 00 00 00 00 00  |0N*>............|
   00001020  00 f0 02 00 00 00 00 00  00 f4 00 00 00 00 00 00  |..............|
   00001030  a0 0c 00 00 00 00 00 00  e1 1c c6 87 00 00 00 00  |.............|
   00001040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00001200  2d 62 73 4b 67 2d 77 54  49 72 2d 56 6a 34 4b 63  |-bsKg-wTIr-Vj4Kc|
   00001210  37 22 0a 64 65 76 69 63  65 20 3d 20 22 2f 64 65  |7".device = "/de|
   00001220  76 2f 73 64 63 32 22 0a  0a 73 74 61 74 75 73 20  |v/sdc2"..status |
   00001230  3d 20 5b 22 41 4c 4c 4f  43 41 54 41 42 4c 45 22  |= ["ALLOCATABLE"|
   <snip>
This did not help: LVM still complained.

   Couldn't find device with uuid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'.
I checked out the volume group backup file at /etc/lvm/backup/owner to see what are the "uuid"s for sda1 and sdb1, and sure enough, they matched the regions at 0220..023F in each of those partitions. So I entered the uuid to sdc2 manually:

   echo -n '92MWt7CPjsZt5Diac0bsKgwTIrVj4Kc7' | dd of=/dev/sdc2 bs=1 seek=544 conv=notrunc
   32+0 records in
   32+0 records out
   32 bytes (32 B) copied, 0.00629778 s, 5.1 kB/s
   root@chii:/dev# hexdump -C /dev/sdc2 |more
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  87 0a f4 de 20 00 00 00  4c 56 4d 32 20 30 30 31  |.. ...LVM2 001|
   00000220  39 32 4d 57 74 37 43 50  6a 73 5a 74 35 44 69 61  |92MWt7CPjsZt5Dia|
   00000230  63 30 62 73 4b 67 77 54  49 72 56 6a 34 4b 63 37  |c0bsKgwTIrVj4Kc7|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   <snip>
To my surprise, this did not help. LVM was still complaining as before. So I investigated in more detail:

   lvm> pvs -d -v -v -d
         Setting global/locking_type to 1
         File-based locking selected.
         Setting global/locking_dir to /var/lock/lvm
         report/aligned not found in config: defaulting to 1
         report/buffered not found in config: defaulting to 1
         report/headings not found in config: defaulting to 1
         report/separator not found in config: defaulting to
         report/prefixes not found in config: defaulting to 0
         report/quoted not found in config: defaulting to 1
         report/columns_as_rows not found in config: defaulting to 0
         report/pvs_sort not found in config: defaulting to pv_name
         report/pvs_cols_verbose not found in config: defaulting to pv_name,vg_name,pv_fmt,pv_attr,pv_size,pv_free,dev_size,pv_uuid
       Scanning for physical volume names
         /dev/sda1: lvm2 label detected
         Label checksum incorrect on /dev/sdc2 - ignoring
         /dev/sdc2: No label detected
         /dev/sdb1: lvm2 label detected
         Label checksum incorrect on /dev/sdc2 - ignoring
   <snip>
It's complaining about a label checksum, and consequently ignoring the whole partition. This would be troublesome. How can I fix a broken checksum?

At this point, I realized where the solution is. It is in the source code! So I looked up Linux kernel source code.

   root@chii:/usr/src/linux/drivers/md# grep -i checksum dm*
   root@chii:/usr/src/linux/drivers/md#
Huh? No matches? It must be userspace then.

   root@chii:/usr/local/src# apt-get source lvm2
   Reading package lists... Done
   Building dependency tree
   Reading state information... Done
   Need to get 610kB of source archives.
   Get:1 http://ftp.de.debian.org testing/main lvm2 2.02.39-2 (dsc) [1132B]
   Get:2 http://ftp.de.debian.org testing/main lvm2 2.02.39-2 (tar) [594kB]
   Get:3 http://ftp.de.debian.org testing/main lvm2 2.02.39-2 (diff) [14.5kB]
   Fetched 610kB in 1s (306kB/s)
   dpkg-source: extracting lvm2 in lvm2-2.02.39
   dpkg-source: info: unpacking lvm2_2.02.39.orig.tar.gz
   dpkg-source: info: applying lvm2_2.02.39-2.diff.gz
   root@chii:/usr/local/src# cd lvm2*/
   root@chii:/usr/local/src/lvm2-2.02.39# find -type f|xargs grep -i checksum
   ./po/lvm2.po:msgid "%s: Checksum error"
   ./po/lvm2.po:msgid "Incorrect metadata area header checksum"
   ./po/lvm2.po:msgid "Label checksum incorrect on %s - ignoring"
   ./lib/label/label.c:                            log_info("Label checksum incorrect on %s - "
   <snip>
Jackpot! So I would need to check out what's lib/label/label.c doing when it complains about a checksum.

             if (calc_crc(INITIAL_CRC, &lh->offset_xl, LABEL_SIZE -
                      uintptr_t) &lh->offset_xl - (uintptr_t) lh !=
                 xlate32(lh->crc_xl {
                 log_info("Label checksum incorrect on %s - "
                      "ignoring", dev_name(dev;
                 continue;
             }
Oh, CRC32? Neat. How about label.h?

   /* On disk - 32 bytes */
   struct label_header {
       int8_t id[8];       /* LABELONE */
       uint64_t sector_xl; /* Sector number of this label */
       uint32_t crc_xl;    /* From next field to end of sector */
       uint32_t offset_xl; /* Offset from start of struct to contents */
       int8_t type[8];     /* LVM2 001 */
   } __attribute__ packed;
Oh, neater! So now I know the structure of the LVM disklabel header.

   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  87 0a f4 de 20 00 00 00  4c 56 4d 32 20 30 30 31  |.. ...LVM2 001|
   00000220  39 32 4d 57 74 37 43 50  6a 73 5a 74 35 44 69 61  |92MWt7CPjsZt5Dia|
   00000230  63 30 62 73 4b 67 77 54  49 72 56 6a 34 4b 63 37  |c0bsKgwTIrVj4Kc7|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
   00000270  00 f0 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |...............|
   00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
So in this, I would need to replace 87 0a f4 de with the CRC32 checksum of the data from 0214 to 03FF (end of the sector). That's easy. We have a CRC32 tool. To verify that I'm getting it right, I checked using one of the intact partitions.

   root@chii:/dev# hexdump -C /dev/sda1 |head -n 20
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  54 73 f8 85 20 00 00 00  4c 56 4d 32 20 30 30 31  |Ts. ...LVM2 001|
   00000220  78 64 4e 63 70 6c 32 6b  65 46 73 73 45 68 5a 71  |xdNcpl2keFssEhZq|
   00000230  4e 69 4f 69 47 65 79 58  55 32 41 38 6a 66 56 77  |NiOiGeyXU2A8jfVw|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
   00000270  00 f0 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |...............|
   00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   root@chii:/dev# dd if=/dev/sda1 skip=$[512+16+4] bs=1 count=$[512-8-8-4] > ~/tmptmp; crc32 ~/tmptmp
   492+0 records in
   492+0 records out
   492 bytes (492 B) copied, 0.0014547 s, 338 kB/s
   b7b1c8e3
Hmm, this does not match. B7B1C8E3 is not the same as 85F87354. Why does it not match?

Well, the secret turned out to be that the INITIAL_CRC value in the source code from label.c is not the same as the initial CRC value used in standard CRC32 implementation.

I checked lib/misc/crc.h and found this:

   #define INITIAL_CRC 0xf597a6cf
   uint32_t calc_crc(uint32_t initial, const void *buf, uint32_t size);
So I would need to write a custom CRC32 program that would use this particular initialization value for the CRC checksum.

Luckily, the CRC32 program I used was a Perl script, so it would be easy to edit it. Here's the original:

   #!/usr/bin/perl -w
   eval 'exec /usr/bin/perl -w -S $0 ${1+"$@"}'
       if 0; # not running under some shell
   # computes and prints to stdout the CRC-32 values of the given files
   use lib qw( blib/lib lib );
   use Archive::Zip;
   use FileHandle;
   my $totalFiles = scalar(@ARGV);
   foreach my $file (@ARGV) {
       if ( -d $file ) {
           warn "$0: ${file}: Is a directory\n";
           next;
       }
       my $fh = FileHandle->new();
       if ( !$fh->open( $file, 'r' ) ) {
           warn "$0: $!\n";
           next;
       }
       binmode($fh);
       my $buffer;
       my $bytesRead;
       my $crc = 0;
       while ( $bytesRead = $fh->read( $buffer, 32768 ) ) {
           $crc = Archive::Zip::computeCRC32( $buffer, $crc );
       }
       printf( "%08x", $crc );
       print("\t$file") if ( $totalFiles > 1 );
       print("\n");
   }
Here's how I changed it:

   <snip>
       my $crc = ~0xf597a6cf;
       while ( $bytesRead = $fh->read( $buffer, 32768 ) ) {
           $crc = Archive::Zip::computeCRC32( $buffer, $crc );
       }
       $crc = ~$crc;
       printf( "%08x", $crc );
   <snip>
Now to test:

   root@chii:/usr/local/src/lvm2-2.02.39/lib# dd if=/dev/sda1 skip=$[512+16+4] bs=1 count=$[512-8-8-4] > ~/tmptmp; /root/crc32tmp ~/tmptmp
   492+0 records in
   492+0 records out
   492 bytes (492 B) copied, 0.00149226 s, 330 kB/s
   85f87354
85f87354 -- that's a match. Now I would take a checksum of sdc2 and poke it back in:

   root@chii:/dev# dd if=/dev/sdc2 skip=$[512+16+4] bs=1 count=$[512-8-8-4] > ~/tmptmp; ~/crc32tmp ~/tmptmp
   492+0 records in
   492+0 records out
   492 bytes (492 B) copied, 0.0147573 s, 33.3 kB/s
   414798fb
   root@chii:/dev# echo -ne '\xfb\x98\x47\x41'|dd of=/dev/sdc2 bs=1 seek=$[512+16]
   4+0 records in
   4+0 records out
   4 bytes (4 B) copied, 0.00801839 s, 0.5 kB/s
   root@chii:/dev# hexdump -C /dev/sdc2 |head -n 20
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
   00000210  fb 98 47 41 20 00 00 00  4c 56 4d 32 20 30 30 31  |.GA ...LVM2 001|
   00000220  39 32 4d 57 74 37 43 50  6a 73 5a 74 35 44 69 61  |92MWt7CPjsZt5Dia|
   00000230  63 30 62 73 4b 67 77 54  49 72 56 6a 34 4b 63 37  |c0bsKgwTIrVj4Kc7|
   00000240  00 04 f9 26 5d 00 00 00  00 00 03 00 00 00 00 00  |..&]...........|
   <snip>
Time to test whether it worked!

Problem averted

   lvm> pvs
     /dev/sdc2: Checksum error
     /dev/sdc2: Checksum error
     /dev/sdc2: Checksum error
     /dev/sdc2: Checksum error
     PV         VG    Fmt  Attr PSize   PFree
     /dev/sda1  owner lvm2 a-   372.61G  29.11G
     /dev/sdb1  owner lvm2 a-   372.61G   5.96G
     /dev/sdc2  owner lvm2 a-   186.11G 186.11G
Still complains about a checksum error, but now it finds sdc2! A fatal error has turned into a warning. This is a big win!

To fix the warning, lastly I did this:

   root@chii:~# vgcfgbackup
     Volume group "owner" successfully backed up.
   root@chii:~# vgcfgrestore owner
     Restored volume group owner
After that, everything was back in condition:
   lvm> pvs
     PV         VG    Fmt  Attr PSize   PFree
     /dev/sda1  owner lvm2 a-   372.61G  29.11G
     /dev/sdb1  owner lvm2 a-   372.61G   5.96G
     /dev/sdc2  owner lvm2 a-   186.11G 186.11G
And I could safely delete the partition this time:

   lvm> vgreduce owner /dev/sdc2
     Removed "/dev/sdc2" from volume group "owner"

   lvm> pvremove /dev/sdc2
     Labels on physical volume "/dev/sdc2" successfully wiped

   lvm> pvs
     No physical volume label read from /dev/sdc2
     PV         VG    Fmt  Attr PSize   PFree
     /dev/sda1  owner lvm2 a-   372.61G 29.11G
     /dev/sdb1  owner lvm2 a-   372.61G  5.96G
Hmm, still a little glitch, but it went away with another round of vgcfgbackup and vgcfgrestore.

Back to business

Next I deleted sdc2 and replaced it with a swap partition, and tried the hibernate option, but it didn't work. For some reason, after writing 1 GB of stuff to the swap partition and shutting down all non-essential harddisk, the kernel suddenly was unable to find any swap device!

   [1749825.798000] PM: Cannot find swap device, try swapon -a.
   [1749826.271752] Restarting tasks ... done.
And it brought the system back running. I guess Linux has problems hibernating to SATA devices. But that's another story.

The point of this article was to show how to salvage a partition accidentally destroyed with pvremove, and that it did :)

In retrospect, this command would probably have saved me the hassle of doing hex editing and stuff:

   pvcreate /dev/sdc2 --uiid '92MWt7-CPjs-Zt5D-iac0-bsKg-wTIr-Vj4Kc7'
It would create a disklabel on sdc2 with exactly the right UUID, without touching the volume group metatables, i.e. filling the precise information that was missing, which I did manually. Oh well. The next time I'll know that. At least I learned a thing or two: about the format of LVM disklabels, and to never ignore two forcing flags and capital letter warnings.

In 2010, a Michal Docekal wrote to me, sharing this detail:

I just got to your article named "How to undo pvremove" on your web page. I got to a very similar situation a while ago: debian:~# pvremove -ff /dev/hdb1
debian:~# vgdisplay
Couldn't find device with uuid '3wvOQk-M0uS-XJN6-nlrl-x8Uz-CJDF-in2fEs'. Couldn't find all physical volumes for volume group data.
...
Volume group "data" doesn't exist

In the end, I discovered a much faster way to undo this:
vgreduce data --removemissing
And that was it. I don't know if there was this option back in 2008, but in any way, it would be good if you added this information to the article, so that other desperate admins can find it.

Last edited: 2010-02-25 18:05:06