16 July 2013

Hard Drive to .VHD

Let’s say you have an XP PC that died, and you want to run that installation in a virtual machine (VM).  The first step will be to harvest that installation into a virtual hard drive.  Different virtualization host software supports different file types for virtual hard drives, but I’ll be using the .VHD standard, which should work in Virtual PC, VirtualBox and VMware Player.

Preparing the physical hard drive

The hard drive’s from a failed PC, so step one is to be sure of the hardware and file system.  The drive is taken out of the dead PC, and dropped into a known PC that is set to boot off safe maintenance OSs (mOS) and tools such as Bart, WinPE, Sardu, BootIt Next Generation, etc.  Do not allow the hard drive to boot!  You will also want to connect another hard drive with enough space to swallow what you will harvest, and make both connections using internal S-ATA or IDE rather than USB, for speed and SMART access.  FAT32 limits size of a file to 4G, so if you are using Disk2VHD or capturing a .WIM image, your target drive file system should be NTFS.

First, I boot my Bart DVD and from there, launch HD Tune, and look at the SMART details.  You can use any SMART tool you like, as long as it shows the raw Data column for these four attributes: Reallocated Sectors, Reallocation Events, Pending Sectors and Offline Uncorrectable sectors.  For all of these, the raw data should be zero; if it does not, evacuate the stricken drive first as files, then as partition image, before doing anything else, including surface scan and other diagnostics.  Don’t trust SMART tools that only show “OK” or “Fail” status, that’s next to useless.

If the hard drive is physically OK, then check and fix the file system from a compatible mOS.  If the volume to be salvaged is NTFS, then the version of mOS should be same or higher as the OS installed on the drive.  So you can use WinPE 4.0 for any Windows, Windows PE 3.0 for Windows 7 and older, WinPE 2.0 for Vista and older, and as we’re after XP in this case, we can use any of those plus Bart, which is built from the XP code base.  Run ChkDsk /F and capture the results, either via command line redirection (you may have to press Y if nothing happens, in case there are “Do you want to…?” prompts you can’t see) or by capturing text from the command window.

Next, you will want to retrieve the Windows product key from the installation, in case that has to be re-entered to activate Windows when booted in the different (virtualized) PC.  I use Bart as the mOS for that, along with two free add-ons.  The first is the RunScanner plugin for Bart, which “wraps” the inactive installation’s registry hives as if that installation had booted these into action, so that registry-aware tools will see these as native (unfortunately, there’s no equivalent to Runscanner for Vista and later).  The second is ProduKey from Nirsoft, which reads the keys you need.

The final preparation step is to zero out free space so that these sectors are not included by certain types of harvesting process, as they will bloat up the size of the .vhd you will eventually create.  You can download SDelete and use that from Bart; the –c –z options will create a file full of zeros to fill the free space, and then delete the file.

Harvesting as loose files

In Windows 9x, if you copied every file to a new hard drive and created the correct MBR and PBR contents, Windows would boot and run just fine.  This is no longer true for NT-based OSs such as XP, but you may still want access to loose files, and if the drive is failing and dies soon, that may be all you get - and more useful than a partial partition or installation image.

I use Bart as the mOS for this, finding it easier to work from a GUI shell than command line.  I have Windows Directory Statistics integrated into my Bart, and use that to compare file counts to be sure I haven’t left anything out.

Harvesting as partition image

There are various partition managers that can boot off CD, and the one I’ve been using is BING (Boot It Next Generation).  If you use BING in this way, it may show an installation dialog when it starts; cancel that, as you don’t want to install it as a boot manager.  Then go into partition maintenance mode and work from there.

You can also boot BING within the virtual machine, from physical disc or captured .ISO, as long as the VM’s “BIOS” is set to boot optical before hard drive.  BING can be used in this way to resize partitions within .VHD, but cannot change the size of the “physical hard drive” as seen within the VM; use VHDResizer for that, from the host OS.

BING can also save a partition or volume image as a set of files, of a maximum size that you can select to best fit CDRs, FAT32 file system limitations, etc. and that’s quite an advantage over .WIM and .VHD images and the tools that create them.

When in BING, it’s a good idea to display the partition table, and take a picture of that via digital camera, in case there are any “surprises” there that kick in when attempting to boot the virtual machine later.

I can also use DriveImage XML as plugged into Bart, as a tool to create and restore partition and volume images; unlike BING, you can browse and extract files from within the image it creates.  There are probably other backup tools that can do the same, but make sure they have tools to work from bare metal, and that these tools work within virtual machines, as Bart and BING can do.

Harvesting as installation image

You can use Microsoft’s imaging tools to create a .WIM; a file-based partition image with the OS-specific smarts to leave out page file, hibernation file, System Volume Information etc.  Because in this case the original PC is dead, we don’t have the option to generalize the installation via SysPrep; a relief in a way, as SysPrep is so rough you’d want some other full backup before using it.

Access to these imaging tools was difficult at best, in XP and older versions of Windows; you had to be in corporate IT or a large OEM to legitimately get these.  You can now download what you need from Microsoft for free, though it’s a large download, and in any case the full OS installation disc can now boot to a command line and function as a WinPE.

In Vista and 7, you’d use a command line tool called ImageX to harvest (capture) to .WIM and apply .WIM to new hard drives, and can add a free 3rd-party tool called GimageX for a less mistake-prone UI.  In Windows 8, you’d use the DISM command instead, and I’ve not sought or found a GUI front-end for that; instead, I’m using batch files to remember the syntax involved.

Harvesting directly to .VHD

There are said to be many tools for this, but I’ve only found one; Disk2VHD.  There’s also something called P2V, but appears to be part of a costly software product aimed at the corporate IT sector, and may apply more (only?) to Microsoft’s Hyper-V virtualization technologies.

Disk2VHD boasts the ability to image partitions that are in use by the running OS, via that OS’s shadow copy engine.  Unfortunately, that is the only way it can work – so it will not run from Bart or WinPE.  You are obliged to boot a hard drive based Windows to host the tool, exposing the hard drive to be harvested to whatever that OS may do. 

That’s too risky for at-risk hard drives, as Windows tends to fiddle with every hard drive volume it can see.  WinME and XP are the worst offenders, as they enable the System Restore engine on every hard drive volume detected and immediately start writing to those file systems.  Al least Vista, 7 and 8 don’t do that!

It’s important to remember that Disk2VHD captures entire hard drives, not just partitions or volumes, even though the UI implies the latter by allowing selection of partitions and volumes to be included.  For example, if you have a 64G C: on a 500G hard drive and you deselect all volumes other than C:, you will create a virtual hard drive 500G in size with a 64G C: partition on it, the rest being left empty.  You may have hoped for a 64G drive filled with C: but that is not what you’ll get.

Size matters

Guess what the sizes of these various harvestings will be, compared to original drive?  Then check out the results of doing this for real, or a mature in-use XP installation with shell folders relocated out of C:

  • 500G - capacity of original hard drive
  • 30G – size of original NTFS C: partition
  • 12.0G – size of files harvested
  • 13.5G – size of files harvested, as occupied space on FAT32
  • 7.48G – size of BING image file set
  • 4.69G – size of .WIM image created from WinPE 3.0 and ImageX
  • 12.6G – size of .VHD file as created by Disk2VHD after zeroing preparation
  • 10.8G - as seen within .VHD via Bart boot within the VM

Note that .VHD are ignorant of the file system and OS within; this is why it’s inappropriate to blame the tool’s creators when a harvested installation fails to boot within a VM.  A significant effect of this is that any sectors containing anything other than zeros, will be included as explicit blocks within the dynamic and differencing types of .VHD, which would otherwise have saved host space by leaving out empty sectors.  The .VHD manages space in large blocks, so this effect is made worse; if any sector in a block is non-zero, the whole block is added to the .VHD

Of these, the .WIM is the most compact (I capture using the strongest compression offered); then the BING image file set.  After that, things are pretty much as you’d expect, though even with zero optimizing preparation and before using the .VHD, the (dynamic) .VHD file is already significantly larger than the files it contains.

Creating a new .VHD

If you used the Disk2VHD tool, you already have your .VHD populated – but it may not be a physical size (as seen from within the VM) that you’d like.  In theory, if the partition size is limited, the “physical” space outside that should never be written to, this never contain anything other than zeros, and thus never add size to any dynamic or differencing .VHD file on the host.  In practice, you may prefer to constrain the physical size of the virtual hard drive, especially if choosing the fixed type of .VHD that always contains every sector as-is, regardless of content.

When creating a new .VHD you set the capacity of the hard drive it pretends to be, and whether the .VHD will be of fixed or dynamic type.  The fixed type is the .VHD equivalent of a fixed-size pagefile; if it’s created as an unfragmented file, it should perform better than one that grows in fragments over time.

Either way, your host volume should have enough free space to contain the full size of the .VHD’s internal capacity, or at least that of all partitions and volumes within the .VHD you intend to ever use.

You can also layer .VHD over each other, with a fixed or dynamic .VHD as the base.  Each layer above that will be a differencing .VHD, valid only as long as the lower layers do not independently change.  Both differencing and dynamic .VHD use the same storage model, which is like a “super-FAT” chain of large blocks that explicitly exist only if any contents within have changed, relative to the layer below.  Under the base layer is assumed to be all zeros, so a block that has never contained anything other than zeros, need not explicitly exist in any differencing or dynamic .VHD

That means every .VHD layer may grow as large as the size of all partitions and volumes within it; host free space should be available accordingly.

Because changes to an underlying .VHD layer will invalidate all layers above, they are generally used in two ways.  You can have a base image that is kept read-only so that it can never change, and this can be the “installation image” over which multiple VMs can run, each with their own changing differencing .VHD; this is how XP Mode is set up.  You can also use a differencing .VHD above that, which is considered disposable until it is merged with the .VHD below; you may use that for guest accounts, malware investigation and other testing, kiosk use etc.  Virtual PC may use this as “undo” disks, though these use the .VUD rather than .VHD file type.

To merge .VHDs, you need enough free disk space on the host for the output .VHD that could be as large as the .VHD being merged. in terms of workspace required.  To compact a .VHD (i.e. discard any blocks full of zeros in dynamic .VHDs) you need enough free host disk space to create the new .VHD; these considerations make .VHDs costly in terms of disk space and hard drive head travel to partitions beyond where they are stored.

Populating the new .VHD

If you used partition imaging tools like BING or DriveImage XML, or installation imaging tools like ImageX from WinPE, you will need to write these images to the new .VHD you created above.  You may also need to move contents from one .VHD to another, if you need to change the base .VHD type, or don’t want to use VHDResizer to change the size of the physical hard drive contained within the .VHD

One way to do this, is by using these “real” bootable discs within the virtual machine, either by booting the VM from the physical disc, or by capturing the relevant .ISO and booting the VM into that.  If you can’t “see” host hard drive volumes within the VM then the materials should be copied into another .VHD that is attached as an additional hard drive before starting the VM session.  You can do that in suitable host OSs (e.g. Windows 7) by mounting the .VHD for use as if it were a native hard drive volume; otherwise you may need a VM that can see outside via network shares etc.

1 comment:

Anonymous said...

Great article thank you! A GUI for DISM is here: https://dismgui.codeplex.com/