20 November 2009

Sysprep Fails, WinPE Sees Wrong Drive Letters

If you…

…then you may find…

  • Sysprep fails before it completes
  • WinPE “sees” partitions with incorrect drive letters

The impact can be severe; finding you’ve built your .WIM from the wrong partition, having Sysprep ruin both the .WIM you harvest plus the reference system you’d built, etc.  Attempts to maintain the Windows installation via WinRE or “just” re-install Windows may fail, too; I haven’t tested those scenarios.

The fix is to make sure the Windows boot partition is set as active in the partition table before you apply Sysprep or attempt access from WinPE, WinRE, OS installation disk, etc.  You can do this after Windows and Ubuntu have been installed; it won’t affect these, or how grub works.

The cause is a combination of the way grub works (which bypasses the normal MBR “boot the partition that is set as “active” code logic) and the way Microsoft code assigns drive letters to Windows-visible partitions and logical volumes.

Standard MBR logic

The Master Boot Record (MBR) is the first sector of the physical hard drive, and acts as an extension of the system BIOS.  It exists outside of any OS, running as it does before any particular OS has come into effect.

The standard MBR contains a partition table defining up to 4 partitions, one of which may be flagged as “active”.  The standard MBR code logic is to look for the (first?) active partition and chain into code within the first sector of this space.  At this point, the system phase of the boot process ends, and the OS phase begins.

How grub works

The grub boot manager adds some initial code to the MBR which links to the bulk of its code within the Ubuntu partition.  At boot time, this modified MBR code will always chain into the rest of grub, irrespective of which partition entry in the partition table is set as “active”.  The partition table is still referenced to find partitions, but the “active” setting is now ignored, and is thus irrelevant.

You may assume that the partition you booted via grub will be set as “active” in the partition table, but this is not the case; grub (at least grub 2, as contained in Ubuntu 9.10) does not update the “active” flag status according to what you booted last, even if set to default to this on next boot.

How Microsoft assigns drive letters

Microsoft OSs can “see” two groups of partition types; primary partitions that may be bootable and define a single volume, and an extended partition type that is not bootable but can contain multiple logical volumes.  Each volume contains a single file system and is typically assigned a single drive letter.

Drive letters have validity only within a Microsoft OS.  In the absence of “remembered” settings within that OS, they are assigned as follows…

A: and B: reserved for legacy diskette drives
For each physical hard drive…
  Assign ascending letters to each “active” primary partition
…next drive until all done
For each physical hard drive…
  Assign ascending letters to each logical volume in extended partition
…next drive until all done
For each physical hard drive…
  Assign ascending letters to each “inactive” primary partition
…next drive until all done

For example, if you have an NTFS primary partition and an extended partition containing three logical volumes, these will be lettered as C:, D:, E: and F: if the primary is set as “active”, and F:, C:, D: and E: if the primary is not set as “active” – and so…

Here comes the pain

When Windows boots off the hard drive, it can override the above logic in two ways. 

Firstly, it is aware of which partition it booted from, and which volume contains the bulk of its own code; these drive letters are recorded within the OS and can’t be changed.

Secondly, it remembers drive letters assigned to volumes it has “seen” before.  Unlike the letters for boot and OS volumes, these can be changed by the user, causing new values to be “remembered” and applied on subsequent boots.

But when you don’t boot this OS code, e.g. you boot WinRE, WinPE or the OS installation disk instead, then all those remembered settings do not apply.  I suspect Sysprep applies fresh logic during its processing as well, thus breaking its assumption base and causing it to fail.

Further, one may not be aware that the “active” flag status is an variance with boot history, and therefore assume that because you last booted Windows, that Windows partition will be the one currently set as “active”.  But that is not what happens when grub is in effect.

Best practices

I would suggest the following, to reduce these sort of risks…

1.  Always do an image backup prior to Sysprep

Sysprep can be as destructive as “just” re-installing Windows, or shifting/resizing existing partitions.  In practice, I have far higher destructive failures with Sysprep than repair installs of XP, over-old OS version upgrades and partition management, all of which have been safer than Service Pack installs.  So if you would always backup before doing those sort of things, then all the more so to backup before Sysprep.

Unlike Win9x and older Microsoft OSs, accurately copying every single file from one drive to another will not result in a bootable system, even if the drives and partitions are identical in size and you also copy over PBR contents that exist outside the file system. 

That is why you have to do a partition image backup (e.g. from BING boot, using Drive Image from Bart boot, etc.) to preserve your “undo” trail.

2.  Check that the Windows primary is set as “active”

This should now be added to your sanity-checks before signing off on a system build, running Sysprep, harvesting .WIM images from WinPE, etc. 

If you have a WinRE installation set up to boot in the event of Windows boot failure, then it may be important for the correct partition to be set as “active” at all times.

3.  Apply descriptive names to disk volumes

I apply the names “C-Drive”, “D-Drive” etc. to partitions and volumes as I create them in BING, so that these are the names I will see when working in BING to manipulate them as partitions. 

BING writes these names into the boot record of the volume, whereas the name you apply in Windows is held as a Volume Label entry within the root directory of that volume.  So you can have “pretty” names in Windows, Bart CDR boot, etc. and accurate names in BING.

My own practice is to choose “pretty” names that happen to start with the expected drive letter, so I get a quick visual sanity-check before operating on them in Windows.  For example, if I see “Core” is C: but “Data”, “Extras” and “Factory” are E:, F: and G:, then I know something’s gone wrong and should be fixed before I generate new paths based on these wrong letters.  I’d know to look for an optical drive or other intruder that has become “D:”, and fix that.

How this was tested

I tested this on new PCs build with the following hardware:

  • Intel “GoldTree” G43 chipset motherboard, latest BIOS applied
  • E6300 processor, VT enabled in BIOS (is off by duhfault)
  • 2 x 2G = 4G DDR2-800 Kingston Value RAM
  • S-ATA Seagate 1.5T hard drive as S-ATA 0
  • S-ATA LG DVD writer as S-ATA 3 (last)

Partitions and OSs were:

  • 30G Ubuntu 9.10 partition (not visible to Windows)
  • 4G Ubuntu swap partition (not visible to Windows)
  • 64G primary partition, Windows 7 64-bit, as C:
  • Extended partition containing FAT32 logicals D:, E: and F:
  • MBR contains grub 2 as installed with Ubuntu 9.10

Two PCs were tested, one with Home Basic and one with Pro as the Windows 7 edition, both being DSP (small OEM) installations.  The grub menu was set to default to the OS that was booted last, and this was always Windows during these tests. 

BING was used to create and manage partitions (unlike Windows, can format FAT32 larger than 32G) and was not installed as boot manager.

Test procedure:

  • BING boot, image backup Win7 primary to logical E:
  • Set Win7 primary as “active”
  • Boot hard drive; grub defaults to last selected (Windows), OK
  • Boot Windows; works, drive letters OK
  • Boot WinPE; what should be C: D: E: F: seen as C: D: E: F:, OK
  • Boot Windows, run Sysprep; works OK
  • BING boot; now…
  • Set Ubuntu primary as “active”
  • Boot hard drive; grub defaults to last selected (Windows), OK
  • Boot Windows; works, drive letters OK
  • Boot WinPE; what should be C: D: E: F: seen as F: C: D: E: - Fail
  • Boot Windows, run Sysprep; fails before post-processing boot
  • Windows is now not functioning, and remains so after reboot - Fail

In each case, Sysprep was run without answer file or CLI parameters; OOBE was selected, Generalize was checked, and Reboot selected as the post-processing action.

No comments: