2 August 2007

When ChkDsk Doesn't

Technorati tags: , ,

Subject: Re: Unable to run CHKDSK with "Fix" option

On Wed, 25 Jul 2007 17:46:03 -0700, Paulie

I think it is beyond time that we had a proper interactive file system maintenance tool for NTFS.  ChkDsk is a relic from the MS-DOS 5 days; I wish NT would at least catch up to, say, MS-DOS 6 Scandisk.

Now folks will flame me for saying that.  "File systems are too complex for users to understand, just trust us to fix everything for you".  Fine; let's read on and see how well that works...

>My new Notebook is unable to run a CHKDSK with the Fix option selected.
>I can run a normal CHKDSK within VISTA and it works without a problem.
>If I choose the Fix option it schedules a scan on the next boot. Upon
>CHKDSK will begin but it will freeze after 8% of the scan. The
>Notebook does not respond and I have to power it down and restart.

Great, so now we combine a possibly corrupted file system in need of repair, with recurrent bad exits.  What's wrong with this picture?

>Any idea what this may be?

Given that ChkDsk and AutoChk are closed boxes with little or no documentation of what they are doing (and little or no feedback to you while they are doing it), one can only guess.

My guesses would be one of:

1)  Physically failing HD

When a sector can't be read, the HD will retry the operation a number of times before giving up.  Whatever driver code that calls the operation will probably also retry a few times, before giving up, and so may the higher-level code that called that, etc. 

The result can be an apparent "hang" lasting seconds to minutes while the system beats the dying disk to death.

That's before you factor in futile attepts to paper over the problem and pretend it isn't there, both by the HD itself, and by the NTFS code.  Each will attempt to read the sick allocation unit's data and write it to a "good" replacement, then switch usage so that the dead sector is avoided in future.  And so on, for next dead sector, etc.

2)  Lengthy repair process

Scandisk and ChkDsk have no "big picture" awareness.  If they were you, walking from A to B, they would take a step, calculate if they were at B, then take another, and repeat.  If they were walking in the wrong direction, away from B, they'd just keep on walking forever.

So when something happens that invalidates huge chunks of the file system, these tools don't see the "big picture" and STOP and say "hey, something is invalidating the way this file system is viewed".  No; they look at one atom of the file system, change it to fit the current view, and repeat for the next.  If that means changing evey atom in the file system, that is what they will do.  Result; garbage.

3)  Bugginess

Whereas (2) is a bad design working as designed, sometimes the code doesn't work as designed and falls off the edge.

Needless to say, AutoChk and ChkDsk don't maintain any undoability. They also "know better than you", so they don't stop and ask you before "fixing" things, they just wade in and start slashing away.

>I have run full virus scans and updates on the drive and
>there are no issues. Other than this the Notebook runs fine. Its just a
>strange issue that i am unable to fix at this stage.

I would at least exclude (1) by checking the HD's surface using the appropriate tests in HD Tune (www.hdtune.com), after backing up my data.  You should be able to get a "second opinion" on the file system, but you can't; ChkDsk and AutoChk are all you have.

Public Conversations

1 comment:

Remote pc said...

Let me try out those simple steps... Well posted!!!