18 November 2020

SSDs and Windows 7

Geek summary: This post attempts to collect all you need to use SATA SSD in Windows 7

  1. Check CMOS Setup SATA is AHCI, not IDE or RAID (no TRIM)
  2. Check whether SSD is listed for Defrag
  3. Check/refresh WEI Storage score; should be > 5.9
  4. Check SuperFetch and PreFetch via Regedit; DWORD 0
  5. Check SuperFetch via Services; ?Disable
  6. Admin Cmd: fsutil behavior query disabledeletenotify = 0

Windows abstracts SSDs and disk drives together as "storage", although the two technologies are completely different, with different gamuts of strengths and weaknesses. Windows Vista, XP and older will treat SSDs as if they were hard drives, which will shorten their life due to increased write operations.

Windows 7 can safely use SSDs, but the process is less transparent than it is on Windows 8.x and Windows 10, so you have to dig deeper to ensure all is set and working as it should be.

AHCI vs. IDE and RAID

SSDs that connect via PCI-e or M.2 slots are beyond the scope of this post; I don't expect Windows 7 to "see" these interfaces at all, especially if the M.2 SSD is NVMe.  Motherboards new enough to include an M.2 slot, are probably "too new" for Windows 7 anyway.

CMOS Setup can set SATA to operate as legacy IDE, AHCI, or RAID.  Windows 7 may only apply TRIM via AHCI mode, but switching between these modes may cause the next Windows 7 boot to fail on a BSoD, so ensure Windows is OK for AHCI before changing this setting in CMOS!

Note: Bart PE and Windows XP and older will need "F6 diskette" boot drivers added in order to boot in AHCI mode, else with fail with the same BSoD. 

While in CMOS Setup, you may also want to disable Hot Plugging (eSATA) for SATA ports connecting to internal drives, so these are not shown as "removable" in the Safe To Remove UI.  This is far easier than chasing registry settings etc. via methods that vary with Windows version, Microsoft vs. Intel drivers, etc. as it fixes the issue at its source.

Does Windows 7 see drive as SSD?

This is easy to determine; right-click a drive letter, Properties, Tools tab, Defrag, drill into Schedule, Drives, and see if the driver letters from the SSD are listed.  If they are, Windows 7 is seeing and treating these as hard drives; bad news!  If they are not listed to be selected, they are seen as SSD, OK.  

Note that you'll still see SSD drive letters listed to be manually defragged in Windows 7 - but even if you leave Scheduled Defrag enabled, the SSD won't be defragged automatically, which is good.

This is where Windows 8.x/10 are so much better; "Defrag" is now "Optimize", each drive is listed as SSD or Hard Drive, and the Optimize operation will show as Defrag or TRIM.  But while Windows 7 may have the right clue behind the scenes, everything is still UI'd as if SSDs don't exist!

Does Windows 7 see the speed?

ReadyBoost, Prefetch and SuperFetch are storage behaviors that are enabled or disabled according to the detected speed of the storage device.  This is tested by the WinSat CLI tool, which is in turn called by the Windows Experience Index (WEI) UI. Subsystem scores are shown by WEI up to the 7.9 maximum value, but are stored in the registry as decimal integer values; e.g. the typical hard drive score shown as 5.9 is stored in the registry as 59 decimal.

WinSat and WEI are not updated automatically, so the old hard drive score may persist, causing a newly-added SSD to be treated as if it were a slower hard drive; Prefetch and Superfetch will be enabled to help speed it up, while ReadyBoost will be disabled as the drive will still be considered too slow to speed up anything else.

There are registry entries to manage Prefetch and Superfetch, but in my experience they don't "stick", so it's better to tackle the root of this issue by refreshing the WEI scores.  This post suggests why; the behaviors periodically check the storage speed score and act accordingly, and this may include overriding the registry settings that "control" these behaviors. 

However, manually re-testing the storage speed, either via the WEI UI or digging into WinSat, may failProbable causes include boot-time storage filter drivers as part of resident antivirus, and can thus affect Safe Mode, where such drivers remain integrated.  There is a Microsoft Hotfix for this, but as it's no longer available from Microsoft, you have to grope for it elsewhere.  After checking the digital signature and uploading to VirusTotal for safety, I installed the Hotfix, and WEI then worked without having to uninstall Avast on my client's PC (disabling Avast did not fix the problem).

I also tried editing the storage score via Regedit; that was ignored, and a good thing too - else "price hero" PC vendors could more easily fake these scores to hide poor performance!

I didn't delve into ReadyBoost, but found these interesting posts.

Is TRIM enabled?

The following details are from my fuzzy memory, and are not required for the scope of this "how to" reference; search and read up multiple sources for more accurate information.

At the SSD firmware level, there is no awareness of partitions, file systems, whether things are deleted or not, so any block which has ever been written and/or has non-zero contents, will be preserved by the firmware.  Flash memory blocks are large, and the firmware's wear-leveling logic may spread writes to unused blocks first, which are also faster to write (hence best out-the-box performance and benchmarks).  Eventually, all blocks will have been used (even if the drive has never contained much data), and write amplification may set in.

TRIM is an SSD firmware feature that optimizes write operations, by adding awareness of what blocks need not be preserved and/or can be erased and/or zero'd out.  Windows 7+ can send the firmware "hints" as to what blocks are to be considered deleted, so the firmware can better inform its internal garbage collection etc. - but it's a horse-and-water situation as to whether the firmware will apply these "hints" immediately, enqueue them for action when "idle", lose them from the queue when it fills up before "idle", or ignore them completely.

Windows 7 sends these hints to the SSD, if this "as admin" command...

fsutil behavior query disabledeletenotify

...returns the value 0.  It's far harder to determine whether the SSD firmware is actually doing TRIM at all, and I've not delved into such detail; hopefully we can assume recent SSDs should "just work" for TRIM?

Windows 7 has no UI to initiate a TRIM (strictly speaking, a "re-trim" operation to send hints to the SSD in the hope its firmware will TRIM).  Once again, Windows 8.x/10 are far better at this; the updated "Defrag" (now "Optimize") UI shows SSDs as such, and that it is TRIM rather than Defrag that happens when an SSD is Optimized.  These newer OSs also improve the chances that the SSD firmware will actually TRIM, by sending re-trim hints when idle, etc.

TRIM, NTFS and Defrag

In Windows 10, TRIM requires NTFS, and presumably this is the case behind Windows 7's opaque UI also.  Ironically, the design of NTFS means it needs Defrag, though less often, for different reasons, and with different logic to be effective.

The SSD itself doesn't need Defrag at all; not only is there no head travel to optimize, the actual location of data within flash memory addressing is most likely scrambled by the SSD firmware wear-leveling logic.  TRIM helps clean up the garbage, whereas Defrag just moves the garbage around.

But whereas FATxx file systems set aside a fixed and duplicated File Allocation Tables to hold cluster chaining information, NTFS attempts to improve scalability by holding pointers to each extent (i.e. run of contiguous clusters) within each file or directory's metadata.  

This is fine when the cluster chain is unfragmented; the overhead is the same as a FATxx directory entry (one pointer, to the start of the cluster chain), effectively dumping the FAT tables for free. But a fragmented file needs an additional metadata pointer for each fragment, and that can bloat the metadata into trouble.

For this reason, Windows 10 will do some sort of defrag on SSD volumes where the Volume Shadow Copy service is active, once a month or so, to prevent NTFS from blowing itself up.  This wouldn't be necessary for FATxx, but Windows won't TRIM FATxx, so you can't avoid the issue.

Volume Shadow Copy (VSC) is the engine behind System Protection, System Restore, File History, and the ability to back up files that are "in use".  It creates and populates "\System Volume Information", where ChkDsk results are also stored.  You may be able to avoid VSC by disabling System Protection, either entirely or for particular volumes that are on SSD.

Windows 7 may predate this awareness of the need to defrag NTFS, and so there may be reason to do a manual Defrag of SSD volumes, though rarely; perhaps twice a year or so?  

However, the standard Defrag logic will not defrag files larger than 64M - exactly the files that are most likely to blow up NTFS, as they will have the largest number of metadata cluster chain pointers.  Concurrent write operations will do the most damage, and huge slowly-growing files (hello, Outlook .pst) are most likely to get into trouble.  VSC increases the risk of concurrent writes, which is probably why Windows 10 takes that as a cue to surreptitiously defrag SSD volumes, at the risk of horrified Internet posters screaming "never defrag an SSD!!"

14 November 2020

Intel 10nm GPU Driver Blanks BCD Boot Menu

This is the second case of a brand new laptop based on Intel's 10nm 10xxGx processors, in which the Intel Display Adapter driver causes the BCD {BootMrg} menu to be invisible (black-on-black), though still working.  Here's how to demonstrate the bug:

  1. Run BCDEdit from an "As Admin" Cmd or PowerShell
  2. Add an OSLoader entry to BCD {BootMgr}, so BCD boot menu will be invoked
  3. Set the Timeout to 15 or so
  4. Restart system from cold, i.e. not "Fast Startup", Resume from Sleep or Hibernate, etc.
  5. Rotating dots will vanish to blank black screen, where you should have seen the boot menu
  6. Wait for timeout or press Enter; system will boot as expected
  7. Device Manager, select Intel Display Adapter, Disable
  8. Repeat from (4), menu will now appear as it should, OK
  9. Device Manager, select Intel Display Adapter, Enable
  10. Repeat from (4), menu will fail to be visible again

The current case is a Dell Inspiron 3593 based on i7-1065G7, whereas the first case was an Asus X509JA-i541GT based on i5-1035G1.  Both processors are 10nm but have different integrated GPUs.  Windows 10 versions 1909 (Dell original), 2004 (Asus updated) and 20H2 (Dell updated) equally affected.

Click the link words to drill down into detail, and here for a fuller description of the problem.

Don't Kick Away The Ladder

You climb a ladder onto the roof, then kick away the ladder. 
How do you safely get down to put the ladder back up so you can safely get down?

Such an obvious f-up, you wonder why I need write this post.  Like a cartoon character who runs off a cliff, panics while briefly suspended in mid-air, then inevitably plunges to a cartoon death; surely, system designers aren't that stupid?  

Examples abound, especially in the age of Class 3 UEFI that forces us to share these stupid risks. The "Extensibility" of UEFI allows code to be integrated before any OS can boot, and this code can persist into the OS runtime, so it's hard to see how this can be "more secure" when fundamentally unsafe.  Years of (U)EFI's buggy "growing up in public" makes it clear such code is insufficiently trivial to be free of exploitable bugs.  A ladder should be trivial enough to never break; flaky firmware can break systems in ways that cannot be fixed!

I'm setting up a new Dell laptop, and in the firmware setup, is an on-by-default option to allow UEFI firmware to connect to the Internet and grope for "updates", before any OS is booted from which malware could be tackled, as if vendor supply-chain attacks had not already happened.  Specifically, if attempts to boot Windows fail successively "too many times", the firmware will try to launch Dell's repair, and if that in turn fails, will try to download the repair material via the Internet.

This Dell laptop also suffers from this bug, based as it is on the same 10nm Intel processor family.  The problem applies to both the original Windows 10 1909 installation, and after this was upgraded to 20H2 as per current Media Creation Tool.  The .iso created by this tool no longer fits a standard DVDR disc, so a bootable USB stick was created instead, and the file set copied from there. 

The laptop's nice NVMe SSD can't be seen from my rescue WinPE boot discs, nor from a freshly-downloaded Kaspersky Rescue Disk; none of these can see the drive via the PCI interface.  Once again, the ladder is kicked away; crucial boot-time code should "always just work", i.e. standard trivial code baked into the firmware, not requiring "special drivers" to work.

As it is, the nature of the UEFI display bug demonstrates how a setting at the top of the ladder (Windows 10 Device Manager, enabling or disabling the Intel display adapter) kicks away the bottom of the ladder (OS-level setting affects pre-OS UEFI, such that pre-OS BCD boot menu fails to display).

Do we really have to wait for more shoes to drop?

26 October 2020

Firefox Memory Leaks

 I use Firefox without extensions and plugins, and find it "leaks memory"; specifically, the memory footprint as seen via Win10 Ctl+Alt+Del Task Manager increases towards 2G over days of mult-tab use, whereupon the entire system slows down and becomes less responsive, while Firefox becomes as crabby as a sleepless toddler (prolly for much the same reasons).

For this reason, I use 32-bit Firefox, to limit its address range and thus impact on the rest of the system.  Going 64-bit would only "solve" the problem in the same way a bigger gas tank would "solve" a leak, i.e. delay the onset of inevitable problems.  32 bits can address a 4G range, halved to 2G for signed offset addressing as is likely for Firefox's internal memory (mis-)management.

So when I saw "CVE-2020-15254: Undefined behavior in bounded channel of crossbeam rust crate" (seriously, WTF is a "crossbeam rust crate"?), I went Aha!  And when I read "The impact on Firefox is undetermined", I went Aha! again, as in "geez Mozilla, don't you know Firefox leaks like a tennis net in a wind tunnel, haven't you even begun to wonder why?"

At the meta level, there's a familiar problem of per-instance vs. aggregate cost; since the days of DOS and Borland's program compilers, it's been "do we frequently ask the OS for small memory allocations, or do we seldom ask for large allocations and manages the details in-house?".  Do you pull cash from an ATM for each cash purchase, or do you draw once a week and manage your own wallet of cash?

Such details may be managed in-house by Firefox developers, or more it's more likely "sub-contracted out" to some 3rd-party generic code library, prolly whatever came with the source code compiler or other development tools.  As a cross-platform program, this is less likely to be handed off to the platform-specific OS; in fact, platform independence is a strong reason for in-house memory management.

Which brings us to another meta-level problem; "black box" code re-usability.  The idea is that such blocks of code should hide their internal details and only expose a limited surface that is trivial enough to rely on (if all non-trivial code has bugs, keep all crucial code trivial!) but in practice they always leak, and such leaks may be exploitable - hence the CVE number.

26 September 2020

Invisible BCD Boot Menu; Intel Graphics Driver

Geek summary: First post-install Win10 update of Intel Graphics drivers for i5-1035G1 renders the BCD Boot Menu invisible, although it still works.  Fixed if Device Manager, Display Adapter is Disabled; problem reproduced if Enabled, effects taking place after Windows restart.

I suspect the cause is failure of the driver to attain color values when started in the raw EFI context, as using the Win10 Settings, Recovery, Advanced UI will show the boot menu in proper color.  That UI reaches a different boot menu, with the normal boot menu seen via Other Operating Systems UI, without restarting through raw EFI boot.  Either the first menu applies the needed color settings, or bypassing the raw EFI phase preserves the successful Win10 OS context.

Test system where problem encountered; brand new Asus laptop based on new 10nm 10xxGx series processor, specifically i5-1035G1.  Not encountered in a new desktop PC built on Gigabyte motherboard with Pentium Gold G6400 processor, also as set up last week.

Background

EFI boot from internal storage enters that storage via {bootmgr}, which displays a boot menu if there are more than one OSLoader entry in the "DisplayOrder".  By default there's only one entry to boot Windows 10, so this boot menu is normally bypassed, and the bug is thus unobserved.

As part of my standard setup, I add boot entries for Safe Mode and Safe Cmd, to float these less-destructive troubleshooting opportunities above the deceptively-named "Refresh Your PC" (a bit more than a F5 web page "refresh") and "Reset Your PC" (far beyond pressing the Reset button to force a bad-exit Restart) bear-traps that you'd have to walk past to eventually find the Safe Modes.  This causes {bootmgr} to display the BCD Boot Menu for the Timeout seconds, thus revealing the bug.

Failure pattern

This particular system displays a GUI "Asus" image during the EFI firmware phase of the boot process, which fades before the BCD Boot Menu appears.  As this logo fades, the color undergoes a subtle shift to a less-blue hue of white; possibly a switch to greyscale, rather than a Win10 "night light" setting (as changing that setting does not change this behavior).  When the failure pattern is not in effect, the Asus logo does not change hue as it fades.

Normally, you'd then see the Boot Menu, but instead, the screen stays black.  There's still display signal present, and if if blindly use the arrow keys before pressing Enter, the menu works; you'd load whichever menu item you'd blindly selected.  If you use the trackpad or a mouse to move the mouse pointer, it will appear as the expected white arrow, and blindly clicking will also succeed in selecting and launching a menu entry.  

If you do nothing, the screen remains black for Timeout seconds and then boots normally.  The initial impression is that the system has "hung" or "crashed" (untrue, as safely tested by pressing Caps Lock to toggle the keyboard LED) or that the system is way slower to boot than expected, especially for an NVMe SSD.

Problem onset

I set up systems offline, to limit problems to one system rather than whatever is being pushed from the entire Internet.  During this phase, the BCD Boot Menu worked normally as expected, both before and after upgrading the "new laptop" Windows 10 version to a freshly-made version 2004.

Problem only appeared after attempting to disable Asus's aggressive underfootware, and initially I ascribed it to this and quickly reversed changes back to the default non-Microsoft Services, Startup entries, and Scheduled Tasks. However, this was also the first Restart after going online and letting Windows Update pull down and install updates, which included "driver updates", which in turn included OEM programs now pushed as "drivers" to evade user management via Settings, Apps or Control Panel, Programs and Features.

The fix

BIOS update, re-defaulting CMOS Setup settings, power off at the mains, holding down Power switch (part of keyboard) for 20+ seconds, BCDEdit nudge to {bootmgr} do not fix.  Device Manager, Display Adapter, Update Driver reports the latest (thus surely the "best") driver is already installed, and the Rollback Driver button is greyed out.

What fixes the problem, is Device Manager, Display Adapter, Disable and then a Shudown UI, Restart to put this change into effect across the EFI boot phase.  Enabling the Display Adapter reproduces the failure pattern after the Restart; the problem remains present until Display Adapter is Disabled again.

Note; I also disable the Windows 10 "Fast Startup" setting via the convoluted Settings, Power UI required.  So at least we know we're not resuming a flawed system runtime after a fake "shutdown".

Likely cause

I suspect the Intel graphics driver depends on context established by Windows, which is absent (nul pointer, anyone?) when the driver is run from raw EFI.  It either sets an incorrect graphics mode, or draws color values from zero'd memory such that "ink" and "paper" are both black.

Safety implications

Class 3 UEFI forces EFI boot, and thus all the flaky complexities of "Extensibility".  Whereas the ancient BIOS/MBR code was sufficiently trivial to be free of bugs, EFI is not, and adds the risk of malware positioning itself to run before any OS or storage device can boot.

The fact that a Windows driver can poison the pre-OS EFI boot process is worrying, especially as the choice of driver to load is either read by pre-OS EFI from Windows, or has been latched into pre-OS EFI behavior by a setting applied from within Windows.

Scenario 1

EFI executable .efi files are able to read the Windows registry, and do so, as the BCD is in fact a Windows registry hive in structure.  However, {bootmgr} is expected to be OS-agnostic, as at the time the Boot Menu is displayed, no decision has been taken as to what OS to boot - could be any version of installed Windows, a PreOS WinPE, a Linux, anything.  So the code that runs before the Boot Menu should not dip into Windows registry hives, e.g. to load drivers or pull variables such as the colors to use for the boot menu, etc.

In fact, safest would be for pre-OS {bootmgr} code to use the lowest default screen resolution, rather than loading any 3rd-party "drivers" for a "better visual experience".  This is a similar safety issue as code integration into "safe modes" (e.g. screen savers).

Scenario 2

When a device driver is selected in Windows, e.g. by disabling or enabling a Display Adapter, Windows may also be changing drivers within firmware EFI.  If so, then a different EFI driver will load, depending on that Windows setting, and a buggy EFI display driver could cause the problem directly, rather than via using null data.

All this is hard to assess, as modern systems blur hardware, firmware, "BIOS", drivers and OSs.  Everything is now likely to contain non-trivial and thus buggy code, and everything is treated as a black-box object that may "leak".  The interface programming model is supposed to blacken the boxes of the object-orientated model, hiding the gooey details more effectively; instead of the "calling code" examining exposed variables (object Properties), it now asks the object to return these variables (object Methods), trutsing the object's code to do that - which is not a great safety/security idea.

04 September 2020

The Clutch Effect

I'll start from the familiar, then delve into the implications.

You have a stack of paper, with a sheet near the bottom peeking out.  You grab that, and pull gently and slowly; the whole stack moves towards you.  You grab it and pull hard and fast; just that sheet emerges, leaving the rest of the pile where it is.

So you scale that up to the "tablecloth trick".  Disaster!

You have an old car with a shot clutch.  If you accelerate slowly and gently, the clutch "works" and the car accelerates proportionally in line with your expectations, based on the engine's revs and selected gear.  But if you stomp the gas, the engine revs speed up nicely but the car doesn't move much faster.

You have Diabetes Mellitus ("peeing lots of sweet urine", as per the tyranny of the measurable... a topic for another day). If you digest carbs slowly, you may be OK; if you digest fast, less so.

You're piloting a fighter aircraft, pursued by a guided missile, and you try to turn and climb to evade it - but at 9G, you black out.  The human-free missile has no such limitations.

You're a 1kg block of whatever, moving through space a nudge above zero.  Your experience is vastly different to that of a similar block moving a nudge below cSee "The universe has two speeds c, and zero; everything else is just a rounding error" for that and more.

You're a man carrying a bucket of water, and at that familiar scale, a liquid generally sinks to fill a container, with a flat surface on top that usually curls upwards a bit round the rim, unless it's mercury, which curves downwards instead.  Contrast that with an ant carrying a bead of water; though the scale isn't that much smaller than our familiar, the experience is already very different, and the opposite of how liquid helium climbs out of its container.

You're a steady DC current, flowing effortlessly through a coil but stopping dead at a capacitor.  You're perfect (i.e. highest frequency) AC, skipping effortlessly across a capacitor but stopping dead at a coil.  So far, so good... now you're a bolt of lightning striking a phone line, frying almost everything in an old 286 PC, yet the monitor remains unscathed - because the 90 degree bend of the thick copper wires at the graphic card's signal port melted before the current to reach it.  What's going on here, if lightning is DC?  Yes, but that very fast rise time behaves more like perfect AC... so to protect against lightning, tie knots in your cables - big ju-ju, works good!

What's common to all these scenarios that I've clumped together as "the clutch effect"?  It ties in with layers of abstraction, within which certain models work (e.g. Newtonian motion and speeds near 0) but beyond which, different models may be needed (e.g. Relativity at speeds near c).

28 August 2020

PDF: Safe, Print; Pick One

Geek summary: If you print a .pdf and "enable all features", you're enabling all risks!

Adobe's PDF is a significant edge-facing risk, and unlike their wretched Flash, no signs of it going away soon.

The email risk

"Opening" a .pdf in a web browser is risky enough, but a bigger risk are all those automated .pdf emaul attackments; "invoices" spawned by generic accounting packages, "forms to print, sign, scan and return", etc. Even if there's a pretense at certificate-based "security", the sender still expects the recipient to trsut a handful of easily-forged pixels, boilerplate text, and a From: address as being "from someone you know".

Evaluating risk of incoming email links and attachments needs proof of trusted human intent to send, not just which human's system appears to have sent it, as malware is likely to spread via harvested addresses that can be used to populate both the From:, and the To:, CC: and BCC: sides of the fence. When both From: and To: addresses are harvested on the same infected system, the malware will most likely be "From: someone you know"!

Digital signatures proving it came from that user's system don't address that problem; you need to read the message "text" to see if the human sender intended to send the message and the attached file(s).  A smart sender will write such text, but the lazy, clueless or disinterested will not; when the accounting program or whatever pops up a boilerplate message (typically "you need Adobe Reader to open this file" with a link that one hopes doesn't point to a malware server), they'll just click Send without changing (personalizing) this text at all.

So we have all these near-identical boilerplate messages bouncing around, telling users to "open" files and/or click links to install software, with attached .pdf - and those files are exploitable enough, without the need to trigger heuristics by faking the file type to "open" raw code.

Risky "data files"

Risks from "data" files stem from the object model, that treats everything as an object, and all objects can have Properties (internal variables) and Methods (internal code) that other objects can use.  The human user is just another "object" that happens to be at the end of peripheral UI input devices.

This is the code equivalent of dumbing down user concepts of "read", "edit" and "run" to just "open"; the same safety-oblivious mindset underlies both, and the interface programming model makes this worse by swinging from reading exposed Properties to running Methods to return these.  That implies the object is now trsuted to run code, to get anything out of it.

Specifically, .pdf is designed to run JavaScript hidden from the user, and even if this script is prevented from "doing anything nasty", it can be leveraged to set up the stack to climb the exploit ladder to running raw code.  If you look in the Acrobat (Reader) Edit, Preferences section, you'll see other exploit opportunities such as multimedia, "opening" other file types, etc.

These risks have been obviously non-theoretical since the Concept "prank macro" and subsequent destructive (including hardware-killing CIH payload) Word and other MS Office macro malware, yet that didn't stop Adobe baking the same risk into the .pdf standard years later.  

Today's reading suggests Microsoft quickly saw the risks, but at the time, I remember the first response had two purposes; reassure users this was "not a virus" but just a "prank macro", and provide a removal method specific to that particular vir.. uh, "prank macro", as if there'd never be another one. It took years to (more or less) fix that "works as designed, won't fix" easy exploitability, and that happy period of easy scripting arguably established the commercial viability of malware.

Protected Mode

Fortunately, we have Adobe's Protected Mode to keep us safe - at least until we print.

I've always been creeped out when "reading" a .pdf and having to print it out, when that asks me to "enable all features".  What "features"? Allowing the "data file" to drop and run code?  Apparently yes, this is exactly what happens if this actually exits Protected Mode (not that Adobe tells you this in that happy "enable all features" dialog box).

It's also good to ensure you use Reader and not the full Acrobat as your default .pdf "open" file association, as full Acrobat duhfaults to disabling Protected Mode!

I hope you've been clicking this article's implicit links along the way, as that's where I "show my workings" as to how I understand this situation.  

Now consider how many of those links "opened" a .pdf in your web browser...

18 August 2020

Wisdom Of The Ages

Starting a new project?  

Ask a 10-year-old if it's ethical

Ask a 20-year-old how to do it

Ask a 30-year-old how to do it better

Ask a 40-year-old who can do it for you

Ask a 50-year-old how to get away with it

Ask a 60-year-old why it's not worth the bother


15 August 2020

Weakly Horrorscope

As competitive animals, we are naturally fearful.  Secular fundamentalists such as I should also be modest, if true to their faith. Accordingly, I weakly offer my horroscope...

Sunday

A time to consider the stars* and other large matter(s).

Moonday

Scaling down to rocks large enough to pull themselves round; most detectable when orbiting stars, planets (e.g. moons of Jupiter) or each other (e.g. Luna, Charon). 

Twosday

Duality; a deeply ingrained falsehood?

Windsday

Your climate matters, at least to you; handle with care.

Thorsday

Ponder human hostility, from GBV to WW3.

Faraday

Cage yourself from technology and self-reference for a moment!

Scatterday

Alter your state; zoom out for insight instead of drill-down taxonomy.

* As in "Twinkle twinkle little**..." rather than "dancing with the...".  Snap to grid, humanists/populists!

** Why "little"?

After Scatterday, you'd need a day in the sun to recover  ;-)

28 April 2020

Trump, Coronavirus, Disinfectants


OK, let's put this one to bed quickly... the challenge with viruses is not destroying them; that is easy.  It's how to target them within an infected host, without damaging the host.

I can understand the frustration that causes one to wish it were possible to call in an air-strike, but that only works if the enemy and your friends are not mixed.  So, general biocidal strategies are great outside the body, but useless within... unless they can be focused on the target.  That's ID-specific policing and sniping, not a carpet-bombing airstrike!

Internal intelligence


Humans have two intelligent systems, only one of which we experience as our consciousness; the one that stems from the animal strategy of physically moving around.

The other intelligence defends the self internally, and has deeper roots than multicellular animals.  This is the immune system, the highest level of which crafts particular-shaped proteins to bind specifically to stuff that isn't white-listed as part of the body's own organic chemicals.  It is this that is expected to give post-infection immunity, at least until the virus mutates beyond recognition (as RNS viruses like Influenza and Coronavirus tend to do), and this is is the basis for vaccination as a pre-infection defense.

Why not antmicrobial drugs?

 

Larger infective agents such as bacteria are easy to attack using simpler chemistry, because their core biological processes involve proteins sufficiently different to our own, so they can be specifically targeted without harming the host.

Viruses are different, because they are pure genetic information that use the host cell processes to reproduce.  So, all those biological processes you can uniquely attack in bacteria, are your own processes when it comes to how a virus "lives".

Viruses coat their genetic material in protein(s) coded within its genetic material.  This coating may both hide the genetic material from the host immune system, and bind the virus to the targeted cells of the host.  Aside from our internal immune systems figuring out how to target this protein and/or genetic material within, cruder chemistry could find ways to disrupt the process whereby the virus binds to host cells, and target that as a means of treatment.

I'm starting to write more about biology here, starting with how it works, and how the biosphere compares with the infosphere, etc.

09 April 2020

Win10 Temp .evtx Flood Revisited


Executive summary of this bug: To curb rapid free space loss to %Windir%\Temp\*.evtx , do this:
  • Regedit, HKLM\System\CCS\Services\AppXSvc, chrange Start from 3 to 4 (Disable)
  • "Lifeboat" batch file to Del %WinDir%\Temp\*.evtx every 15 secs, loop forever
  • "Run As Admin" desktop shortcut to batch file, for rapid emergency access
This is a workaround, not a cure; the cure must come from Microsoft as a meta-level bugfix of the Microsoft Store and App subsystem present in Windows 8.x and 10.  It's not enough to "step on ants" on a case-by-case basis via Feedback, Help or Support - and it's not a matter of fixing particular Apps that trigger the problem, as the bug lies not in what causes this error handling response, but in the error handling logic itself; endless, rapid, and uncotrolled retries and logging.

Why has this bug persisted for years?

Current Troubleshooters miss the issue entirely, making it harder to visualize the problem.  Storage Sense may launch appropriately, and show a massive Temporary Files footprint, but the bulk of this is not shown within the sub-categories that are offered to be cleared.

Space management utilities such as TreeSize or Windows Directory Statistics can't normally "see" into %WinDir%\Temp due to permissions issues that require "Run As Admin", making it harder for even tech-literate users to track down the problem.

Because the bug is in code that is infrequently invoked, there hasn't been a massive single outbreak of cases to attract the vendor attention we need.  As a "blind spot" to both vendor (not handled by Storage Sense) and user ("As Admin" blocks on inspecting %WinDir%\Temp), it's both under- and inadequately-reported; most threads run for pages before the .evtx files are seen, so that an accurate description of the bug is slower to surface As the usual end-point is "I dunno, just try re-installing Windows, maybe that will fix it", no "clean cure" emerges, so new victims may just give up.

However, when you do find threads on this problem, you see hundreds of "I have the same question" victims, so it's not so rare that we can forget about it.  As an unexpected bug in an exposed surface, it may be an exploitable vulnerability as well.

Metro, Modern, UWP...

When a vendor keeps shuffling the branding of a feature or product, it suggests attempts to re-launch it after initially being rejected in the marketplace.  We've seen that with MSN, and we're seeing that with "Metro", "Modern", "Universal Windows Platform" and evolving drifts from there.

UWP is a new subset platform added to Windows 8 to bridge the UI and platform divide between PCs and sub-PC mobile devices, so that programs could run both on large screens with keyboard and mouse, and tiny screens prodded by fat fingers.

The original form as added to Windows 8 was a grotesque throwback to Windows before 3.x, i.e. before there really was "windows".  Apps ran full-screen, with no visible UI to close them, and screen space was wasted on massive UI elements to work on tiny touch screens.  The first Apps didn't do anything better than properly-behaved Windows "desktop" programs, so there was nothing to attract PC users, and everything "called home" all the time, pushing you into losing anonymity and accepting the increased risks of being permanently logged into a Microsoft online Account.  Just why does a Calculator App need to access the Internet, anyway?

By Windows 10, Apps can at least be windowed, finally catching up with the Windows 3.yuk UI feature set, but UWP still feels like an unwanted blob stuck on what we'd rather use instead.

The UWP installer/updater subsystem 

The nature of UWP seems to be to run underfoot, similar to the way it's not UI-obvious on a smartphone as to what apps are still running in the background.

In particular, UWP appears to have a separate installer and updater subsystem, outside Windows Update and related user controls.  Compare installation and update activity as shown in Reliability, with what you see in Windows Update History, to see what I mean.

So in effect, Windows 10 has the Windows "desktop" .exe and .msi installation system, the UWP App system, and added to that by MS Office, "Click To Run".  The last two appear to be not only the least documented for our troubleshooting purposes, but the most invasive and buggiest as well.

It's "coding 101" to never fall into an endless loop, exhaust resources such as storage space, or lack situational awareness such as how often you are doing something, how long it takes to do, and whether it is worked.  The ".evtx flood bug" is such an embarrassing failure at so many of these points, undermining confidence in the UWP App system for developers, techs, and users.


03 April 2020

Win10 Bug: .evtx Files Rapidly Fills C: Free Space


This very nasty Windows 10 bug has been around for over two years at least; crippling, often associated with "Feature Updates", i.e. new versions of Windows 10, and still there from at least as far back as 1803, to current 1909.

Note that each of those hyperlinked words in the previous paragraph, is a link to a forum thread on this issue, so while it not be common enough "all at once" to attract attention and get fixed, it's always around, and always eating systems - no magic bullets, typical advice is a shrug and "just re-install Windows" or equally-hi-impact brute-force "fixes".

I suspect it's a generic category of bug within the Microsoft Store and UWP Apps subsystem, regardless of which of these Apps is the "cause" of the problem on any particular system. Never use Apps, Microsoft Store, or UWP stuff?  Too bad, that "updater" or "installer" will still flood your drive with pointless error messages and make it impossible to use your PC.

What you will see

The bug presents as an inexplicable runaway filling of free space on the C: drive, no matter how much free space you had there before.  Disk Cleanup doesn't show the bulk that needs to be cleared; Settings, Storage Sense may pop up and show the material as in "Temporary Files", yet not in any of the checkbox sub-categories offered to be cleared.

Users will then turn to Windows Directory Statistics (WDS) and/or TreeView or similar, and may get side-tracked into arguing which is better, etc.  I use WDS, and it will show a massive "Unknown" accounting for the lost storage space.  If I right-click WDS and "Run As Admin", I will then see this bulk as thousands of small files (between 68k and 20M) within %WinDir%\Temp

Most of these will be .evtx files, as "opened" by Event Logger; the rest will be .txt files, and these will be date-stamped as being spawned several times a minute, if not every few seconds, until the free space is exhausted.  Deleting these doesn't help; they will immediately start flooding again.

What appears to be cause


AppXSvc is a Windows service that "deploys UWP Apps"; I found little documentation of the service, but finds this Fortnet zero-day alert, FWIW.  Looking at...

Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AppXSvc

...via Regedit, we see the following settings:

Start = 3, i.e. Manual
Type = 32 (0x20), i.e. can share address space with other services of the same type
ErrorControl = 1, i.e. warn but do not abort starting Windows

So, something else starts it all the time, as it's always running yet not set to start automatically.

Recovery actions, as seen via the Services UI (where everything is "greyed out"), appear reasonable...

1st failure: Restart the service
2nd failure: Restart the service
3rd and subsequent failures: Take no action

...so if this service repeatedly falls on its ass, it should stop running, limiting the impact to at most 2 sets of .evtx and related error logging files in %WinDir%\Temp.  So what's going wrong, here?

I suspect whatever is starting this wretched (and for most of us who only use "real" Windows preograms, totally useless) service isn't paying any attention to those Recovery actions, and is just endlessly banging away, restarting the service "Manually".  If each time the service considers itself to have been launched for the first time, it will always "Restart the service".

Also, in one of the two cases I've seen first-hand (manual Media Creation Tool upgrade to 1909 from inside Windows), I noticed odd behavior in the Services UI.  Specifically, the service Properties (as seen via Services UI) offered to Start the service, even though Ctl+Alt+Del Task Manager showed it to be still running.  If the service and/or managing code gets confused about whether it's running or not, that too may screw up the "FFS stop trying to start the &^$& thing, it's already failed 3 times" logic.

The other system I saw and managed via TeamViewer, was after an auto-upgrade to 1903.  On that system, setting the AppXSvc Start value to 4 (Disable) will hopefully kill the service, plus I wrote a brute-force batch file set to "Run As Admin" as follows:
@Echo Off
Set Secs=15
Set Mask=*.evtx
Echo.
Echo Deleting %WinDir%\Temp\%Mask% files every %Secs% seconds...
:LoopForever
    Echo.
    Del %WinDir%\Temp\%Mask%
    Echo.
    Timeout 15
GoTo LoopForever
None of this a proper fix, especially on an SSD where you don't want tends of thousands of pointless file writes every hour or few.  At the basic level, Microsoft needs to muzzle the UWP App subsystem so it doesn't stomp all over the system willy-nilly, and ensure that every logging process has a basic LIFO clue so as not to consume all available storage space.  A specific fix would be nice, too, but we also need a more respectful vendor-to-user relationship.

PS: What is it with HTML text editing (e.g. in Blogger) that messes up blank line spacing around subheadings, etc.?




Pharma As As Service


Covid-19 is the first of what I expect will be many "bio-pollution" crises du jour going forward.  Experience with Ebola and similar previous outbreaks has triggered an unprecedented global response, and brings to light some challenges familiar of software development, e.g.time-to-market, rapid scalability, etc.

The software industry has already changed with this in mind, though driven by vendor's self-interest.  How would Big Pharma look, if adopting the same strategies?  Think about how EUL"A" trump the common-law rights of users, how products are forever in "beta" i.e. unfit to be guaranteed safe for release, how systems are left open for vendor-pushed updates, etc.

Can we wait for formal FDA etc. approval of tests, vaccines, etc.?

What will happen if we don't?

02 April 2020

I Live... Again...

Yep, back after a long gap, using extra time available during the Covid-19 LockDown.  Unlike WordPress, looks like Blogger doesn't have a new post editor.

Does anyone remember ye olde DOS game "Blood"?  I guess we're all pre-trained for Covid-19 by all those "zombie apocalypse" fillums  ;-)