Heh, this is where a non-coder (well, OK; ex-coder) sounds off on programming. File under "teaching grandma to suck eggs" if you like, unless you find yourself asking "I make what I think is pretty good software, yet my clients shout at me and go off and use something else, and I can't figure out why".
Features may make folks warm to a product, but there are two things that make them hate a product and never want to use it again:
1) It corrupts or loses data
2) It acts beyond the user's intention
Sure, "difficult to use" issues may cause folks to bounce off a product, but nothing else engenders pure hatred as the above two crises will do. Today's post goes about (1).
User data is the only component that cannot be replaced by throwing money at the problem. It's unique to the user, and should be treated with the utmost respect. User preferences and settings should be included within this umbrella of care.
If your application creates and manages user data, then you have to ensure that the full infrastructure exists to manage that data, including backup, transfer between systems, integration into other data sets, and recovery or repair.
The easiest way to do that is to use a generic data structure that already enjoys such infrastructure, to ensure the location of the data can be controlled by the user, and that the data set is free of infectable code, version-specific program material, or other bloat. This way, the user can locate the data within their existing data set and it will be backed up with that.
If you have to create a proprietary binary file structure, then it's best (from a recovery perspective) to document this structure so that raw binary repair or salvage is possible. When the app handles the data, it should sanity-check the structure and fail gracefully if need be, with useful error messages. It's particularly important not to allow malformed data to overrun buffers or get other opportunities to act as code.
Large, slowly-growing files pose the largest risk for fragmentation, long critical periods during updates, and corruption. Don't stick everything in one huge file, such as a .PST, so that if this file blinks, all data is lost. It's also helpful to avoid file and folder names with the first 6 characters in common (as these generate ambiguous 8.3 names) and deeply-nested folders. Ask yourself what you'd like to see if you had to do raw disk data recovery, and be guided by that.
When it comes to data portability and integration, this works best if you avoid version dependencies and "special" names. For example, an email app that has "special" structure for InBox and OutBox is going to be a problem if one wants to drop these from another system, so they can be integrated into an existing data set. It should be possible to rename these so they don't overwrite what is there already, and have the application see them as ordinary extra mailboxes.
From a survivability perspective, it should be possible to manage the data from the OS, i.e. simply drop the files into place and the application will see and use them. If you fuss with closed-box indexing or data integration, then you're forced to provide your own special import and export tools, and things become very brittle if there is only one "special" code set that can manage the data.
Don't forget whose data it is - it's the users, not yours. Warn the user about consequences, but it is not your data to "own" in the sense that nothing else can touch it, and that anything outside the app that nudges the data should cause the app to be unable to use it.
What makes for good survivability, may be bad for data security. If there's a need to keep data private, you may have to impact on data survivability to the point that you have to assume full responsibility for data management - something not to be taken lightly.
This is a different issue from security, and goes about limiting the behavior of data to the level of risk that is anticipated by the user. But that's another day's bloggery :-)