Legacy Documents – Friend or Foe

As the world turns and technology perpetually reinvents what we do and how we do it, often the old gets distorted or forgotten.  History gets rewritten, languages and word meanings change and over time, we lose the ability to work with items from the past – even the recent past.

It has been said that we can’t rebuild the Saturn moon rocket because the magnetic tapes on which the data and plans were recorded are unreadable by current hardware.  Historically speaking, archeologists have had a very difficult time understanding written documents as recent as 500-600 years ago because no one living can read them and there are no references that translate them.  The famous Rosetta stone (not the language education company) from 200 BC was an amazing find 300 years ago because it was a stone tablet that had the same passage in three languages including Egyptian hieroglyphs. 

It doesn’t take 2000 years or 500 years to lose the ability to read a document.  When it comes to working with current documents or images, there is a long and varied list of formats from over the last 30 years that are increasingly difficult to access today.  Consider all the old word processing file formats, picture and drawing formats, company proprietary formats that were variants of TIFF, more elaborate company formats like IBM’s AFP and ABIC and IBM/FileNet’s method of storing documents as single pages.  Worse yet, think of the document formats that were incorrectly created by junior programmers who didn’t read the specs correctly.

In many cases, you can hope that old documents don’t matter and they are perhaps better relegated to trash, but with the uncertainty caused by Sarbanes-Oxley and other government mandates with regards to preserving documents and emails, how do you discern between the files that should be discarded and the files which should be retained?  It is often safer and cheaper to preserve these old documents.  The most likely approach today for handling older documents is to convert them over to current formats like PDF or TIFF where standard readers can work with them.  But an insurance company or a financial institution with millions of records in an obsolete format must ask whether there is a reasonable ROI to converting such old documents. 

Alternatively, companies in many industries make the choice of either preserving the old viewing technology (a risky choice since newer operating systems may not run older viewing technology - see how well Vista runs Windows 95 or even XP applications) or finding current viewing technology that reads their old documents. The choice comes down between choosing conversion or using current viewing technology to read obsolete documents.

Until recently, many major Enterprise Content Management (ECM) companies simply told their customers to use their existing viewers to open their documents.  You could view PDF documents with Adobe and Word documents with MS Office.   Customers began asking their vendors, “If I’m spending so much money porting to this new, high-performance system, why don’t you give me viewing technology as part of the product?” Fortunately, several major content repository players have listened to their customers and provided them with more integrated -- though still quite limited -- solutions. 

For example, IBM/FileNet provides a Java applet that provides viewing capabilities for TIFF documents.  When it comes to AFP documents, a customer that has an IBM Content Manager system with embedded AFP capability can use a FileNet to CM connector that allows viewing of those stored documents.  EMC/Documentum provides a basic TIFF viewer.  If you need more, they will sell you an ActiveX viewer from one of their partners that can handle TIFF, DWG or PDF.

These options are not full solutions for legacy documents, but they will get you started.  TIFF and PDF are not exactly the most common legacy formats.  If you need AFP support, proprietary TIFF support, Word support, or more esoteric document formats, many of the vendors don’t have a universal solution.  So your users have to open one application for one set of files and something else for another.  That leaves a lot of opportunity for confusion.

If you want to view AFP documents in an EMC system or IBM/FileNet system, the vendor may point you to a universal conversion technology or viewer  - something like Oracle/Stellent or Snowbound’s VirtualViewer.  These two choices also work if you need Macintosh or Unix viewers. However Oracle now competes with IBM, EMC and other ECM vendors. So companies may be hesitant about informing you about all the alternatives.

Therefore it is important to get educated on the market, because  there are lots of ways to spend your company’s resources and you want to be sure that your investment brings you the best return.  Handling legacy documents is a very critical and demanding project – research carefully.