Home /All Articles /Articles /You Can't Find a Document if You Don't Have It: How to Migrate Document Content into a Cloud Application








As part of the implementation planning process for a cloud-based content management system, the migration of existing documents and content should be considered. Many cloud content management systems (CCMS) have different approaches to migrating existing content that range from uploading one-file-at-a-time to being able to move multiple files at a time. If there is a need to move 1,000 or more files, for example, the CCMS vendor should be consulted, as many CCMS vendors do not have the tools to move very large numbers of files, large files such as files greater than five gigabytes, or to do a mass migration of several hundred gigabytes.


Basic migration strategies are:


1. Allow users to move documents on an as-needed basis. 

This allows users to move the documents from their current repository to the new repository when the document(s) are needed. There is no need to pre-populate sites with documents as the user is in charge.


2. Work with the users to establish a range of documents to be moved to the new repository. 

This is typically a date range, as in all documents dated from xx to xx. Users may move older documents left behind on an as-needed basis. IT may either support doing this or offer hands-on guidance for the first several moves.


3. Migrate complete repositories (file shares or other document management system repositories) prior to the new system going live. 

IT would typically work with the CCMS vendor to do this and it may be via FTP or even transferring complete directories to a hard drive that is sent to the vendor.



Prior to moving any files, a test move should be completed to ensure that (1) all the files targeted are actually moved and (2) the existing file metadata is moved without being changed. For example, if the author is Mike Smith and the creation date is January 1, 2012, the metadata should not change when the file is moved. However, some CCMS will change the author to whoever owns the directory in the CCMS that the document is being transferred to and may change the creation date to the date of the move. Mike Smith may become Mary Jones, and the creation date may change from January 1, 2012 to March 20, 2014. If this happens, the actual file metadata does not change. If you open the Word document and look at the properties, it will still have the original author and creation date, so you may not be in trouble from a legal point of view. But it's still worth checking this with the CCMS.



In addition to correctly migrating the documents, there should be some means of reporting to ensure that the documents selected were the documents migrated and any errors were recorded. For example, if you selected 200 files from a file share and did a drag-and-drop operation, will the system (or migration tool) report that 200 files were selected and 200 files were moved? Some CCMSs may have, for example, a file size limitation of 50 MB, and if one of the 200 files moved was 75 MB, this file may have been dropped from the migration. It is difficult to manually compare files selected and files moved when there are a large number of files.



When content and documents are migrated from a file share or other repository, one should consider what to do with the documents in the original repository. If the original documents are left in the repository, it will create duplicates and different versions of the same document. Depending on why the CCMS site is being created, the documents in the original site can be deleted; however, review with your IT and information management and/or legal prior to deleting the original files. If from a file share, consider making the file share “read only” so that existing documents cannot be edited and saved back or new documents cannot be created and saved back to the file share.



Also, all the documents you may want to migrate may not be in obvious places, like a file share. Many users maintain substantial document repositories on their hard drives and/or UBS hard drives. Users may be reluctant to let you know this in order to keep their working files private. Other places you may look for documents will be on your Internet and intranet. Many documents can be stored inside an Internet or intranet in a content folder and can only be accessed by a user through the web interface. Or, documents may be available in an intranet via a link to a file share and that drive location may not be an obvious place and even hidden to limit visibility to others. If these files are on a CCMS and then linked to the Internet or intranet, than the files will be available from both the web location and the CCMS, which make the document more open for use and collaboration.



The purpose for having a CCMS is to enable file visibility, search ability and collaboration, and the only way that can truly happen is to have all the files in the CCMS. It is hard to find and collaborate on documents when they reside in different repositories that are not connected or searchable.