Much of the discussion about cloud migration is about the challenge of getting data to the cloud. The challenge is usually described as one of data volume; when there are large quantities of data, it’s difficult to physically get it into the cloud without creating application downtime or requiring an update/synchronization process once the data is loaded into cloud.
It’s often overlooked that there are quality challenges when getting data to the cloud, too. Business applications rarely have complete, consistent, correct data. Deciding how—or whether—to address those issues can have a big impact on the success of your application in the cloud.
Data Quality Challenges
We’ve talked about looking at your data before migrating it before. There are several data factors to consider before planning to migrate a data set:
• Data age. Even a “good” set of data may not be worth migrating if it’s too old to be useful.
• Data consistency. If there are gaps in the data or fields were used inconsistently, it’s difficult to draw good conclusions or support business processes with that data.
• Data model. A flawed data model, even when used consistently, doesn’t capture the data in a format that enables business to work effectively.
Addressing Data Quality Challenges in Cloud Migration
Once you know where your data quality challenges are, you can make an informed decision about how to deal with it. You can take one of these different approaches:
• Ignore it. Acknowledging that there are problems with the data but choosing to migrate it “as is” anyway is a perfectly valid approach. Your business has been working with this data for some time, and most likely has workarounds and other solutions for coping with the problems. You can accept that you’ll continue to use those workarounds even once your workloads are in the cloud.
• Correct it before you migrate. If you aren’t going to ignore the problems, you’ll have to address them. You can address the data issues in your current, non-cloud environment before you begin the migration. This has the advantage that you’ll solve the problems in an environment you’re familiar with and where your team has expertise, but it means that it will take time before you’re ready to move to cloud. If you plan to use a different database in the cloud, you’ll have to redesign a second time for that migration.
• Correct during the migration. Alternatively, you can make changes to the data model or the data as part of the migration. You’ll make changes to the database, data models, and schemas as part of the migration and will load data from its existing storage directly into the new structure in the cloud. This approach adds complexity and risk and can make it more difficult to validate the data migration after all the data is loaded to the cloud.
• Correct after migration. This approach offers a middle ground. You identify the problems with the data and make a plan to address it after you’ve completed and validated migrating it to cloud in its existing format. Like correcting it before migrating, this approach takes time, but at least you’ll be in the cloud while you’re working on it.
Verify Data Quality After Migration
Whichever approach you take to handling data problems during migration—even if you choose not to address them—you should plan to verify the data quality after the migration completes. The migration process can introduce errors while data is being transferred, if it loads incompletely, or if it loads incorrectly. Don’t rely on error messages to determine if the process is successful. Document a validation process as part of your migration plan.
Addressing data quality issues as part of your cloud migration can be an important factor in the success of your cloud project and digital transformation. Contact VAST IT Services to learn more about succeeding with data in the cloud.