|
Author: David Masri Founder & CEO |
I really hate the Salesforce's Apex Data Loader, I really do. Not because it's a bad app, it's quite a good one (at least as far as utilities go), the reason I hate it is simply because it exists. Let me explain, but first a bit of background.
I recently authored an article on Salesforce Ben titled: Choosing the Right ETL Tool or Middleware for Your Salesforce Data Migration or Integration , in that article I noted:
"The APEX data loader is almost never the right tool for the job because it doesn't have any data transformation capabilities and is relatively difficult to automate."
In my book,(Developing Data Migrations and Integrations with Salesforce. ), I introduce the data loader as follows:
"The Apex Data Loader is a free utility provided by Salesforce. It's great for testing the SOAP and Bulk REST APIs under various configurations, as well as doing one-off updates and exports. I do not recommend it for a full-blown data migration or integration because of its lack of features as an ETL tool, not because of any limitations in working with the Salesforce APIs."
When discussing Best Practices (Chapter 6 of my book), Best Practice #20 is "Use Proper Tools" here is an excerpt:
I want to stress (again) that the Apex Data Loader is not the right tool for any serious migration or integration work. It has no capability for data transformations, and although you can use it to connect to Java Database Connectivity (JDBC) data sources, doing so is tedious and error prone. In addition, all the success and error data can only be outputed to .csv files. Using .csv files with Excel as a data transformation tool is also not a good idea. Excel does not produce clean .csv files.*
Excel does not support relational joins, and vertical lookups (VLookup) are slow. Excel simply stops responding when dealing with large files. Most importantly, using Excel reinforces bad habits of transforming data manually rather than automating it. I am convinced that the reason this combination of Excel and .csv files is so common for data migrations is because people tend to know of the Apex Data Loader and instinctively want to use it. (It’s free! And data migrations are one-time run code!**) Then, when they want to edit the .csv files, they double-click and it opens in Excel. Don’t fall into this trap.
*I'm not saying you can't code a migration or integration using the Apex Data Loader that meets all the attributes I listed in Chapters 4 and 5. You absolutely can, and I have. You can use the command line to automate it fully, and some scripting language for all transformation code. I'm just saying it's not the right tool for the job. [Again, you can see my article on SalesforceBen.com on choosing the right tool]
[**Data Migrations are NOT one time run, and thinking that it is the biggest mistakes you can make when migrating data to Salesforce.]
So, you can see from my previous statements that I'm not a fan, but why do I hate it? Because it encourages an error prone manual process. It encourages the violation of every one of the "six attributes of a good data migration" I outline in my book. Every single time I have been brought in to clean up a data migration mess it has been an Apex data loader migration. To stress how bad Data Loader data migrations are, I recently diagrammed out what the data migration process using the Apex Data Loader looks like, here it is:
This is not a joke! That is the actual process the vast majority of data migrations to Salesforce looks like! It's insane! Yet, people wonder why many Salesforce projects fail (or at least struggle to go live) simply because of the data migration process.
But here is the worse part of that process, and why I hate it so much: because it works for small data migrations, and because It works for small migrations –people keep doing it, and begin to scale up, as you scale up it begins to fail, and then it fails catastrophically.
Here are a just few of the problems that arise with this nut-so process:
I really wish people would exclusively use the APEX data loader for its intended purpose, that is one off data extractions, and simple loads as part of basic administration (Importing leads or campaign members in bulk). But as long as it exists, people will be using it to perform data migrations so complex they are doomed to fail. And that's why I hate it.
Needless to say, some people just never learn and will insist on using it anyway. They will argue with me that just because some people abuse the Data Loader, that's not the Data Loaders' fault! Here is my response from the FAQ section of my book (Chapter 13):
Why can't I use the Data Loader? I love the Apex data loader. Look, you absolutely can build data migrations and even data integrations using the Apex Data Loader while following every single best practice in this book. It's just simply not designed to be used that way and is an error-prone process. But if you insist, I recommend that you:
I want to stress that the difficulties you face in automating the Apex Data Loader, particularly in automating your data transformations, are not an excuse to introduce manual steps.
I get that the Apex Data Loader is a necessary evil, but I beg you, please stop using the Apex Data Loader for your data migrations.
This article is adapted from my book: Developing Data Migrations and Integrations with Salesforce.