Founder & CEO
Just like most relational database systems, Salesforce locks records when they are being modified, to prevent two people from updating the same record simultaneously, which would result in a conflict. So when someone "asks" Salesforce to update a record, Salesforce first locks the record so no one else can update it until the lock is released. If someone tries to (asks for an) update the record while it's locked, Salesforce will try up to 10 times to obtain a record lock, before giving up, and throwing an error.
Record locking errors is a common source of headache for people coding Data Migrations or Integrations with Salesforce. The good news is that most of the time It's our own integration code that is causing self-contention as opposed to some other user locking the record on us (because we are plowing too many conflicting updates at once, usually in an effort to improve performance). It can also be caused by Salesforce code that is triggered by our updates that require the locks, and then, when not getting them, ultimately fail. Then, that error bubbles up and our updates fail.
Suppose we want to update 10,000 contacts in Salesforce. We configure our ETL to push the updates in batches of 200, serially, using the SOAP API. Per batch, Salesforce will do the following:
Continuing with the same example, again suppose we want to send 10,000 contact records for update, and our ETL tool will break our 10,000-row dataset into batches of 200 records each, based on the order of the records in the dataset. If we don't order our data, we increase the number of locks taken, and with it the potential for record locking issues.
What would happen if we decide to send our 10,000-record dataset using the Bulk API, unordered, and set it to process batches in parallel. Salesforce takes those 10,000 records, breaks them up into batches of 200 (server side), and processes them five batches at a time (in parallel).
Let's say we have four contacts that belong to the same parent account in the dataset. Because they are unordered, the four contacts each land in a different batch, and because we are running things in parallel, Salesforce tries to lock the parent account four times at once, by four different processes. None can get the lock, and all four fail, or one gets the lock and the other 3 fail, or all pass but after wasting time on retry cycles slowing our load down considerably.
What can we do about this? The easiest thing to do is to switch to serial processing, so only one batch is processed at a time, but that affects performance negatively (and in a serious way) , so we don't want to do that. Our other option is to sort the contacts by parent account. In this way, all the contacts that belong to the same account have a high likelihood of being in the same batch, reducing the lock potential.
To be clear, I'm not saying that if we don’t order our data we will always get a locking issue. (Remember, Salesforce tries to gets the lock up to ten times, and may get it.) I'm just saying it increases the likelihood of it happening. I'm also saying that locking is a time-intensive process, and the retries are time intensive; so, even if we get no errors, an ordered load will perform better than an unordered one. Even if we get no locking conflicts and Salesforce can lock the account record every time it needs to, in our example the unordered list will lock the account record four times, but our ordered load will most likely take only one lock-two, at a maximum. (Most likely, all four contacts will be in the same batch, but if they happen to be right at the end of the batch, some can spill over into the next.)
If we have a junction object, it has two parents, and we can only sort by one! So, what do we do? There's not much we can do, except to sort by the one most likely to cause locking issues based on our knowledge of what custom Salesforce code will run when pushing the updates. Barring that, as a general guideline, we can do the following:
Sort by Master-Detail fields before lookups.
For lookups, sort required lookups before nonrequired ones.
For nonrequired lookups, sort ones set to "Don't allow deletion of the lookup record that's part of a lookup relationship” before ones set to "Clear the value of this field. You can't choose this option if you make this field required" (see the Image below).
It's important to know that, when we select "Clear the value of this field. We cant choose this option if we make this field required.", updates to that record will not lock the parent. Unfortunately, this is not the default selection, so our Salesforce SA (or administrator or developer) may have left the default selection ("Don't allow deletion...") simply because they had no business rules or reason to change it. We should work with our Salesforce development team to determine whether this setting can be changed (based on the business use case) and, if so, change it to "Clear the value of this field..."
If you are still having locking issues see this article (also authored by me :-): FAQ : One bad record is causing a batch of records to fail! How can I prevent triggers from impacting my ETL code?
The Salesforces record locking cheat sheet can be found here.
This article is adapted from my book: Developing Data Migrations and Integrations with Salesforce.