drm/amdkfd: fix restore worker race condition

In free memory of gpu path, remove bo from validate_list to make sure
restore worker don't access the BO any more, then unregister bo MMU
interval notifier. Otherwise, the restore worker will crash in the
middle of validating BO user pages if MMU interval notifer is gone.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Philip Yang 2020-05-21 09:56:58 -04:00 committed by Alex Deucher
parent 62cc895c02
commit f7646585a3
1 changed files with 3 additions and 3 deletions

View File

@ -1302,15 +1302,15 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
return -EBUSY;
}
/* No more MMU notifiers */
amdgpu_mn_unregister(mem->bo);
/* Make sure restore workers don't access the BO any more */
bo_list_entry = &mem->validate_list;
mutex_lock(&process_info->lock);
list_del(&bo_list_entry->head);
mutex_unlock(&process_info->lock);
/* No more MMU notifiers */
amdgpu_mn_unregister(mem->bo);
ret = reserve_bo_and_cond_vms(mem, NULL, BO_VM_ALL, &ctx);
if (unlikely(ret))
return ret;