Since the GPU memory is very limited, it can only undertake one job at a time, but much faster than the same job on CPU. Now I have 100 jobs to do, and want them to be done by the GPU as many as possible, and make full use of the slower but memory-rich CPUs.
Currently I've got no ways to ensure there being enough GPU memory free. I have to use try-catch block to gather the data back to CPU memory if an exception occurs. However, once an out-of-GPU-memory exception occurs on a worker, it will never free the GPU memory it consumed. As a result, if there're enough many jobs to do, each worker will hold its own part of useless GPU memory but not be able to do any actual GPU job because the GPU memory is full of useless occupancies.
That's why I start to consider how to fix the GPU to only one specific worker or two (if the GPU can hold 2 jobs in its memory). Or at least, I need a way to fully release GPU memory consumption if a worker decides not to use GPU for its job.