I understand that the TPL uses work-stealing queues for its tasks when I execute things like Parallel.For and similar constructs.
If I understand this correctly, the construct will spin up a number of tasks, where each will start processing items. If one of the tasks complete their allotted items, it will start stealing items from the other tasks which hasn't yet completed theirs. This solves the problem where items 1-100 are cheap to process and items 101-200 are costly, and one of the two tasks would just sit idle until the other completed. (I know this is a simplified exaplanation.)
However, how will this scale on a terminal server or in a web application (assuming we use TPL in code that would run in the web app)? Can we risk saturating the CPUs with tasks just because there are N instances of our application running side by side?
Is there any information on this topic that I should read? I've yet to find anything in particular, but that doesn't mean there is none.