-->

Why is infinite loop protection being triggered on

2020-06-03 08:30发布

问题:

Update

I should have added from the outset - this is in Microsoft Dynamics CRM 2011


I know CRM well, but I'm at a loss to explain behaviour on my current deployment.

Please read the outline of my scenario to help me understand which of my presumptions / understandings is wrong (and therefore what is causing this error). It's not consistent with my expectations.

Basic Scenario

  • Requirement demands that a web service is called every X minutes (it adds pending items to a database index)
  • I've opted to use a workflow / custom entity trigger model (i.e. I have a custom entity which has a CREATE plugin registered. The plugin executes my logic. An accompanying workflow is started when "completed" time + [timeout period] expires. On expiry, it creates a new trigger record and the workflow ends).
  • The plugin logic works just fine. The workflow concept works fine to a point, but after a period of time the workflow stalls with a failure:

    This workflow job was canceled because the workflow that started it included an infinite loop. Correct the workflow logic and try again. For information about workflow logic, see Help.

So in a nutshell - standard infinite loop detection. I understand the concept and why it exists.

Specific deployment

Firstly, I think it's quite safe for us to ignore the content of the plugin code in this scenario. It works fine, it's atomic and hardly touches CRM (to be clear, it is a pre-event plugin which runs the remote web service, awaits a response and then sets the "completed on" date/time attribute on my Trigger record before passing the Target entity back into the pipeline) . So long as a Trigger record is created, this code runs and does what it should.

Having discounted the content of the plugin, there might be an issue that I don't appreciate in having the plugin registered on the pre-create step of the entity...

So that leaves the workflow itself. It's a simple one. It runs thusly:

  1. On creation of a new Trigger entity...
  2. it has a Timeout of Trigger.new_completedon + 15 minutes
  3. on timeout, it creates a new Trigger record (with no "completed on" value - this is set by the plugin remember)
  4. That's all - no explicit "end workflow" (though I've just added one now and will set it testing...)

With this set-up, I manually create a new Trigger record and the process spins nicely into action. Roll forwards 1h 58 mins (based on the last cycle I ran - remembering that my plugin code may take a minute to finish running), after 7 successful execution cycles (i.e. new workflow jobs being created and completed), the 8th one fails with the aforementioned error.

What I already know (correct me where I'm wrong)

Recursion depth, by default, is set to 8. If a workflow / plugin calls itself 8 times then an infinite loop is detected.

Recursion depth is reset every one hour (or 10 minutes - see "Warnings" in linked blog?)

Recursion depth settings can be set via PowerShell or SDK code using the Deployment Web Service in an on-premise deployment only (via the Set-CrmSetting Cmdlet)

What I don't want to hear (please)

"Change recursion depth settings"

I cannot change the Deployment recursion depth settings as this is not an option in an online scenario - ultimately I will be deploying to CRM Online too.

"Increase the timeout period on your workflow"

This is not an option either - the reindex needs to occur every 15 minutes, ideally sooner.

Update

@Boone suggested below that the recursion depth timeout is reset after 60 minutes of inactivity rather than every 60 minutes. Therein lies the first misunderstanding.

While discussing with @alex, I suggested that there may be some persistence of CorrelationId between creating an entity via the workflow and the workflow that ultimates gets spawned... Well there is. The CorrelationId is the same in both the plugin and the workflow and any records that spool from that thread. I am now looking at ways to decouple the CorrelationId (or perhaps the creation of records) from the entity and the workflow.

回答1:

For the one hour "reset" to take place you have to have NO activity for an hour. It doesn't reset just 1 hour from the original. So since you have an activity every 15 minutes, it never has a chance to reset. I don't know that is said in stone anywhere... but from my experience.

In CRM 4 it was possible to create a CRM Service (Google creating a CRM service in the child pipeline) and reset the correlation ID (using CorrelationToken.NewToken()). I don't see anything so easy in the 2011 SDK. No idea if this trick worked in the online environment. Is 2011 online backwards compatible with CRM 4 plug-ins?

One thing you could try would be to use the IExecutionContext.CorrelationId to scavenge the asyncoperation (System Job) table. But according to the metadata, the attribute I think might be useful (CorrelationId, CorrelationUpdatedTime, Depth) are NOT valid for update. Maybe you could delete the rows? Even that may not help.



回答2:

I doubt this can be solved like this.

I'd suggest a different approach: deploy a simple application alongside CRM and let it call the web service, which in turn can use the XRM endpoints in order to change the records.

UPDATE

Or, you can try something like this upon your crm service initialization in the plugin (dug it up from one of my plugins) leaving your workflow untouched:

CrmService service = new CrmService();
//initialize service here, then...

CorrelationToken newtoken = new CorrelationToken();
newtoken.CorrelationId = context.CorrelationId;
newtoken.CorrelationUpdatedTime = context.CorrelationUpdatedTime;

// WILD GUESS: Enforce unlimited depth ?
corToken.Depth = 0; // THIS WAS: context.Depth;

//updating correlation token
service.CorrelationTokenValue = corToken;

I admit I don't really remember much about this (code dates back to about 2 years ago), but it might help.