I've had a question open on here about improving the performance of my 30gb workflow application in XPages. There were lots of suggestions but most involve recycling, improving code etc. and what actually fixed the issues with speed are not often talked about - the advanced tab in the application properties (see my last post)
Now I have an application that runs really well, it is fast and people are happy BUT the server still periodically crashes. Or I should say, HTTP becomes unresponsive and in extreme cases runs 100% CPU so Domino is also sluggish but still running.
I've been monitoring the HTTP threads with
tell http show thread state
And in most cases I see 80 http threads that are idle, or doing something but quickly released. Since the last update to the application where we have been way more focused on recycling the notes objects in SSJS, I thought we would see the end of the hung http thread but it's still there.
I'm almost positive that it's not an infinite loop that's causing this problem because 2 cases that I have confirmed with the end users are completely different and there are no loops as far as I can tell.
User is editing a document, presses a workflow button to approve and send it on to the next user. They are using Chrome. The spinning circle on the chrome tab starts, the server is then supposed to run the workflow agent, send emails, and then close the page on the browser. I noticed that there were 2 or 3 hung http threads that hadn't been released after an hour so I contacted the user and she told me the page hadn't refreshed but the spinning circle was still spinning in chrome suggesting the server was doing something. I checked the logs and the workflow agent HAS run, emails have been sent and the document is updated. She refreshed the page and can now see that it's been updated but for whatever reason Chrome sat there waiting patiently and never received the message that the LS Agent had run. I use notesAgent.runOnServer and return the resulting integer to confirm if the agent has run or not. If it returns 1 (i think) then the page is supposed to close, otherwise it should display a message. The page never refreshed so it didn't display anything, but the agent did complete.
A user in the evening ended up with about 15 hung http threads. In the logs I could see she was trying to reload the page multiple times. Then there was a search for the document she wanted, and then more attempts to open it. When I checked she said she searched for the document, the search page showed the results (in a repeat control), and every time she clicked the document to open it nothing happened. So she didn't even get in to the document yet the threads were hanging after each attempt. I got the URL from the notes log and tried it, document opens fine. I ran the same search, document opens fine. I send her a link to the document directly and it opens fine for her. Weird!!
Is there ANY way to diagnose this sort of behaviour because right now I have to have domino admin open running the tell http show.... command all day keeping an eye on it to make sure threads aren't hanging. It usually gets to lunch time and the server needs a reboot, which is rubbish.
Please help my sanity :-)