Recommendations for designing a long-running, reso

2019-04-08 06:21发布

问题:

I have a .NET function that does some complex calculation. Depending on the parameters that are passed in, the function:

  • Takes anywhere from several minutes to several hours to run
  • Uses 100% of a single core during the computation
  • Requires anywhere from 100s of MB to several GB of memory
  • Writes anywhere from several MB to several GB of data to disk
  • May throw an exception, including an OutOfMemoryException

The amount to data to be written to disk can be accurately predicted from the function parameterisation. There is no easy way to predict the other resource requirements from the function parameterisation.

I need to expose this function via a web service. This service needs to be:

  • Resiliant and gracefully report any problems during the calculation
  • Capable of handling concurrent requests, as long as there are sufficient resources to handle the request without significant performance degradation, and to gracefully deny the request otherwise.

I'm intending to handle the long-running nature by having the initial request return a status resource that can be polled for progress. Once the calculation is complete this resource will provide the location of the output data, which the client can download (probably via FTP).

I'm less clear on how best to handle the other requirements. I'm considering some sort of "calculation pool" that maintains instances of the calculator and keeps track of which ones are currently being used, but I haven't figured out the details.

Does anyone with experience of similar situations have any suggestions? As long as the solution can run on a Windows box, all technology options can be considered.

回答1:

I'd suggest splitting your application in two parts.

  1. The web service itself. It's functionality:
    • Get a work item from a client;
    • Transfer this work to a backend service that performs the actual work;
    • Report progress and the result;
  2. The backend service. It's functionality:
    • Process the requests friom the web service;
    • Perform the actual computation.

The reasons for this design are
1) it's relatively difficult to handle the workload in the hosted application (ASP.NET) because the server (IIS) will manage the resources, while in a separate app you have more direct control;
2) two-tier design is more scalable - for instance, later you could easily move the backend to another physical machine (or several machines).

The web service should be stateless - for instance, after a request is accepted, the user gets back some ID and uses this ID to poll the service for the result.

The backend server, probably, has to maintain a queue of the requests to process and a set of worker threads that process them. The workers should monitor the resources available and take care not to overload the machine (and, of course, gracefully handle all possible error conditions).



回答2:

While you may want to provide a web service interface, web services are typically not designed for these kind of processes. What you might want to do is forward the request to a windows service (on a dedicated machine) that can handle this. Windows Services won't get recycled and you have much more control over the process.

About the calculation pool: what you can try is create a calculation queue (for instance a table in the database). This way you can have multiple windows services on dedicated machines processing the calculations. This can allow you to scale more easily.