How to limit the number of running instances in C+

2019-01-15 03:21发布

问题:

I have a c++ class that allocates a lot of memory. It does this by calling a third-party library that is designed to crash if it cannot allocate the memory, and sometimes my application creates several instances of my class in parallel threads. With too many threads I have a crash. My best idea for a solution is to make sure that there are never, say, more than three instances running at the same time. (Is this a good idea?) And my current best idea for implementing that is to use a boost mutex. Something along the lines of the following pseudo-code,

MyClass::MyClass(){
  my_thread_number = -1; //this is a class variable
  while (my_thread_number == -1)
    for (int i=0; i < MAX_PROCESSES; i++)
      if(try_lock a mutex named i){
        my_thread_number = i;
        break;
      }
  //Now I know that my thread has mutex number i and it is allowed to run
}

MyClass::~MyClass(){
    release mutex named my_thread_number
}

As you see, I am not quite sure of the exact syntax for mutexes here.. So summing up, my questions are

  1. Am I on the right track when I want to solve my memory error by limiting the number of threads?
  2. If yes, Should I do it with mutexes or by other means?
  3. If yes, Is my algorithm sound?
  4. Is there a nice example somewhere of how to use try_lock with boost mutexes?

Edit: I realized I am talking about threads, not processes. Edit: I am involved in building an application that can run on both linux and Windows...

回答1:

Here's a simplistic way to implement your own 'semaphore' (since I don't think the standard library or boost have one). This chooses a 'cooperative' approach and workers will wait for each other:

#include <boost/thread.hpp>
#include <boost/phoenix.hpp>

using namespace boost;
using namespace boost::phoenix::arg_names;

void the_work(int id)
{
    static int running = 0;
    std::cout << "worker " << id << " entered (" << running << " running)\n";

    static mutex mx;
    static condition_variable cv;

    // synchronize here, waiting until we can begin work
    {
        unique_lock<mutex> lk(mx);
        cv.wait(lk, phoenix::cref(running) < 3);
        running += 1;
    }

    std::cout << "worker " << id << " start work\n";
    this_thread::sleep_for(chrono::seconds(2));
    std::cout << "worker " << id << " done\n";

    // signal one other worker, if waiting
    {
        lock_guard<mutex> lk(mx);
        running -= 1;
        cv.notify_one(); 
    }
}

int main()
{
    thread_group pool;

    for (int i = 0; i < 10; ++i)
        pool.create_thread(bind(the_work, i));

    pool.join_all();
}

Now, I'd say it's probably better to have a dedicated pool of n workers taking their work from a queue in turns:

#include <boost/thread.hpp>
#include <boost/phoenix.hpp>
#include <boost/optional.hpp>

using namespace boost;
using namespace boost::phoenix::arg_names;

class thread_pool
{
  private:
      mutex mx;
      condition_variable cv;

      typedef function<void()> job_t;
      std::deque<job_t> _queue;

      thread_group pool;

      boost::atomic_bool shutdown;
      static void worker_thread(thread_pool& q)
      {
          while (auto job = q.dequeue())
              (*job)();
      }

  public:
      thread_pool() : shutdown(false) {
          for (unsigned i = 0; i < boost::thread::hardware_concurrency(); ++i)
              pool.create_thread(bind(worker_thread, ref(*this)));
      }

      void enqueue(job_t job) 
      {
          lock_guard<mutex> lk(mx);
          _queue.push_back(std::move(job));

          cv.notify_one();
      }

      optional<job_t> dequeue() 
      {
          unique_lock<mutex> lk(mx);
          namespace phx = boost::phoenix;

          cv.wait(lk, phx::ref(shutdown) || !phx::empty(phx::ref(_queue)));

          if (_queue.empty())
              return none;

          auto job = std::move(_queue.front());
          _queue.pop_front();

          return std::move(job);
      }

      ~thread_pool()
      {
          shutdown = true;
          {
              lock_guard<mutex> lk(mx);
              cv.notify_all();
          }

          pool.join_all();
      }
};

void the_work(int id)
{
    std::cout << "worker " << id << " entered\n";

    // no more synchronization; the pool size determines max concurrency
    std::cout << "worker " << id << " start work\n";
    this_thread::sleep_for(chrono::seconds(2));
    std::cout << "worker " << id << " done\n";
}

int main()
{
    thread_pool pool; // uses 1 thread per core

    for (int i = 0; i < 10; ++i)
        pool.enqueue(bind(the_work, i));
}

PS. You can use C++11 lambdas instead boost::phoenix there if you prefer.



回答2:

UPDATE My other answer addresses scheduling resources among threads (after the question was clarified).

It shows both a semaphore approach to coordinate work among (many) workers, and a thread_pool to limit workers in the first place and queue the work.

On linux (and perhaps other OSes?) you can use a lock file idiom (but it's not supported with some file-systems and old kernels).

I would suggest to use Interprocess synchronisation objects.

E.g., using a Boost Interprocess named semaphore:

#include <boost/interprocess/sync/named_semaphore.hpp>
#include <boost/thread.hpp>
#include <cassert>

int main()
{
    using namespace boost::interprocess;
    named_semaphore sem(open_or_create, "ffed38bd-f0fc-4f79-8838-5301c328268c", 0ul);

    if (sem.try_wait())
    {
        std::cout << "Oops, second instance\n";
    }
    else
    {
        sem.post();

        // feign hard work for 30s
        boost::this_thread::sleep_for(boost::chrono::seconds(30));

        if (sem.try_wait())
        {
            sem.remove("ffed38bd-f0fc-4f79-8838-5301c328268c");
        }
    }
}

If you start one copy in the back ground, new copies will "refuse" to start ("Oops, second instance") for about 30s.

I have a feeling it might be easier to reverse the logic here. Mmm. Lemme try.

some time passes

Hehe. That was more tricky than I thought.

The thing is, you want to make sure that the lock doesn't remain when your application is interrupted or killed. In the interest of sharing the techniques for portably handling the signals:

#include <boost/interprocess/sync/named_semaphore.hpp>
#include <boost/thread.hpp>
#include <cassert>
#include <boost/asio.hpp>

#define MAX_PROCESS_INSTANCES 3

boost::interprocess::named_semaphore sem(
        boost::interprocess::open_or_create, 
        "4de7ddfe-2bd5-428f-b74d-080970f980be",
        MAX_PROCESS_INSTANCES);

// to handle signals:
boost::asio::io_service service;
boost::asio::signal_set sig(service);

int main()
{

    if (sem.try_wait())
    {
        sig.add(SIGINT);
        sig.add(SIGTERM);
        sig.add(SIGABRT);
        sig.async_wait([](boost::system::error_code,int sig){ 
                std::cerr << "Exiting with signal " << sig << "...\n";
                sem.post();
            });
        boost::thread sig_listener([&] { service.run(); });

        boost::this_thread::sleep_for(boost::chrono::seconds(3));

        service.post([&] { sig.cancel(); });
        sig_listener.join();
    }
    else
    {
        std::cout << "More than " << MAX_PROCESS_INSTANCES << " instances not allowed\n";
    }
}

There's a lot that could be explained there. Let me know if you're interested.

NOTE It should be quite obvious that if kill -9 is used on your application (forced termination) then all bets are off and you'll have to either remove the Name Semaphore object or explicitly unlock it (post()).

Here's a testrun on my system:

sehe@desktop:/tmp$ (for a in {1..6}; do ./test& done; time wait)
More than 3 instances not allowed
More than 3 instances not allowed
More than 3 instances not allowed
Exiting with signal 0...
Exiting with signal 0...
Exiting with signal 0...

real    0m3.005s
user    0m0.013s
sys 0m0.012s