Multithreaded execution where order of finished Wo

2020-05-19 04:37发布

I have a flow of units of work, lets call them "Work Items" that are processed sequentially (for now). I'd like to speed up processing by doing the work multithreaded.

Constraint: Those work items come in a specific order, during processing the order is not relevant - but once processing is finished the order must be restored.

Something like this:

   |.|
   |.|
   |4|
   |3|
   |2|    <- incoming queue
   |1|
  / | \
 2  1  3  <- worker threads
  \ | /
   |3|
   |2|    <- outgoing queue
   |1|

I would like to solve this problem in Java, preferably without Executor Services, Futures, etc., but with basic concurrency methods like wait(), notify(), etc.

Reason is: My Work Items are very small and fine grained, they finish processing in about 0.2 milliseconds each. So I fear using stuff from java.util.concurrent.* might introduce way to much overhead and slow my code down.

The examples I found so far all preserve the order during processing (which is irrelevant in my case) and didn't care about order after processing (which is crucial in my case).

9条回答
我命由我不由天
2楼-- · 2020-05-19 05:09

Pump all your Futures through a BlockingQueue. Here's all the code you need:

public class SequentialProcessor implements Consumer<Task> {
    private final ExecutorService executor = Executors.newCachedThreadPool();
    private final BlockingDeque<Future<Result>> queue = new LinkedBlockingDeque<>();

    public SequentialProcessor(Consumer<Result> listener) {
        new Thread(() -> {
            while (true) {
                try {
                    listener.accept(queue.take().get());
                } catch (InterruptedException | ExecutionException e) {
                    // handle the exception however you want, perhaps just logging it
                }
            }
        }).start();
    }

    public void accept(Task task) {
        queue.add(executor.submit(callableFromTask(task)));
    }

    private Callable<Result> callableFromTask(Task task) {
        return <how to create a Result from a Task>; // implement this however
    }
}

Then to use, create a SequentialProcessor (once):

SequentialProcessor processor = new SequentialProcessor(whatToDoWithResults);

and pump tasks to it:

Stream<Task> tasks; // given this

tasks.forEach(processor); // simply this

I created the callableFromTask() method for illustration, but you can dispense with it if getting a Result from a Task is simple by using a lambda instead or method reference instead.

For example, if Task had a getResult() method, do this:

queue.add(executor.submit(task::getResult));

or if you need an expression (lambda):

queue.add(executor.submit(() -> task.getValue() + "foo")); // or whatever
查看更多
Root(大扎)
3楼-- · 2020-05-19 05:14

If you allow BlockingQueue, why would you ignore the rest of the concurrency utils in java? You could use e.g. Stream (if you have java 1.8) for the above:

List<Type> data = ...;
List<Other> out = data.parallelStream()
    .map(t -> doSomeWork(t))
    .collect(Collectors.toList());

Because you started from an ordered Collection (List), and collect also to a List, you will have results in the same order as the input.

查看更多
女痞
4楼-- · 2020-05-19 05:14

Just ID each of the objects for processing, create a proxy which would accept done work and allow to return it only when the ID pushed was sequential. A sample code below. Note how simple it is, utilizing an unsynchronized auto-sorting collection and just 2 simple methods as API.

public class SequentialPushingProxy {

    static class OrderedJob implements Comparable<OrderedJob>{
        static AtomicInteger idSource = new AtomicInteger();
        int id;

        public OrderedJob() {
            id = idSource.incrementAndGet();
        }

        public int getId() {
            return id;
        }

        @Override
        public int compareTo(OrderedJob o) {
            return Integer.compare(id, o.getId());
        }
    }

    int lastId = OrderedJob.idSource.get();

    public Queue<OrderedJob> queue;

    public SequentialPushingProxy() {
        queue = new PriorityQueue<OrderedJob>();
    }

    public synchronized void pushResult(OrderedJob job) {
        queue.add(job);
    }

    List<OrderedJob> jobsToReturn = new ArrayList<OrderedJob>();
    public synchronized List<OrderedJob> getFinishedJobs() {
        while (queue.peek() != null) {
            // only one consumer at a time, will be safe
            if (queue.peek().getId() == lastId+1) {
                jobsToReturn.add(queue.poll());
                lastId++;
            } else {
                break;
            }
        }
        if (jobsToReturn.size() != 0) {
            List<OrderedJob> toRet = jobsToReturn;
            jobsToReturn = new ArrayList<OrderedJob>();
            return toRet;
        }
        return Collections.emptyList();
    }

    public static void main(String[] args) {
        final SequentialPushingProxy proxy = new SequentialPushingProxy();

        int numProducerThreads = 5;

        for (int i=0; i<numProducerThreads; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while(true) {
                        proxy.pushResult(new OrderedJob());
                    }
                }
            }).start();
        }


        int numConsumerThreads = 1;

        for (int i=0; i<numConsumerThreads; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while(true) {
                        List<OrderedJob> ret = proxy.getFinishedJobs();
                        System.out.println("got "+ret.size()+" finished jobs");
                        try {
                            Thread.sleep(200);
                        } catch (InterruptedException e) {
                            // TODO Auto-generated catch block
                            e.printStackTrace();
                        }
                    }
                }
            }).start();
        }


        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        System.exit(0);
    }

}

This code could be easily improved to

  • allow pushing more than one job result at once, to reduce the synchronization costs
  • introduce a limit to returned collection to get done jobs in smaller chunks
  • extract an interface for those 2 public methods and switch implementations to perform tests
查看更多
【Aperson】
5楼-- · 2020-05-19 05:14

Preprocess: add an order value to each item, prepare an array if it is not allocated.

Input: queue (concurrent sampling with order values 1,2,3,4 but doesnt matter which tread gets which sample)

Output: array (writing to indexed elements, using a synch point to wait for all threads in the end, doesn't need collision checks since it writes different positions for every thread)

Postprocess: convert array to a queue.

Needs n element-array for n-threads. Or some multiple of n to do postprocessing only once.

查看更多
smile是对你的礼貌
6楼-- · 2020-05-19 05:16

You could have 3 input and 3 output queues - one of each type for each worker thread.

Now when you want to insert something into the input queue you put it into only one of the 3 input queues. You change the input queues in a round robin fashion. The same applies to the output, when you want to take something from the output you choose the first of the output queues and once you get your element you switch to the next queue.

All the queues need to be blocking.

查看更多
够拽才男人
7楼-- · 2020-05-19 05:16

Reactive programming could help. During my brief experience with RxJava I found it to be intuitive and easy to work with than core language features like Future etc. Your mileage may vary. Here are some helpful starting points https://www.youtube.com/watch?v=_t06LRX0DV0

The attached example also shows how this could be done. In the example below we have Packet's which need to be processed. They are taken through a simple trasnformation and fnally merged into one list. The output appended to this message shows that the Packets are received and transformed at different points in time but in the end they are output in the order they have been received

import static java.time.Instant.now;
import static rx.schedulers.Schedulers.io;

import java.time.Instant;
import java.util.List;
import java.util.Random;

import rx.Observable;
import rx.Subscriber;

public class RxApp {

  public static void main(String... args) throws InterruptedException {

    List<ProcessedPacket> processedPackets = Observable.range(0, 10) //
        .flatMap(i -> {
          return getPacket(i).subscribeOn(io());
        }) //
        .map(Packet::transform) //
        .toSortedList() //
        .toBlocking() //
        .single();

    System.out.println("===== RESULTS =====");
    processedPackets.stream().forEach(System.out::println);
  }

  static Observable<Packet> getPacket(Integer i) {
    return Observable.create((Subscriber<? super Packet> s) -> {
      // simulate latency
      try {
        Thread.sleep(new Random().nextInt(5000));
      } catch (Exception e) {
        e.printStackTrace();
      }
      System.out.println("packet requested for " + i);
      s.onNext(new Packet(i.toString(), now()));
      s.onCompleted();
    });
  }

}


class Packet {
  String aString;
  Instant createdOn;

  public Packet(String aString, Instant time) {
    this.aString = aString;
    this.createdOn = time;
  }

  public ProcessedPacket transform() {
    System.out.println("                          Packet being transformed " + aString);
    try {
      Thread.sleep(new Random().nextInt(5000));
    } catch (Exception e) {
      e.printStackTrace();
    }
    ProcessedPacket newPacket = new ProcessedPacket(this, now());
    return newPacket;
  }

  @Override
  public String toString() {
    return "Packet [aString=" + aString + ", createdOn=" + createdOn + "]";
  }
}


class ProcessedPacket implements Comparable<ProcessedPacket> {
  Packet p;
  Instant processedOn;

  public ProcessedPacket(Packet p, Instant now) {
    this.p = p;
    this.processedOn = now;
  }

  @Override
  public int compareTo(ProcessedPacket o) {
    return p.createdOn.compareTo(o.p.createdOn);
  }

  @Override
  public String toString() {
    return "ProcessedPacket [p=" + p + ", processedOn=" + processedOn + "]";
  }

}

Deconstruction

Observable.range(0, 10) //
    .flatMap(i -> {
      return getPacket(i).subscribeOn(io());
    }) // source the input as observables on multiple threads


    .map(Packet::transform) // processing the input data 

    .toSortedList() // sorting to sequence the processed inputs; 
    .toBlocking() //
    .single();

On one particular run Packets were received in the order 2,6,0,1,8,7,5,9,4,3 and processed in order 2,6,0,1,3,4,5,7,8,9 on different threads

packet requested for 2
                          Packet being transformed 2
packet requested for 6
                          Packet being transformed 6
packet requested for 0
packet requested for 1
                          Packet being transformed 0
packet requested for 8
packet requested for 7
packet requested for 5
packet requested for 9
                          Packet being transformed 1
packet requested for 4
packet requested for 3
                          Packet being transformed 3
                          Packet being transformed 4
                          Packet being transformed 5
                          Packet being transformed 7
                          Packet being transformed 8
                          Packet being transformed 9
===== RESULTS =====
ProcessedPacket [p=Packet [aString=2, createdOn=2016-04-14T13:48:52.060Z], processedOn=2016-04-14T13:48:53.247Z]
ProcessedPacket [p=Packet [aString=6, createdOn=2016-04-14T13:48:52.130Z], processedOn=2016-04-14T13:48:54.208Z]
ProcessedPacket [p=Packet [aString=0, createdOn=2016-04-14T13:48:53.989Z], processedOn=2016-04-14T13:48:55.786Z]
ProcessedPacket [p=Packet [aString=1, createdOn=2016-04-14T13:48:54.109Z], processedOn=2016-04-14T13:48:57.877Z]
ProcessedPacket [p=Packet [aString=8, createdOn=2016-04-14T13:48:54.418Z], processedOn=2016-04-14T13:49:14.108Z]
ProcessedPacket [p=Packet [aString=7, createdOn=2016-04-14T13:48:54.600Z], processedOn=2016-04-14T13:49:11.338Z]
ProcessedPacket [p=Packet [aString=5, createdOn=2016-04-14T13:48:54.705Z], processedOn=2016-04-14T13:49:06.711Z]
ProcessedPacket [p=Packet [aString=9, createdOn=2016-04-14T13:48:55.227Z], processedOn=2016-04-14T13:49:16.927Z]
ProcessedPacket [p=Packet [aString=4, createdOn=2016-04-14T13:48:56.381Z], processedOn=2016-04-14T13:49:02.161Z]
ProcessedPacket [p=Packet [aString=3, createdOn=2016-04-14T13:48:56.566Z], processedOn=2016-04-14T13:49:00.557Z]
查看更多
登录 后发表回答