What “thread safe” really means…In Practical terms

2020-05-13 13:21发布

问题:

please bear with my newbie questions..

I was trying to convert PDF to PNG using ghostscript, with ASP.NET and C#. However, I also read that ghostscript is not thread safe. So my questions are:

  1. What exactly does "ghostscript is not thread safe" mean in practical terms? What impact does it have if I use it in a live ASP.NET(aspx) web application with many concurrent users accessing it at the same time?

  2. I also read from another site that the major feature of ghostscript ver. 8.63 is multithreaded rendering. Does this mean our thread safe issue is now resolved? Is ghostscript thread safe now?

  3. I am also evaluating PDF2Image from PDFTron, which is supposed to be thread safe. But the per CPU license doesn't come cheap. Is it worth paying the extra money for "thread safe" vs "not safe"?

回答1:

Given that a Collection, for instance, is not threasafe:

var myDic = new Dictionary<string, string>();

In a multhread environment, this will throw:

string s = null;
if (!myDic.TryGetValue("keyName", out s)) {
    s = new string('#', 10);
    myDic.Add("keyName", s);
}

As one thread is working trying to add the KeyValuePair to the dictionary myDic, another one may TryGetValue(). As Collections can't be read and written at the same time, an Exception will occur.

However, on the other hand, if you try this:

// Other threads will wait here until the variable myDic gets unlocked from the preceding thread that has locked it.
lock (myDic) {
    string s = null;
    if (!myDic.TryGetValue("keyName", out s)) {
        s = new string('#', 10);
        myDic.Add("keyName", s);
    }
} // The first thread that locked the myDic variable will now release the lock so that other threads will be able to work with the variable.

Then suddenly, the second thread trying to get the same "keyName" key value will not have to add it to the dictionary as the first thread already added it.

So in short, threadsafe means that an object supports being used by multiple threads at the same time, or will lock the threads appropriately for you, without you having to worry about threadsafety.

2. I don't think GhostScript is now threadsafe. It is majorly using multiple threads to perform its tasks, so this makes it deliver a greater performance, that's all.

3. Depending on your budget and your requirements, it may be worthy. But if you build around wrapper, you could perhaps only lock() where it is convenient to do so, or if you do not use multithreading yourself, it is definitely not worth to pay for threadsafety. This means only that if YOUR application uses multithreading, then you will not suffer the consequences of a library not being threadsafe. Unless you really multihread, it is not worth paying for a threadsafe library.



回答2:

A precise technical definition that everyone agrees on is difficult to come up with.

Informally, "thread safe" simply means "is reasonably well-behaved when called from multiple threads". The object will not crash or produce crazy results when called from multiple threads.

The question you actually need to get answered if you intend to do multi-threaded programming involving a particular object is "what is the threading model expected by the object?"

There are a bunch of different threading models. For example, the "free threaded" model is "do whatever you want from any thread; the object will deal with it." That's the easiest model for you to deal with, and the hardest for the object provider to provide.

On the other end of the spectrum is the "single threaded" model -- all instances of all objects must be accessed from a single thread, period.

And then there's a bunch of stuff in the middle. The "apartment threaded" model is "you can create two instances on two different threads, but whatever thread you use to create an instance is the thread you must always use to call methods on that instance".

The "rental threaded" model is "you can call one instance on two different threads, but you are responsible for ensuring that no two threads are ever doing so at the same time".

And so on. Find out what the threading model your object expects before you attempt to write threading code against it.



回答3:

I am a Ghostscript developer, and won't repeat the general theory about thread safety.
We have been working on getting GS to be thread safe so that multiple 'instances' can be created using gsapi_new_instance from within a single process, but we have not yet completed this to our satisfaction (which includes our QA testing of this).
The graphics library is, however, thread safe and the multi-threaded rendering relies on this to allow us to spawn multiple threads to render bands from a display list in parallel. The multi-threaded rendering has been subjected to a lot of QA testing and is used by many commercial licensees to improve performance on multi-core CPU's.

You can bet we will announce when we finally support multiple instances of GS. Most people that want to use current GS from applications that need multiple instances spawn separate processes for each instance so that GS doesn't need to be thread safe. The GS can run a job as determined by the argument list options or I/O can be piped to/from the process to provide data and collect output.



回答4:

1) It means if you share the same Ghostscript objects or fields among multiple threads, it will crash. For example:

private GhostScript someGSObject = new GhostScript();
...
// Uh oh, 2 threads using shared memory. This can crash!
thread1.Use(someGSObject);
thread2.Use(someGSObject);

2) I don't think so - multithreaded rendering suggests GS is internally using multiple threads to render. It doesn't address the problem of GS being unsafe for use from multiple threads.

3) Is there a question in there?

To make GhostScript thread safe, make sure only 1 thread at a time is accessing it. You can do this via locks:

lock(someObject)
{
   thread1.Use(someGSObject);
}
lock(someObject)
{
   thread2.Use(someGSObject);
}


回答5:

If you are using ghostscript from a shell object (i.e. running a command line to process the file) you will not be caught by threading problems because every instance running will in a different process on the server. Where you need to be careful is when you have a dll that you are using from C# to process the PDF, that code would need to be synchronized to keep from two threads from executing the same code at the same time.



回答6:

  1. Thread safe basically means that a piece of code will function correctly even when accessed by multiple threads. Multiple problems can occur if you use non-thread safe code in a threaded application. The most common problem is deadlocking. However, there are much more nefarious problems (race conditions) which can be more of a problem because thread issues are notoriously difficult to debug.

  2. No. Multithreaded rendering just means that GS will be able to render faster because it is using threads to render (in theory, anyway - not always true in practice).

  3. That really depends on what you want to use your renderer for. If you are going to be accessing your application with multiple threads, then, yes, you'll need to worry about it being thread safe. Otherwise, it's not a big deal.



回答7:

In general it is an ambiguous term.

Thread-Safety could be at the conceptual level, where you have correct synchronization of your shared data. This is usually, what is meant by library writers.

Sometimes, it means concurrency is defined at the language level. i.e. the memory model of the language supports concurrency. This is tricky! because as a library writer you can't produce concurrent libraries, because the language have no guarantees for many essential primitives that are needed to use. This concerns compiler writers more than library users. C# is thread-safe in that sense.

I know I didn't answer your question directly, but hope that helps.