How many CPUs are needed before Erlang is faster t

2019-03-25 17:59发布

I am currently using Java, I've read a lot about Erlang on the net, and I have 2 big questions:

  1. How much slower (if any) will Erlang be over simple Java?
    I'm assuming here that Java is going to be faster from the shootout benchmarks on the net (Erlang doesn't do that well). So, how many more CPUs am I going to need to make Erlang shine over single-threaded Java (in my particular situation, given below)?

  2. After reading around about Erlang for a while I've hit on a number of comments/posts that say that most large Erlang systems contain a good amount of C/C++.
    Is this for speed reasons (my assumption) or something else? i.e. Why is this required?

I have read about the number of processors in most machines going up and threading models being hard (I agree) but I am looking to find out when the "line" is going to be crossed so that I can change language/paradigm at the right time.

A bit of background/context:
I am working server-side on Java services which are very CPU-bound and easily made parallel. This is due to, typically, a single incoming update (via TCP) triggering a change to multiple (100s of) outputs.

The calculations are typically pretty simple (few loops, just lots of arithmetic) and the inputs are coming in pretty fast (100/s).

Currently we are running on 4 CPU machines and running multiple services on each (so multi-threading is pretty pointless and Java seems to run faster without the sync blocks, etc required to make it multi-threaded). There is now a strong push for speed and we now have access to 24 processor machines (per process if required) so I am wondering how best to proceed - massively multi-threaded Java or something easier to code, like Erlang.

5条回答
看我几分像从前
2楼-- · 2019-03-25 18:24

It depends on several factors. The quick answer is that you will need to benchmark each differnt program to understand where that quiescence watermark is.

Here are some of the relevant aspects that could impact that benefit ratio:

1) Computational Dependencies: if the logic flow has many dependencies to external resources ( DBMS, disk access, networking ). The higher the amount of computational dependencies that are divisible in concurrent processing, the higher the benefit of adopting a distributed computation platform such as erlang.

2) Logical flow atomicity: if your program has to spend a large amount of computation time on a single sequential synchronous flow control and that cannot be broken down on smaller logical segments of code. The larger is your code atomicity, the less it can be broken down into CPU spreading flows.

3) State Sharing Overhead: the larger the amount of data that has to be distributed across various functions, the higher the overhead the framework requires to simply transmit and receive the state. In other words, if you are send large amounts of data repetitively without a common shared cache area, the benefits will decrease, although this has different approaches depending on the adopted programming patterns.

Therefore, given the vast possibilities and variations based on criteria such as the above, it is not possible to have a common estimate that is acceptable to all scenarios.

查看更多
老娘就宠你
3楼-- · 2019-03-25 18:25

The question of speed when it comes to programming languages is as complex as a question can get. Java advocates can point to a lot of areas and claim to be fastest and they would be 100% correct. Ruby/Python advocates point to a different set of parameters and claim to be faster and they would also be correct. Erlang advocates then point to concurrent connections and claim to be fastest when dealing with hundreds or thousands of concurrent connections or calculations and the would not be wrong either.

Looking at the basic description of the project in question it seems to me that Erlang would be a perfect fit for your needs. Not knowing the details I would say that this would actually be a pretty darn simple Erlang program and could be done in a very short time indeed.

查看更多
Bombasti
4楼-- · 2019-03-25 18:28

Have you compared the cost of new hardware versus the cost of retraining staff in Erlang and re-architecting your software in a new language?

I wouldn't underestimate the expense of retraining yourself (or others) and the cost of hiring people conversant in Erlang (who are going to be a lot harder to find than Java people). Servers obviously cost in terms of their storage costs / power / maintenance etc., but they're still a lot cheaper than qualified staff. If you can make progress and remain scalable whilst using your current skillsets, I suspect that's the most pragmatic approach.

查看更多
做个烂人
5楼-- · 2019-03-25 18:31

If you get 100 per second but they take 100s each how can it possibly keep up? Maybe I am misreading that part, but anyway unless it's thousands or millions of requests a second your synchronization code should not be taking long. If it is, you are doing something wrong, possibly locking while you execute the whole job or something.

For multithreaded code, going to an even higher level language is probably a mistake. Even if you write the application part in erlang or whatever the multithreading should probably be in Java or move to C++ if performance really becomes an issue.

查看更多
何必那么认真
6楼-- · 2019-03-25 18:41

since this is a arithmetic heavy workload and you have already done the job of splitting out the code into seperate service processes, you wouldn't gain much from Erlang. Your job seems to fit Java comfortably. Erlang is good at tiny transactions -- such as msg switching or serving static or simple-dynamic web-pages. Not -- inately at enterprise number-crunching or database workload.

However, you could build on external numerical libraries and databases and use Erlang as a MSG switch :D that's what couch-db does :P

-- edit --

  1. If you move your arithmetic operations into an Erlang async-IO driver erlang will be just as good as the language shoot-out stuff -- but with 24 cpu's perhaps it won't matter that much; the erlang database is procedural and thefore quite fast -- this can be exploited in your application updating 100 entities on each transaction.

  2. The erlang runtime system needs to be a mix of C and C++ because (a) the erlang emulator is written in C/C++ (you have to start somewhere), (b) you have to talk to the kernel to do async file io and network io, and (c) certain parts of the system need to be blistering fast --e.g., the backend of the database system (amnesia).

-- discussion --

with 24 CPU's in a 6 core * 4 CPU topology using a shared memory buss -- you have 4 NUMA entities (the CPUs) and one central memory. You need to be wise about the paradigm, the shared-nothing multi-process approach might kill your memory buss.

To get around this you need to create 4 processes with 6 processing threads and bind each processing thread the corresponding core in the corresponding CPU. These 6 threads need to do collaborative multi-threading -- Erlang and Lua have this innately -- Erlang does it in a hard-core way as it has a full-blown scheduler as part of its runtime which it can use to create as many processes as you want.

Now if you were to partition your tasks across the 4 processes (1 per physical CPU) you would be a happy man, however you are running 4 Java VM's doing (presumably) serious work (yuck, for many reasons). The problem needs to be solved with a better ability to slice and dice the problem.

In comes the Erlang OTP system, it was designed for redundant robust networked systems, but now it is moving towards same-machine NUMA-esque CPU's. It already has a kick-ass SMP emulator, and it will become NUMA aware as well soon. With this paradigm of programming you have a much better chance to saturate your powerful servers without killing your bus.

Perhaps this discussion has been theoretical; however, when you get a 8x8 or 16x8 topology you will be ready for it as well. So my answer is when you have more then 2 -- modern -- physical CPU's on your mainboard you should probably consider a better programming paradigm.

As an example of a major product following the discussion here: Microsoft's SQL Server is CPU-Level NUMA-aware in the SQL-OS layer on which the database engine is built.

查看更多
登录 后发表回答