I've recently come across Chapel and I'm very keen to try it out. I have a two-fold problem I'm hoping it can solve.
I typically work in Python or C++. Java when backed into a corner.
I have two matrices I
and V
. Both are sparse and of dimension about 600K x 600K, populated at about 1% density.
First, using SciPy, I can load both from a SQL database into memory at the moment. However, I expect our next iteration will be simply too large for our machines. Perhaps 1.5M^2. In a case like that, RDDs from Spark may work for the load. I wasn't able to get PyTables to make this happen. I understand this is described as an "Out-of-core" problem.
Even if they do get loaded, doing I'IV
goes OOM in minutes. (Here I'
is transpose), so I'm looking into distributing this multiplication over multiple cores (which SciPy can do) and multiple machines (which it cannot, so far as I know). Here, Spark falls down but Chapel appears to answer my prayers, so-to-speak.
A serious limitation is budget on machines. I can't afford a Cray, for instance. Does the Chapel community have a pattern for this?
Starting with a few high-level points:
In more detail:
The following program creates a Block-distributed dense array:
For example, when run on 6 nodes (
./myProgram -nl 6
), the output is:Note that running a Chapel program on multiple nodes requires configuring it to use multiple locales. Such programs can be run on clusters or networked workstations in addition to Crays.
Here's a program that declares a distributed sparse array:
Running on six locales gives:
In both the examples above, the forall loops will compute on the distributed arrays / indices using multiple nodes in an owner-computes fashion, and using the multiple cores per node to do the local work.
Now for some caveats:
Distributed sparse array support is still in its infancy as of Chapel 1.15.0, as most of the project's focus on distributed memory to date has been on task parallelism and distributed dense arrays. A paper+talk from Berkeley in this year's annual Chapel workshop, "Towards a GraphBLAS Library in Chapel" highlighted several performance and scalability issues, some of which have since been fixed on the master branch, others of which still require attention. Feedback and interest from users in such features is the best way to accelerate improvements in these areas.
As mentioned at the outset, Linear Algebra libraries are a work-in-progress for Chapel. Past releases have added Chapel modules for BLAS and LAPACK. Chapel 1.15 included the start of a higher-level LinearAlgebra library. But none of these support distributed arrays at present (BLAS and LAPACK by design, LinearAlgebra because it's still early days).
Chapel does not have an SQL interface (yet), though a few community members have made rumblings about adding such support. It may also be possible to use Chapel's I/O features to read the data in some textual or binary format. Or, you could potentially use Chapel's interoperability features to interface with a C library that could read the SQL.