Benchmark test results on the home page of Julia (http://julialang.org/) shows that Fortran is about 4x slower than Julia/Numpy in the "rand_mat_mul" benchmark.
I can not understand that why fortran is slower while calling from the same fortran library (BLAS)??
I have also performed a simple test for matrix multiplication evolving fortran, julia and numpy and got the similar results:
Julia
n = 1000; A = rand(n,n); B = rand(n,n);
@time C = A*B;
>> elapsed time: 0.069577896 seconds (7 MB allocated)
Numpy in IPython
from numpy import *
n = 1000; A = random.rand(n,n); B = random.rand(n,n);
%time C = dot(A,B);
>> Wall time: 98 ms
Fortran
PROGRAM TEST
IMPLICIT NONE
INTEGER, PARAMETER :: N = 1000
INTEGER :: I,J
REAL*8 :: T0,T1
REAL*8 :: A(N,N), B(N,N), C(N,N)
CALL RANDOM_SEED()
DO I = 1, N, 1
DO J = 1, N, 1
CALL RANDOM_NUMBER(A(I,J))
CALL RANDOM_NUMBER(B(I,J))
END DO
END DO
call cpu_time(t0)
CALL DGEMM ( "N", "N", N, N, N, 1.D0, A, N, B, N, 0.D0, C, N )
call cpu_time(t1)
write(unit=*, fmt="(a24,f10.3,a1)") "Time for Multiplication:",t1-t0,"s"
END PROGRAM TEST
gfortran test_blas.f90 libopenblas.dll -O3 & a.exe
>> Time for Multiplication: 0.296s
I have changed the timing function to system_clock() and result turns out to be (I run it five times in one program)
It is approximate as Numpy, but still about 20% slower than Julia.