In Fortran 90, what is a good way to write an arra

2019-01-23 06:58发布

问题:

I am new to Fortran, and I would like to be able to write a two-dimensional array to a text file, in a row-wise manner (spaces between columns, and each row on its own line). I have tried the following, and it seems to work in the following simple example:

PROGRAM test3
  IMPLICIT NONE

  INTEGER :: i, j, k, numrows, numcols
  INTEGER, DIMENSION(:,:), ALLOCATABLE :: a

  numrows=5001
  numcols=762
  ALLOCATE(a(numrows,numcols))
  k=1
  DO i=1,SIZE(a,1)
    DO j=1,SIZE(a,2)
      a(i,j)=k
      k=k+1
    END DO
  END DO

  OPEN(UNIT=12, FILE="aoutput.txt", ACTION="write", STATUS="replace")
  DO i=1,numrows
    WRITE(12,*) (a(i,j), j=1,numcols)
  END DO
END PROGRAM test3

As I said, this seems to work fine in this simple example: the resulting text file, aoutput.txt, contains the numbers 1-762 on line 1, numbers 763-1524 on line 2, and so on.

But, when I use the above ideas (i.e., the last fifth-to-last, fourth-to-last, third-to-last, and second-to-last lines of code above) in a more complicated program, I run into trouble; each row is delimited (by a new line) only intermittently, it seems. (I have not posted, and probably will not post, here my entire complicated program/script--because it is rather long.) The lack of consistent row delimiters in my complicated program/script probably suggests another bug in my code, not with the four-line write-to-file routine above, since the above simple example appears to work okay. Still, I am wondering, can you please help me think if there is a better row-wise write-to-text file routine that I should be using?

Thank you very much for your time. I really appreciate it.

回答1:

There's a few issues here.

The fundamental one is that you shouldn't use text as a data format for sizable chunks of data. It's big and it's slow. Text output is good for something you're going to read yourself; you aren't going to sit down with a printout of 3.81 million integers and flip through them. As the code below demonstrates, the correct text output is about 10x slower, and 50% bigger, than the binary output. If you move to floating point values, there are precision loss issues with using ascii strings as a data interchange format. etc.

If your aim is to interchange data with matlab, it's fairly easy to write the data into a format matlab can read; you can use the matOpen/matPutVariable API from matlab, or just write it out as an HDF5 array that matlab can read. Or you can just write out the array in raw Fortran binary as below and have matlab read it.

If you must use ascii to write out huge arrays (which, as mentioned, is a bad and slow idea) then you're running into problems with default record lengths in list-drected IO. Best is to generate at runtime a format string which correctly describes your output, and safest on top of this for such large (~5000 character wide!) lines is to set the record length explicitly to something larger than what you'll be printing out so that the fortran IO library doesn't helpfully break up the lines for you.

In the code below,

  WRITE(rowfmt,'(A,I4,A)') '(',numcols,'(1X,I6))'

generates the string rowfmt which in this case would be (762(1X,I6)) which is the format you'll use for printing out, and the RECL option to OPEN sets the record length to be something bigger than 7*numcols + 1.

PROGRAM test3
  IMPLICIT NONE

  INTEGER :: i, j, k, numrows, numcols
  INTEGER, DIMENSION(:,:), ALLOCATABLE :: a
  CHARACTER(LEN=30) :: rowfmt
  INTEGER :: txtclock, binclock
  REAL    :: txttime, bintime

  numrows=5001
  numcols=762
  ALLOCATE(a(numrows,numcols))
  k=1
  DO i=1,SIZE(a,1)
    DO j=1,SIZE(a,2)
      a(i,j)=k
      k=k+1
    END DO
  END DO

  CALL tick(txtclock)
  WRITE(rowfmt,'(A,I4,A)') '(',numcols,'(1X,I6))'
  OPEN(UNIT=12, FILE="aoutput.txt", ACTION="write", STATUS="replace", &
       RECL=(7*numcols+10))
  DO i=1,numrows
    WRITE(12,FMT=rowfmt) (a(i,j), j=1,numcols)
  END DO
  CLOSE(UNIT=12)
  txttime = tock(txtclock)

  CALL tick(binclock)
  OPEN(UNIT=13, FILE="boutput.dat", ACTION="write", STATUS="replace", &
       FORM="unformatted")
  WRITE(13) a
  CLOSE(UNIT=13)
  bintime = tock(binclock)

  PRINT *, 'ASCII  time = ', txttime
  PRINT *, 'Binary time = ', bintime

CONTAINS

    SUBROUTINE tick(t)
        INTEGER, INTENT(OUT) :: t

        CALL system_clock(t)
    END SUBROUTINE tick

    ! returns time in seconds from now to time described by t
    REAL FUNCTION tock(t)
        INTEGER, INTENT(IN) :: t
        INTEGER :: now, clock_rate

        call system_clock(now,clock_rate)

        tock = real(now - t)/real(clock_rate)
    END FUNCTION tock
END PROGRAM test3


回答2:

This may be a very roundabout and time-consuming way of doing it, but anyway... You could simply print each array element separately, using advance='no' (to suppress insertion of a newline character after what was being printed) in your write statement. Once you're done with a line you use a 'normal' write statement to get the newline character, and start again on the next line. Here's a small example:

program testing

implicit none

integer :: i, j, k

k = 1

do i=1,4
   do j=1,10
      write(*, '(I2,X)', advance='no') k
      k = k + 1
   end do
   write(*, *) ''  ! this gives you the line break
end do

end program testing

When you run this program the output is as follows:

 1  2  3  4  5  6  7  8  9 10  
11 12 13 14 15 16 17 18 19 20  
21 22 23 24 25 26 27 28 29 30  
31 32 33 34 35 36 37 38 39 40


回答3:

Using an "*" is list-directed IO -- Fortran will make the decisions for you. Some behaviors aren't specified. You could gain more control using a format statement. If you wanted to positively identify row boundaries you write a marker symbol after each row. Something like:

  DO i=1,numrows
    WRITE(12,*) a(i,:)
    write (12, '("X")' )
  END DO

Addendum several hours later:

Perhaps with large values of numcols the lines are too long for some programs that are you using to examine the file? For the output statement, try:

WRITE(12, '( 10(2X, I11) )' ) a(i,:)

which will break each row of the matrix, if it has more than 10 columns, into multiple, shorter lines in the file.