Sparse Storage in Fortran: Only Reading and Writin

2019-08-27 21:35发布

问题:

I have an array with multiple dimensions (the goal is to allow for about 100) and each dimension has a size of about 2^10 and I only need to store in it about 1000 double precision coefficients. I don't need to do any operation with this array aside from reading and writing into it. The code is written in Fortran 90.

I assume that if I a library like one of the ones mentioned in this answer I would be able to store the do this, but would this be optimized for the simple reading and writing operations? Is there a library that would be most efficient for that purpose?

Edit: By "simple reading and writing operations" I mean the following. Suppose

REAL(8), DIMENSION(1000)   :: coeff1
INTEGER, DIMENSION(1000,5) :: index

I want to define coeff2 to store the values in coeff1 and then read itat the indices in index, that is

DO i = 1,1000
     index(i,:) = [something]
     coeff1(i)  = [another something]
     coeff2(index(i,1),index(i,2),index(i,3),index(i,4),index(i,5)) = coeff1(i)
ENDDO

Then, for any i I would like to access the value of

coeff2(index(i,1),index(i,2),index(i,3),index(i,4),index(i,5))

as quickly as possible. Being able to do this fast is what I mean by "efficient".

Since the indices in [something] are at most 2^10 I am currently defining coeff2 as follows:

REAL(8), DIMENSION(2**10,2**10,2**10,2**10,2**10) :: coeff2

but this is too wasteful of memory specially since I need to increase the number of dimensions, now 5, to the order of 100 and most elements of this array are equal to 0. So, another measure of efficiency that is relevant to me is that the memory necessary to store coeff2 should not explode as I increase the number of dimensions.

回答1:

Well, It's still not totally clear to me the nature of your data and the way you want to use it.

If what you need is indexed data, whose indices are not consecutive, Sparse matrix can be an answer, and there are many solutions already implemented over the internet (as shown in the link you provided). But maybe it would be overkill for what I think you are trying to do. Maybe a simple datatype could serve your purpose, like this:

program indexed_values
  implicit none

  type :: indexed
    integer :: index
    real(8) :: value
  end type

  integer, parameter :: n_coeffs = 1000
  integer, parameter :: n_indices = 5
  integer :: i

  real(8), dimension(n_coeffs) :: coeff1
  integer, dimension(n_coeffs, n_indices) :: index
  type(indexed), dimension(n_coeffs, n_indices) :: coeff2
  type(indexed) :: var

  do i = 1, n_coeffs
    index(i, :) = [1, 2, 4, 16, 32] * i ! your calc here
    coeff1(i) = real(i * 3, 8) ! more calc here
    coeff2(i, :)%index = index(i, :)
    coeff2(i, :)%value = coeff1(i)
  end do

  ! that's how you fetch the indices and values by stored position
  var = coeff2(500, 2)
  print*, var%index, var%value ! outputs: 1000   1500.0

  ! that's how you fetch a value by its index
  print*, fetch_by_index(coeff2(500, :), 1000) ! outputs: 1500.0

contains

  real(8) function fetch_by_index(indexed_pairs, index)
    type(indexed), dimension(:) :: indexed_pairs
    integer, intent(in) :: index
    integer :: i

    do i=1, size(indexed_pairs)
      if(index == indexed_pairs(i)%index) then
        fetch_by_index = indexed_pairs(i)%value
        return
      end if
    end do
    stop "No value stored for this index"
  end
end

The provided function for fetching values by its indices could be improved if your indices will be alwyas stored in ascending order (no need to traverse the whole list to fail). Moreover, if you will assing a constant result of coeff1 to all the indices at each row, you could do even better and just not having a coeff2 array at all, just have coeff1 for values and index for the indices, and correlate them by position.