Maximum Array of Strings VisualBasic WSH

2019-02-19 09:14发布

问题:

I'm writing a WSH script in VB to read a massive directory listing generated with a redirected directory listing via .Run method.

The directory listing is about 8400 lines, but every time I run the script, the following loop

Do Until DirList.AtEndOfStream Redim Preserve arrData(i) arrData(i) = DirList.ReadLine i = i + 1 Loop

cuts out early, in a seemingly random range of 1800 to 3500 lines. Does this sound like an array size issue or a shell memory limit?

I have heard of people parsing LARGE log files, reading them all in at once like I have.

回答1:

Would it not be better, in this instance, to cycle through the file first and count the number of lines, then Redim the array to the exact size required. Then close the file open it again this time you actually assign the lines to the array elements?



回答2:

The size of arrays in VBScript is limited by a few different things, whichever comes first:

  • A maximum of (2 ^ 31) - 1 elements (because the number of elements is stored internally as a Long value, and because there is no larger data type available to use as an indexer).

  • A maximum of 60 dimensions.

  • Available system memory.

Most of these limits are of no use beyond purely theoretical exercise, however, and I'd be very suspicious of any code written that had to be concerned with them.

Because you say that you only have 8400 lines to process, I doubt you're running into the theoretical limits placed on the size of an array. Instead, the biggest problem with your code is that you're using Redim Preserve inside of a loop.

The MSDN reference explains that Redim Preserve is used to resize the last dimension of the array dynamically, while preserving its existing contents. What it doesn't necessarily mention is how it works. Each time you use Redim Preserve, a new array is created with the number of elements that you specify, and the values of the elements in the previous array are copied into the new one. This should be sending up red flags immediately, because it means that on each iteration of your loop, you're allocating space for and filling an entirely new array. The problem only gets worse the more iterations you've made in the loop, because the size of each new array that is created is growing incrementally larger.

Thus, it's more likely that you're overflowing the stack space that VBScript allocates for local variables. (How appropriate—a stack overflow error?) Eventually those unused arrays will be garbage collected, but you're putting a giant amount of pressure on memory and resources when you do this in a tight loop.

You are far better off simply allocating enough space in your array to hold all of the directory listings that you'll need to hold. You don't necessarily have to get the maximum size exactly right. Simply allocating more than you'll need is still far cheaper than continually creating and destroying new arrays. If you're still concerned that this won't be enough, you can check in the loop if the current index is greater than the maximum amount of elements in the array, and if so, allocate a lot more space (by, say, doubling its current size). After you get finished, you can deallocate the excess space while retaining the good data using the Redim Preserve command, but this time only once! For a rough knock-up example:

Dim MyData()                           ''#declare the array
Dim MaxSize = 10000                    ''#guess an initial size
Redim MyData(MaxSize)                  ''#allocate an array of the initial size

Dim Size
Do Until DirList.AtEndOfStream
    Size = Size + 1                    ''#record the current index

    If Size > MaxSize Then
        ''#The array is full, so allocate more space
        MaxSize = MaxSize * 2          ''#determine a new size (doubling is a good guess)
        Redim Preserve MyData(MaxSize) ''#add more space in the array
    End If

    MyData(Size) = DirList.ReadLine    ''#store this data in the array
Loop

If Size < MaxSize Then
    Redim Preserve MyData(Size)        ''#deallocate the extra space in your array
End If

This just might save your memory from the abuse of the constant allocation and deallocation of arrays.



回答3:

Using ReDim Preserve inside of a loop is a fine practice. It's the only method of dynamically resizing an array in VBScript and Microsoft's own sample codes do it all of the time. I've personally done with arrays well into the tens of thousands of iterations without problem.

The problem you are encountering is that you are running out of system resources. While a new array is allocated on each iteration, the old one is released. The problem is nominal and in most cases completely negligible.

You have to keep in mind that your scripts are executed within an executable environment. In the case of the WSH, this means that all of your actions are performed in a single thread. The WSH does not provide any methods of managing memory usage within your scripts.

The best advice I can give you is to limit the number of iterations or read the input file in chunks (releasing them on each iteration). Without seeing the file that causes the error, or having the actual error message you are receiving, I can't give any more direct advice. I can only say that this situation does not arise very often and almost always point to a poor machine configuration or poorly written code.