I have the following functions to generate md5-hashes for files. The functions work great for small files, but crashes and generate Run-time error 7 - Out of memory when I try to hash files over ~250 MB (I don't actually know at which exact size it breaks, but files below 200 MB work fine).
I don't understand why it breaks at a certain size, so if anyone could shed some light on that I would appreciate it a lot.
Also, is there anything I can do to make the functions handle larger files? I intend to use the functions in a larger tool where I will need to generate hashes for files of unknown sizes. Most will be small enough for the current functions to work, but I will have to be able to handle large files as well.
I got my current functions from the most upvoted answer this post How to get the MD5 hex hash for a file using VBA?
Public Function FileToMD5Hex(ByVal strFileName As String) As String
Dim varEnc As Variant
Dim varBytes As Variant
Dim strOut As String
Dim intPos As Integer
Set varEnc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
'Convert the string to a byte array and hash it
varBytes = GetFileBytes(strFileName)
varBytes = varEnc.ComputeHash_2((varBytes))
'Convert the byte array to a hex string
For intPos = 1 To LenB(varBytes)
strOut = strOut & LCase(Right("0" & Hex(AscB(MidB(varBytes, intPos, 1))), 2))
Next
FileToMD5Hex = strOut
Set varEnc = Nothing
End Function
Private Function GetFileBytes(ByVal strPath As String) As Byte()
Dim lngFileNum As Long
Dim bytRtnVal() As Byte
lngFileNum = FreeFile
'If file exists, get number of bytes
If LenB(Dir(strPath)) Then
Open strPath For Binary Access Read As lngFileNum
ReDim bytRtnVal(LOF(lngFileNum)) As Byte
Get lngFileNum, , bytRtnVal
Close lngFileNum
Else
MsgBox "Filen finns inte" & vbCrLf & "Avbryter", vbCritical, "Filen hittades inte"
Exit Function
End If
GetFileBytes = bytRtnVal
Erase bytRtnVal
End Function
Thank you
It looks like you reached the memory limit.
A better way would be to compute the MD5 of the file by block:
Public Function ComputeMD5(filepath As String) As String
Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i&
blockSize = 2 ^ 16
' open the file '
If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath
hFile = FreeFile
Open filepath For Binary Access Read As hFile
' allocate buffer '
If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
' compute hash '
Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
For i = 1 To LOF(hFile) \ blockSize
Get hFile, , buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next
Get hFile, , buffer
svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize
buffer = svc.Hash
' cleanup '
svc.Clear
Close hFile
' convert to an hexa string '
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
This is an extension to FlorentB's answer, which worked brilliantly for me until my files surpassed the 2GB LOF() size limit.
I tried to adapt for getting file length by alternate means as follows:
Public Function ComputeMD5(filepath As String) As String
If Len(Dir(filepath)) Then Else Err.Raise 5, , "File not found." & vbCr & filepath
Dim blockSize As Long: blockSize = 2 ^ 20
Dim blockSize_f As Double
Dim buffer() As Byte
Dim fileLength As Variant
Dim hFile As Integer
Dim n_Reads As Long
Dim i As Long
Dim svc As Object: Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
fileLength = DecGetFileSize(filepath)
If fileLength < blockSize Then blockSize = ((fileLength + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
n_Reads = fileLength / blockSize
blockSize_f = fileLength - (CDbl(blockSize) * n_Reads)
hFile = FreeFile
Open filepath For Binary Access Read As hFile
For i = 1 To n_Reads
Get hFile, i, buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next i
Get hFile, i, buffer
svc.TransformFinalBlock buffer, 0, blockSize_f
buffer = svc.Hash
svc.Clear
Close hFile
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
Public Function DecGetFileSize(fname As String) As Variant
Dim fso As New FileSystemObject
Dim f: Set f = fso.GetFile(fname)
DecGetFileSize = CDec(f.Size)
Set f = Nothing
Set fso = Nothing
End Function
This all runs fine, returning a string, however that string does not equal the MD5 calculated using other tools on the same file.
I can't work out where the discrepancy is originating.
I've checked and double checked filelength, n_reads, blockSize and blockSize_f and I'm sure those values are all correct.
I had some trouble with the Get function, where if I didn't explicitly tell it the block number, it dies at block 2048.
Any ideas / pointers would be much appreciated.