FileSystemObject - Reading Unicode Files

2019-04-06 07:34发布

Classic ASP, VBScript context.

A lot of articles including this Microsoft one, say you cannot use FileSystemObject to read Unicode files.

I encountered this issue a while back, so switched to using ADODB.Stream instead, per the ReadText example here, instead of using FileSystemObject.OpenTextFile (which does accept a final parameter indicating whether to open the file as unicode, but actually doesn't work).

However, ADODB.Stream results in a world of pain when trying to read a file on a UNC fileshare (permissions-related issue). So, investigating this, I stumbled across the following approach which works a) with unicode files, and b) across UNC fileshares:

dim fso, file, stream
set fso = Server.CreateObject("Scripting.FileSystemObject")
set file = fso.GetFile("\\SomeServer\Somefile.txt")
set stream = file.OpenAsTextStream(ForReading,-1) '-1 = unicode

This is using the FSO to read a unicode file without any apparent problem, so I'm confused as to all the references, including MS, saying you can't use the FSO to read unicode files.

Has anyone else used this approach for reading unicode files? Are there any hidden gotchas I'm missing, or can you really actually read unicode files using FSO?

5条回答
老娘就宠你
2楼-- · 2019-04-06 07:55

I think MS does not officially state that it supports unicode because:

  1. It does not detect unicode files using the byte-order mark at the start of the file, and
  2. It only supports Little-Endian UTF-16 unicode files (and you need to remove the byte order mark if present).

Here is some sample code that I have been using successfully (for a few years) to auto-detect and read unicode files with FSO (assuming they are little-endian and contain the BOM):

'Detect Unicode Files
Set Stream = FSO.OpenTextFile(ScriptFolderObject.Path & "\" & FileName, 1, False)
intAsc1Chr = Asc(Stream.Read(1))
intAsc2Chr = Asc(Stream.Read(1))
Stream.Close
If intAsc1Chr = 255 And intAsc2Chr = 254 Then 
    OpenAsUnicode = True
Else
    OpenAsUnicode = False
End If

'Get script content
Set Stream = FSO.OpenTextFile(ScriptFolderObject.Path & "\" & FileName, 1, 0, OpenAsUnicode)
TextContent = Stream.ReadAll()
Stream.Close
查看更多
小情绪 Triste *
3楼-- · 2019-04-06 07:59

Yes that documentation is out of date. The scripting component did go through a set of changes in its early days (some of them were breaking changes if you were using early binding) however since at least WK2000 SP4 and XP SP2 it has been very stable.

Just be careful what you mean by unicode. Sometimes the word unicode is used more broadly and can cover any encoding of unicode. FSO does not read for example UTF8 encodings of unicode. For that you would need to fall back on ADODB.Stream.

查看更多
贪生不怕死
4楼-- · 2019-04-06 08:02

I am writing a windows 7 gadget and run in to the same problem, and if it is possible you can just switch your files into another encoding, for example: ANSI encoding "windows-1251". With this encoding it is working fine.

If you are using this to writing a site, then better will be to use another development approach avoiding this objects.

查看更多
在下西门庆
5楼-- · 2019-04-06 08:06
'assume we have detected that it is Unicode file - then very straightforward 
'byte-by-byte crawling sorted out my problem:
'.
'.
'.
else
   eilute=f.ReadAll
   'response.write("&#268;IA BUVO &#268;ARLIS<br/>")
   'response.write(len(eilute))
   'response.write("<br/>")
   elt=""
   smbl=""
   for i=3 to len(eilute)  'First 2 bytes are 255 and 254
     baitas=asc(mid(eilute,i,1)) 
     if (i+1) <= len(eilute) then
      i=i+1 
    else
     exit for
    end if
    antras=asc(mid(eilute,i,1))*256 ' raidems uzteks
    'response.write(baitas)
    'response.write(asc(mid(eilute,i,1)))
    'response.write("<br/>")
    if baitas=13 and antras=0 then 'LineFeed
      response.write(elt)
      response.write("<br/>")
      elt=""
      if (i+2) <= len(eilute) then i=i+2 'persokam per CarriageReturn
    else
      skaicius=antras+baitas
      smbl="&#" & skaicius & ";"
      elt=elt & smbl
    end if
    next
   if elt<>"" then
    response.write(elt)
    response.write("<br/>")
    elt=""
   end if
  end if
 f.Close
 '.
 '.
查看更多
我欲成王,谁敢阻挡
6楼-- · 2019-04-06 08:07

I'd say if it works, use it ;-)

I notice the MS article you refer to is from the Windows 2000 (!) scripting guide. Maybe it's obsolete.

查看更多
登录 后发表回答