ASP: I can´t decode some character from utf-8 to i

2020-04-29 01:55发布

问题:

I use this function to decode UTF-8:

function DecodeUTF8(s)
  dim i
  dim c
  dim n
  i = 1
  do while i <= len(s)
    c = asc(mid(s,i,1))
    if c and &H80 then
      n = 1
      do while i + n < len(s)
        if (asc(mid(s,i+n,1)) and &HC0) <> &H80 then
          exit do
        end if
        n = n + 1
      loop
      if n = 2 and ((c and &HE0) = &HC0) then
        c = asc(mid(s,i+1,1)) + &H40 * (c and &H01)
      else
        c = 191 
      end if
      s = left(s,i-1) + chr(c) + mid(s,i+n)
    end if
    i = i + 1
  loop

  DecodeUTF8 = s
end function

But there are some probles to decode that characters:

€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ

In that case

c=191-->c='¿'

I found some info related with this problem: http://www.i18nqa.com/debug/utf8-debug.html

Do you know any function to decode correctly?

回答1:

Public Function DecodeUTF8(s)
  Set stmANSI = Server.CreateObject("ADODB.Stream")
  s = s & ""
  On Error Resume Next

  With stmANSI
    .Open
    .Position = 0
    .CharSet = "Windows-1252"
    .WriteText s
    .Position = 0
    .CharSet = "UTF-8"
  End With

  DecodeUTF8 = stmANSI.ReadText
  stmANSI.Close

  If Err.number <> 0 Then
    lib.logger.error "str.DecodeUTF8( " & s & " ): " & Err.Description
    DecodeUTF8 = s
  End If
  On error Goto 0
End Function