I'm having a hard time with a special character (Unicode U+2028, in XML it shows as , and in Chrome and Excel cell text, it shows as a L SEP in a dashed square). It's this one:
I've made a macro which opens many webpages, copying part of their contents into separate cells in an excel sheet. Most of the time it performs well, but recently one of the content distributors whose webpage I need to research started using the LSEP character (I don't know why, maybe he's using a new text or HTML editor; I have no contact with any of them).
As a result, Excel can't open the xlsx, returning an XML error in /xl/sharedStrings.xml file. When I renamed MySheet.xlsx to MySheet.xlsx.zip and opened the troubling file, I've discovered the error was caused by that dreadful character. Manually, I removed the LSEP characters, replaced sharedStrings.xml, renamed MySheet.xlsx.zip to MySheet.xlsx and it opened all right.
Having figured out my problem, the next step was to write some piece of code into my macro, to automatically replace the LSEP character with nothing. Being a special character, my approach was to try referring it by ASC
function:
Asc(ActiveCell.Formula)
Here was the first spooky thing I got: the code above use to return 63
, but one time it returned 93
to the same cell content! Trying ActiveCell.Text
got me the same result.
Anyway, I've tried a normal Replace
:
Replace(ActiveCell.Formula, Chr(63), "")
The Replace
returned exactly the same string, with the infamous LSEP (tried both ActiveCell.Formula
and ActiveCell.Text
, both returned the same; so I've tried the same with Chr(93)
instead of Chr(63)
, with no sucess).
Then, I've tried to locate the infamous sneaky ninja pirate LSEP character, so I tried:
InStr(1, ActiveCell.Formula, Asc(63)) 'Or Asc(93), again the same results
The return was 0
. So, astounded but not defeated, I rubbed my magical lamp and asked this to VBA genius:
MsgBox "Where's Chr(63)? " & InStr(1, ActiveCell.Formula, Chr(63)) & vbCrLf & _
"Where's Chr(93)? " & InStr(1, ActiveCell.Formula, Chr(93)) & vbCrLf & _
"Asc(Right(ActiveCell.Formula, 1)): " & Asc(Right(ActiveCell.Formula, 1))
In response, VBA tells me 0
to first and second lines (like there isn't Chr(63)
nor Chr(93)
), at the same time telling me the rightmost character of the same string is the Ascii 63!
How is this possible? Does anyone have an idea? I'm totally lost. How can I detect and replace this stealthy ninja pirate LSEP character in advance, in my code?
Thanks for your time!
Asc()
&Chr()
only work for ANSI characters.Instead use the Unicode aware versions:
Chrw$(&h2028)
is the LSep character, its counterpart isAscW()
.