I had to solve a little problem today (trimming trailing whitespace in a MS Word document that the PDF converter had added to each and every cell), and I quickly found out that this isn't possible using the standard Word interface, so wrote a small VBA script:
Sub TrimCellSpaces()
Dim itable As Table
Dim C As Cell
For Each itable In ThisDocument.Tables
For Each C In itable.Range.Cells
C.Range.Text = Trim(C.Range.Text)
Next
Next
End Sub
I was surprised that not only did this fail to remove the trailing spaces, it even added paragraph markers at the end of each cell. So I tried a regex approach:
Sub TrimCellSpaces()
Dim myRE As New RegExp
Dim itable As Table
Dim C As Cell
myRE.Pattern = "\s+$"
For Each itable In ThisDocument.Tables
For Each C In itable.Range.Cells
With myRE
C.Range.Text = .Replace(C.Range.Text, "")
End With
Next
Next
End Sub
Same result. I added a breakpoint, copied the value of C.Range.Text
(before replacement) into a hex editor and found that it ended in the hex sequence 0D 0D 07
(07
is the ASCII Bell character (!)).
I changed the regex to \s+(?!.*\w)
, and the script worked flawlessly. After the replace operation, the value of C.Range.Text
ended only in 0D 07
(one 0D
fewer).
I also tried this with a newly created table, not one generated by Word's PDF importer - same results.
What's going on here? Is Word using 0D 0D 07
as an "end of cell" marker? Or is it 0D 07
? Why did \s+
remove only one 0D
?