Extract text content from cell (With bold, italic,

2020-03-05 02:34发布

I'm trying to extract text content from Excel using a macro. This is my code:

Dim i As Integer, j As Integer
Dim v1 As Variant
Dim Txt As String

v1 = Range("A2:C15")
For i = 1 To UBound(v1)
    For j = 1 To UBound(v1, 2)
        Txt = Txt & v1(i, j)
    Next j
    Txt = Txt & vbCrLf
Next i

MsgBox Txt

But it is showing the raw characters only meaning that it doesn't show any formatting information like bold, italic, underline, etc..

I want to extract the text along with the formatting information.

Example: This is sample text

Expected output: This is sample text

Actual output: This is sample text

Can someone explain what's wrong with the code and tell if anything is wrong?

3条回答
爱情/是我丢掉的垃圾
2楼-- · 2020-03-05 03:12

A messagebox does not permit formatiing without changing system defaults, which is not a starightforward approach. If you want to display formatted text in a prompt then you are probably easiest to create a userform and format the label appropriately.

For example, you can determine if a cell has bold fomatting using:

Dim isBold As Boolean
isBold = v1(i, j).Font.Bold

And apply this to a userform label font using:

label.Font.Bold = isBold

If you want to output to a text (ie .txt) file then this cannot store any formatting information. The best you could hope to achieve is to create a markup style output where:

If isBold Then
    txt = "<b >mytext< /b>" 'Ignore the spaces
Else
    txt = "mytext"
End If

The range.Font.Bold property has three return options:

v1(i, j).Font.Bold = True  'if the entire cell IS bold
v1(i, j).Font.Bold = False 'if the entire cell IS NOT bold
v1(i, j).Font.Bold = Null  'if the cell is PARTIALLY bold

Calling IsNull(v1(i, j).Font.Bold) will tell you whether you have partial fomatting in a cell. Unfortunately you must then assess each character in the string individually to determine the bold characters. This function should determine where the bold formatting is switched on or off in a string contained in the Range object passed and add the appropriate markup tag:

Function markup(rng As Range) As String

Dim chr As Integer
Dim isCharBold As Boolean
Dim str As String
Dim tempChar As Characters

isCharBold = False
str = ""

If IsNull(rng.Font.Bold) Then
    For chr = 1 To rng.Characters.Count
        Set tempChar = rng.Characters(chr, 1)
        If isCharBold Then
            If tempChar.Font.Bold Then
                str = str + tempChar.Text
            Else
                isCharBold = False
                str = str & "</b>" & tempChar.Text
            End If
        Else
            If tempChar.Font.Bold Then
                isCharBold = True
                str = str + "<b>" & tempChar.Text
            Else
                str = str & tempChar.Text
            End If
        End If
    Next chr
Else
    str = rng.Value
End If

markup = str

End Function

Notice that the Else case just returns the default string values. You can modify this approach to work for any of the .Font properties e.g. strikethrough, underline, italic....

The framework in the OP suggests that you are assigning the contents of a range of cells into an array of type Variant. This essentially leaves you with an unformatted string of characters in each array index. In this case you won't be able to extract any formatting from the array strings. To access the Characters().Font.Bold property you must be operating on a Range object so it might be best to iterate through each cell in Range("A2:C15") directly. This could be achieved by modifying your initial code as such, so it now calls the markup function:

Sub OutputText()

Dim i As Integer, j As Integer
Dim rng As Range
Dim Txt As String

Set rng = Range("A2:C15")
For i = 1 To rng.Rows.Count
    For j = 1 To rng.Columns.Count
        Txt = Txt & markup(rng(i, j)) & " "
    Next j
    Txt = Txt & vbCrLf
Next i

Debug.Print Txt

End Sub
查看更多
疯言疯语
3楼-- · 2020-03-05 03:14

OK, let's have the algorithm from @stucharo a little bit simpler to extend.

Public Function getHTMLFormattedString(r As Range) As String

 isBold = False
 isItalic = False
 isUnderlined = False
 s = ""
 cCount = 0
 On Error Resume Next
 cCount = r.Characters.Count
 On Error GoTo 0

 If cCount > 0 Then

  For i = 1 To cCount

   Set c = r.Characters(i, 1)

   If isUnderlined And c.Font.Underline = xlUnderlineStyleNone Then
    isUnderlined = False
    s = s & "</u>"
   End If

   If isItalic And Not c.Font.Italic Then
    isItalic = False
    s = s & "</i>"
   End If

   If isBold And Not c.Font.Bold Then
    isBold = False
    s = s & "</b>"
   End If


   If c.Font.Bold And Not isBold Then
    isBold = True
    s = s + "<b>"
   End If

   If c.Font.Italic And Not isItalic Then
    isItalic = True
    s = s + "<i>"
   End If

   If Not (c.Font.Underline = xlUnderlineStyleNone) And Not isUnderlined Then
    isUnderlined = True
    s = s + "<u>"
   End If

   s = s & c.Text

   If i = cCount Then
    If isUnderlined Then s = s & "</u>"
    If isItalic Then s = s & "</i>"
    If isBold Then s = s & "</b>"
   End If

  Next i

 Else
  s = r.Text
  If r.Font.Bold Then s = "<b>" & s & "</b>"
  If r.Font.Italic Then s = "<i>" & s & "</i>"
  If Not (r.Font.Underline = xlUnderlineStyleNone) Then s = "<u>" & s & "</u>"
 End If

 getHTMLFormattedString = s
End Function

To be clear, this function works only with a range containing a single cell. But it should be easy calling this function for each cell in a bigger range and concatenating the returned strings into one.

Edit by the OP:

I called the function by the below code:

Sub ReplaceFormattingTags()

Dim i As Integer, j As Integer
Dim rng As Range
Dim Txt As String

Set rng = Range("A2:C15")
For i = 1 To rng.Rows.Count
    For j = 1 To rng.Columns.Count
        Txt = Txt & getHTMLFormattedString(rng(i, j)) & " "
    Next j
    Txt = Txt & vbCrLf
Next i

Debug.Print Txt

End Sub
查看更多
姐就是有狂的资本
4楼-- · 2020-03-05 03:23

VBA string does not support formatting like that. It will purely take the string from the range. No formatting at all. If you want to format the string, you can not see this through msgbox.

Only way to do it would be to store it in a cell then format the cell. But then that does not give you the output in a messagebox as a formatted string.

If you are planning to then put the string in a cell with formatting, you will need to save the formatting somewhere, or copy it from the cell you got the text from. And then apply the formatting to the cell

查看更多
登录 后发表回答