How can I pull hashtags out of a text column?

2019-09-09 10:23发布

问题:

I have an Excel sheet in which there is a "description" column. The values in this column often contain anywhere from 0-3 tags, all starting with the # symbol. Is there a way to pull all of these tags out in to columns?

Perhaps just have 3 blank columns called hashtag 1, 2, 3 and pull them in to each column.

It isn't even important that it remove them from the description column while pulling them out.

Example of descriptions:

"#0034 #lost client lost file"            - pull out 0034    and   lost  
"worker has bad quality #SusanB #quality" - pull out SusanB  and   quality  
"#0840 client complaint"                  - pull out 0840  
"lots of ipsum"                           - pull out       nothing

回答1:

Lets say Column A is Description column, and in A2 you have the first cell with hashtags
In B2 enter:

=MID(A2;(FIND("#";A2))+1;(FIND(" ";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))-(FIND("#";A2))-1)

In C2 enter:

=MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;(FIND(" ";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2))))-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))-1)

In D2 enter:

=MID(A2;(FIND("#";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2))))+1;(FIND(" ";MID(A2;(FIND("#";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2))))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2))))))+(FIND("#";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))-(FIND("#";MID(A2;(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))+1;LEN(A2)-(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2)))))+(FIND("#";MID(A2;(FIND("#";A2))+1;LEN(A2)-(FIND("#";A2))))+(FIND("#";A2))))-1)


回答2:

I'm fond of an extension that can let you use REGEX in Excel ...

Without this :

1) find the position of the separator character (# ?) in your string with FIND()

2) then use LEFT(), MID() and RIGHT() to explode your string into 3 columns

3) you can delete the # using MID() instead of LEFT() and RIGHT()

--

It would be something like this for the first tag with the # :

=LEFT(A1,FIND("#",A1)-1)

--

Hope this will help !



回答3:

This can always be done using regular expression.

In VBE, write following function in a module:

Function getHashTags(rng As Range) As Variant
    Dim regEx As RegExp
    Set regEx = New RegExp
    regEx.Pattern = "#\w*\b"
    regEx.IgnoreCase = True
    regEx.Global = True
    Set myMatches = regEx.Execute(rng.Value)
    Dim arr(1 To 1, 1 To 3) As Variant
    For i = 1 To 3
        If i > myMatches.Count Then
            arr(1, i) = ""
        Else
            arr(1, i) = Replace(myMatches(i - 1), "#", "")
        End If
    Next i
    getHashTags = arr
End Function

Now, let's suppose Column A is the Description column, and in cell A2 you have the first cell with hash tags.

In cell B2 enter this:

=getHashTags(B$2)

Now select the cells B2, C2, D2, Press F2 and then ctrl+shift+enter. This will populate the variant return from the function getHashTags to the selected cells.

I hope this helps.

PS: And, yes for this to work you also need to give reference to Microsoft VBScript Regular Expressions 5.5 library.