How to make a template file of CRF++?

2019-03-27 13:40发布

I'm new to CRF++. I'm teaching myself looking at its manual: http://crfpp.googlecode.com/svn/trunk/doc/index.html?source=navbar#templ

And I don't understand what this means:

This is a template to describe unigram features. When you give a

template "U01:%x[0,1]", CRF++ automatically generates a set of feature

functions (func1 ... funcN) like:

func1 = if (output = B-NP and feature="U01:DT") return 1 else return 0

func2 = if (output = I-NP and feature="U01:DT") return 1 else return 0

func3 = if (output = O and feature="U01:DT") return 1 else return 0

.... funcXX = if (output = B-NP and feature="U01:NN") return 1 else return 0

funcXY = if (output = O and feature="U01:NN") return 1 else return 0. The number of feature functions generated by a template

amounts to (L * N), where L is the number of output

Why are there many lines for the Unigram features and what do they mean?

标签: crf crf++
2条回答
Rolldiameter
2楼-- · 2019-03-27 14:21

After looking at the documentation for long enough, I think I figured it out.

Take the example in the documentation where the input data is:

He        PRP  B-NP
reckons   VBZ  B-VP
the       DT   B-NP 
current   JJ   I-NP 
account   NN   I-NP

and the feature template (in the format %x[row, col], where row is relative to your current position) in question is %x[0,1]

When %x[0,1] is expanded, depending on the current token, it could scan one of the strings inside the set [PRP, VBZ, DT, JJ, NN] (i.e. one of the unique strings from the 1st column, where the leftmost column is column 0). For each of these strings it creates a set of feature functions of the form (looking at the 3rd row of input data):

func1 = if (output = B-NP and feature="U01:DT") return 1 else return 0
func2 = if (output = I-NP and feature="U01:DT") return 1 else return 0
func3 = if (output = O    and feature="U01:DT") return 1 else return 0
...

where that particular string (DT in the code above) is compared with every single output class.

So if the output classes are [B-NP, I-NP, O] the feature template expanded into feature functions will look like:

# row 1 (He, PRP, B-NP)
func1 = if (output = B-NP and feature="U01:PRP") return 1 else return 0
func2 = if (output = I-NP and feature="U01:PRP") return 1 else return 0
func3 = if (output = O    and feature="U01:PRP") return 1 else return 0

# row 2 (Reckons, VBZ, B-VP)
func4 = if (output = B-NP and feature="U01:VBZ") return 1 else return 0
func5 = if (output = I-NP and feature="U01:VBZ") return 1 else return 0
func6 = if (output = O    and feature="U01:VBZ") return 1 else return 0

# Row 3 (the, DT, B-NP)
func7 = if (output = B-NP and feature="U01:DT") return 1 else return 0
func8 = if (output = I-NP and feature="U01:DT") return 1 else return 0
func9 = if (output = O    and feature="U01:DT") return 1 else return 0

# Row 4 (current, JJ, I-NP)
func10 = if (output = B-NP and feature="U01:JJ") return 1 else return 0
func11 = if (output = I-NP and feature="U01:JJ") return 1 else return 0
func12 = if (output = O    and feature="U01:JJ") return 1 else return 0

# Row 5 (account, NN, I-NP)
func13 = if (output = B-NP and feature="U01:NN") return 1 else return 0
func14 = if (output = I-NP and feature="U01:NN") return 1 else return 0
func15 = if (output = O    and feature="U01:NN") return 1 else return 0

Regarding where the documentation mentions:

The number of feature functions generated by a template amounts to (L * N), where L is the number of output classes and N is the number of unique strings expanded from the given template.

In this case L would be 3 and N would be 5.

查看更多
啃猪蹄的小仙女
3楼-- · 2019-03-27 14:44

For a particular template %x[i,j], i represents the offsets(row) to current position, j represents the feature(column) you want to use. Given data:

He        PRP  B-NP

reckons   VBZ  B-VP

the       DT   B-NP

current   JJ   I-NP  << CURRENT TOKEN

account   NN   I-NP

%x[0,1] refers to the word, offset to current word is 0, its pos tag is JJ and its output tag is I-NP.

Move farword, %x[0, 1] -> pos tag = NN, output tag = I-NP

Each feature function refers to a pair of possible values of the current word and its pos tag.

update:

I think explaination above is quite straight forward on condition that you understand CRF model well.

CRF Model Reference

CRF++ is a replication of Sha and Pereira (2003)

查看更多
登录 后发表回答