Tensorflow feature column for variable list of val

2019-04-06 07:50发布

From the TensorFlow docs it's clear how to use tf.feature_column.categorical_column_with_vocabulary_list to create a feature column which takes as input some string and outputs a one-hot vector. For example

vocabulary_feature_column =
    tf.feature_column.categorical_column_with_vocabulary_list(
        key="vocab_feature",
        vocabulary_list=["kitchenware", "electronics", "sports"])

Let's say "kitchenware" maps to [1,0,0] and "electronics" maps to [0,1,0]. My question is related to having a list of strings as a feature. For example, if the feature value was ["kitchenware","electronics"] then the desired output would be [1,1,0]. The input list length is not fixed but the output dimension is.

The use case is a straight bag-of-words type model (obviously with a much larger vocabulary list!).

What is the correct way to implement this?

标签： tensorflow machine-learning neural-network feature-extraction

2条回答

你好瞎i

2楼-- · 2019-04-06 07:57

you should use tf.feature_column.indicator_column see https://www.tensorflow.org/versions/master/api_docs/python/tf/feature_column/indicator_column

0人赞添加讨论(0) 举报

劳资没心，怎么记你

3楼-- · 2019-04-06 08:06

Here is an example how to feed data to the indicator column:

features = {'letter': [['A','A'], ['C','D'], ['E','F'], ['G','A'], ['X','R']]}

letter_feature = tf.feature_column.categorical_column_with_vocabulary_list(
                "letter", ["A", "B", "C"], dtype=tf.string)

indicator = tf.feature_column.indicator_column(letter_feature)
tensor = tf.feature_column.input_layer(features, [indicator])

with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    session.run(tf.tables_initializer())
    print(session.run([tensor]))

Which outputs:

[array([[2., 0., 0.],
       [0., 0., 1.],
       [0., 0., 0.],
       [1., 0., 0.],
       [0., 0., 0.]], dtype=float32)]

0人赞添加讨论(0) 举报

Tensorflow feature column for variable list of val

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间