Put comments in between multi-line statement (with

2019-07-21 18:06发布

When i write a following pyspark command:

# comment 1
df = df.withColumn('explosion', explode(col('col1'))).filter(col('explosion')['sub_col1'] == 'some_string') \
    # comment 2
    .withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2'])) \
    # comment 3
    .withColumn('sub_col3', from_unixtime(col('explosion')['sub_col3']))

I get the following error:

.withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2']))
^
IndentationError: unexpected indent

Is there a way to write comments in between the lines of multiple-line commands in pyspark?

标签： python pyspark comments

1条回答

地球回转人心会变

2楼-- · 2019-07-21 19:05

This is not a pyspark issue, but rather a violation of python syntax.

Consider the following example:

a, b, c = range(3)
a +\
# add b
b +\
# add c
c

This results in:

    a +# add b
              ^
SyntaxError: invalid syntax

The \ is a continuation character and python interprets anything on the next line as occurring immediately after, causing your error.

One way around this is to use parentheses instead:

(a +
# add b
b +
# add c
c)

When assigning to a variable this would look like

# do a sum of 3 numbers
addition = (a +
            # add b
            b +
            # add c
            c)

Or in your case:

# comment 1
df = (df.withColumn('explosion', explode(col('col1')))
    .filter(col('explosion')['sub_col1'] == 'some_string')
    # comment 2
    .withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2']))
    # comment 3
    .withColumn('sub_col3', from_unixtime(col('explosion')['sub_col3'])))

0人赞添加讨论(0) 举报

Put comments in between multi-line statement (with

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间