How to write a language with Python-like indentati

2019-06-21 17:22发布

I'm writing a tool with it's own built-in language similar to Python. I want to make indentation meaningful in the syntax (so that tabs and spaces at line beginning would represent nesting of commands).

What is the best way to do this?

I've written recursive-descent and finite automata parsers before.

标签： parsing

3条回答

Melony?

2楼-- · 2019-06-21 17:33

Check out the python compiler and in particular compiler.parse.

0人赞添加讨论(0) 举报

相关推荐>>

3楼-- · 2019-06-21 17:38

I'd suggest ANTLR for any lexer/parser generation ( http://www.antlr.org ).

Also, this website ( http://erezsh.wordpress.com/2008/07/12/python-parsing-1-lexing/ ) has some more information, in particular:

Python’s indentation cannot be solved with a DFA. (I’m still perplexed at whether it can even be solved with a context-free grammar).

PyPy produced an interesting post about lexing Python (they intend to solve it using post-processing the lexer output)

CPython’s tokenizer is written in C. It’s ad-hoc, hand-written, and complex. It is the only official implementation of Python lexing that I know of.

0人赞添加讨论(0) 举报

何必那么认真

4楼-- · 2019-06-21 17:52

The current CPython's parser seems to be generated using something called ASDL.

Regarding the indentation you're asking for, it's done using special lexer tokens called INDENT and DEDENT. To replicate that, just implement those tokens in your lexer (that is pretty easy if you use a stack to store the starting columns of previous indented lines), and then plug them into your grammar as usual (like any other keyword or operator token).

0人赞添加讨论(0) 举报

How to write a language with Python-like indentati

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间