Using sub-generators for lexical scanning in Python
A few days ago I watched a very interesting talk by Rob Pike about writing a non-trivial lexer in Go. Rob discussed how the traditional switch-based state machine approach is cumbersome to write, because it’s not really compatible with the algorithm we want to express. The main problem is that when we return a new token, a traditional state-machine structure forces us to explicitly pack up the state of where we are and return to the caller. Especially in cases where we just want to stay in the same state, this makes code unnecessarily convoluted.
This struck a chord with me, because I’ve already written about simplifying state machine code in Python with coroutines. I couldn’t help but wonder what would be an elegant Pythonic way to implement Rob’s template lexer (watch the talk or take a look at his slides for the syntax).
What follows is my attempt, which uses the new yield from syntax from PEP 380, and hence requires Python 3.3 (which is currently in beta, but should be released soon). I’ll present the code in small chunks with explanations; the full source is available for download here. It’s heavily commented, so should be easy to grok.
First, some helper types and constants:
TOK_LEFT_META = 'TOK_LEFT_META'
TOK_RIGHT_META = 'TOK_RIGHT_META'
TOK_PIPE = 'TOK_PIPE'
TOK_NUMBER = 'TOK_NUMBER'
TOK_ID = 'TOK_ID'
# A token has
# type: one of the TOK_* constants
# value: string value, as taken from input
#
Token = namedtuple('Token', 'type value')
This struck a chord with me, because I’ve already written about simplifying state machine code in Python with coroutines. I couldn’t help but wonder what would be an elegant Pythonic way to implement Rob’s template lexer (watch the talk or take a look at his slides for the syntax).
What follows is my attempt, which uses the new yield from syntax from PEP 380, and hence requires Python 3.3 (which is currently in beta, but should be released soon). I’ll present the code in small chunks with explanations; the full source is available for download here. It’s heavily commented, so should be easy to grok.
First, some helper types and constants:
CODE:
TOK_TEXT = 'TOK_TEXT'TOK_LEFT_META = 'TOK_LEFT_META'
TOK_RIGHT_META = 'TOK_RIGHT_META'
TOK_PIPE = 'TOK_PIPE'
TOK_NUMBER = 'TOK_NUMBER'
TOK_ID = 'TOK_ID'
# A token has
# type: one of the TOK_* constants
# value: string value, as taken from input
#
Token = namedtuple('Token', 'type value')
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/301743/viewspace-740958/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 用Python實現 詞法分析器(Lexical Analyzer)Python詞法分析
- webstorm卡在scanning files to indexWebORMIndex
- pdf crop using pythonPython
- Using Python to Shorten a URL Using Google's Shortening ServicePythonGo
- 9 check Palindrome Number by using pythonPython
- Using lxml.objectify to Parse XML With PythonXMLObjectPython
- SyntaxError: EOL while scanning string literal錯誤解決ErrorWhile
- 解決 ideal 卡死一直 scanning files to index....IdeaIndex
- Using index condition Using indexIndex
- MySQL 索引優化 Using where, Using filesortMySql索引優化
- using indexIndex
- MySQL explain結果Extra中"Using Index"與"Using where; Using index"區別MySqlAIIndex
- Using hints for PostgresqlSQL
- MySQL 之 USINGMySql
- Using the WITH CHECK OPTION
- CQRS using phpPHP
- What are the benefits of using an proxy?
- Using mysqldump for backupsMySql
- A example that using JQuery clonejQuery
- Using MongoDB in C#MongoDBC#
- Using Oracle SecureFiles LOBsOracle
- USING NHIBERNATE WITH MySQLMySql
- Regression Analysis Using ExcelExcel
- Using NHibernate with SQLiteSQLite
- Using svn in CLI with BatchBAT
- Using dbms_monitor
- alter table using indexIndex
- Using Multiple Tablespaces (46)
- 淺談Using filesort和Using temporary 為什麼這麼慢
- Using HiveServer2 - AuthenticationHiveServer
- Using Sorted Sets with Jedis APIAPI
- Using Multiple Variables with the Same Name
- Using AUTO_INCREMENT CASEREM
- Using View and Data API with MeteorViewAPI
- Using XML Parser for PL/SQLXMLSQL
- Using Regular Expressions in Oracle DatabaseExpressOracleDatabase
- RAC On Linux Using NFSLinuxNFS
- write picture to oracle using javaOracleJava