Using sub-generators for lexical scanning in Python
A few days ago I watched a very interesting talk by Rob Pike about writing a non-trivial lexer in Go. Rob discussed how the traditional switch-based state machine approach is cumbersome to write, because it’s not really compatible with the algorithm we want to express. The main problem is that when we return a new token, a traditional state-machine structure forces us to explicitly pack up the state of where we are and return to the caller. Especially in cases where we just want to stay in the same state, this makes code unnecessarily convoluted.
This struck a chord with me, because I’ve already written about simplifying state machine code in Python with coroutines. I couldn’t help but wonder what would be an elegant Pythonic way to implement Rob’s template lexer (watch the talk or take a look at his slides for the syntax).
What follows is my attempt, which uses the new yield from syntax from PEP 380, and hence requires Python 3.3 (which is currently in beta, but should be released soon). I’ll present the code in small chunks with explanations; the full source is available for download here. It’s heavily commented, so should be easy to grok.
First, some helper types and constants:
TOK_LEFT_META = 'TOK_LEFT_META'
TOK_RIGHT_META = 'TOK_RIGHT_META'
TOK_PIPE = 'TOK_PIPE'
TOK_NUMBER = 'TOK_NUMBER'
TOK_ID = 'TOK_ID'
# A token has
# type: one of the TOK_* constants
# value: string value, as taken from input
#
Token = namedtuple('Token', 'type value')
This struck a chord with me, because I’ve already written about simplifying state machine code in Python with coroutines. I couldn’t help but wonder what would be an elegant Pythonic way to implement Rob’s template lexer (watch the talk or take a look at his slides for the syntax).
What follows is my attempt, which uses the new yield from syntax from PEP 380, and hence requires Python 3.3 (which is currently in beta, but should be released soon). I’ll present the code in small chunks with explanations; the full source is available for download here. It’s heavily commented, so should be easy to grok.
First, some helper types and constants:
CODE:
TOK_TEXT = 'TOK_TEXT'TOK_LEFT_META = 'TOK_LEFT_META'
TOK_RIGHT_META = 'TOK_RIGHT_META'
TOK_PIPE = 'TOK_PIPE'
TOK_NUMBER = 'TOK_NUMBER'
TOK_ID = 'TOK_ID'
# A token has
# type: one of the TOK_* constants
# value: string value, as taken from input
#
Token = namedtuple('Token', 'type value')
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/301743/viewspace-740958/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 用Python實現 詞法分析器(Lexical Analyzer)Python詞法分析
- pdf crop using pythonPython
- wlan0 interface does‘t support scanning
- UVA12421 (Jiandan) Mua (I) - Lexical Analyzer題解
- SciTech-Mathmatics-ImageProcessing-Remove the Background from an image using Python?REMPython
- SyntaxError: EOL while scanning string literal錯誤解決ErrorWhile
- MySQL 索引優化 Using where, Using filesortMySql索引優化
- 論文閱讀:End to End Chinese Lexical Fusion Recognition with Sememe Knowledge
- MySQL explain結果Extra中"Using Index"與"Using where; Using index"區別MySqlAIIndex
- 解決 ideal 卡死一直 scanning files to index....IdeaIndex
- Anaiable執行出現[WARNING]: Platform linux on hostis using the discovered Python interpreter at /usr/bin/pythonAIPlatformLinuxPython
- Using hints for PostgresqlSQL
- String interpolation using $
- using的用法
- Using mysqldump for backupsMySql
- MySQL 之 USINGMySql
- MGTSC 212 using ExcelExcel
- Video Division with using OpenCvIDEOpenCV
- Dictionary application using SwingAPP
- What are the benefits of using an proxy?
- 淺談Using filesort和Using temporary 為什麼這麼慢
- ARS Reinforcement Learning using Gymnasium
- Using MATLAB with CANoe 快讀Matlab
- SEC504.2 Recon, Scanning, and Enumeration Attacks 偵察、掃描和列舉攻擊
- Building OpenNI using a cross-compilerUIROSCompile
- 【Using English】28 - Security with HTTPS and SSLHTTP
- LeetCode | 232 Implement Queue Using StacksLeetCode
- [Javascript] Using IIFE to improve code performanceJavaScriptORM
- fribidi not found using pkg-config
- Mysql using使用詳解ZCSFMySql
- cdMysql?using?用法示例詳解MySql
- recover database using backup controlfile理解Database
- Dog robot MPC Cotroller using Pybullet
- How to get the description of blast hit using blastdbcmd?AST
- Fatal error in launcher: Unable to create process using '"'Error
- Step by Step Data Replication Using Oracle GoldenGateOracleGo
- PostgreSQL DBA(181) - Using PostgreSQL as a Data WarehouseSQL
- How to develop locally a Laravel app using LaragondevLaravelAPPGo
- [20181214]open file using O_DIRECT.txt