Nixtu: Simple Tokenizer in Python

Friday, July 2, 2010

Simple Tokenizer in Python

I came by a series that aims to build an ECMAScript interpreter in Python yesterday. As a homework the author offered a simple task of of implementing a tokenizer. I haven't implemented one earlier so I thought it might be fun to come up with one.

What is a tokenizer then? Basically it evaluates given input string and returns an evaluated result that contains some metadata. Instead of trying to complicate things further, it's probably easier if you take a look at my simple (hopefully!) implementation. I tried to keep it as readable as possible, really. :) Anyway, here we go:

It's probably far from perfect (pointers are welcome) but appears to work just fine with in the given cases. I found the development work really straightforward, thanks to test driven approach. I started out with simple tests and slowly expanded the test suite further. Later on it should be expanded to support more complicated cases.

Nixtu

Read my book

Friday, July 2, 2010

Simple Tokenizer in Python

Popular Posts

Labels

RSS

Comments