Free search string tokenization in Python

Want to do some simple lex parsing in Python? Using shlex, you may be able to get something that meets your requirements almost for free. Here is an example I used recently to parse a search string. The requirements were that tokens could be separated by spaces or commas, and double-quotes denotes a single token.

import shlex 

def _tokens(query):
    return shlex.split(str(query))

Examples:


>>> _tokens("java, perl, c++")
['java,', 'perl,', 'c++']

>>> _tokens("java perl c++")
['java', 'perl', 'c++']

>>> _tokens("java perl c++ \"Phil's Staffing\"")
['java', 'perl', 'c++', "Phil's Staffing"]

Chase Seibert