A spec compliant CSV parser in a (mostly) bare minimum of lines
Go to file
2024-03-17 12:40:49 -04:00
csv_parse.py Collect code 2024-03-17 12:35:59 -04:00
csv_parse_no_regex.py Add the regex-free version of the parser 2024-03-17 12:40:49 -04:00
my_data.csv Collect code 2024-03-17 12:35:59 -04:00
README.md Add the regex-free version of the parser 2024-03-17 12:40:49 -04:00

A Spec-compliant CSV Parser in 17 lines of python

Just an exercise I did for fun. I wanted to see how practical it was to construct a spec-compliant CSV parser in python using parser combinators. Turns out, it's not that hard!

My end result was about 17 lines (two helper functions, two atomic parsers, seven combinators, and six parsers). Admittedly, there's a couple ugly lines. In particular, in the definition of the regex parser, I used lambda expressions as let expressions rather than employ a multi-line function definition, which looks a little sloppy. Still, I'm pretty happy with the result, especially given that conciseness was an explicit goal.

If you want to take a look, find the file in csv_parse.py or try cloning and running it. I also put together another version which doesn't use python's regex module, which you can find in csv_parse_no_regex.py. The sample data in this repository is stolen from this handy repo of sample CSV files.

It's worth noting that this was built purely as an exercise. While it is spec compliant and can theoretically actually be used, I would recommend using Python's built-in CSV parser, or just splitting on commas if you can get away with it. Please don't try to use this as a library.