2024-03-17 16:35:59 +00:00
|
|
|
# A Spec-compliant CSV Parser in 17 lines of python
|
|
|
|
|
|
|
|
Just an exercise I did for fun. I wanted to see how practical it was to construct a
|
|
|
|
spec-compliant CSV parser in python using parser combinators. Turns out, it's not that
|
|
|
|
hard!
|
|
|
|
|
|
|
|
My end result was about 17 lines (two helper functions, two atomic parsers, seven
|
|
|
|
combinators, and six parsers). Admittedly, there's a couple ugly lines. In particular,
|
|
|
|
in the definition of the regex parser, I used lambda expressions as let expressions rather
|
|
|
|
than employ a multi-line function definition, which looks a little sloppy. Still, I'm
|
|
|
|
pretty happy with the result, especially given that conciseness was an explicit goal.
|
|
|
|
|
|
|
|
If you want to take a look, find the file in [`csv_parse.py`](./csv_parse.py) or try
|
2024-03-17 16:40:49 +00:00
|
|
|
cloning and running it. I also put together another version which doesn't use python's
|
|
|
|
regex module, which you can find in [`csv_parse_no_regex.py`](./csv_parse_no_regex.py).
|
|
|
|
The sample data in this repository is stolen from [this handy repo of sample CSV
|
|
|
|
files](https://github.com/datablist/sample-csv-files).
|
2024-03-17 16:35:59 +00:00
|
|
|
|
|
|
|
It's worth noting that this was built purely as an exercise. While it is spec compliant
|
|
|
|
and can theoretically actually be used, I would recommend using Python's built-in CSV
|
|
|
|
parser, or just splitting on commas if you can get away with it. Please don't try to use
|
|
|
|
this as a library.
|