I want to use the python CSV reader but I want to leave the quotes in. That is I want:
>>> s = '"simple|split"|test'
>>> reader = csv.reader([s], delimiter='|', skipinitialspace=True)
>>> reader.next()
['"simple|split"', 'test']
But I actually get:
['simple|split', 'test']
In my case I want the quoted string to be passed on still quoted.
I know the CSV reader is working as intended and my use case is an abuse of it, but is there some way to bend it to my will? Or do I have to write my own string parser?
解决方案
You're going to have to write your own parser, as the part of the module that backs parsing and quotes is in the C side of things, in particular parse_process_char located in Modules/_csv.c:
else if (c == dialect->quotechar &&
dialect->quoting != QUOTE_NONE) {
if (dialect->doublequote) {
/* doublequote; " represented by "" */
self->state = QUOTE_IN_QUOTED_FIELD;
}
else {
/* end of quote part of field */
self->state = IN_FIELD;
}
}
else {
/* normal character - save in field */
if (parse_add_char(self, c) < 0)
return -1;
}
That "end of quote part of field" section is what's chomping your double quote. On the other hand, you might be able to kill that else conditional and rebuild the python source code. However that's not all that maintainable to be honest.
Edit: Sorry I meant add the bit from the last else before self->state = IN_FIELD so it adds the quote in.