Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid conflict with double quote within data #233

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aleen42
Copy link

@aleen42 aleen42 commented Sep 13, 2023

This PR contains:

  • bugfix
  • feature
  • refactor
  • tests
  • documentation
  • metadata

Breaking Changes?

  • yes
  • no

If yes, please describe the breakage.

Please Describe Your Changes

Strictly define what a stating quote is except for the case like that:

a,b"d

Copy link

@jcapcik jcapcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also looking for a way to account for quotes within an unquoted field, which some other parsers seem to handle just fine (Excel and Rainbow CSV being a few).

This change breaks stuff in a lot of situations though. However, it lead me to a slight change that seems to work at least in my current set of tests.

Instead of checking if i === 1, which is not correct since i is the zero-based index of the character in the entire file, not each line, check if i === start, which would indicate the start of the line:

const isStartingQuote = !isQuoted && buffer[i] === quote && (i === start || buffer[i - 1] === comma)

So we only consider a quote a starting quote if we aren't within quotes already, and it either is the first character of the line or it follows a comma.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants