Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use orjson for faster JSONL output #1019

Merged
merged 2 commits into from
Nov 13, 2024
Merged

Use orjson for faster JSONL output #1019

merged 2 commits into from
Nov 13, 2024

Conversation

craigds
Copy link
Member

@craigds craigds commented Nov 13, 2024

Description

When generating a large (2GB) diff as JSON-Lines this takes 20-30% less
time than the stdlib.

It may be possible to use this in other places, but note that orjson
doesn't support streaming encoding (iterencode), which means it is of
limited utility where we're trying to stream JSON diffs of huge
datasets.

This change uses it for individual features in JSONL diffs only where
the lack of iterencode() isn't a concern.

orjson is {MIT, Apache 2} dual licensed.

Related links:

refs #1018

Checklist:

  • Have you reviewed your own change?
  • Have you included test(s)?
  • Have you updated the changelog?

refs #1018

When generating a large (2GB) diff as JSON-Lines this takes 20-30% less
time than the stdlib.

It may be possible to use this in other places, but note that orjson
doesn't support streaming encoding (iterencode), which means it is of
limited utility where we're trying to stream JSON diffs of huge
datasets.

This change uses it for individual features in JSONL diffs only where
the lack of iterencode() isn't a concern.

orjson is MIT licensed.
@craigds craigds merged commit 764953a into master Nov 13, 2024
37 checks passed
@craigds craigds deleted the orjson branch November 13, 2024 19:33
@rcoup
Copy link
Member

rcoup commented Nov 13, 2024

What is the indicative performance difference for #1018?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants