-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak in parse
#119
Comments
interesting! do you happen to have a good source of data I could reproduce this with? |
Here's my attempt at reproducing something approximating what you describe: import { garbage } from '@ipld/garbage'
import * as json from '@ipld/dag-json'
const objTargetSize = 1024 * 100 // 100k
const utf8Dec = new TextDecoder()
const utf8Enc = new TextEncoder()
function nfmt (n) {
return Intl.NumberFormat().format(n)
}
let lastGbg
const src = {
async * [Symbol.asyncIterator] () {
while (true) {
await new Promise((resolve) => setImmediate(resolve))
lastGbg = garbage(objTargetSize)
yield utf8Dec.decode(json.encode(lastGbg))
}
}
}
let c = 0
let byts = 0
for await (const line of src) {
let obj
try {
obj = json.decode(utf8Enc.encode(line))
} catch (err) {
if (err.message === 'Invalid encoded CID form' ||
err.message === 'To parse non base32 or base58btc encoded CID multibase decoder must be provided' ||
err.message === 'Non-base58btc character') {
// @ipld/garbage may throw up an object with `{'/':'whatever'}` that @ipld/dag-json won't bork at encoding (yet)
continue
}
console.log('Got error:')
console.log(err.message)
console.log('On content:')
console.log(line)
console.log('Object:')
console.log(lastGbg)
}
if (obj === undefined) {
throw new Error('undefined')
}
byts += line.length
if (++c % 1000 === 0) {
console.log('%s lines / %s Mb processed', nfmt(c), nfmt(Math.floor(byts / 1024 / 1024)))
}
} But running it (Node.js v20.5.1) just seems to hover somewhere above 100k in memory. I'll leave it running a bit longer but as of writing I've got up to ~800k lines / 50 Gb worth of decoding. Can you help tweak this to replicate your environment a bit better @alanshaw? |
Parsing a stream of ndjson entries causes OOM in Node.js 20:
The text was updated successfully, but these errors were encountered: