Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create 100 billion transactions #26

Closed
mgoelswirlds opened this issue Aug 23, 2022 · 1 comment · May be fixed by jasperpotts/hedera-records-processor#6
Closed

Create 100 billion transactions #26

mgoelswirlds opened this issue Aug 23, 2022 · 1 comment · May be fixed by jasperpotts/hedera-records-processor#6
Assignees
Labels
BigDataPOC Mirror Node POC

Comments

@mgoelswirlds
Copy link
Collaborator

Currently we have 2.5 billion transactions in pinot , generated by importer.
We need to write code to either duplicate this data or generate fake data.

@mgoelswirlds mgoelswirlds added the BigDataPOC Mirror Node POC label Aug 23, 2022
@mgoelswirlds mgoelswirlds moved this to 🏃‍♀ Sprint backlog in Mirror Node Aug 29, 2022
@MarcKriguerAtHedera MarcKriguerAtHedera moved this from 🏃‍♀ Sprint backlog to 👷 In progress in Mirror Node Aug 29, 2022
@MarcKriguerAtHedera MarcKriguerAtHedera self-assigned this Aug 29, 2022
@MarcKriguerAtHedera
Copy link
Collaborator

The current transactions take up 9800 files,. from "transaction_1.avro" through "transaction_9799.avro" .

I am going to write a Java program that takes 3 arguments:

  • input filename
  • output filename
  • number of 3-year shifts to apply to the consensus_timestamp field.

For every record in the file, it will apply that offset (e.g. "1" -> all dates get shifted forward by 3 years, "-1" -> all dates get shifted back by 3 years), and we will eventually end up making 39 copies of the data. 16 going back in time (the earliest timestamp would be 48 years before September 2019, so September 1971, still ahead of the Unix epoch), and 23 going forward in time (to August 2091). Once the program is written, we can test it on a single file, verify it works, then determine if we want to all of copy the existing "transactions" (with a -3-year time shift) or just a fraction of them.
Eventually, we will want 39 copies of the data ingested into pinot, but it doesn't all have to be within one directory at the same time.

@MarcKriguerAtHedera MarcKriguerAtHedera moved this from 👷 In progress to 👀 In review in Mirror Node Sep 6, 2022
Repository owner moved this from 👀 In review to ✅ Done in Mirror Node Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BigDataPOC Mirror Node POC
Projects
None yet
2 participants