Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraping email #1

Open
rhm2k opened this issue Jan 23, 2024 · 2 comments
Open

Scraping email #1

rhm2k opened this issue Jan 23, 2024 · 2 comments

Comments

@rhm2k
Copy link

rhm2k commented Jan 23, 2024

The documentation that explains what one must do to scrape email from a Gmail account is very unclear.

  • I assume (but am not sure) that the credentials file I download from Google needs to be re-named 'credentials.json' and placed in /scrapper
  • you mention scripts in the 'scrapper' directory, as well as 'final.sh' and 'makefileforingestion.sh'
  • there's a vague reference that 'makefileforingestion.sh' may need to be given execution permissions
  • does the 'makefileforingestion.sh' need to be run BEFORE restarting with the 'poetry run … ' command to start privategpt?
  • in the repo there is a file /scrapper/threads.txt which appears to be the results of a previous scraping session
@MostlyKIGuess
Copy link
Owner

Oh yeah thanks for raising the issue, I removed my data 😭

  • I will update the documentation , I forgot to mention how that python file works.

  • Basically I mentioned TMux because the final.sh can't be run on a single terminal windows that's why you need new sessions and TMux is used for that.

  • But you can just manually run the scripts to make it work.

  • Yes Makefileforingestion is supposed to run before because that would ingest the data into model, (the scrapped data). there's also another script inside scrapper, I will soon update the docs.

  • Thank you so much for letting me know about data.

@MostlyKIGuess
Copy link
Owner

  • It's done now, lmk if you are getting any error now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants