Skip to content

Lidiasaes/PDF2texto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF2texto ✨

Welcome to PDF2TEXTO✨,

This is an open source project to deploy a webpage, upload a pdf file and download their .txt file content.

There are 2 versions available:

  • PDF2texto.v.1: only works for selectable text (the code is this repo).
  • PDF2texto.v.2: works for any type of pdf file (OCR implemented; the code is found in folder Streamlit_Colaboratory).

How does it work?

How to use it?

  1. Run the Colaboratory notebook cell by cell.
  2. The last cell will return an output similar to this one:
You can now view your Streamlit app in your browser.
 
Local URL: http://localhost:8501

Network URL: http://123.45.6.78:8501

External URL: http://12.345.678.90:8501
 
 
npx: installed 22 in 2.441s
 
your url is: https://nice-forks-start.loca.lt
  1. In 'External URL', copy the number appearing between https and the port (:8501). In this example: 12.345.678.90
  2. Click in 'your url is:'. It will pop up a new window.
  3. Paste the number you copied in step 3. Click 'Submit'. It will start running ✨✨✨

I used this tutorial as first steps for building the website with Streamlit

Language2Language ✨

This is the second part of the project. You can upload a .txt file. Select the source language, the target language, and automatically translate it! You should type your desired language by following this list. The models used are found here, in Helsinki-NLP/Opis-MT.

Not all languages are supported, as this is a prototype. If you want more, just add them easily in the code! ✨

You may need an account on Hugging Face in order to get your own Acess Token

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published