Enable manually specifying the desired OPUS model? #74

MoritzLaurer · 2022-07-31T10:13:15Z

I really like the library, great work!
Is there a way to manually specify a specific OPUS model? For example EasyNMT with OPUS currently does not support English as source and Portuguese as target language because it tries to download 'opus-mt-en-pt' by default, which does not exist.
There is, however, an en2pt model on the hub now (https://huggingface.co/Helsinki-NLP/opus-mt-tc-big-en-pt) with a slightly different name. I don't know how to tell EasyNMT to take this specific model instead of throwing the following error:

OSError: Helsinki-NLP/opus-mt-en-pt is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

The text was updated successfully, but these errors were encountered:

lucasfariaslf · 2022-11-10T23:29:15Z

For some specific sentences it also seems that model.translate constructs a non-existent model identifier. For example, for some sentences in dutch, instead of the correct Helsinki-NLP/opus-mt-nl-en, it looks for non-existent Helsinki-NLP/opus-mt-nds-en, which then throws the same error @MoritzLaurer mentioned.

glowinthedark · 2022-12-12T18:42:25Z

You can bypass easynmt entirely and do it for example with transformers (pip install transformers):

from transformers import pipeline

pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-en-pt")
print(pipe("Nobody Expects the Spanish Inquisition")[0]["translation_text"])

but in this case you'll need to manually deal with sentence tokenization so it's not as easy as easynmt. Or you can use EasyNMT.sentence_splitting() https://github.com/UKPLab/EasyNMT/blob/main/easynmt/EasyNMT.py#L444

tansaku · 2023-02-20T17:20:27Z

I'm seeing this same problem:

Helsinki-NLP/opus-mt-pt-en is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

bit of a blocker for me as I'm trying to translate multiple different languages and it would be nice if easynmt just handled them all correctly - does anyone know how to go about fixing this?

wasifferoze · 2024-06-03T18:03:53Z

Adding source_lang resolved this problem for me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable manually specifying the desired OPUS model? #74

Enable manually specifying the desired OPUS model? #74

MoritzLaurer commented Jul 31, 2022

lucasfariaslf commented Nov 10, 2022 •

edited

Loading

glowinthedark commented Dec 12, 2022 •

edited

Loading

tansaku commented Feb 20, 2023

wasifferoze commented Jun 3, 2024

Enable manually specifying the desired OPUS model? #74

Enable manually specifying the desired OPUS model? #74

Comments

MoritzLaurer commented Jul 31, 2022

lucasfariaslf commented Nov 10, 2022 • edited Loading

glowinthedark commented Dec 12, 2022 • edited Loading

tansaku commented Feb 20, 2023

wasifferoze commented Jun 3, 2024

lucasfariaslf commented Nov 10, 2022 •

edited

Loading

glowinthedark commented Dec 12, 2022 •

edited

Loading