[Pytorch] Backward capability from pretrained model. #1075

guangster · 2021-09-01T14:55:01Z

guangster
Sep 1, 2021

Hi,

I am looking for ways to load a pre-trained PyTorch model, then continue to fine-tune both the pre-trained model and additional layers in Java (could be JIT, or any other ways to load similar to torch.load_state_dict). I found some discussions around this suggesting that it's possible here pytorch/pytorch#17614

So far with javacpp-presets for pytorch, it seems we can now load the jit modules which means we can run forward passes on pre-trained model. Is there a way to run backward passes as well on the JIT compile torchscript models?

I can give the suggestions here for C++ a try pytorch/pytorch#17614 (comment)
if you can expose the parameters() function, is that possible?

BTW, thanks @saudet for suggesting this package in another forum 😄.

Answered by saudet

Sep 1, 2021

Looks like we'll need to work on mapping that parameters() function, yes.

View full answer

saudet · 2021-09-01T22:31:07Z

saudet
Sep 1, 2021
Maintainer

Looks like we'll need to work on mapping that parameters() function, yes.

4 replies

saudet Sep 2, 2021
Maintainer

Ok, that's done in commit ded739e. Enjoy!

guangster Sep 13, 2021
Author

Thanks, @saudet, that was fast! I just tried this, and it seems that the backward pass is working.

guangster Sep 13, 2021
Author

So, not sure if this is a bug, but I was playing around with my code. I noticed the loop logic for parameter_list object didn't work like here. Is that intended?

The equal check seems to always fail so the loop keeps running even though I know all the elements have been accessed.
So for example, I load a JitModule module object with 12 parameters. This code would not stop after 12 parameters and crash.

        TensorVector v = new TensorVector(module.parameters().size());
        parameter_list params = module.parameters();
        Tensor[] t = new Tensor[(int) params.size()];
        int i =0;
        for (parameter_iterator it = params.begin(); !it.equals(params.end()); it = it.increment()) {
            Tensor param = it.access();
            t[i++] = param;
        }
        v.put(t);

However, this code would work fine (loading all 12 parameters and stopping).

        TensorVector v = new TensorVector(module.parameters().size());
        parameter_list params = module.parameters();
        Tensor[] t = new Tensor[(int) params.size()];
        int i =0;
        for (parameter_iterator it = params.begin(); !it.equals(params.end()); it = it.increment()) {
            Tensor param = it.access();
            t[i++] = param;
        }
        v.put(t);

saudet Sep 14, 2021
Maintainer

Right, those iterators do not provide operator==() functions, so we can't compare them with equals(). I'm not sure why they did it that way, but in any case, it's supposed to be an internal API... There is operator!=() that we could map to notEquals(), but at the moment we can't use it because it's declared as private. We could however patch the header files in a similar fashion to what we had to do to access the operator<<() one for Module here:
https://github.com/bytedeco/javacpp-presets/blob/master/pytorch/cppbuild.sh#L141
Contributions are welcome, so feel free to send a pull request in that sense! Also, this should be reported upstream, but they'll probably just tell you that you shouldn't use the internal API. :)

saudet · 2021-09-02T01:29:25Z

saudet
Sep 2, 2021
Maintainer

It looks like it's now possible to train with higher-level functions eval() and train() though: pytorch/pytorch#17614 (comment)
Have you tried it that way?

BTW, I have already proposed to use JavaCPP for PyTorch in DJL. They are already using it for TensorFlow, but its C++ API is quite limited, so I think it would be nice if DJL or anyone else could provide a user-friendly Java API for PyTorch, which should be a lot easier to do with JavaCPP than by doing it manually with JNI like DJL is doing right now.

4 replies

guangster Sep 13, 2021
Author

I think those just set flags within the pytorch models. I am able to use your above commit, gather all tensors from module.parameters(), then pass them to an optimizer and call forward/backward similar to your SimpleMNIST.java example and see that the model is updating.

Yeah, if DJL can adopt JavaCPP that would be great. Your library here seems really flexible, it seems to be the only one out there that can support fine-tuning existing model in Java. I did notice that I was not able to load djl and javacpp's model instance simultaneously. I get a link error when I load the JitModule after loading a DJL model. I was hoping to be able to run both because DJL's API is a bit easier to code with.

saudet Sep 14, 2021
Maintainer

It's unfortunately not possible to load 2 different instances of PyTorch in the same process. We would need to get DJL to load the same instance of PyTorch as JavaCPP, which is possible, but that would need to be done in DJL since it is downstream.

/cc @frankfliu @stu1130

frankfliu Sep 14, 2021

DJL use official release of libtorch distribution. If you build JavaCPP with the same gcc version, you should be able to use them together. DJL allows you specify the location of libtorch.so file. It would be nice if JavaCPP can build against libtorch instead of building from source. With official release of libtorch, we can easily integrate with 3rd party plugins, such as AWS inferentia.

saudet Sep 15, 2021
Maintainer

@frankfliu The JNI libraries created with JavaCPP can link with the binaries from LibTorch, that's not a problem at all. It will work fine, no need to worry about the versions of GCC. The problem is that LibTorch doesn't come with a loader for Java, so DJL created its own loader, and if we want to be able to use the JNI libraries created with JavaCPP with the LibTorch binaries loaded by DJL, it also needs to be the one to load the JNI libraries created with JavaCPP. Of course users can hack this manually by specifying paths manually and what not, but since the goal of DJL is to provide a user friendly API to all this, you should also consider adding support for this kind of use case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pytorch] Backward capability from pretrained model. #1075

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 8 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

[Pytorch] Backward capability from pretrained model. #1075

guangster Sep 1, 2021

Replies: 2 comments · 8 replies

saudet Sep 1, 2021 Maintainer

saudet Sep 2, 2021 Maintainer

guangster Sep 13, 2021 Author

guangster Sep 13, 2021 Author

saudet Sep 14, 2021 Maintainer

saudet Sep 2, 2021 Maintainer

guangster Sep 13, 2021 Author

saudet Sep 14, 2021 Maintainer

frankfliu Sep 14, 2021

saudet Sep 15, 2021 Maintainer

guangster
Sep 1, 2021

Replies: 2 comments 8 replies

saudet
Sep 1, 2021
Maintainer

saudet Sep 2, 2021
Maintainer

guangster Sep 13, 2021
Author

guangster Sep 13, 2021
Author

saudet Sep 14, 2021
Maintainer

saudet
Sep 2, 2021
Maintainer

guangster Sep 13, 2021
Author

saudet Sep 14, 2021
Maintainer

saudet Sep 15, 2021
Maintainer