Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022实践空间站问题汇总3 #739

Open
zzzkey23 opened this issue Jun 7, 2022 · 1 comment
Open

2022实践空间站问题汇总3 #739

zzzkey23 opened this issue Jun 7, 2022 · 1 comment

Comments

@zzzkey23
Copy link
Contributor

zzzkey23 commented Jun 7, 2022

ebfe7276f8d48715a58076976bab16c

对应GitHub实验链接https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/wiki/Forward-and-Backward-Design

已知grad_output.data是梯度,input.data是之前forward时备份下来的输入数据,举例当Batchsize = 4, InputDim = 10, 神经元数量为5时,grad_output是[4, 5]的向量,input是[4, 10]的向量,grad_weights是运算结果,请问grad_weights有什么作用?

@UEFI-code
Copy link

UEFI-code commented Jun 8, 2022

我实验了一下,即使grad_weights输出全0的Tensor,模型也能够收敛。
实验代码:
https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/blob/main/Demo_myLinear.py
https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/blob/main/myKakuritsu_Linear_backend/myKakuritsuCPU.cpp

使用--no-cuda参数运行,就是grad_weights输出全0的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants