We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
已知grad_output.data是梯度,input.data是之前forward时备份下来的输入数据,举例当Batchsize = 4, InputDim = 10, 神经元数量为5时,grad_output是[4, 5]的向量,input是[4, 10]的向量,grad_weights是运算结果,请问grad_weights有什么作用?
The text was updated successfully, but these errors were encountered:
我实验了一下,即使grad_weights输出全0的Tensor,模型也能够收敛。 实验代码: https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/blob/main/Demo_myLinear.py https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/blob/main/myKakuritsu_Linear_backend/myKakuritsuCPU.cpp
使用--no-cuda参数运行,就是grad_weights输出全0的
Sorry, something went wrong.
No branches or pull requests
对应GitHub实验链接https://github.com/UEFI-code/MSRA_thePracticeSpaceProject_PyTorchCUDA/wiki/Forward-and-Backward-Design
已知grad_output.data是梯度,input.data是之前forward时备份下来的输入数据,举例当Batchsize = 4, InputDim = 10, 神经元数量为5时,grad_output是[4, 5]的向量,input是[4, 10]的向量,grad_weights是运算结果,请问grad_weights有什么作用?
The text was updated successfully, but these errors were encountered: