-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EP_FAIL : Non-zero status code returned while running Conv node. Name:'/features/features.0/Conv' Status Message: Failed to initialize CUDNN Frontend #23301
Comments
I also build the onnxruntime-gpu == 1.20.0 and the got the same error on the same place. |
@tianleiwu , it is related to cudnn frontend. |
@snnn @tianleiwu any update on this issue... |
@m0hammadjaan, please try add some environment variable to collect cudnn debug log:
Then run your tests. CUDNN_STATUS_SUBLIBRARY_LOADING_FAILED means it cannot load a sub library (.so), and I think it is likely an environment setup issue (try add |
@tianleiwu I set these environmental variables and still getting the same error. Then I try to add
|
@m0hammadjaan, when you installed cudnn-front-end (although not needed by ORT) from source, did you verify that the installation is good following https://github.com/NVIDIA/cudnn-frontend?tab=readme-ov-file#checking-the-installation? You can check DLL (*.so) loading like
OR
You shall be able to see which *.so file failed to load during your test. |
@tianleiwu, yes I have followed the same
|
@tianleiwu any updates on it? |
@m0hammadjaan, Could you try build a binary with tlwu/conv_cudnn_fe_fallback branch. It will try fallback Conv to not use cudnn frontend. Let me know if it could resolve the issue. |
I have an EC2 instance of type g5g.xlarge. I have installed the following:
On the following code:
I am getting the following Error:
However, prints from the below code confirms that the installation is done perfectly:
Output:
In order to resolve this, I have installed the nvidia_cudnn_frontend v1.9.0 from the source. Still it is not resolved.
nvidia-smi is working. Its version is: NVIDIA-SMI 550.127.08
nvcc is also working fine.
Versions
The text was updated successfully, but these errors were encountered: