-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird network issues when running inside Singularity container #3531
Comments
Hi @DeepHorizons , indeed this is weird, how did you run image ? |
The image hader is
It also doesn't have a runscript, which i'm assuming makes |
|
We depend on some folders in the home directory but I tried my best to run it in a contained environment as such
But the problem still persists. Something I'm noticing is that if I run the command manually it seems to work, as compared to having the command run on boot. |
Are you running this on a cluster? Are there certain nodes where it runs and certain nodes where it fails? Does it run reliably as a single job and maybe it fails when it becomes parallel? |
It has all been run on one machine, and only one instance is running at a time. |
Looking at the Azure issue, is DNS resolution actually happening and the failure is on the end point connection? It says you could manually run dig to query... but what about when it's running by itself and fails? Would it be possible for you to try subverting the DNS lookup by adding an entry into |
I got around to do more testing and found some interesting behaviors
Since I was seeing issues with it only on boot, I though maybe it was related to the boot order. So I decided to try sleeps in different places.
So it seems there is some system state that Singularity is "remembering" but gets fixed/settled a little after boot. Perhaps adjusting the boot order would fix this?
I tried starting it after |
@DeepHorizons Looks like a side effect of how systemd start service, because even if singularity is the last one it doesn't forcibly mean than other services have fully started, and you said a |
Any feedback on @cclerget's suggestion to try using the |
Closing, feel free to re-open it if required |
For sake of completeness, the |
Version of Singularity:
3.1.0
Expected behavior
We run our application and it works without issues or errors
Actual behavior
When we run our application inside a Singularity image about 50% of the time we get an error from one of the libraries we use. The other 50% it works fine.
When running outside of Singularity there are no issues.
Steps to reproduce behavior
We have not yet been able to reproduce it reliably, it seems to be random.
I originally though this was an issue with the library we were using but after testing our application outside of Singularity and having it work 100% I'm looking into if Singularity is doing something weird. There is some additional information on the library end here. Azure/azure-storage-cpp#259
What is Singularity's roll with containers and the network? Is there a way I can output network information?
The text was updated successfully, but these errors were encountered: