Message boards : Number crunching : horns named project files causing pgtables out of memory errors in boinc all others run fine
Author | Message |
---|---|
at90systems Send message Joined: 19 Apr 20 Posts: 7 Credit: 700,368 RAC: 0 |
Have 2 Rpi running Ubuntu. All other project files run fine (named other things) All files I get that are named horns_.....xxxxxx etc. start but eventually cause multiple errors such as pgtables out of memory etc and I am having to abort them to clear the issues. Rebooting the Pi does not solve the issues, happening on both units. It is just the horns named files. Any suggestions or similar problems anyone has noticed with a solution, I hate to keep just aborting a particular strand? TIA |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Many of the horns tasks require a large amount of memory and are likely to fail on smaller machines in the way you have seen. You are probably best off aborting any that you receive; there are plenty of other task types that require less RAM and will run fine on a Pi. There’s some related discussion in another thread, starting here. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,552,383 RAC: 6,167 |
You are probably best off aborting any that you receive As i wrote it's better to create the possibility to choose apps in the user's profile. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
better to create the possibility to choose apps in the user's profile.As a way of allowing users to select which strands of research they contribute to, certainly. It would require substantially clearer categorisation of tasks than we have at present, though. But avoiding performance issues like this one is a separate matter. It would be better for the server to have some knowledge of the characteristics of each task type, so it could automatically refrain from sending large tasks to small hosts that stand little chance of completing them. |
at90systems Send message Joined: 19 Apr 20 Posts: 7 Credit: 700,368 RAC: 0 |
I kind of figured that was the problem to start with thanks for the pointer to the other thread. Surprised to see that since both of my units are 4GB Pi4s. Not to sound crazy but doesn't the need for large amounts of memory to process files like that kind of defeat the purpose of the distributed computing model? I know its tough to find platforms that support the PI equipment (trust me there are only a handful) just strange that there is only this one form (the horns) that requires so much. Seems like an easy fix though, split the files programming down to smaller subsets or limit them to a particular set of boinc users, the boinc registration should be able to supply some type of information as I know if you try to register the Pi with a project that is not supported it tells you that from the get go. It just gets repetitive having to go to the units I have (which are headless for the most part) and keep aborting one single file every day or other day when other subclasses work fine was all I was concerned about. Wondering if using an 8Gb Pi would solve the issue and keeping the 4Gb on machine learning project would fix the issue. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I know its tough to find platforms that support the PI equipment (trust me there are only a handful) just strange that there is only this one form (the horns) that requires so much. Seems like an easy fix though, split the files programming down to smaller subsets or limit them to a particular set of boinc users, the boinc registration should be able to supply some type of information as I know if you try to register the Pi with a project that is not supported it tells you that from the get go. No, the models usually start out large. They then get smaller with development. I don't know whether it is feasible to separate them at the large stage. The real question is why do they allow a Pi at all? It is bound to fail at some point. |
jeff_b Send message Joined: 8 Apr 20 Posts: 3 Credit: 12,168,723 RAC: 161 |
So far my 8gb pi is working ok with horns project files (so far), but my 4gb pi cluster doesn't like them so have to delete them. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,552,383 RAC: 6,167 |
The real question is why do they allow a Pi at all? It is bound to fail at some point. Not only Raspberry Pi, but also smarphone and pc with less than 4bg per core... |
wolfman1360 Send message Joined: 18 Feb 17 Posts: 72 Credit: 18,450,036 RAC: 0 |
The real question is why do they allow a Pi at all? It is bound to fail at some point. Because most WUs take 1 gb per core if not a bit less. I wish there was a way to separate out these larger requirements for WUs on this project. I have quite a variety of machines and it's frustrating to have super huge WUs pop up unexpectedly. RPI or not, smartphone or not, everything should be able to contribute. Especially now. |
at90systems Send message Joined: 19 Apr 20 Posts: 7 Credit: 700,368 RAC: 0 |
Good to know what going to ask that question, thank for the response. |
at90systems Send message Joined: 19 Apr 20 Posts: 7 Credit: 700,368 RAC: 0 |
Well, shout out to whomever fixed the problem, I hope it wasn't simply by just stopping the horns work units. PI units have been working flawlessly now for quite a few days without errors. In regards to why allow Pi units to run, they are more robust than what they were now with the 4Gb and 8Gb versions they can contribute so much to the program. Sure they don't do as much as a computer cluster, but they do contribute. The entire idea behind distributed computing is to allow anything to contribute and take the work load off of a much larger super computer, so in this day and age I say yes more boinc projects need to support them, not only that the space considerations and power consumption used by the units and expense (when compared to purchases not using old hardware though) is great. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,552,383 RAC: 6,167 |
I wish there was a way to separate out these larger requirements for WUs on this project. I have quite a variety of machines and it's frustrating to have super huge WUs pop up unexpectedly. RPI or not, smartphone or not, everything should be able to contribute. Especially now. As i wrote in another thread, there is App_Plan in the boinc server scheduler.... |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,152,433 RAC: 4,296 |
I wish there was a way to separate out these larger requirements for WUs on this project. I have quite a variety of machines and it's frustrating to have super huge WUs pop up unexpectedly. RPI or not, smartphone or not, everything should be able to contribute. Especially now. There's no way they will test that here at Rosetta, it would happen at the Beta Project first, Ralph@home, and right now they have zero tasks to crunch. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,552,383 RAC: 6,167 |
There's no way they will test that here at Rosetta, it would happen at the Beta Project first, Ralph@home, and right now they have zero tasks to crunch. Ralph@home is often, very often, underused |
Message boards :
Number crunching :
horns named project files causing pgtables out of memory errors in boinc all others run fine
©2024 University of Washington
https://www.bakerlab.org