Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 277 · 278 · 279 · 280 · 281 · 282 · 283 . . . 316 · Next
Author | Message |
---|---|
MStenholm Send message Joined: 18 Apr 20 Posts: 19 Credit: 27,931,117 RAC: 57,879 |
You ran out of memory. Six jobs of 2.6 GB and you have 16 GB. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
You ran out of memory. Six jobs of 2.6 GB and you have 16 GB.That might do it. I've got half that many cores/threads & twice that amount of RAM and over the last couple of days when i had mostly Rosetta_VS Tasks there have been times i've had over 60% of my RAM in use. Even without the 2GB + Tasks, there were plenty of others using 1-1.5GB. But normally if lack of RAM is an issue, the Taks should have suspended with a "Waiting for memory" note. It shouldn't cause things to crash & burn. Grant Darwin NT |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 412 Credit: 12,535,572 RAC: 13,691 |
You ran out of memory. Six jobs of 2.6 GB and you have 16 GB. Ach, I thought I had 32gb. I remember now, the 2 sticks wouldn't play with each other :-( The other machine has 64gb, I'll update this one to match Thanks |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
Looks like the boinc-process server is having issues yet again- Rosetta beta Validator & Assimilator are down (along with a few other processes). How far behind witll the Validator get this time? Presently 11,825 Workunits waiting for Validation. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
Looks like the boinc-process server is having issues yet again- Rosetta beta Validator & Assimilator are down (along with a few other processes). How far behind witll the Validator get this time?Backlog is now 20,000, but Validator now shows as running. Will have to wait a while to see if it actually is. Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2030 Credit: 10,081,426 RAC: 12,283 |
Backlog is now 20,000, but Validator now shows as running. Will have to wait a while to see if it actually is. Now is 0. Validator queue is empty. |
Dr Who Fan Send message Joined: 28 May 06 Posts: 87 Credit: 278,104 RAC: 136 |
Me & all wingman Seeing lots of errors on Android due to what appears to be misconfigured Rosetta task: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1395840251 [ ERROR ]: Caught exception: File: src/core/pack/dunbrack/SingleResidueDunbrackLibrary.hh:306 chi angle must be between -180 and 180: nan ------------------------ Begin developer's backtrace ------------------------- BACKTRACE: ------------------------- End developer's backtrace -------------------------- |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
Me & all wingman Seeing lots of errors on Android due to what appears to be misconfigured Rosetta task:Been a problem for years now. Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2030 Credit: 10,081,426 RAC: 12,283 |
chi angle must be between -180 and 180: nan A great classic!! |
dcs1955 Send message Joined: 2 Dec 22 Posts: 13 Credit: 6,721,475 RAC: 14,715 |
Waiting for Memory.... For the past two weeks I have had one of four core processes held up for needing memory.. It happens on two of my desktops with 16 GRAM. In over 8 years crunching WCG and Rosetta I have not had this happen. Since all the work is Rosetta Beta 6.04. Is this a known issue?? |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 276 Credit: 523,512 RAC: 744 |
RosettaVS tasks use more memory than 8a_hal |
dcs1955 Send message Joined: 2 Dec 22 Posts: 13 Credit: 6,721,475 RAC: 14,715 |
Thanks.. Do you know if it is significantly more memory? Currently, 50% of my tasks are VS. Two VS are running (others are 8a-e__hal ) one of 4 processes is using 1.8-2.2G the others are using 100-300M |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
Thanks.. Do you know if it is significantly more memory? Currently, 50% of my tasks are VS.You just answered your own question. Generally they need between 500MB & 2.5GB, depending on the Task.1-1.5GB tends to be more common. Grant Darwin NT |
dcs1955 Send message Joined: 2 Dec 22 Posts: 13 Credit: 6,721,475 RAC: 14,715 |
Thanks I tweaked the computer preferences to up the memory use percentage. Something I have not needed to do before. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
I've had mine set to When computer is in use, use at most 95 % When computer is not in use, use at most 95 %without issues (with fair bit more RAM per core/thread than your 8GB RAM systems). Grant Darwin NT |
dcs1955 Send message Joined: 2 Dec 22 Posts: 13 Credit: 6,721,475 RAC: 14,715 |
I wimped out and stopped at 90%. :) |
äxl Send message Joined: 30 Dec 08 Posts: 11 Credit: 497,080 RAC: 0 |
Rosetta Beta 6.05 I've had to put RAM usage to 25% for now since it would crash my PC. (Could be faulty modules.) I even aborted 3 of 4 WUs since they would stay in RAM and I don't think I could have finished them anyway. The one I kept is still at 24% and it says Elapsed Time ~5h, Remaining Time ~6h, Deadline is in ~6h. It's running through ScienceUnited so here are the WUs if someone cares: RosettaVS_SAVE_ALL_OUT_NOJRAN_UBA5_3H8V_fulldb_IGNORE_THE_REST_WwaHIZ_1_3192_2978231_3 The ones I stopped: RosettaVS_SAVE_ALL_OUT_NOJRAN_UBA5_3H8V_fulldb_IGNORE_THE_REST_WwaHIZ_3_1857_2978234_3_0 RosettaVS_SAVE_ALL_OUT_NOJRAN_UBA5_3H8V_fulldb_IGNORE_THE_REST_WwaHIZ_6_7045_2978237_3_0 RosettaVS_SAVE_ALL_OUT_NOJRAN_UBA5_3H8V_fulldb_IGNORE_THE_REST_WwaHIZ_4_6655_2978235_3_0 It's an old computer: https://scienceunited.org/su_hosts.php?action=detail&host_id=87101 Running BOINC because: 1) I'm using 100% green energy (no certificates or other non-sense) 2) My computer runs mostly anyway (due to BT and other non-sense) 3) To help |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1758 Credit: 18,534,891 RAC: 473 |
The fact that it's through Science United makes it impossible to see what's going on (we can't view you computer without being logged in to your account there), and will probably affect what you're able to do about it. I've had to put RAM usage to 25% for now since it would crash my PC. (Could be faulty modules.) Under Preferences, Computing preferences, make sure Memory, "Leave non-GPU tasks in memory while suspended" is not selected. When running more than one project, no cache is best. Less chance of deadline issues. Preferences, Computing Preferences, Other, Store at least 0.1 days of work Store up to an additional 0.01 days of work Run Memtest on the system to see if there is an issue with the memory, most likely it's a lack of memory on the system as most of the RosettaVS_ and Rosetta 4.20 Tasks need plenty of RAM- 500GB to 2.5GB (1-1.5GB tends to be most common). And reducing the amount of memory that BOINC can use, will just make things worse. Luckily, there have been very few of those Tasks released in the last 24hrs or so. Also check your completed Valid Tasks and compare the Run time to the CPU time- if there's more than a few minutes difference, it means you're using your system a bit. If there's 30min or so then you're using it a lot. Hours+, you or something else on the computer is making a huge use of your CPU's time. Grant Darwin NT |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 202 Credit: 6,881,503 RAC: 10,850 |
Run Memtest on the system to see if there is an issue with the memory, most likely it's a lack of memory on the system as most of the RosettaVS_ and Rosetta 4.20 Tasks need plenty of RAM- 500GB to 2.5GB (1-1.5GB tends to be most common). And reducing the amount of memory that BOINC can use, will just make things worse. I have not run memtest in years. Back when I had 8 GBytes of RAM and dual Intel Xeon processors, it took almost a day to run memtest. Now that this machine has 128 GBytes of RAM, it would probably take over a week to run it. This machine has 8 memory modules, and when I raised it from 64 GBytes to 128 GByte it was a little flakey, but it was pretty easy to find which module it was and the RAM supplier replaced it free of charge. As far as RosettaVS tasks are concerned, I have only two of them waiting to start out of 22 tasks on the machine. At times, half of the tasks on my machine have been RosettaVS, and sometimes two of them have run at the same time. Right now, I have one Rosetta 4.20 Task waiting to run. The biggest tasks I have run have been CPDN like this one: Task 22317868 Name oifs_43r3_bl_a4ck_2016092300_15_991_12212423_2 Workunit 12212423 Created 15 Apr 2023, 5:23:15 UTC Sent 15 Apr 2023, 5:24:02 UTC Report deadline 14 Jun 2023, 5:24:02 UTC Received 15 Apr 2023, 12:23:18 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x00000000) Computer ID 1511241 Run time 6 hours 18 min 49 sec CPU time 6 hours 13 min 2 sec Validate state Valid Credit 1,813.14 Device peak FLOPS 6.06 GFLOPS Application version OpenIFS 43r3 Baroclinic Lifecycle v1.11 x86_64-pc-linux-gnu Peak working set size 5,592.19 MB Peak swap size 5,930.79 MB Peak disk usage 1,277.90 MB |
äxl Send message Joined: 30 Dec 08 Posts: 11 Credit: 497,080 RAC: 0 |
Grant (SSSF) wrote: The fact that it's through Science United makes it impossible to see what's going on (we can't view you computer without being logged in to your account there), and will probably affect what you're able to do about it. Even I can't see much. I can't see done WUs for example. Under Preferences, Computing preferences, make sure Memory, "Leave non-GPU tasks in memory while suspended" is not selected. Yes, that helped. When running more than one project, no cache is best. Less chance of deadline issues. Yes, this is the default, isn't it? Run Memtest on the system to see if there is an issue with the memory, I'm running memtester on 1GB since yesterday. I don't think it covers much but it's a start, I guess. most likely it's a lack of memory on the system as most of the RosettaVS_ and Rosetta 4.20 Tasks need plenty of RAM- 500GB to 2.5GB (1-1.5GB tends to be most common). And reducing the amount of memory that BOINC can use, will just make things worse. I've finished the 1 WU ~3h before deadline. (I think the only thing that got me into trouble was that the system froze and then I didn't have time over the weekend.) But you're saying because I didn't do parts 2 to 4 it's bad for the project? Also check your completed Valid Tasks and compare the Run time to the CPU time- if there's more than a few minutes difference, it means you're using your system a bit. If there's 30min or so then you're using it a lot. I can at least check the running WUs. Are you saying that if the difference is too big I shouldn't crunch at all? Jean-David Beyer wrote: it was pretty easy to find which module it was You mean by turning the computer off, pulling a module, turning the computer on, turning it off again etc.? Running BOINC because: 1) I'm using 100% green energy (no certificates or other non-sense) 2) My computer runs mostly anyway (due to BT and other non-sense) 3) To help |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2025 University of Washington
https://www.bakerlab.org