Message boards : Number crunching : Minirosetta v1.40 bug thread
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
just delete those lockfiles and you should be able to get back on your way again. hopefully the new work you get will not contain these problems. i saw awhile back that they were going to look into that problem and fix it. |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Instant remedy for getting out of lockfile depression - have a change of scenery - go over to RALPH - about 30,000 work units at last count ready to send! |
A Few Good Men Send message Joined: 25 Mar 07 Posts: 14 Credit: 2,031,382 RAC: 0 |
Result task id's for last 12 hours of Rosetta after resetting client. All Client Errors 211601578 211514071 211514058 211512399 211512310 211512309 211512308 211512306 211512305 Compute Errors 211522797 211514072 Please Advise. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Result task id's for last 12 hours of Rosetta after resetting client. looks like its a whole load of defective tasks. 2 different systems bombed them. it is also possible that if your system is being OC'd that your speed is to fast for rosetta to handle. I was working with my OC percentage last night and crashed a whole bunch. Some of the tasks were successful with other users and some of them crashed again. Keep an eye on your current tasks and see if they crash with the same kind of error code. If your running OC'd lower your speed a little bit to see where the threshold is for Rosetta. 5-10 mhz can make a difference in a success and a crash. |
A Few Good Men Send message Joined: 25 Mar 07 Posts: 14 Credit: 2,031,382 RAC: 0 |
Ill do a run at stock cpu, ram and fsb values. Thanks. |
Dave Mickey Send message Joined: 29 Dec 07 Posts: 33 Credit: 4,136,957 RAC: 0 |
Just another data point - still have 1.40 tasks that do not respond to BOINCs command to suspend. this is not fixed yet. Dave |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
This is just ridiculous. Rosetta Mini 1.40 on Linux does NOT obey the BOINC API to suspend the task. I think it does this whenever it's creating the first decoy in the simulation. Other than a dedicated server, this makes it really hard to let Rosetta run on a workstation box. No new work for me until you fix this and I hear back. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
This is just ridiculous. Rosetta Mini 1.40 on Linux does NOT obey the BOINC API to suspend the task. I think it does this whenever it's creating the first decoy in the simulation. Suggestion of how to handle at least part of a fix: Allow it to suspend even during the first decoy if the leave in memory option is selected, as long as paging to the swapfile won't hurt. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
what with all this text that shows up in the stder out text? recovering checkpoint of tag S_U12X5X_00000001 with id abrelax_rg_state recovering checkpoint of tag S_U12X5X_00000001 with id stage_1 recovering checkpoint of tag S_U12X5X_00000001 with id stage_2 this keeps showing up in alot of tasks. the tasks completes ok. |
Tony Send message Joined: 12 Dec 05 Posts: 7 Credit: 6,724,341 RAC: 0 |
I think it is not only on linux that minirosetta doesn't suspend. It seem to be like an unrully child that will not mind. In windows start the task manager to see all running processes and sort by cpu usage. Seems some of the processes obey but some keep running after a snooze or suspend. Restart seems to make it behave. I think it may be errors that will not let it stop the running task. I seem to be having lots of errors on three different computers I just started crunching with. |
Tony Send message Joined: 12 Dec 05 Posts: 7 Credit: 6,724,341 RAC: 0 |
I think it is not only on linux that minirosetta doesn't suspend. It seem to be like an unrully child that will not mind. In windows start the task manager to see all running processes and sort by cpu usage. Seems some of the processes obey but some keep running after a snooze or suspend. Restart seems to make it behave. I think it may be errors that will not let it stop the running task. I seem to be having lots of errors on three different computers I just started crunching with. Mostly problems with a new computer I just built. It is not overclocked but is running vista ultimate 64 bit with 8 gigs mem amd 9950 processor. |
Tony Send message Joined: 12 Dec 05 Posts: 7 Credit: 6,724,341 RAC: 0 |
[quote]I think it is not only on linux that minirosetta doesn't suspend. It seem to be like an unrully child that will not mind. In windows start the task manager to see all running processes and sort by cpu usage. Seems some of the processes obey but some keep running after a snooze or suspend. Restart seems to make it behave. I think it may be errors that will not let it stop the running task. I seem to be having lots of errors on three different computers I just started crunching with. This on an older computer. 12/4/2008 12:08:18 PM||Suspending computation - user is active 12/4/2008 12:08:18 PM||Suspending network activity - user is active 12/4/2008 12:08:35 PM|rosetta@home|Task cc_0_8_nocst4_homo_bench_foldcst_chunk_general_t303__olange_IGNORE_THE_REST_2GO7A_7_5161_15_0 exited with zero status but no 'finished' file 12/4/2008 12:08:35 PM|rosetta@home|If this happens repeatedly you may need to reset the project. This is repeated many times. With this message the task seems to be still running even though boinc says computation is suspended. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
I think it is not only on linux that minirosetta doesn't suspend. It seem to be like an unrully child that will not mind. In windows start the task manager to see all running processes and sort by cpu usage. Seems some of the processes obey but some keep running after a snooze or suspend. Restart seems to make it behave. I think it may be errors that will not let it stop the running task. I seem to be having lots of errors on three different computers I just started crunching with. Under 32-bit Windows Vista SP1, my results indicate that the suspend problem occurs under Vista also, but not in all workunits. I suspect that it's only in workunits that use the new features added under minirosetta 1.39 and 1.40, and not even all of those. I would like to see Rosetta@home add the option to select which types of workunits a particular computer gets, in order to avoid some of the more problematic new types. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
This on an older computer. The first part of that seems likely for workunits that go for a long time between checkpoints on machines that don't have enough memory to allow the workunit to stay in memory, and don't allow BOINC to use enough disk space and swap file space to save the current contents of the memory during user interruptions. For my computer, about US $50 worth of added memory put it up to the maximum amount of memory that model of computer can handle. |
mfbabb2 Send message Joined: 10 Oct 08 Posts: 4 Credit: 10,345 RAC: 0 |
Running on Vista w/SP1: Computation Error and no apparent progress. Project has been reset several times. Rosetta used to work. 12/4/2008 11:57:10 AM|rosetta@home|Restarting task cc_1_0_nocst4_homo_bench_foldcst_chunk_general_t364__olange_IGNORE_THE_REST_1S5UA_5_5206_5_0 using minirosetta version 140 12/4/2008 11:57:51 AM|rosetta@home|Task cc_1_0_nocst4_homo_bench_foldcst_chunk_general_t364__olange_IGNORE_THE_REST_1S5UA_5_5206_5_0 exited with zero status but no 'finished' file 12/4/2008 11:57:51 AM|rosetta@home|If this happens repeatedly you may need to reset the project. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The version being tested now on Ralph is 1.45. I'm pretty sure the issue with tasks not suspending when BOINC tells them to has been resolved. Hopefully coming very soon to Rosetta. Rosetta Moderator: Mod.Sense |
Ma3threeX Send message Joined: 22 Aug 08 Posts: 3 Credit: 347,217 RAC: 0 |
i don't know if its the best thread for it but...whatever i have now at least 8 WUS who are 100% crunched and uploaded but it don't dissappears from the list seems like its waiting for something. I also get a Message from the Rosetta Server : " Cant attach shared Memory" anybody knows the prob? greetings Ma3threeX |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
the team moved the server to a new address. just let boinc manager sort it out. it needs to get the new info from the new master file. some guys are hitting update 10 times to get to the new master file, but the team says just let the program take it's course, it will self correct. i don't know if its the best thread for it but...whatever |
Nicolai Send message Joined: 21 Jun 08 Posts: 1 Credit: 142,530 RAC: 0 |
Running on Vista w/SP1: I have been having the same problem for more than a while now... |
Alec Rosa Send message Joined: 11 Nov 08 Posts: 18 Credit: 2,635 RAC: 0 |
I don't care anymore! Version 1.45 works! (For now anyway.) Yay. |
Message boards :
Number crunching :
Minirosetta v1.40 bug thread
©2024 University of Washington
https://www.bakerlab.org