Message boards : Number crunching : Problems with Rosetta versions 5.72 and 5.73
Author | Message |
---|---|
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Thanks to everyone for continuing to post and discuss issues with the applications! This version should not be much different from the previous one (5.70). For the aficionados, this is the "rosetta_beta" application getting updated. |
KWSN THE Holy Hand Grenade! Send message Joined: 3 May 07 Posts: 5 Credit: 2,542,452 RAC: 0 |
Thanks to everyone for continuing to post and discuss issues with the applications! This version should not be much different from the previous one (5.70). For the aficionados, this is the "rosetta_beta" application getting updated. One thing I've noted that, that is different from both 5.68 and 5.70, is that the RMSD area doesn't scroll; when items are outside the current range |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Two of us failed on this one 1d3z_non_ideal_BOINC_MFR_ABRELAX_PICKED_1850 Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
This one never got started.. On either rig.. 1d3z_non_ideal_BOINC_MFR_ABRELAX_PICKED_1850_5161_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=94897692 BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
Tulio Lazarini Send message Joined: 3 Apr 07 Posts: 6 Credit: 55,216 RAC: 0 |
Hi! I'm crunching work units using BOINC 5.10.13 and Rosetta 5.72 under Windows Vista Home Premium. Yesterday, my computer freezed suddenly. I did a hard reset (no response from keyboard at all), and started all other applications as I usually do. When I authorized the execution of BOINC (annoying thing: every time I turn the machine on, I need to tell Windows that BOINC Manager is a 'good boy' and can run, tsc, tsc...), three or four minutes later my computer freezed again. Another restart, another execution of BOINC, and this time computer hanged ten minutes after, with a BSOD (blue screen of death), with a 'stack frame like' problem. The dump was NOT generated (computer freezed before) and the error was not logged on event log. When I do not authorize BOINC to run, my machine flies gracefully. Is there a chance that Rosetta is causing the problem? My computer is a HP DV-6150 notebook, with a AMD Turion64 x2 TL-50 processor, 1Gb RAM DDR2 667MHz, Windows Vista Home Premium, 120Gb SATA HD. Thanks in advance! |
Susie HomeMaker Send message Joined: 12 Nov 06 Posts: 22 Credit: 2,511,881 RAC: 0 |
Two of us failed on this one 1d3z_non_ideal_BOINC_MFR_ABRELAX_PICKED_1850 Got this one last night.... 1d3z_non_ideal_BOINC_MFR_ABRELAX_PICKED_1850_6515 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=86135842 |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Is there a chance that Rosetta is causing the problem? It is more likely that your machine has something going on that is not exposed by other applications. Sometimes on a notebook there are heat problems, but seems unlikely they would occur so quickly after startup. But then again if you are in a hot room, or just turned it off and back on, then it is hot to begin with. Suggest that you try setting your general preference to use at most 50% of the CPU and see if that makes any difference. Once you make the change, you have to update to the project for your machine to use it. If that resolves it then it would seem to confirm you have a heat problem with that machine. Looks like you run SETI as well, you might try suspending one project and letting the other run and just see if it appears to make any difference which project is running. Rosetta Moderator: Mod.Sense |
Tulio Lazarini Send message Joined: 3 Apr 07 Posts: 6 Credit: 55,216 RAC: 0 |
Sometimes on a notebook there are heat problems, but seems unlikely they would occur so quickly after startup. But then again if you are in a hot room, or just turned it off and back on, then it is hot to begin with. I thougth this, initially, and when the computer freezed for the first time, I turned it off and only switched it on a half hour after. I'm Brazilian and we're in winter here, the temperature is actually pleasant (18°C at night). After boot - with a cold machine - computer freezed again. Suggest that you try setting your general preference to use at most 50% of the CPU and see if that makes any difference. Once you make the change, you have to update to the project for your machine to use it. If that resolves it then it would seem to confirm you have a heat problem with that machine. The Turion64 x2 is a very powerful processor, and it really generates excessive heat. In the beggining, when I started crunching Rosetta@Home WUs, I noticed some overheating, and decided to reduce CPU consumption to 60%. My profile for 'home' was defined with this settings. Looks like you run SETI as well, you might try suspending one project and letting the other run and just see if it appears to make any difference which project is running. I was running Cimateprediction.net 'til 2 weeks ago, and I've lost all the work when migrating from XP to Vista. Actually my BOINC's crunch only Rosetta@Home WU's. However, the processor has two cores, and two tasks run at the same time on the same machine. I might try to suspend one of these tasks and run Rosetta in a single core, but I really don't believe overheating is the root cause. Are there any chances that the problem could be caused by some instruction (assembler instruction) on Rosetta 5.72 that executes fine on Intel and doesn't execute well on AMD? Or, maybe, a set of instructions Windows Vista recognizes as some dangerous stuff? Hope someone, with a computing environment near mine, could offer us some clue about this. Thanks for your help! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Are there any chances that the problem could be caused by some instruction (assembler instruction) on Rosetta 5.72 that executes fine on Intel and doesn't execute well on AMD? Or, maybe, a set of instructions Windows Vista recognizes as some dangerous stuff? If that were the case, there would be dozens of similar posts on the boards. But I've not seen them. If you review the task manager, do you see the idle task running half the time, thus confirming your preference of 60% of CPU is being used?? In order for it to be used, you must select "run based on preferences" on the activity menu rather then "run always". Sometimes the exit status code on the tasks reported back gives some clues as to what to look for. Since your machines are hidden, your completed tasks are not visible to review the status codes. (you change this in your Rosetta Preferences). Did your work units happen to have the "non_ideal" in their names? The prior posts seem to show several are having issues with those. Rosetta Moderator: Mod.Sense |
Tulio Lazarini Send message Joined: 3 Apr 07 Posts: 6 Credit: 55,216 RAC: 0 |
If you review the task manager, do you see the idle task running half the time, thus confirming your preference of 60% of CPU is being used?? In order for it to be used, you must select "run based on preferences" on the activity menu rather then "run always". Yes, the CPU usage turns around 60%, sometimes up to, sometimes down to, but near the limit. Did your work units happen to have the "non_ideal" in their names? The prior posts seem to show several are having issues with those. The task I'm actually crunching with my HP notebook with Vista and Rosetta 5.72 is "1bgf__BOINC_ABINITIO_SAVE_ALL_OUT-1bgf_-frags83__1838_3950_0". Well, in fact the error occurred again yesterday, and I tried to find an explanation for the STOP error I received: it's about "Stop 0x00000077" or "KERNEL_STACK_INPAGE_ERROR", which occurs every time Windows needs to page in or out memory and something goes wrong (i.e. hard disk drive is not accessible or memory fails). In other words, this absolutely has nothing to do with Rosetta... I'm sorry about this false alarm, I think this is some kind of BIOS problems when computer tries to start energy saving procedures, like shutting down hard disk drive when it's in use by the system. Thanks anyway for your support! |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
This doesn't sound right -- we're looking into this job (1850) now. This one never got started.. |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
yep somethings not right 0 cpu seconds on each rig BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
soriak Send message Joined: 25 Oct 05 Posts: 102 Credit: 137,632 RAC: 0 |
My Vista Home Premium desktop computer has received a few BSODs as well - only noticed this since last night. (I usually work on my laptop) Didn't get them at all before. I seem to have completed a few 5.72 work units, so it may not be it... but I'm going to abort the upcoming ones anyway. I still have mostly 5.68 so no point risking more crashes, which only result in lost work. |
Tulio Lazarini Send message Joined: 3 Apr 07 Posts: 6 Credit: 55,216 RAC: 0 |
My Vista Home Premium desktop computer has received a few BSODs as well - only noticed this since last night. (I usually work on my laptop) Didn't get them at all before. I seem to have completed a few 5.72 work units, so it may not be it... Well, I think this is really a coincidence: the BSODs and the fact Rosetta 5.72 Betta is runing. I started my machine on safe mode w/ network support, found some BIOS end hardware driver updates on HP support site, installed them, and apparently the problem was solved. ...but I'm going to abort the upcoming ones anyway. I still have mostly 5.68 so no point risking more crashes, which only result in lost work. Before suspending another Rosetta 5.72 beta tasks, I strongly suggest that you try some other things, such as: (1) start your machine in safe mode, and then restart in normal mode - Windows does some refreshing on pagefile, prefetching, registry, I don't know for sure what it is - things may be better after this; (2) find updates for your BIOS, specially for disk and memory controllers - I had some question with AMD Hypertransport(tm) controller support on my old BOIS - it would help definitely; and (3) write down the STOP error code and diagnosis code, and contact Microsoft Support site, some useful information may be obtained to solve you problem. Well, Vista user like me, GOOD LUCK!! You'll need it! |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 10 |
Another crashed wu. Result ID 96127994 Name 1r69__BOINC_GENERIC__ABRELAX-1r69_-generic__1870_8796_0 Workunit 87041574 Created 30 Jul 2007 2:05:34 UTC Sent 30 Jul 2007 4:05:31 UTC Received 30 Jul 2007 6:59:34 UTC Server state Over Outcome Client error Client state Compute error Exit status 193 (0xc1) Computer ID 544079 Report deadline 9 Aug 2007 4:05:31 UTC CPU time 2376.856544 stderr out <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1) </message> <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 10800 # random seed: 2191065 SIGSEGV: segmentation violation Stack trace (14 frames): [0x8d3bd1b] [0x8d36b4c] [0xb7f54420] [0x8ca6153] [0x8bac6ba] [0x8bb26c0] [0x8c8db88] [0x84b42d8] [0x80d8651] [0x85ee9b7] [0x871c6e3] [0x871c78e] [0x8d9fc14] [0x8048111] Exiting... </stderr_txt> ]]> Validate state Invalid Claimed credit 3.44072738375063 Granted credit 0 application version 5.72 Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
I contacted the person in charge of this workunit, and we won't be sending out anymore until it gets fixed... This doesn't sound right -- we're looking into this job (1850) now. |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
We figured out this issue -- when this workunit gets resent, it should work fine. I contacted the person in charge of this workunit, and we won't be sending out anymore until it gets fixed... |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Not a problem i guess just haven't seen this before and result was a small 11kb. BTW with 5.73. WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
5.73 08/08/2007 16:38:52|rosetta@home|Computation for task 1c26__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1c26_-foldanddock__1878_6494_0 finished In BOINC LOGX I see the message -- Warning! Not sure non-ideal rotamers are compatible with symmetry yet. etc etc.(as in message 44768). This does not appear in Rosie's message logs. Can I get an explanation? |
ramostol Send message Joined: 6 Feb 07 Posts: 64 Credit: 584,052 RAC: 0 |
Result: 1fe6-task, 1 model totalling 15:22:48 After 14h 30 m crunching (and still continuing) this wu refused to display the graphic screen and the screensaver; the screensaver showed merely a black screen, the graphic display never appeared at all although the American flag was visible in the top menu bar as usual when this display is active. Half an hour earlier the display was normal. (And now, after starting another wu, the graphics once more display properly.) At 15:22 h I was rapidly approaching the upper limit for analyzing 1 model (default 4 hours x 4) and the model was still running, so I changed my preferences to 6 h and connected to the server to update my computer. Then I discovered that the computer was uploading the wu as completed. I have no idea if the wu really completed in exactly the same moment as I connected to the server, or if the wu in reality had finished earlier but decided to stay "running, high priority" anyhow. I seem to remember that the graphics likewise refused to appear once before in connection with another (large?) wu in a late crunching stage (with an earlier version of Boinc), so this case is not unique. Rosetta beta 5.73, Boinc 5.10.17. MacOS 10.3.9 |
Message boards :
Number crunching :
Problems with Rosetta versions 5.72 and 5.73
©2024 University of Washington
https://www.bakerlab.org