Message boards : Number crunching : Problems with Rosetta version 5.68
Author | Message |
---|---|
bblum Send message Joined: 15 Aug 06 Posts: 6 Credit: 4,077 RAC: 0 |
Hi all. Please let us know here if any issues arise with the new Rosetta version. |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
This rig has not had a w/u crash in over a month. Now it's 1st 5.68 w/u crashed. https://boinc.bakerlab.org/rosetta/result.php?resultid=83768981 BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
Thomas Leibold Send message Joined: 30 Jul 06 Posts: 55 Credit: 19,627,164 RAC: 0 |
Out of the first dozen or more workunits with 5.68 (different computers) three failed. Workunit #s: 73754058, 73753552, 73753540 ResultId #s: 83930186, 83928164, 83928154 The successful 5.68 workunits had 'TREEJUMP' in their name while all the failing ones had 'CHAINBREAKS' and 'ALTSECSTRUCT' in their name. <core_client_version>5.8.17</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1) </message> <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 28800 trouble finding jump_templates_RNA_basepairs_v2.dat ERROR:: Exit from: read_paths.cc line: 360 </stderr_txt> ]]> Team Helix |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Wokunit 75682536 stuck at 0% progress for several hours. |
Peter Send message Joined: 12 Sep 06 Posts: 3 Credit: 6,800,495 RAC: 16 |
I know these versions are supposed to be distributed automatically, but I am still running 5.64. I have a continuous web connection, so that isn't the problem. Are the new versions announced? I see no way to download a new version myself. I don't think I have ever received an e-mail from R@H. Another problem (not with 5.68): A new BOINC version downloaded a large number of WU's at one time, which are all due today; about 10 are not done. Should I abort them before the deadline, or will they still count if finished after the deadline? I have turned off getting new WU's until the backlog of R@H and seti@Home is cleaned up. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Peter, it is your pile of tasks which has caused you not to see version 5.68 come down yet. The version is defined at the time the task is sent to you. If I were you, I'd allow new work and just abort those that you cannot complete prior to the deadline. They you should get more tasks downloaded. Don't worry about setting the "no new work" just because you've already got too many tasks. BOINC will figure that out as well and not request anymore work. Rosetta Moderator: Mod.Sense |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
Peter, you could set the projects to "no new work", then change your rosetta "run time" pref to 1 hour, then update the project. Run those wus until finished or deadline has been reached. Reset your "run time" pref to normal, then update the project, and then allow new work. If you leave it at 1 hour, and don't stop getting work, then it'll get a whole bunch of wus to fill your cache based upon that pref. |
Paul Oosterlaken Send message Joined: 10 Sep 06 Posts: 2 Credit: 12,923,494 RAC: 3,376 |
Since a few days I do not receive an work anymore. Is this down to the new version or am I doing something wrong? Thanks Paul |
Knorr Send message Joined: 18 Feb 06 Posts: 21 Credit: 373,953 RAC: 0 |
Since a few days I do not receive an work anymore. Is this down to the new version or am I doing something wrong? I see your running SETI so I would think the problem is debt related. Because SETI has had a lot of downtime lately you have been running Rosetta more than your resource share is set for. So now BOINC is paying its debt to SETI back. If you want the host to run Rosetta you can: 1. Wait until the debt to SETI is repaid 2. Raise your resource share for Rosetta 3. Reset the debt in client_state.xml (Note debt is ofsetting, so you have to reset both Rosetta and SETI longterm debt.) |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
Here is another w/u that crashed. This is from a different rig that have never had a w/u crash before. https://boinc.bakerlab.org/rosetta/result.php?resultid=84070989 Result ID 84070989 Name 1n0u__TREEJUMP_ABRELAX__NEWRELAXFLAGS_TJTOP3_TOR_BARCODE_BARCODE__1769_4930_0 Workunit 75769002 Created 3 Jun 2007 12:14:12 UTC Sent 3 Jun 2007 12:15:25 UTC Received 4 Jun 2007 13:03:13 UTC Server state Over Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 411074 Report deadline 13 Jun 2007 12:15:25 UTC CPU time 6080.59375 stderr out <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 10737 sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range it goes on and on.... BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
Paul Oosterlaken Send message Joined: 10 Sep 06 Posts: 2 Credit: 12,923,494 RAC: 3,376 |
Knorr, Thanks I did not know it was that precise between the projects. Paul |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
Here is another from a different rig that has never had a w/u crash before. Result ID 84162036 Name 1gidA_BOINC_MG_CHAINBREAK5_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1734_87054_1 Workunit 73952005 Created 3 Jun 2007 21:01:08 UTC Sent 3 Jun 2007 21:02:21 UTC Received 4 Jun 2007 20:53:12 UTC Server state Over Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 410071 Report deadline 13 Jun 2007 21:02:21 UTC CPU time 6 stderr out <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 14400 trouble finding jump_templates_RNA_basepairs_v2.dat ERROR:: Exit from: .read_paths.cc line: 360 </stderr_txt> ]]> Validate state Invalid Claimed credit 0.0264771778312024 Granted credit 0 application version 5.68 BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
This WU ran for 8 seconds, seems to be a similar error to that in message41834 Result ID 84242806 Name 1gidA_BOINC_MG_SASAPAIR_EVENRES_RNA_ABINITIO_SAVE_ALL_OUT_BARCODE_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1760_15471_1 Workunit 74015516 Created 4 Jun 2007 5:54:29 UTC Sent 4 Jun 2007 5:55:03 UTC Received 4 Jun 2007 23:25:09 UTC Server state Over Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 510574 Report deadline 14 Jun 2007 5:55:03 UTC CPU time 8.0625 stderr out <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 1669100 trouble finding jump_templates_RNA_basepairs_v2.dat ERROR:: Exit from: .read_paths.cc line: 360 </stderr_txt> ]]> Validate state Invalid Claimed credit 0.0242118324118417 Granted credit 0 application version 5.68 |
turbo_lag Send message Joined: 30 May 07 Posts: 1 Credit: 0 RAC: 0 |
i recently updated boinc (5.8.16 on win2k) and joined Rosetta. All of my work is marked as client error. The message logs show checksum errors on some text files for every download session: 6/4/2007 11:00:14 PM|rosetta@home|[file_xfer] Started download of file bblum_lars_description2.txt 6/4/2007 11:00:14 PM|rosetta@home|[file_xfer] Started download of file 1ogw_.fasta.gz 6/4/2007 11:00:15 PM|rosetta@home|Incomplete read of 437.000000 < 5KB for bblum_lars_description2.txt - truncating 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Finished download of file bblum_lars_description2.txt 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Throughput 1755 bytes/sec 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Finished download of file 1ogw_.fasta.gz 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Throughput 543 bytes/sec 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Started download of file 1ogw_.psipred_ss2.gz 6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Started download of file cc1ogw_03_05.200_v1_3.gz 6/4/2007 11:00:15 PM|rosetta@home|[error] Checksum or signature error for bblum_lars_description2.txt 6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Finished download of file 1ogw_.psipred_ss2.gz 6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Throughput 2535 bytes/sec 6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Started download of file cc1ogw_09_05.200_v1_3.gz 6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Finished download of file cc1ogw_03_05.200_v1_3.gz 6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Throughput 14320 bytes/sec 6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Started download of file ccfrags200.txt 6/4/2007 11:01:26 PM|rosetta@home|Incomplete read of 1478.000000 < 5KB for ccfrags200.txt - truncating 6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Finished download of file ccfrags200.txt 6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Throughput 1132 bytes/sec 6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Started download of file 1ogw.pdb.gz 6/4/2007 11:01:26 PM|rosetta@home|[error] Checksum or signature error for ccfrags200.txt 6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Finished download of file 1ogw.pdb.gz 6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Throughput 3551 bytes/sec 6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Started download of file 1ogw__LA_barcode06.txt.gz 6/4/2007 11:01:36 PM|rosetta@home|[file_xfer] Finished download of file 1ogw__LA_barcode06.txt.gz 6/4/2007 11:01:36 PM|rosetta@home|[file_xfer] Throughput 6170 bytes/sec 6/4/2007 11:02:14 PM|rosetta@home|[file_xfer] Finished download of file cc1ogw_09_05.200_v1_3.gz 6/4/2007 11:02:14 PM|rosetta@home|[file_xfer] Throughput 18925 bytes/sec is this a configuration error on my side? Thanks. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
this WU 1gidA was missing a .dat file and crashed. |
Fullhouse07 Send message Joined: 10 Sep 06 Posts: 11 Credit: 14,703,260 RAC: 0 |
to: Dr. Who Fan, Nice Move, you just gave the WORLD someone's IP Address or in other words his phone number away. I Hope the person you violated has the ability to block hackers and help us as a group to continue crunching to reach a cure! FirePage |
MattDavis Send message Joined: 22 Sep 05 Posts: 206 Credit: 1,377,748 RAC: 0 |
to: Umm, all his sig does is show each reader his own IP and ISP. I see mine, you see yours, etc. etc. |
Fullhouse07 Send message Joined: 10 Sep 06 Posts: 11 Credit: 14,703,260 RAC: 0 |
to: Mat, your are right but I still hope this cruncher is not hacked. |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
I have a number of WUs that errored out after a few seconds. In all cases, the WU was an old one that was re-issued requesting 5.68. It seems that these old WUs weren't tested with the new Rosetta version. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73901800 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73894861 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73971201 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73987715 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=74072673 |
Odysseus Send message Joined: 3 May 07 Posts: 14 Credit: 241,831 RAC: 0 |
Another crash from v5.68 on my G4 Mac, about twenty minutes into 2tif__LARS_ABRELAX_SAVE_ALL_OUT-2tif_-_BARCODE__1775_7010, with exit code 6 (0x6); aside from the crash-dump the output says SIGBUS: bus errorand finishes with Exiting... pure virtual method called terminate called without an active exception Caught SIGABRT in graphics thread |
Message boards :
Number crunching :
Problems with Rosetta version 5.68
©2024 University of Washington
https://www.bakerlab.org