Problems with Rosetta version 5.68

Message boards : Number crunching : Problems with Rosetta version 5.68

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile bblum

Send message
Joined: 15 Aug 06
Posts: 6
Credit: 4,077
RAC: 0
Message 41741 - Posted: 2 Jun 2007, 0:53:37 UTC

Hi all. Please let us know here if any issues arise with the new Rosetta version.
ID: 41741 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevea

Send message
Joined: 19 Dec 05
Posts: 50
Credit: 738,655
RAC: 0
Message 41784 - Posted: 3 Jun 2007, 4:19:01 UTC

This rig has not had a w/u crash in over a month.
Now it's 1st 5.68 w/u crashed.

https://boinc.bakerlab.org/rosetta/result.php?resultid=83768981
BETA = Bahhh

Way too many errors, killing both the credit & RAC.

And I still think the (New and Improved) credit system is not ready for prime time...
ID: 41784 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 41786 - Posted: 3 Jun 2007, 4:48:22 UTC - in response to Message 41741.  

Out of the first dozen or more workunits with 5.68 (different computers) three failed.
Workunit #s: 73754058, 73753552, 73753540
ResultId #s: 83930186, 83928164, 83928154

The successful 5.68 workunits had 'TREEJUMP' in their name while all the failing ones had 'CHAINBREAKS' and 'ALTSECSTRUCT' in their name.

<core_client_version>5.8.17</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: read_paths.cc line: 360

</stderr_txt>
]]>
Team Helix
ID: 41786 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 41802 - Posted: 3 Jun 2007, 16:11:35 UTC

Wokunit 75682536 stuck at 0% progress for several hours.
ID: 41802 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter

Send message
Joined: 12 Sep 06
Posts: 3
Credit: 6,800,495
RAC: 16
Message 41806 - Posted: 3 Jun 2007, 20:36:46 UTC

I know these versions are supposed to be distributed automatically, but I am still running 5.64. I have a continuous web connection, so that isn't the problem. Are the new versions announced? I see no way to download a new version myself. I don't think I have ever received an e-mail from R@H.

Another problem (not with 5.68): A new BOINC version downloaded a large number of WU's at one time, which are all due today; about 10 are not done. Should I abort them before the deadline, or will they still count if finished after the deadline?
I have turned off getting new WU's until the backlog of R@H and seti@Home is cleaned up.


ID: 41806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 41807 - Posted: 3 Jun 2007, 21:26:43 UTC

Peter, it is your pile of tasks which has caused you not to see version 5.68 come down yet. The version is defined at the time the task is sent to you.

If I were you, I'd allow new work and just abort those that you cannot complete prior to the deadline. They you should get more tasks downloaded. Don't worry about setting the "no new work" just because you've already got too many tasks. BOINC will figure that out as well and not request anymore work.
Rosetta Moderator: Mod.Sense
ID: 41807 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 41808 - Posted: 3 Jun 2007, 21:26:53 UTC
Last modified: 3 Jun 2007, 21:27:52 UTC

Peter, you could set the projects to "no new work", then change your rosetta "run time" pref to 1 hour, then update the project. Run those wus until finished or deadline has been reached. Reset your "run time" pref to normal, then update the project, and then allow new work. If you leave it at 1 hour, and don't stop getting work, then it'll get a whole bunch of wus to fill your cache based upon that pref.
ID: 41808 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paul Oosterlaken

Send message
Joined: 10 Sep 06
Posts: 2
Credit: 12,923,494
RAC: 3,376
Message 41813 - Posted: 3 Jun 2007, 21:53:57 UTC

Since a few days I do not receive an work anymore. Is this down to the new version or am I doing something wrong?

Thanks Paul
ID: 41813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Knorr

Send message
Joined: 18 Feb 06
Posts: 21
Credit: 373,953
RAC: 0
Message 41818 - Posted: 4 Jun 2007, 8:41:03 UTC - in response to Message 41813.  

Since a few days I do not receive an work anymore. Is this down to the new version or am I doing something wrong?

Thanks Paul


I see your running SETI so I would think the problem is debt related. Because SETI has had a lot of downtime lately you have been running Rosetta more than your resource share is set for. So now BOINC is paying its debt to SETI back.

If you want the host to run Rosetta you can:

1. Wait until the debt to SETI is repaid
2. Raise your resource share for Rosetta
3. Reset the debt in client_state.xml (Note debt is ofsetting, so you have to reset both Rosetta and SETI longterm debt.)
ID: 41818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevea

Send message
Joined: 19 Dec 05
Posts: 50
Credit: 738,655
RAC: 0
Message 41831 - Posted: 4 Jun 2007, 18:09:45 UTC
Last modified: 4 Jun 2007, 18:11:42 UTC

Here is another w/u that crashed. This is from a different rig that have never had a w/u crash before.

https://boinc.bakerlab.org/rosetta/result.php?resultid=84070989

Result ID 84070989
Name 1n0u__TREEJUMP_ABRELAX__NEWRELAXFLAGS_TJTOP3_TOR_BARCODE_BARCODE__1769_4930_0
Workunit 75769002
Created 3 Jun 2007 12:14:12 UTC
Sent 3 Jun 2007 12:15:25 UTC
Received 4 Jun 2007 13:03:13 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 411074
Report deadline 13 Jun 2007 12:15:25 UTC
CPU time 6080.59375
stderr out

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 10737
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range

it goes on and on....




BETA = Bahhh

Way too many errors, killing both the credit & RAC.

And I still think the (New and Improved) credit system is not ready for prime time...
ID: 41831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paul Oosterlaken

Send message
Joined: 10 Sep 06
Posts: 2
Credit: 12,923,494
RAC: 3,376
Message 41833 - Posted: 4 Jun 2007, 22:55:07 UTC

Knorr, Thanks I did not know it was that precise between the projects.

Paul
ID: 41833 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevea

Send message
Joined: 19 Dec 05
Posts: 50
Credit: 738,655
RAC: 0
Message 41834 - Posted: 4 Jun 2007, 22:55:45 UTC

Here is another from a different rig that has never had a w/u crash before.

Result ID 84162036
Name 1gidA_BOINC_MG_CHAINBREAK5_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1734_87054_1
Workunit 73952005
Created 3 Jun 2007 21:01:08 UTC
Sent 3 Jun 2007 21:02:21 UTC
Received 4 Jun 2007 20:53:12 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 410071
Report deadline 13 Jun 2007 21:02:21 UTC
CPU time 6
stderr out

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: .read_paths.cc line: 360

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 0.0264771778312024
Granted credit 0
application version 5.68
BETA = Bahhh

Way too many errors, killing both the credit & RAC.

And I still think the (New and Improved) credit system is not ready for prime time...
ID: 41834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
M.L.

Send message
Joined: 21 Nov 06
Posts: 182
Credit: 180,462
RAC: 0
Message 41835 - Posted: 5 Jun 2007, 0:00:07 UTC
Last modified: 5 Jun 2007, 0:03:54 UTC

This WU ran for 8 seconds, seems to be a similar error to that in message41834

Result ID 84242806
Name 1gidA_BOINC_MG_SASAPAIR_EVENRES_RNA_ABINITIO_SAVE_ALL_OUT_BARCODE_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1760_15471_1
Workunit 74015516
Created 4 Jun 2007 5:54:29 UTC
Sent 4 Jun 2007 5:55:03 UTC
Received 4 Jun 2007 23:25:09 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 510574
Report deadline 14 Jun 2007 5:55:03 UTC
CPU time 8.0625
stderr out <core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 1669100
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: .read_paths.cc line: 360

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 0.0242118324118417
Granted credit 0
application version 5.68

ID: 41835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
turbo_lag

Send message
Joined: 30 May 07
Posts: 1
Credit: 0
RAC: 0
Message 41842 - Posted: 5 Jun 2007, 6:40:16 UTC

i recently updated boinc (5.8.16 on win2k) and joined Rosetta. All of my work is marked as client error. The message logs show checksum errors on some text files for every download session:

6/4/2007 11:00:14 PM|rosetta@home|[file_xfer] Started download of file bblum_lars_description2.txt
6/4/2007 11:00:14 PM|rosetta@home|[file_xfer] Started download of file 1ogw_.fasta.gz
6/4/2007 11:00:15 PM|rosetta@home|Incomplete read of 437.000000 < 5KB for bblum_lars_description2.txt - truncating
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Finished download of file bblum_lars_description2.txt
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Throughput 1755 bytes/sec
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Finished download of file 1ogw_.fasta.gz
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Throughput 543 bytes/sec
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Started download of file 1ogw_.psipred_ss2.gz
6/4/2007 11:00:15 PM|rosetta@home|[file_xfer] Started download of file cc1ogw_03_05.200_v1_3.gz
6/4/2007 11:00:15 PM|rosetta@home|[error] Checksum or signature error for bblum_lars_description2.txt
6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Finished download of file 1ogw_.psipred_ss2.gz
6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Throughput 2535 bytes/sec
6/4/2007 11:00:16 PM|rosetta@home|[file_xfer] Started download of file cc1ogw_09_05.200_v1_3.gz
6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Finished download of file cc1ogw_03_05.200_v1_3.gz
6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Throughput 14320 bytes/sec
6/4/2007 11:01:24 PM|rosetta@home|[file_xfer] Started download of file ccfrags200.txt
6/4/2007 11:01:26 PM|rosetta@home|Incomplete read of 1478.000000 < 5KB for ccfrags200.txt - truncating
6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Finished download of file ccfrags200.txt
6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Throughput 1132 bytes/sec
6/4/2007 11:01:26 PM|rosetta@home|[file_xfer] Started download of file 1ogw.pdb.gz
6/4/2007 11:01:26 PM|rosetta@home|[error] Checksum or signature error for ccfrags200.txt
6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Finished download of file 1ogw.pdb.gz
6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Throughput 3551 bytes/sec
6/4/2007 11:01:29 PM|rosetta@home|[file_xfer] Started download of file 1ogw__LA_barcode06.txt.gz
6/4/2007 11:01:36 PM|rosetta@home|[file_xfer] Finished download of file 1ogw__LA_barcode06.txt.gz
6/4/2007 11:01:36 PM|rosetta@home|[file_xfer] Throughput 6170 bytes/sec
6/4/2007 11:02:14 PM|rosetta@home|[file_xfer] Finished download of file cc1ogw_09_05.200_v1_3.gz
6/4/2007 11:02:14 PM|rosetta@home|[file_xfer] Throughput 18925 bytes/sec

is this a configuration error on my side? Thanks.
ID: 41842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 41872 - Posted: 5 Jun 2007, 20:02:39 UTC

this WU 1gidA was missing a .dat file and crashed.
ID: 41872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fullhouse07

Send message
Joined: 10 Sep 06
Posts: 11
Credit: 14,703,260
RAC: 0
Message 41877 - Posted: 5 Jun 2007, 22:51:00 UTC

to:

Dr. Who Fan,

Nice Move, you just gave the WORLD someone's IP Address or in other words his phone number away. I Hope the person you violated has the ability to block hackers and help us as a group to continue crunching to reach a cure!

FirePage
ID: 41877 · Rating: -2 · rate: Rate + / Rate - Report as offensive    Reply Quote
MattDavis
Avatar

Send message
Joined: 22 Sep 05
Posts: 206
Credit: 1,377,748
RAC: 0
Message 41878 - Posted: 5 Jun 2007, 22:55:11 UTC - in response to Message 41877.  

to:

Dr. Who Fan,

Nice Move, you just gave the WORLD someone's IP Address or in other words his phone number away. I Hope the person you violated has the ability to block hackers and help us as a group to continue crunching to reach a cure!

FirePage


Umm, all his sig does is show each reader his own IP and ISP. I see mine, you see yours, etc. etc.
ID: 41878 · Rating: -3 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fullhouse07

Send message
Joined: 10 Sep 06
Posts: 11
Credit: 14,703,260
RAC: 0
Message 41880 - Posted: 5 Jun 2007, 23:18:12 UTC - in response to Message 41878.  

to:

Dr. Who Fan,

Nice Move, you just gave the WORLD someone's IP Address or in other words his phone number away. I Hope the person you violated has the ability to block hackers and help us as a group to continue crunching to reach a cure!

FirePage


Umm, all his sig does is show each reader his own IP and ISP. I see mine, you see yours, etc. etc.



Mat, your are right but I still hope this cruncher is not hacked.
ID: 41880 · Rating: -2 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 41884 - Posted: 6 Jun 2007, 0:21:33 UTC

I have a number of WUs that errored out after a few seconds.

In all cases, the WU was an old one that was re-issued requesting 5.68. It seems that these old WUs weren't tested with the new Rosetta version.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73901800
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73894861
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73971201
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=73987715
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=74072673
ID: 41884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Odysseus

Send message
Joined: 3 May 07
Posts: 14
Credit: 241,831
RAC: 0
Message 41902 - Posted: 6 Jun 2007, 15:58:41 UTC

Another crash from v5.68 on my G4 Mac, about twenty minutes into 2tif__LARS_ABRELAX_SAVE_ALL_OUT-2tif_-_BARCODE__1775_7010, with exit code 6 (0x6); aside from the crash-dump the output says
SIGBUS: bus error
and finishes with
Exiting...
pure virtual method called
terminate called without an active exception
Caught SIGABRT in graphics thread

ID: 41902 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Problems with Rosetta version 5.68



©2024 University of Washington
https://www.bakerlab.org