Problems with version 5.90/5.91

Message boards : Number crunching : Problems with version 5.90/5.91

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 49980 - Posted: 23 Dec 2007, 17:34:06 UTC - in response to Message 49978.  

This one just errored out: 128154725

This is on a windows XP box. Rosetta asked Zone Alarm for access to the net. I gave permission and it killed itself.

Tim


Tim, great job on the Rosetta game scores!

Just so you understand, the task had ended abnormally already. This is the reason Rosetta wanted direct access to the internet and ZoneAlarm caught it. The program is trying to report the symbol tables at the time of the failure.

So, it was the failure that caused the internet access; and not the other way around.
Rosetta Moderator: Mod.Sense
ID: 49980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 49984 - Posted: 23 Dec 2007, 19:19:23 UTC - in response to Message 49972.  

There seems to be a problem with the 1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2470 jobs. So far, the watchdog has killed 6 out of 7 jobs.


I'm getting watchdog errors for 1zpy_... workunits using 5.91 on Linux as well. As can be seen from the stderr.txt below the old issues with the Rosetta watchdog segfaulting on Linux are still present in 5.91. This workunit was processed on dual AMD Quad-Core Opteron 2346HE running OpenSuSE 10.3 in 64-bit mode. Boinc client is 5.10.21 (also the 64-bit version).

<core_client_version>5.10.21</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
# random seed: 3497252
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -10.2275 for 900 seconds
**********************************************************************
GZIP SILENT FILE: ./xx1zpy.out
SIGSEGV: segmentation violation
Stack trace (19 frames):
[0x8d9f877]
[0x8d9a66c]
[0xffffe500]
[0x8a8a0eb]
[0x8d089ac]
[0x8c0f2fa]
[0x8c1166f]
[0x804c7c2]
[0x8a824f1]
[0x8a83ebb]
[0x8935b66]
[0x89378a1]
[0x893b1af]
[0x898a502]
[0x85e96ae]
[0x87289aa]
[0x8728aca]
[0x8e03bc4]
[0x8048111]

Exiting...

</stderr_txt>
]]>
Team Helix
ID: 49984 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 14 Oct 05
Posts: 101
Credit: 578,497
RAC: 0
Message 49985 - Posted: 23 Dec 2007, 19:55:20 UTC - in response to Message 49980.  
Last modified: 23 Dec 2007, 19:55:39 UTC

This one just errored out: 128154725

This is on a windows XP box. Rosetta asked Zone Alarm for access to the net. I gave permission and it killed itself.

Tim


Tim, great job on the Rosetta game scores!

Just so you understand, the task had ended abnormally already. This is the reason Rosetta wanted direct access to the internet and ZoneAlarm caught it. The program is trying to report the symbol tables at the time of the failure.

So, it was the failure that caused the internet access; and not the other way around.


Got it. Thanks! I'll be back in the game soon, I'm totally addicted!



ID: 49985 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BitSpit
Avatar

Send message
Joined: 5 Nov 05
Posts: 33
Credit: 4,147,344
RAC: 0
Message 49988 - Posted: 23 Dec 2007, 20:18:42 UTC

Okay, 1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2470 problems are NOT limited to Linux. I just had one on a Windows machine get killed by the watchdog:

https://boinc.bakerlab.org/rosetta/result.php?resultid=128341472
ID: 49988 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leonzio
Avatar

Send message
Joined: 19 Nov 07
Posts: 8
Credit: 2,731
RAC: 0
Message 49989 - Posted: 23 Dec 2007, 22:09:41 UTC

Linux client (5.10.8).
https://boinc.bakerlab.org/rosetta/result.php?resultid=128149800:
Task ID 128149800
Name 1qx8__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1qx8_-crystal_foldanddock__2468_1693_0
Workunit 116516582
Created 21 Dec 2007 1:19:05 UTC
Sent 21 Dec 2007 1:21:43 UTC
Received 22 Dec 2007 17:05:34 UTC
Server state Over
Outcome Client error
Client state Aborted by user
Exit status -197 (0xffffff3b)
Computer ID 676702
Report deadline 31 Dec 2007 1:21:43 UTC
CPU time 5381.028
stderr out

<core_client_version>5.10.8</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3679098
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 10.9518809022522
Granted credit 0
application version 5.90


https://boinc.bakerlab.org/rosetta/result.php?resultid=128149799:
Task ID 128149799
Name 1mz9__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1mz9_-crystal_foldanddock__2468_1693_0
Workunit 116516581
Created 21 Dec 2007 1:19:00 UTC
Sent 21 Dec 2007 1:21:43 UTC
Received 22 Dec 2007 16:47:26 UTC
Server state Over
Outcome Client error
Client state Aborted by user
Exit status -197 (0xffffff3b)
Computer ID 676702
Report deadline 31 Dec 2007 1:21:43 UTC
CPU time 19274.27
stderr out

<core_client_version>5.10.8</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3684098
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3684098
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3684098
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3684098
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...
SIGSEGV: segmentation violation
Stack trace (26 frames):
[0x8d9fccb]
[0x8d9a66c]
[0xffffe500]
[0x8811e4b]
[0x8508656]
[0x8c0f7cc]
[0x804c78c]
[0x86180eb]
[0x876c04e]
[0x876c3d6]
[0x8770393]
[0x8781747]
[0x87857f4]
[0x861cfa1]
[0x87869a9]
[0x804e30e]
[0x8cf5936]
[0x89a6761]
[0x8939c96]
[0x893b3f1]
[0x898a502]
[0x85e96ae]
[0x87289aa]
[0x8728aca]
[0x8e04704]
[0x8048111]

Exiting...
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3684098
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...
WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 39.2284726111539
Granted credit 0
application version 5.90

But https://boinc.bakerlab.org/rosetta/result.php?resultid=128517422:
Task ID 128517422
Name 1tif__BOINC_ABINITIO_VF-S25-9-S3-3--1tif_-vf__2450_9783_0
Workunit 116854959
Created 22 Dec 2007 17:30:01 UTC
Sent 22 Dec 2007 17:31:08 UTC
Received 23 Dec 2007 13:40:53 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 676702
Report deadline 1 Jan 2008 17:31:08 UTC
CPU time 10562.276101
stderr out

<core_client_version>5.10.8</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 1484898
======================================================
DONE :: 1 starting structures 10561.3 cpu seconds
This process generated 31 decoys from 31 attempts
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
]]>

Validate state Valid
Claimed credit 21.4971544312456
Granted credit 20.0868561778462
application version 5.91
ID: 49989 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BitSpit
Avatar

Send message
Joined: 5 Nov 05
Posts: 33
Credit: 4,147,344
RAC: 0
Message 49997 - Posted: 24 Dec 2007, 13:58:12 UTC

Two more jobs that got stuck and were killed by the watchdog:

1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_287
1zpy__BOINC_TWIST_RINGS_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_200

Every job that I've had killed like that has been a 1zpy with TWIST_RINGS. The combination of the two is causing a problem.
ID: 49997 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Joachim
Avatar

Send message
Joined: 26 Nov 06
Posts: 5
Credit: 518,439
RAC: 1,130
Message 49999 - Posted: 24 Dec 2007, 14:06:49 UTC

I've got an error-message, which says my computer has not enough memory.

On the website ist nothing to find about requirements for the amount of memory the computer must have. I thing 758 MB must be enough memory.

2007-12-24 11:42:09 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 11:42:09 [rosetta@home] Requesting 5435 seconds of new work
2007-12-24 11:42:14 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 11:42:14 [rosetta@home] Message from server: No work sent
2007-12-24 11:42:14 [rosetta@home] Message from server: Your computer has 758 MB of memory, and 763 MB is needed
2007-12-24 11:42:14 [rosetta@home] Deferring communication for 1 hr 2 min 1 sec
2007-12-24 11:42:14 [rosetta@home] Reason: no work from project
2007-12-24 12:44:20 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 12:44:20 [rosetta@home] Requesting 5783 seconds of new work
2007-12-24 12:44:25 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 12:44:25 [rosetta@home] Message from server: No work sent
2007-12-24 12:44:25 [rosetta@home] Message from server: Your computer has 758 MB of memory, and 763 MB is needed
2007-12-24 12:44:25 [rosetta@home] Deferring communication for 1 min 0 sec
2007-12-24 12:44:25 [rosetta@home] Reason: no work from project
2007-12-24 12:45:28 [rosetta@home] Fetching scheduler list
2007-12-24 12:45:33 [rosetta@home] Master file download succeeded
2007-12-24 12:45:38 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 12:45:38 [rosetta@home] Requesting 5791 seconds of new work
2007-12-24 12:45:43 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 12:45:43 [rosetta@home] Message from server: Not sending work - last request too recent: 78 sec
2007-12-24 12:45:43 [rosetta@home] Deferring communication for 1 min 0 sec
2007-12-24 12:45:43 [rosetta@home] Reason: no work from project
2007-12-24 12:46:45 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 12:46:45 [rosetta@home] Requesting 5797 seconds of new work
2007-12-24 12:46:50 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 12:46:50 [rosetta@home] Message from server: Not sending work - last request too recent: 67 sec
2007-12-24 12:46:50 [rosetta@home] Deferring communication for 1 min 0 sec
2007-12-24 12:46:50 [rosetta@home] Reason: no work from project
2007-12-24 12:47:55 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 12:47:55 [rosetta@home] Requesting 5801 seconds of new work
2007-12-24 12:48:00 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 12:48:00 [rosetta@home] Message from server: Not sending work - last request too recent: 70 sec
2007-12-24 12:48:00 [rosetta@home] Deferring communication for 1 min 0 sec
2007-12-24 12:48:00 [rosetta@home] Reason: no work from project
2007-12-24 12:49:01 [rosetta@home] Sending scheduler request: To fetch work
2007-12-24 12:49:01 [rosetta@home] Requesting 5808 seconds of new work
2007-12-24 12:49:06 [rosetta@home] Scheduler RPC succeeded [server version 601]
2007-12-24 12:49:06 [rosetta@home] Message from server: Not sending work - last request too recent: 66 sec
2007-12-24 12:49:06 [rosetta@home] Deferring communication for 1 min 0 sec
2007-12-24 12:49:06 [rosetta@home] Reason: no work from project
2007-12-24 12:49:14 [rosetta@home] Starting 1acf__BOINC_ABINITIO_VF-S25-9-S3-3--1acf_-vf__2450_6718_0
2007-12-24 12:49:14 [rosetta@home] Starting task 1acf__BOINC_ABINITIO_VF-S25-9-S3-3--1acf_-vf__2450_6718_0 using rosetta_beta version 589

Joachim
Dinos are not dead. They are alive and well and living in data centers all around you. They speak in tongues and work strange magics with computers. Beware the dino!
ID: 49999 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 50002 - Posted: 24 Dec 2007, 15:42:53 UTC

--

You don't need to abort the 5.90 runs... Just shutdown the client and restart. It will save the results.. Just do it after the normal time limit has expired and it won't restart the tasks...


Looking for a team ??? Join BoincSynergy!!


ID: 50002 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 50003 - Posted: 24 Dec 2007, 16:03:36 UTC
Last modified: 24 Dec 2007, 16:29:15 UTC

--

*CAUTION* Since I originally posted this I had a few weird aborts. Maybe this should be ignored ... If I figure out exactly what happened I will post the issue *CAUTION*

Another thing... while your client is down....

If you are adept with a text editor, you can edit the client_state.xml and change the 5.90 version references to 5.91.

Just search for '590' It will find lines like:
<version_num>590</version_num>

for these change the 590 to 591.

Then search for 5.90 It will find lines like:
<file_name>rosetta_beta_5.90_i686-pc-linux-gnu</file_name>

change the .90 to .91

One *CAVEAT* ::: the 5.90 search will also find lines like:
<url>https://boinc.bakerlab.org/rosetta/download/rosetta_beta_5.90_i686-pc-linux-gnu</url>

*Just leave these alone*

Restart the client and 5.91 will be substituted for the 5.90... I don't know what would happen should you don this in the middle of a run, but, since, the alternative could be aborting the process.....

I am not sure I would do it.

What I would do if I were at Rosetta is re-run all of the Linux 5.90 processes just to be sure that the results are valid. I know about CRC's and checksums, but, I would rather not find out like Intel found out about the bug in the IEEE math processors of yore.... IIIIEEEEEEE!!!!

Of course your mileage may vary...
Looking for a team ??? Join BoincSynergy!!


ID: 50003 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 50005 - Posted: 24 Dec 2007, 16:40:32 UTC
Last modified: 24 Dec 2007, 16:49:38 UTC

Here are three examples of tasks that were killed by the watchdog.

128851634
128634832
128360681

All three were of the "1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_..." species.

The tasks ran between 1/3 and 1/9 of my usual runtime preference, but apparently completed at least some models/decoys during the time they did run, as they were granted some credit.

This seems to me a case of the watchdog doing its job perfectly, as the jobs were aborted without spoiling useful results already generated.

I include them in the problems thread because they didn't have a "normal" run, and because these tasks seem to be under special scrutiny right now.
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 50005 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BitSpit
Avatar

Send message
Joined: 5 Nov 05
Posts: 33
Credit: 4,147,344
RAC: 0
Message 50006 - Posted: 24 Dec 2007, 18:26:10 UTC
Last modified: 24 Dec 2007, 18:28:33 UTC

Major, major flaws in any 1zpy job with TWIST_RINGS and only those jobs.Any other 1zpy job runs properly. And before anyone blames my computers, I went through Thomas Leibold's computers and his results are showing the same problems.

1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_4256_0
1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_15109_0
1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_3990_0
1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_25051_0
1zpy__BOINC_TWIST_RINGS_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_4607_0
1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_4333_0
1zpy__BOINC_TWIST_RINGS_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_4804_0

And in particular, 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0

After it crashed, it was still listed as a running task, still accumulating CPU time but not actually running. It was even listed in the job list as a computation error. Message log:

9:02:00 AM	Starting 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0
9:02:00 AM	Starting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 using rosetta_beta version 591
9:02:02 AM	[file_xfer] Started upload of file 1q9a__BOINC_NO_SRL_TORSION_RNA_ABMIN-1q9a_-_2473_4714_0_0
9:02:38 AM	[file_xfer] Finished upload of file 1q9a__BOINC_NO_SRL_TORSION_RNA_ABMIN-1q9a_-_2473_4714_0_0
9:02:38 AM	[file_xfer] Throughput 29612 bytes/sec
9:09:19 AM	Sending scheduler request: To report completed tasks
9:09:19 AM	Reporting 1 tasks
9:09:24 AM	Scheduler RPC succeeded [server version 601]
9:09:24 AM	Deferring communication for 4 min 2 sec
9:09:24 AM	Reason: requested by project
9:47:51 AM	Starting BOINC client version 5.8.11 for i686-pc-linux-gnu
9:47:51 AM	log flags: task, file_xfer, sched_ops
9:47:51 AM	Libraries: libcurl/7.16.0 OpenSSL/0.9.8d zlib/1.2.3
9:47:51 AM	Data directory: /home/armada/nodes/armada5/bin/boinc
9:47:51 AM	Processor: 2 GenuineIntel Intel(R) Core(TM)2 CPU          4400  @ 2.00GHz
9:47:51 AM	Memory: 1011.04 MB physical, 0 bytes virtual
9:47:51 AM	Disk: 70.87 GB total, 64.30 GB free
9:47:51 AM	URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 404492; location: (none); project prefs: default
9:47:51 AM	General prefs: from rosetta@home (last modified 2007-07-21 12:29:21)
9:47:51 AM	Host location: none
9:47:51 AM	General prefs: using your defaults
9:47:51 AM	Restarting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_3756_0 using rosetta_beta version 591
9:47:51 AM	Restarting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 using rosetta_beta version 591
11:07:57 AM	Aborting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0: exceeded disk limit: 129.67MB > 95.37MB
11:07:57 AM	Deferring communication for 1 min 0 sec
11:07:57 AM	Reason: Unrecoverable error for result 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 (Maximum disk usage exceeded)
11:08:02 AM	Computation for task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 finished
11:08:02 AM	Output file 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0_0 for task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 absent
11:08:03 AM	[error] Process 1936 not found
11:08:26 AM	Starting 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0
11:08:26 AM	Starting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 using rosetta_beta version 591
11:43:11 AM	Starting BOINC client version 5.8.11 for i686-pc-linux-gnu
11:43:11 AM	log flags: task, file_xfer, sched_ops
11:43:11 AM	Libraries: libcurl/7.16.0 OpenSSL/0.9.8d zlib/1.2.3
11:43:11 AM	Data directory: /home/armada/nodes/armada5/bin/boinc
11:43:11 AM	[error] State file error: result 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_4758_0 is in wrong state
11:43:11 AM	Processor: 2 GenuineIntel Intel(R) Core(TM)2 CPU          4400  @ 2.00GHz
11:43:11 AM	Memory: 1011.04 MB physical, 0 bytes virtual
11:43:11 AM	Disk: 70.87 GB total, 64.30 GB free
11:43:11 AM	URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 404492; location: (none); project prefs: default
11:43:11 AM	General prefs: from rosetta@home (last modified 2007-07-21 12:29:21)
11:43:11 AM	Host location: none
11:43:11 AM	General prefs: using your defaults
11:43:11 AM	Restarting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_3756_0 using rosetta_beta version 591
11:43:12 AM	Starting 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_16554_0
11:43:12 AM	Starting task 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_16554_0 using rosetta_beta version 591
ID: 50006 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 50007 - Posted: 24 Dec 2007, 18:34:37 UTC - in response to Message 50003.  
Last modified: 24 Dec 2007, 18:35:06 UTC

Don't worry about the *CAUTION* I fat fingered an edit... Mea Culpa...

--
*CAUTION* Since I originally posted this I had a few weird aborts. Maybe this should be ignored ... If I figure out exactly what happened I will post the issue *CAUTION*

Another thing... while your client is down....

If you are adept with a text editor, you can edit the client_state.xml and change the 5.90 version references to 5.91.

Just search for '590' It will find lines like:
<version_num>590</version_num>

for these change the 590 to 591.

Then search for 5.90 It will find lines like:
<file_name>rosetta_beta_5.90_i686-pc-linux-gnu</file_name>

change the .90 to .91

One *CAVEAT* ::: the 5.90 search will also find lines like:
<url>https://boinc.bakerlab.org/rosetta/download/rosetta_beta_5.90_i686-pc-linux-gnu</url>

*Just leave these alone*

Restart the client and 5.91 will be substituted for the 5.90... I don't know what would happen should you don this in the middle of a run, but, since, the alternative could be aborting the process.....

I am not sure I would do it.

What I would do if I were at Rosetta is re-run all of the Linux 5.90 processes just to be sure that the results are valid. I know about CRC's and checksums, but, I would rather not find out like Intel found out about the bug in the IEEE math processors of yore.... IIIIEEEEEEE!!!!

Of course your mileage may vary...

Looking for a team ??? Join BoincSynergy!!


ID: 50007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stewjack

Send message
Joined: 23 Apr 06
Posts: 39
Credit: 95,871
RAC: 0
Message 50009 - Posted: 24 Dec 2007, 19:12:03 UTC

Joachim in message #4999 mentioned a problem similar to mine.


I have a single core CPU, and I have just started getting those limited memory errors when attempting to request new work from Rosetta.

---------

12/24/2007 12:07:28 PM|rosetta@home|Message from server: No work sent
12/24/2007 12:07:28 PM|rosetta@home|Message from server: Your preferences limit memory usage to 691 MB, and 763 MB is needed

--------

For whatever it is worth my total system memory is 768 MB

Jack



ID: 50009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50010 - Posted: 24 Dec 2007, 20:05:54 UTC - in response to Message 50009.  
Last modified: 24 Dec 2007, 20:21:40 UTC

I have a single core CPU, and I have just started getting those limited memory errors when attempting to request new work from Rosetta.

---------

12/24/2007 12:07:28 PM|rosetta@home|Message from server: No work sent
12/24/2007 12:07:28 PM|rosetta@home|Message from server: Your preferences limit memory usage to 691 MB, and 763 MB is needed

--------

For whatever it is worth my total system memory is 768 MB

Jack



Jack, You have 768 MB of "installed" memory. This system requirement refers to "useable" memory (memory not reserved/taken by other things like
the operating system). My best guess is that the new wus Izpy Boinc Twist Rings Twist Angles Symm Fold and Dock Relax 1zpy 2477 ? are requiring
more memory than you/I/MOST(by most, I mean I think the vast majority of hosts connected to rosetta have 512K or less, although that
percentage will lessen over time) of the "normal" crunchers out there, and they don't have any other work queued up for us to participate on. So in
effect they're saying "we don't need you at the moment, please come back when we do". One way to stay productive would be to have a cache as
large as humanly (computerly???) possible to carry us through these times. You/everyone affected could keep a close eye on the fora and the first time this is
mentioned then race to increase your "cpu run time pref" so you could also maintain work for the period in which they have none to offer.

Atleast that's my guess.
ID: 50010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 50011 - Posted: 24 Dec 2007, 20:25:24 UTC

ive got 512 and got a few twist rings in queue.
but now it is saying this:
rosetta@home|Message from server: No work sent
rosetta@home|Message from server: Your computer has 511 MB of memory, and 763 MB is needed

luckily i have 7.5 days of work in queue so i can wait a bit.
ID: 50011 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stewjack

Send message
Joined: 23 Apr 06
Posts: 39
Credit: 95,871
RAC: 0
Message 50012 - Posted: 24 Dec 2007, 21:04:35 UTC - in response to Message 50010.  


My best guess is that the new wus ... are requiring more memory than you/I/MOST(by most, I mean I think the vast majority of hosts connected to rosetta have 512K or less, although that percentage will lessen over time) of the "normal" crunchers out there, and they don't have any other work queued up for us to participate on. So in effect they're saying "we don't need you at the moment, please come back when we do". ....Atleast that's my guess.


Now that you mention it, I think that I remember this problem being discussed on this board.

I'm OK for 2 or 3 days at my normal Rosetta 50% duty cycle.

Thanks for the reply -
Jack
ID: 50012 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BitSpit
Avatar

Send message
Joined: 5 Nov 05
Posts: 33
Credit: 4,147,344
RAC: 0
Message 50013 - Posted: 24 Dec 2007, 21:12:52 UTC

The 1zpy__BOINC_TWIST_RINGS jobs don't even need 763MB. They only use around 122MB.

I hate to say it but I'm done with Rosetta. I've run out of patience. The whole 5.90 thing got me close but the currently flawed jobs have pushed me too far. All the 1zpy__BOINC_TWIST_RINGS are fatally flawed for Linux and these are the ONLY jobs being sent out. I should know. I burned through 70 of them with the abort command. I just can't take the declining quality of the project anymore.
ID: 50013 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 50014 - Posted: 24 Dec 2007, 21:18:13 UTC

These two were ended by the Watchdog because they went 900 seconds with no progress:
https://boinc.bakerlab.org/rosetta/result.php?resultid=128627315
https://boinc.bakerlab.org/rosetta/result.php?resultid=128614343

Note that both had segmentation violations after the Watchdog tried to shut them down.

I also have 512MB machines going idle because they can't get work.
ID: 50014 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Vid Vidmar*
Avatar

Send message
Joined: 17 Sep 05
Posts: 1
Credit: 75,130
RAC: 0
Message 50015 - Posted: 24 Dec 2007, 21:34:28 UTC

And here is mine on Windows platform.

These WUs really seem to be broken, however concerning version 5.90, it is doing its job on other WUs just fine.

Greetings,
ID: 50015 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50016 - Posted: 24 Dec 2007, 23:16:24 UTC
Last modified: 25 Dec 2007, 0:01:09 UTC

There's many kinds of that similar wu. These new ones are "twist rings, twist angle, symm fold and dock relax .... 2477 THe emboldened parts are the NEW
parts. THe older ones were 2470 as well, and those ran well on my lower mem systems.

1zpy__BOINC_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_?_0
1zpy__BOINC_TWIST_RINGS_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_?_0
1zpy__BOINC_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-1zpy_-native__2474_?_0
1zpy__BOINC_TWIST_ANGLE_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_?_0
1zpy__BOINC_LESSCYCLES_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-1zpy_-native__2471_?_0
1zpy__BOINC_LESSCYCLES_TWIST_RINGS_SYMM_FOLD_AND_DOCK-1zpy_-native__2471_?_0
1g2z__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1g2z_-crystal_foldanddock__2469_?_0

Note: the last one is different from the new one because of the "crystal_foldand dock ????" and 1g2z
ID: 50016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Problems with version 5.90/5.91



©2024 University of Washington
https://www.bakerlab.org