minirosetta v1.15 bug thread

Message boards : Number crunching : minirosetta v1.15 bug thread

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
James Thompson

Send message
Joined: 13 Oct 05
Posts: 46
Credit: 186,109
RAC: 0
Message 52672 - Posted: 23 Apr 2008, 22:43:17 UTC

Workunits for minirosetta v1.15 are going to be sent out in slowly increasing batch sizes over the next two days. Please report application bugs in this thread.
ID: 52672 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fernik

Send message
Joined: 21 Dec 06
Posts: 1
Credit: 17,454
RAC: 0
Message 52677 - Posted: 24 Apr 2008, 9:22:22 UTC - in response to Message 52672.  

My NOD32 antivirus view minirosetta 1.15 as a win32 troyan type
ID: 52677 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 52697 - Posted: 25 Apr 2008, 6:32:43 UTC - in response to Message 52677.  

Right, this has been an ongoing issue with minirosetta. The Rosetta folks have not gotten any help regarding this when working with the ESET folks.


My NOD32 antivirus view minirosetta 1.15 as a win32 troyan type


ID: 52697 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 52698 - Posted: 25 Apr 2008, 6:37:44 UTC - in response to Message 52677.  

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4017

Note -- I believe you need to be on version 3.0 of NOD32 for this to work.

My NOD32 antivirus view minirosetta 1.15 as a win32 troyan type


ID: 52698 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 52709 - Posted: 25 Apr 2008, 14:31:59 UTC

I had a validation error with this one:

144715592

It was stuck at about 2:53 hours with no graphics available.
I had to reboot the computer (for other purposes) and it began again at 2:08 hours and it ended there.

<core_client_version>5.10.20</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 14400


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x005BD07D write attempt to address 0x4304D066

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.1.16


Dump Timestamp : 04/25/08 11:12:48
# cpu_run_time_pref: 14400
======================================================
DONE :: 1 starting structures 7702.44 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish

ID: 52709 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Quidgydog

Send message
Joined: 28 Sep 06
Posts: 3
Credit: 499,462
RAC: 0
Message 52736 - Posted: 26 Apr 2008, 13:24:28 UTC
Last modified: 26 Apr 2008, 13:29:09 UTC

minirosetta workunits (4 different ones attempted) not running on one of my systems. Process starts, but CPU time does not start counting and no progress despite leaving for long period of time. No errors, no exceptions, just doesn't run.

Running Core2Quad Q6600, Windows Server 2003 R2.

Workunits running fine on my other comps with XP and Vista.
ID: 52736 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 52737 - Posted: 26 Apr 2008, 14:25:56 UTC
Last modified: 26 Apr 2008, 14:28:18 UTC

Two "Compute Errors" to report, both with large and detailed debugger messages.

resultid=158425798
resultid=158582229

Two different computers had the errors linked above. Both computers had successfully run 1.15 tasks for RALPH@home.
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 52737 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gavin Shaw
Avatar

Send message
Joined: 1 Feb 07
Posts: 10
Credit: 506,456
RAC: 0
Message 52744 - Posted: 26 Apr 2008, 23:24:39 UTC

Okay. Run my first Roseeta Mini unit over night while I was asleep. Woke up and found it had a compute error.

From the message log in Boinc this was given as the reason:

27/04/2008 7:51:52 AM|rosetta@home|Output file 1c8cA_BOINC_ABINITIO_IGNORE_THE_REST-S25-9-S3-3--1c8cA-_3092_209_0_0 for task 1c8cA_BOINC_ABINITIO_IGNORE_THE_REST-S25-9-S3-3--1c8cA-_3092_209_0 absent

Have reported it back and received no credits (not that I expected otherwise) and it is task 158393978

In short the first lines say the following:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x005BD4A8 write attempt to address 0x00000008

Engaging BOINC Windows Runtime Debugger...

Memory addressing problem?

Hope this helps...

Never surrender and never give up. In the darkest hour there is always hope.

ID: 52744 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 52745 - Posted: 27 Apr 2008, 0:12:12 UTC
Last modified: 27 Apr 2008, 1:00:07 UTC

I just had this died in a screaming heap, boinc messages repeated many

times the same,[40+] never had this before. Not good.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=145104931

https://boinc.bakerlab.org/rosetta/result.php?resultid=158863990


4/27/2008 9:54:05 AM|rosetta@home|Starting task 2acy__BOINC_ABINITIO_IGNORE_THE_REST-S25-10-S3-11--2acy_-_3105_2_0 using minirosetta version 115

4/27/2008 9:54:06 AM|rosetta@home|Task 2acy__BOINC_ABINITIO_IGNORE_THE_REST-S25-10-S3-11--2acy_-_3105_2_0 exited with a DLL initialization error.

4/27/2008 9:54:06 AM|rosetta@home|If this happens repeatedly you may need to reboot your computer.

EDIT// Is there any way to find out which DLL's are missing and if they
can be installed manualy or not to fix this.//

pete.
ID: 52745 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 52753 - Posted: 27 Apr 2008, 6:41:48 UTC

What is wrong with the validater ?
Workunits 144724221,144734937,144747594 all apparently completed normally (around the specified runtime and without any errors), but got marked invalid and received no credit.
Two of those workunits were completed successfully by other users (however with shorter runtimes).

Team Helix
ID: 52753 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Quidgydog

Send message
Joined: 28 Sep 06
Posts: 3
Credit: 499,462
RAC: 0
Message 52755 - Posted: 27 Apr 2008, 11:44:47 UTC

stderr output from previously mentioned problem . . . .


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C82A714 read attempt to address 0x00E9DD4D

Engaging BOINC Windows Runtime Debugger...


ID: 52755 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
v149907

Send message
Joined: 11 Apr 08
Posts: 2
Credit: 10,729,679
RAC: 3,164
Message 52757 - Posted: 27 Apr 2008, 17:38:54 UTC - in response to Message 52736.  
Last modified: 27 Apr 2008, 17:48:36 UTC

minirosetta workunits (4 different ones attempted) not running on one of my systems. Process starts, but CPU time does not start counting and no progress despite leaving for long period of time. No errors, no exceptions, just doesn't run.

Running Core2Quad Q6600, Windows Server 2003 R2.

Workunits running fine on my other comps with XP and Vista.



Is there any way (other than detaching from the project)to keep mini-rosetta tasks off a system? Status says running; but no progess, no errors, no exceptions... just see decreasing RAC while the CPUs aren't doing anything. Thanks
ID: 52757 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
v149907

Send message
Joined: 11 Apr 08
Posts: 2
Credit: 10,729,679
RAC: 3,164
Message 52758 - Posted: 27 Apr 2008, 17:43:01 UTC - in response to Message 52757.  
Last modified: 27 Apr 2008, 17:49:26 UTC

minirosetta workunits (4 different ones attempted) not running on one of my systems. Process starts, but CPU time does not start counting and no progress despite leaving for long period of time. No errors, no exceptions, just doesn't run.

Running Core2Quad Q6600, Windows Server 2003 R2.

Workunits running fine on my other comps with XP and Vista.




I have a similar problem on one of my systems with exactly the same symptoms. I have had to abort every mini-rosetta task received...

AuthenticAMD
Dual-Core AMD Opteron(tm) Processor 2218 [x86 Family 15 Model 65 Stepping 2]
Number of CPUs 4
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 3055.35 MB

158848050 145089669 27 Apr 2008 1:21:16 UTC 27 Apr 2008 12:12:56 UTC Over Client error Aborted by user 0.00 0.00 ---
158836624 145078554 26 Apr 2008 19:33:58 UTC 27 Apr 2008 12:12:56 UTC Over Client error Aborted by user 0.00 0.00 ---
158836584 145078487 26 Apr 2008 19:33:58 UTC 27 Apr 2008 12:12:56 UTC Over Client error Aborted by user 0.00 0.00 ---
158836531 145078395 26 Apr 2008 19:33:58 UTC 27 Apr 2008 0:06:36 UTC Over Client error Aborted by user 0.00 0.00 ---
158602197 144860512 25 Apr 2008 21:30:39 UTC 26 Apr 2008 18:11:58 UTC Over Client error Aborted by user 0.00 0.00 ---
158602195 144860508 25 Apr 2008 21:30:39 UTC 26 Apr 2008 18:11:58 UTC Over Client error Aborted by user 0.00 0.00 ---
158602193 144860504 25 Apr 2008 21:30:39 UTC 27 Apr 2008 0:06:36 UTC Over Client error Aborted by user 0.00 0.00 ---
158602191 144860500 25 Apr 2008 21:30:39 UTC 27 Apr 2008 0:06:36 UTC Over Client error Aborted by user 0.00 0.00 ---
158602189 144860497 25 Apr 2008 21:30:39 UTC 27 Apr 2008 0:06:36 UTC Over Client error Aborted by user 0.00 0.00 ---
158602187 144860493 25 Apr 2008 21:30:39 UTC 27 Apr 2008 0:06:36 UTC Over Client error Aborted by user 0.00 0.00 ---
ID: 52758 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gavin Shaw
Avatar

Send message
Joined: 1 Feb 07
Posts: 10
Credit: 506,456
RAC: 0
Message 52763 - Posted: 28 Apr 2008, 0:21:31 UTC

Ran a second Rosetta Mini unit. It dies after about 7 hours (target runtime is 8 hours).

Boinc output has the following:

28/04/2008 4:51:14 AM|rosetta@home|Computation for task 1cei__BOINC_ABINITIO_IGNORE_THE_REST-S25-13-S3-11--1cei_-_3105_1_0 finished
28/04/2008 4:51:14 AM|rosetta@home|Output file 1cei__BOINC_ABINITIO_IGNORE_THE_REST-S25-13-S3-11--1cei_-_3105_1_0_0 for task 1cei__BOINC_ABINITIO_IGNORE_THE_REST-S25-13-S3-11--1cei_-_3105_1_0 absent

Task 158649586

Looks like a similar error to my previous.

Never surrender and never give up. In the darkest hour there is always hope.

ID: 52763 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
glaesum

Send message
Joined: 16 Oct 06
Posts: 21
Credit: 508,632
RAC: 0
Message 52774 - Posted: 28 Apr 2008, 16:38:00 UTC
Last modified: 28 Apr 2008, 16:57:38 UTC

I'm getting the same type of failure as Peter Leman (pm sent) using OS win98, the tasks don't even start:

the "stderr out" result report reads like this -

<core_client_version>5.10.30</core_client_version>
<![CDATA[
<message>
too many normally harmless exit(s)
</message>
]]>

the two tasks failed so far are:
wu145397531
wu145454437

(mini v.1.07 worked ok whilst mini v.1.09 did not, see (v.1.09 message) for slightly different stderr out report)

if anyone succeeds with win98 on these tasks please report happiness!!
ID: 52774 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [BAT] MaDr

Send message
Joined: 30 Nov 05
Posts: 1
Credit: 1,383,050
RAC: 0
Message 52783 - Posted: 29 Apr 2008, 6:16:05 UTC - in response to Message 52736.  

minirosetta workunits (4 different ones attempted) not running on one of my systems. Process starts, but CPU time does not start counting and no progress despite leaving for long period of time. No errors, no exceptions, just doesn't run.


Havin' the same problem on a Dell 2650 running Windows 2003.
ID: 52783 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 52789 - Posted: 29 Apr 2008, 14:17:33 UTC

Two more errors to report, both include large and detailed debugger reports.

resultid=159151381
resultid=159146976

An observation:

In every case (these two, and the the preceding two, reported earlier in this thread) my "wingman" ran these tasks successfully. But in all four cases, my wingman's runtime preference was apparently much shorter than my own.

Coincidence? I don't know. The sample size is small, but the correlation is perfect.

I ran many successful mini 1.15 workunits on RALPH, where my runtime preference is set to the minimum. So far, I am 0/4 with mini 1.15 workunits on Rosie, where my runtime preference is set to the maximum.

I would be curious to know if this relationship has been observed by any other crunchers.
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 52789 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 52790 - Posted: 29 Apr 2008, 14:22:31 UTC

This 1tif task shows peak memory usage of 357M on WinXP. It's 7hrs in to an 8hr runtime preference.

Is that one a "high memory" task? Or is that higher then expected for Mini?
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 52790 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 52794 - Posted: 30 Apr 2008, 3:33:29 UTC

I am 0/5 now on mini 1.15 workunits with these two computers:

hostid=623950
hostid=663412

Both have successfully completed mini 1.15 workunits on RALPH.

I am waiting to see how my wingman does with failure number five before taking any further action.

My wingmen, on the four tasks that have reported successful after my failures, were: two Linux boxes, one WinXp box, and a Darwin box. One thing they all had in common was run times of less than 11,000 seconds, so are probably set for a 3 hour runtime preference.

Have any other users had problems with mini 1.15 that seem related to long runtime preferences?
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 52794 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James Thompson

Send message
Joined: 13 Oct 05
Posts: 46
Credit: 186,109
RAC: 0
Message 52799 - Posted: 30 Apr 2008, 7:06:37 UTC - in response to Message 52794.  

Hi everyone,

I just wanted to post again and let you know that we're in the process of debugging minirosetta. Thank you all for your input, we're taking the errors from this application very seriously.

We're enlisting the help of Rom Walton, one of the BOINC developers to help us debug some of the trickier problems with minirosetta v1.15, so expect a new release up on Ralph very soon. Rom is a very talented programmer, and has helped us a great deal in the past with the rosetta_beta app in the past. We hope to have a new version of minirosetta (v1.16) on Ralph by tomorrow that should address some of the problems people have been having.

We've also fixed the problem with validating the results from some of our minirosetta test jobs, so please let us know if that happens in the future. This is a result of trying some new protocols for the next CASP, which we'll describe in detail in the Science threads as we apply these methods to CASP8 targets.

This is all very exciting for us, and thank you for crunching. CASP8 starts on Monday, and I'm very much looking forward to it. Cheers,

James
ID: 52799 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : minirosetta v1.15 bug thread



©2024 University of Washington
https://www.bakerlab.org