Message boards : Number crunching : Lots of workunit failures...
Author | Message |
---|---|
Jack Shaftoe Send message Joined: 30 Apr 06 Posts: 115 Credit: 1,307,916 RAC: 0 |
Been attached for all of about 24 hours and already 3 failed workunits. Frustrating. https://boinc.bakerlab.org/rosetta/result.php?resultid=164381455 <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> # cpu_run_time_pref: 86400 </stderr_txt> ]]> |
Jack Shaftoe Send message Joined: 30 Apr 06 Posts: 115 Credit: 1,307,916 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=163455869 <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 86400 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x005C3030 write attempt to address 0x00000004 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 6.3.0 Dump Timestamp : 05/16/08 15:01:41 LoadLibraryA( dbghelp.dll ): GetLastError = 8 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 28146, Write: 0, Other 6765 - I/O Transfers Counters - Read: 0, Write: 38294, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 44156, QuotaPeakPagedPoolUsage: 44156 QuotaNonPagedPoolUsage: 5688, QuotaPeakNonPagedPoolUsage: 5688 - Virtual Memory Usage - VirtualSize: 2144301056, PeakVirtualSize: 2144301056 - Pagefile Usage - PagefileUsage: 1021972480, PeakPagefileUsage: 1029021696 - Working Set Size - WorkingSetSize: 1023954944, PeakWorkingSetSize: 1031032832, PageFaultCount: 23105986 *** Dump of thread ID 1876 (state: Waiting): *** - Information - Status: Wait Reason: UserRequest, , Kernel Time: 1477656192.000000, User Time: 294476873728.000000, Wait Time: 16713152.000000 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x005C3030 write attempt to address 0x00000004 *** Dump of thread ID 3036 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 16713148.000000 *** Dump of thread ID 900 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 16713100.000000 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> ]]> |
Jack Shaftoe Send message Joined: 30 Apr 06 Posts: 115 Credit: 1,307,916 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=163455882 <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> # cpu_run_time_pref: 86400 </stderr_txt> ]]> The thing that is frustrating about these path errors, is that they open a C++ error window, and the workunit just continues to use CPU until you hit OK. One of them failed at like 4 am, and didn't stop until I checked the machine 10 minutes ago - I hit OK, the workunit fails and then starts another one. Grrr.... If this continues, I'm going to have to go back to my other projects. |
Jack Shaftoe Send message Joined: 30 Apr 06 Posts: 115 Credit: 1,307,916 RAC: 0 |
I think I found a solution. I dropped my runtime from 24 hours down to 4 hours and didn't get a single failure last night. After a few days if things continue to remain stable I will increase to 6 hours. |
David Emigh Send message Joined: 13 Mar 06 Posts: 158 Credit: 417,178 RAC: 0 |
I also have discovered the workaround of decreasing runtime. Rosie, Rosie, she's our gal, If she can't do it, no one shall! |
Message boards :
Number crunching :
Lots of workunit failures...
©2024 University of Washington
https://www.bakerlab.org