Message boards : Number crunching : All FFD_ units ending with Validate error
Author | Message |
---|---|
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
Hi It seems that every FFD_ wu issued in the last two or three days is ending in vaslidation error in my computer. Is it also anyuone else case? |
wyxchari Send message Joined: 27 Nov 14 Posts: 11 Credit: 85,318 RAC: 0 |
It is a bug. They are repairing. Put rosetta paused a few days and try again. While computes other projects. I have done it. 756338212 685529164 5 Sep 2015 18:03:44 UTC 5 Sep 2015 18:42:32 UTC Over Client error Compute error 0.00 0.00 --- 756310232 685503587 5 Sep 2015 16:53:59 UTC 5 Sep 2015 16:58:04 UTC Over Client error Compute error 0.00 0.00 --- 756307390 685500806 5 Sep 2015 17:08:01 UTC 5 Sep 2015 17:12:11 UTC Over Client error Compute error 0.00 0.00 --- 756247891 685454889 5 Sep 2015 9:29:07 UTC 5 Sep 2015 16:53:59 UTC Over Client error Compute error 0.00 0.00 --- 756247277 685454435 5 Sep 2015 9:25:01 UTC 5 Sep 2015 9:29:07 UTC Over Client error Compute error 0.00 0.00 --- 756242469 685450844 5 Sep 2015 8:53:27 UTC 5 Sep 2015 9:25:01 UTC Over Client error Compute error 0.00 0.00 --- 756227466 685409102 5 Sep 2015 7:15:32 UTC 5 Sep 2015 8:49:20 UTC Over Client error Compute error 0.00 0.00 --- 756225269 685438074 5 Sep 2015 7:02:02 UTC 5 Sep 2015 7:03:12 UTC Over Client error Compute error 0.00 0.00 --- 756222663 685435942 5 Sep 2015 6:50:29 UTC 5 Sep 2015 7:02:02 UTC Over Client error Compute error 0.00 0.00 --- 756221134 685434760 5 Sep 2015 6:40:17 UTC 5 Sep 2015 6:46:20 UTC Over Client error Compute error 0.00 0.00 --- 756215685 685430302 5 Sep 2015 6:46:20 UTC 5 Sep 2015 6:50:29 UTC Over Client error Compute error 0.00 0.00 --- 756215055 685429746 5 Sep 2015 7:07:17 UTC 5 Sep 2015 7:11:24 UTC Over Client error Compute error 0.00 0.00 --- 756214923 685429614 5 Sep 2015 7:11:24 UTC 5 Sep 2015 7:15:32 UTC Over Client error Compute error 0.00 0.00 --- 756148171 685380906 4 Sep 2015 21:55:44 UTC 5 Sep 2015 6:40:17 UTC Over Client error Compute error 0.00 0.00 --- 756147522 685380434 4 Sep 2015 21:51:38 UTC 4 Sep 2015 21:55:44 UTC Over Client error Compute error 0.00 0.00 --- 756146726 685379842 4 Sep 2015 21:43:22 UTC 4 Sep 2015 21:47:31 UTC Over Client error Compute error 0.00 0.00 --- 756146379 685379499 4 Sep 2015 21:47:31 UTC 4 Sep 2015 21:51:38 UTC Over Client error Compute error 0.00 0.00 --- 756142269 685191678 4 Sep 2015 21:11:45 UTC 4 Sep 2015 21:43:22 UTC Over Client error Compute error 0.00 0.00 |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
Yours could be another type of error making the wus abort just at the beggining. FFD_ units are mostly failing at validation, which is worse since they are crunched and sent back to the server to fail there. I'm seeing that all wingmen are also having Validate error for these wus. |
wyxchari Send message Joined: 27 Nov 14 Posts: 11 Credit: 85,318 RAC: 0 |
Sep 3, 2015. The minirosetta application has been updated to 3.62. After upgrading to Rosetta 3.62 from day 4: Task ID 755987746 Name rb_09_01_58658_103423_ab_stage0_h001___robetta_IGNORE_THE_REST_11_18_302525_12_0 Workunit 685246348 Created 4 Sep 2015 6:18:57 UTC Sent 4 Sep 2015 10:13:57 UTC Received 4 Sep 2015 11:42:49 UTC Server state Over Outcome Client error Client state Compute error Exit status -1073741795 (0xffffffffc000001d) Computer ID 2201387 Report deadline 18 Sep 2015 10:13:57 UTC CPU time 0 stderr out <core_client_version>7.4.42</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741795 (0xc000001d) </message> ]]> Validate state Invalid Claimed credit 0 Granted credit 0 application version 3.62 |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
There are FFD__ jobs like FFD__xxxxx_insulinxxxx that are completing and validating properly, but none of the FFH__xxxxx_abinitoDocking jobs are. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 9,368 |
There are FFD__ jobs like FFD__xxxxx_insulinxxxx that are completing and validating properly, but none of the FFH__xxxxx_abinitoDocking jobs are. Well spotted. Unfortunately I don't have any of those in my own task queue |
dkester788 Send message Joined: 22 Oct 14 Posts: 2 Credit: 1,285,173 RAC: 0 |
I'm having the same issue with the FFH__xxxxx_abinitoDocking WUs. The jobs run to completion but fail validation when done. It sounds like they're working on the problem but I haven't seen any of the above WUs today. Good Luck! |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
Still having the Validate error with these units |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
We've just had a long weekend in North America so I suspect nothing has changed to remedy this just yet - let's see where we stand by end of day tomorrow. In the meantime, I'm continuing to abort these FFH__xxxxx_abinitoDocking jobs. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
krypton Volunteer moderator Project developer Project scientist Send message Joined: 16 Nov 11 Posts: 108 Credit: 2,164,309 RAC: 0 |
It appears to be an issue with the latest rosetta build (which is being automatically updated on your end, once existing jobs have run). The format of the output changed for the protein-protein docking jobs and the validation script that we use on the server is expecting the old format. We have to either change the validation script or update rosetta output format. We are working on it now! Thanks for the feedback! |
krypton Volunteer moderator Project developer Project scientist Send message Joined: 16 Nov 11 Posts: 108 Credit: 2,164,309 RAC: 0 |
We've killed all new FFD_* jobs until the error is fixed. Sorry for the trouble. We'll work on preventing this from happening in the future. |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
We've killed all new FFD_* jobs until the error is fixed. Thanks for the update Sergey! I spent the day today at work debugging some rather complex code that recently underwent some major refactoring and broke one of our use cases that isn't tested for very thoroughly - akin to finding a needle in a haystack - so I totally understand what you all are going through! Cheers man! |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
It appears to be an issue with the latest rosetta build (which is being automatically updated on your end, once existing jobs have run). The format of the output changed for the protein-protein docking jobs and the validation script that we use on the server is expecting the old format. We have to either change the validation script or update rosetta output format. Good to know you have found out the root cause of the problem. I still have tenths of Validate Error and no credit units from previous days, the correcting script should be having a lot of work/fun :). Thanks! |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232 |
We've killed all new FFD_* jobs until the error is fixed. Strange. I continue to download FFD_* jobs |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
We've killed all new FFD_* jobs until the error is fixed. +1 I've just checked it out.... and they continue giving validate error... |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
We've killed all new FFD_* jobs until the error is fixed. Often they basically halt the generation of new jobs for such problems, and any existing ones have to churn through to get fully purged. Or there are retries of existing tasks that were previously generated, and they are unable to cancel those. But the number left should be small. Rosetta Moderator: Mod.Sense |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
Is it planned to run the routine to give credits to the units with Validate error? thanks |
krypton Volunteer moderator Project developer Project scientist Send message Joined: 16 Nov 11 Posts: 108 Credit: 2,164,309 RAC: 0 |
The invalid jobs should eventually get credit. The error(s) from yesterday were my fault. Sorry about that! We have a two step queue system. I killed all the FFD_* jobs in queue 1, but there were still some in queue 2. I've now launched the command to kill the jobs in queue 2 as well. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Is it planned to run the routine to give credits to the units with Validate error? If you check the task details, you will see the granted credit. For whatever reason, tasks given credit for errors do not show the granted credit on the task summary page. Rosetta Moderator: Mod.Sense |
Message boards :
Number crunching :
All FFD_ units ending with Validate error
©2024 University of Washington
https://www.bakerlab.org