11_18_14... WUs progressing very slowly and failing

Message boards : Number crunching : 11_18_14... WUs progressing very slowly and failing

To post messages, you must log in.

AuthorMessage
Trotador

Send message
Joined: 30 May 09
Posts: 108
Credit: 291,214,977
RAC: 0
Message 77666 - Posted: 19 Nov 2014, 20:30:25 UTC

Hi

I have many of these tasks across several machines in a semi stuckt status at below 2% progress after many hours. Checking out my log they seem to finally end in computer error.

Anyone has completed any of them?
ID: 77666 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 77672 - Posted: 20 Nov 2014, 17:25:42 UTC - in response to Message 77666.  

I've noticed a few of these as well -- on multiple computers -- I was looking in the forum here for other reference.

So you do have company here.



Hi

I have many of these tasks across several machines in a semi stuckt status at below 2% progress after many hours. Checking out my log they seem to finally end in computer error.

Anyone has completed any of them?


ID: 77672 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bobby Langan

Send message
Joined: 17 Nov 13
Posts: 2
Credit: 674,127
RAC: 0
Message 77673 - Posted: 20 Nov 2014, 20:25:42 UTC - in response to Message 77672.  

Hello! Are you still seeing these failures, and if so do you have ID numbers for these WUs? I noticed an error in my submissions from 11/18 yesterday morning (11/19/14) and removed those batches immediately. Your cases must started before my deletion, and I apologize for that. It has been fixed as of yesterday

Thank you for alerting us!

I've noticed a few of these as well -- on multiple computers -- I was looking in the forum here for other reference.

So you do have company here.



Hi

I have many of these tasks across several machines in a semi stuckt status at below 2% progress after many hours. Checking out my log they seem to finally end in computer error.

Anyone has completed any of them?

ID: 77673 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 77674 - Posted: 20 Nov 2014, 21:55:35 UTC

Bobby, thanks for the reply -- I tend to delete them as I encounter them.

Just got another one a few minutes ago. Here's a list (incomplete)

701367564 634988589 19 Nov 2014 21:34:48 UTC 20 Nov 2014 21:44:45 UTC Over Client error Compute error 398.10 3.20

701433090 635047947 20 Nov 2014 8:18:17 UTC 20 Nov 2014 8:22:25 UTC Over Client error Compute error 0.00 0.00

701445816 635059453 20 Nov 2014 10:10:29 UTC 20 Nov 2014 10:20:49 UTC Over Client error Compute error 0.00 0.00

701313812 634941210 19 Nov 2014 14:07:47 UTC 20 Nov 2014 17:23:27 UTC Over Client error Compute error 549.17 3.03

701287990 634867334 19 Nov 2014 10:11:48 UTC 20 Nov 2014 17:34:40 UTC Over Client error Compute error 701.05 5.64

701273671 634836099 19 Nov 2014 8:07:41 UTC 20 Nov 2014 8:18:17 UTC Over Client error Compute error 0.00 0.00 ---

701272728 634905359 19 Nov 2014 8:03:03 UTC 19 Nov 2014 8:14:34 UTC Over Client error Compute error 0.00 0.00

701226235 634862784 19 Nov 2014 0:57:24 UTC 20 Nov 2014 4:25:15 UTC Over Client error Compute error 321.42 1.81

Barry
ID: 77674 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bobby Langan

Send message
Joined: 17 Nov 13
Posts: 2
Credit: 674,127
RAC: 0
Message 77675 - Posted: 20 Nov 2014, 22:54:16 UTC - in response to Message 77674.  

Okay, I worked with one of the admins and the issue should be resolved. The issue was deleting the jobs from the database (done now) vs deleting just the queued jobs (what I did yesterday). I am a new graduate student to the program, so thank you for your patience, and for your help to my work in the lab!

And again, if you have further issues with this then I will forward them to my advisors and get an answer ASAP
ID: 77675 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Trotador

Send message
Joined: 30 May 09
Posts: 108
Credit: 291,214,977
RAC: 0
Message 77677 - Posted: 21 Nov 2014, 5:20:27 UTC

Yes, those units disappeared but I aborted a good bunch of them. No problem,this is investigation.

Now, the ones that seem not progressing right and giving computing errors are those starting by 141112.6.2layer_..
ID: 77677 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : 11_18_14... WUs progressing very slowly and failing



©2024 University of Washington
https://www.bakerlab.org