Most of my Granted credit is lower than Claimed

Message boards : Number crunching : Most of my Granted credit is lower than Claimed

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 47661 - Posted: 12 Oct 2007, 23:36:14 UTC

I've just reported those results to the Project Team to scrutenize. So I guess there's just enough transparency to track down the ummm... "funny" results reported by that host. And yes, I should think THAT would be a large factor on credit for the specific batches of tasks that had these huge claims reported.

You see that machine is claiming these models are very easy to produce. So their result would skew the average. In comparison to that, we're all having more trouble working to complete a model then benchmarks would predict, and so we're getting less credit then we would if these excessive claims weren't reported.
Rosetta Moderator: Mod.Sense
ID: 47661 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Feb 06
Posts: 316
Credit: 6,621,003
RAC: 0
Message 47664 - Posted: 13 Oct 2007, 0:28:35 UTC - in response to Message 47661.  

I've just reported those results to the Project Team to scrutenize. So I guess there's just enough transparency to track down the ummm... "funny" results reported by that host. And yes, I should think THAT would be a large factor on credit for the specific batches of tasks that had these huge claims reported.

You see that machine is claiming these models are very easy to produce. So their result would skew the average. In comparison to that, we're all having more trouble working to complete a model then benchmarks would predict, and so we're getting less credit then we would if these excessive claims weren't reported.

Wrong thread?
Reno, NV
Team: SETI.USA
ID: 47664 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,821,902
RAC: 15,180
Message 47665 - Posted: 13 Oct 2007, 0:43:33 UTC - in response to Message 47660.  

These WUs should make interesting blips in the graph.

Where are you getting access to all the results on one screen?

I haven't - i'm using excel vba to trawl through all the urls using a big fat version of this:

Download result x data
Store useful details of x
x = x + 1
loop

then doing the same for each of the HostIDs to get the comp info etc. I'll put the spreadsheets up when i've finished and let it run.

Danny
ID: 47665 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Angus

Send message
Joined: 17 Sep 05
Posts: 412
Credit: 321,053
RAC: 0
Message 47669 - Posted: 13 Oct 2007, 2:44:13 UTC - in response to Message 47661.  

I've just reported those results to the Project Team to scrutenize. So I guess there's just enough transparency to track down the ummm... "funny" results reported by that host. And yes, I should think THAT would be a large factor on credit for the specific batches of tasks that had these huge claims reported.

You see that machine is claiming these models are very easy to produce. So their result would skew the average. In comparison to that, we're all having more trouble working to complete a model then benchmarks would predict, and so we're getting less credit then we would if these excessive claims weren't reported.


And now that those potentially bad credit awards are in the average, the average for the rest of that WU run will be skewed.

I would like to see the actual credit awarding code section to see how this mysterious average is REALLY being calculated.

In the interest of transparency, how about making it public? It isn't part of the Rosetta scientific app so there shouldn't be any license issues involved.

Proudly Banned from Predictator@Home and now Cosmology@home as well. Added SETI to the list today. Temporary ban only - so need to work harder :)



"You can't fix stupid" (Ron White)
ID: 47669 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jmarks
Avatar

Send message
Joined: 16 Jul 07
Posts: 132
Credit: 98,025
RAC: 0
Message 47676 - Posted: 13 Oct 2007, 11:47:21 UTC

If you either kick out the rosetta 5.80 ver and use the other rosetta vers. that would give you most real aswers or you could look for 2168 in the wus those seem to be the one over reporting.
Jmarks
ID: 47676 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 47682 - Posted: 13 Oct 2007, 14:53:52 UTC

There is a very lengthy thread on Ralph from when the new credit system was released, feet1st offered an explaination, and David Kim later described it as "...great and right on".

The only point I'm not clear on my self is #4, as to whether Rosetta's credit average rolls as results come in, or whether it is fixed based on the results from Ralph.

And you may want to review #6 with reference to the "blips" just observed.
Rosetta Moderator: Mod.Sense
ID: 47682 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 47688 - Posted: 13 Oct 2007, 16:22:13 UTC

At the time when it moved over to here it was decided to just take results from here and keep it rolling.

While you say you can get a sample size that does not actually acieve the graph of how that size panned out. A Sample size could easily miss the first few in the group.
You have to collect them all to achieve that. Or once the first is found few are found (which is very hard to do as you'd need to have them all to know it was the first ?)
All this adds considerable load the the rosetta servers downloading each page at a time to get the data when Rosetta could do it themselves with considerably less overhead.

Why do you think results are exported for large collection and RPC where created to extract specific information.

P.S. Excel is not good for this, you'll need use Access to collect the data if your using office.
Team mauisun.org
ID: 47688 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,821,902
RAC: 15,180
Message 47691 - Posted: 13 Oct 2007, 17:34:39 UTC - in response to Message 47688.  

At the time when it moved over to here it was decided to just take results from here and keep it rolling.

While you say you can get a sample size that does not actually acieve the graph of how that size panned out. A Sample size could easily miss the first few in the group.
You have to collect them all to achieve that. Or once the first is found few are found (which is very hard to do as you'd need to have them all to know it was the first ?)
All this adds considerable load the the rosetta servers downloading each page at a time to get the data when Rosetta could do it themselves with considerably less overhead.

Why do you think results are exported for large collection and RPC where created to extract specific information.

P.S. Excel is not good for this, you'll need use Access to collect the data if your using office.


yeah, i thought about access, but excel is easier, and yes it is inefficient - a bakerlab table with the info on would be much easier and more efficient. I wasn't really thinking about the evolution of the granted credit calcs when i started though - more how the different CPU families get credited as per the OP.

It'd be nice to be able to have a look at how the averages develop though...

ID: 47691 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,821,902
RAC: 15,180
Message 47734 - Posted: 14 Oct 2007, 21:18:57 UTC

Here's some results if anyone wants to play with them... the first sheet is the results and the second is the hosts that they came from:

http://www.extremedc.net/danny/Rosetta/RosettaResults-V1P1.htm
http://www.extremedc.net/danny/Rosetta/RosettaResults-V1P2.htm

If this is useful to anyone then I can get a bigger sample at the click of a button so just let me know. I don't want to put the spreadsheet up though because if there's more than one person running it it's unnecessary load on the R@H server.

Danny
ID: 47734 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 47760 - Posted: 15 Oct 2007, 15:05:45 UTC

dc, nice work! Could you also create pages with the data in a comma delimited form? Then others could easily import and massage or graph as they wish.

While you are excercising your coding and crawling skills... might I suggest a new tool?

Allow me to enter either my user ID (if my computers are not hidden) or a list of host IDs, and go out and retrieve a list of all WUs that have NOT completed yet. Sort the list by WU name, then host ID, and provide a link for me to click, which goes to the host's page. Since it is my host, I'll be signed on to Rosetta and it will show me the host name, so I can find it.

The idea being that when specific tasks are identified which I wish to abort from my systems, this tool would make it much easier to find all of my effected machines.

...might be helpful to list OS type after host ID as well, in case the trouble only pertains to one platform.

Since you'll only be running over a single user's machines, it wouldn't cause huge number of hits to the servers. Might want to limit the list of host IDs to 100 or so.

If you do not have a way to serve the dynamic content, if you can code it in PHP, I have a server where I can host it for you (us).
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 47760 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 47761 - Posted: 15 Oct 2007, 15:10:25 UTC

The one figure your spreadsheet needs is a column for credit per decoy.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 47761 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,821,902
RAC: 15,180
Message 47762 - Posted: 15 Oct 2007, 16:32:30 UTC

csv and credit per decoy are no problem, but php isn't on my CV yet!

I think it'd be a useful tool though. I can do it (the vba is pretty much all already written), but i'd have to enter the hostID manually each time and run it...

just had a thought - rather than pulling the hosts info from the web i could just pull it from the hosts.gz stats file. That'd save nearly half of the lookups on the san... if i can work out how to pull data from the hosts file - it's xml :o
ID: 47762 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,821,902
RAC: 15,180
Message 47822 - Posted: 17 Oct 2007, 21:59:09 UTC - in response to Message 47688.  

Why do you think results are exported for large collection and RPC where created to extract specific information.

P.S. Excel is not good for this, you'll need use Access to collect the data if your using office.

I've imported the hosts and users tables into Access. Is it possible/easy enough to do the web crawling of the Results straight into Access or do i need to leave that bit in Excel??? Or is there a way to do it without trawling through all the pages?
ID: 47822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 47878 - Posted: 20 Oct 2007, 8:13:32 UTC - in response to Message 47822.  

Why do you think results are exported for large collection and RPC where created to extract specific information.

P.S. Excel is not good for this, you'll need use Access to collect the data if your using office.

I've imported the hosts and users tables into Access. Is it possible/easy enough to do the web crawling of the Results straight into Access or do i need to leave that bit in Excel??? Or is there a way to do it without trawling through all the pages?

Assuming you use VBA to do the collection then Access uses VBA as well, it shouldn't be to difficult to move accross.... As for sirect RPC calls to collect data (though I don't think you can collect what you need) look at TeamDoc's VBA code, it mostly (last time I looked) excel independent.
Team mauisun.org
ID: 47878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Most of my Granted credit is lower than Claimed



©2024 University of Washington
https://www.bakerlab.org