GPU use... again

Author	Message
mikey Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,716,125 RAC: 138	Message 69750 - Posted: 7 Mar 2011, 10:50:53 UTC - in response to Message 69749. It's actually already written in the posts above. That all cores work on the same thing does not change the fact, that each core still needs an own thread. And each thread some additional memory for itself, as it should do something different than the threads on other cores. And while on a supercomputer each core has a GB or whatever for its own use, one GPU has 2GB for... what have the actual top models, 1000 cores? Secondly, as I said above I can't imagine, they were running the same WUs on that supercomputer as we get them, but for sure something far more complex. And more complex things can usually be better done in paralell, but on the other hand need more RAM. So I don't think there is any point in comparing supercomputer with a GPU, that are two different worlds. If it was possible to replace a supercomputer with few GPUs, they were all shut down by now. So just because something runs on a supercomputer, it's not necessarily running on a GPU. There already is a project using multiple cpu's or even gpu's on one workunit but it is not overall faster or more efficient. Boinc is a framework designed to work on a single core cpu that has been expanded to work for both multi-core cpu's and now gpu's. But since it was originally written for a single core pc, and ALOT of people still use it that way, it is limited in its abilities to work like a super computer. Nor should it emulate one IMO. I do not have a super computer, I currently have 13 pc's, so it is better the way it is for me, and most of us users. A super computer can usually do one thing very fast, i saw 4.7mhz XP pc's back in the day overclocked and sitting in liquid nitrogen running at 100mhz adding 1+x a million times! It could 1 to an number faster than anything else in existence at the time, but it couldn't do anything else without crashing. Super computers are not pc's and do not work the same either, they can handle what they are designed to handle and not much else. I saw a Crapy super computer on sale once for 10 grand, delivered to your site! I toyed with the idea of getting it and making it crunch but after some discussions realized it wouldn't be any faster than my local pc, as it was not designed to do that kind of calculations! ID: 69750 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 327	Message 69753 - Posted: 7 Mar 2011, 15:19:49 UTC - in response to Message 69749. Last modified: 7 Mar 2011, 15:24:55 UTC It's actually already written in the posts above. That all cores work on the same thing does not change the fact, that each core still needs an own thread. And each thread some additional memory for itself, as it should do something different than the threads on other cores. And while on a supercomputer each core has a GB or whatever for its own use, one GPU has 2GB for... what have the actual top models, 1000 cores? Each core might need its own thread, but I'm fairly sure all cores can work on the same shared memory - you'd need memory for the model (say 500MB) plus working memory for each thread to use (say 512KB per GP core - complete guess, but that's the current limit on new nvidia cards). Have a look at GPUGrid which is doing exactly what we're talking about on GPUs with less than 1GB of memory. Secondly, as I said above I can't imagine, they were running the same WUs on that supercomputer as we get them, but for sure something far more complex. And more complex things can usually be better done in paralell, but on the other hand need more RAM. Not necessarily, but I'd be interested to know from one of the project team. So I don't think there is any point in comparing supercomputer with a GPU, that are two different worlds. If it was possible to replace a supercomputer with few GPUs, they were all shut down by now. So just because something runs on a supercomputer, it's not necessarily running on a GPU. I wasn't comparing a supercomputer with a GPU - I was stating that I expect that the version of Rosetta that they used on the supercomputer was capable of parallel computing on a single task rather than as it runs under BOINC where it's one task per core, and if the software is capable of parallel computation then that might be applicable to GPUs too. ID: 69753 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1248 Credit: 14,421,737 RAC: 0	Message 69805 - Posted: 13 Mar 2011, 4:34:06 UTC - in response to Message 69738. The point is, that one task on all of the cores in parallel needs for sure a lot of RAM and I'm pretty sure that they wouldn't run it on a supercomputer if it could be done on few of our GPUs. Not sure I follow; if it's running all cores in parallel on the same model then the model only needs to be held in memory once (on a gpu at least, on a supercomputer I imagine the networking overheads mean that they actually store a copy locally for each blade in ram?) If you're running different tasks on each core then you need enough ram for one model per core... One copy of the PROGRAM should be enough, but unless you do a major rewrite of the program, you'll need as many copies of the DATASPACE memory as there are GPU cores in use. I suspect that most of the over 500 MB the current program requires is dataspace, and therefore you'd need over 500 MB of graphics memory for each GPU core in use. My GTS 450 GPU board has only 1 GB of memory, and therefore could use only one GPU core for the calculations, and possibly make some use of a few more to access memory not directly reachable from that GPU core. I'd guess that a GPU board with 2 GB of graphics memory could actually use three GPU cores for calculations at once - still not much of a speedup over the current CPU workunits. Would need a change to make sure that these GPUs get only workunits that have many decoy starting points, though, instead of the type that must have the results from the previous decoy to start another one. ID: 69805 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 327	Message 69808 - Posted: 13 Mar 2011, 10:15:48 UTC - in response to Message 69805. One copy of the PROGRAM should be enough, but unless you do a major rewrite of the program, you'll need as many copies of the DATASPACE memory as there are GPU cores in use. I already said that that work might already have been done: I expect that the version of Rosetta that they used on the supercomputer was capable of parallel computing on a single task rather than as it runs under BOINC where it's one task per core, and if the software is capable of parallel computation then that might be applicable to GPUs too. I assume (without knowing any details about Rosie) that to work in parallel it would change the position of many amino acids at once (I don't know whether they're moved at random or in the direction of lower energy - i.e. does Rosetta calculate which direction lower energy is for each AA?) and then recalculate the total energy state. ID: 69808 · Rating: 0 · rate: / Reply Quote