Re: Project Update

tazzduke · October 21, 2017, 06:54:14 PM

Greetings All

Yeah, I did strike it lucky with that one, but the others not so, now I just cant seem to get anything, so I have left it open to see if I can snag a task or two. I just want to get to 1 million anyways.

I would be one of many who cannot participate if I didnt do my long overdue rebuild.

But I am kicking myself for not getting the 17y00 instead of the 1400, I could have had 16 cores instead of 8 cores. Oh well, bit late now.

Well anyways, I am still crunching WCG for the time being.

kashi · October 21, 2017, 07:45:00 PM

Yeh, I couldn't get any for a day or so too. Well, did get 1. Then got heaps. Will abort those later that will not finish before deadline, that will release some for others. Just testing some v0.26 tasks first to see if they credit the same as v0.25.

Probably only a few people have any tasks, and those that do would have tried to maximise their cache in case future tasks grant less. It's unfair really but happens commonly with limited batches of tasks available when new projects are under development.

Wouldn't matter if credit rate was consistent, anyone could easily catch up later. But if credit rate drops later then those who got lots of early credit monsters are unable to be overtaken easily or at all. With limited task availability, admin should have put a limit on number of tasks per GPU allowed in cache but he's been busy trying to get checkpoints working instead, plus faffing around with Linux compile questions.

Yep, I've still got both E5-2670 CPUs fully on WCG. Finished up with MCM for now and concentrating on next FA@H2 and HS TB badges.

tazzduke · October 21, 2017, 08:59:25 PM

Hey I got one lol

JugNut · October 22, 2017, 02:44:02 AM

Now i'm not opposed to a good a days credit but yesterday I received over 20mil for just that one day, even for me thats a tad excessive. That kind of credit is unsustainable, I mean the only other project where I could get a similar total would be at Collatz and only then by heavily tuning the app. But the Collatz project uses a tiny math app that's been worked on and optimised year after year by a skilled programmer so in some ways I can understand why they give the credit they do, but that's not the case for Drug Discovery in fact their app is the total opposite.

The thing is at around the 250k mark for a 7 or 8 hour work unit that would have been quite acceptable and is somewhere around what GPUgrid gives in credit but because the Drug Discoveries app is so inefficient and you can run 3, 4 or more instances at once which of course means you receive 3, 4 or more times the credit than you normally would do too.

Normally I despise people who complain about getting good credit thereby ruining everyones elses fun when its eventually changed thats why I wont be mentioning it on Drug Discoveries website, but I still think it needs changing as it is more than just a tad extreme. Regardless of what I think it usually not long before a kill joy or two does start complaining and that may be the end of that. So if you're looking for a quick top up on credit well nows the time as it might not be there in the future.

But all in all it's great to have another Bio project that uses GPU's.

What do you guys think? To much? Or do you think there should be no limits on credit at all?

Anyway for now let the credit flow & the drugs be discovered.. LOL

Crunch-on..

kashi · October 22, 2017, 05:46:53 AM

Well high credit in itself is not a problem so much if it is a consistent rate, there are sufficient tasks available for all and application is reasonably efficient and not overly CPU bound.

Then many people get the chance to enjoy a project with high credit and the project itself may interest them as well. I certainly enjoy a project more when credit rate is high rather than stingy. Although credit's not the only criterion for choice, rather employ my GPU on GPUGrid or DrugDiscovery than on Collatz, regardless of credit rate.

However when the rate repeatedly varies excessively from lowish to very high, depending on when you downloaded a batch then it is unfair.

When documentation is poor and doesn't mention that a recent GPU driver version is required that is also unfair.

Most importantly when available tasks are limited and no restriction is placed on cache size, that is the unkindest cut of all. It means some like me and you can gallop up the charts with our fat caches whilst others suffer a task drought and are unable to join in the credit party.

There's a few problems causing this and they've happened a number of times before.

Firstly, the use of a credit system that varies the amount of credit granted proportionally according to runtime. This on a GPU project that has the second problem is a proven recipe for credit mayhem.

Secondly, as already mentioned, a GPU application that is extremely inefficient and/or CPU bound. This causes the running of multiple concurrent GPU tasks to try and use the GPU more efficiently. Not only does this greatly multiply any very high credit rate batch tasks but it also uses up all your CPU cores so prevents other projects being run.

Thirdly the lack of a cache size limit on a new project with limited tasks available causes a task feast for a few and a task famine for many. Cache limit is easy to implement through the BOINC software and should be done as a priority when task availability is limited.

Admins who are struggling just to get new applications to work without crashing and to checkpoint properly may not have sufficient working knowledge of various BOINC credit systems to avoid the greatly increased potential of runtime based credit systems going totally haywire when used for GPU applications.

Sometimes think they're a bit lacking in GPU application development skills. May be whizzes at compiling Linux applications and other programming, but perhaps have little experience and knowledge of how to optimise GPU CUDA or OpenCL code for a range of modern GPUs. If they can get their GPU applications to work without errors and checkpoint then that's it, job done. Always felt POEM was like that.

Yes good to have another Bio project for GPUs; even though I've currently got an Nvidia myself, still unfortunate it's not AMD GPU too though.

I've copped it many times on different projects when I've missed the boat on getting any early gigantor credit tasks, so other than a slight twinge I'm not going to be embarrassed by roaring up the charts because I managed to snag a cache of big 'uns. Bonanza will probably end soon enough so I'll just enjoy this tasty credit boost while it lasts.

tazzduke · October 22, 2017, 12:00:56 PM

Greetings All

Just got through 3 tasks and well yep credit is all over the place lol.

But then I forgot to apply an app_info file so I could run 2 tasks at a time and just got 9 tasks, so I will have to wait till they go through and then try again.

I am running a GTX 960 though, so am not sure if its even worth while.

Cheers.

chooka03 · October 22, 2017, 03:20:05 PM

Meanwhile those of us with AMD cards just watch on

I'm not worried though. Plenty of other things to crunch :)

kashi · October 22, 2017, 03:41:26 PM

Yes, looks like credit is on the way down again. My last task dropped by 81% compared to rate of my recent previous tasks. Never got any of the super dooper 500,000+ credit tasks, just the high 170,000 to 220,000 ones. At that 81% lower rate, daily yield would be similar to GPUGrid, but GPUGrid doesn't use all my CPU cores or make computer run as hot. See what happens with the current 4 tasks in progress, if they're same as last one then I'll sing my song and I'll be gone*.

Be a relief in some ways as I'll be able to add this computer back with other box's WCG badge hunting efforts. Those Sapphire badges and higher take a power of processing even with many cores a crunchin'.

At least our DrugDiscovery million badges are prettier than the blurry 100,000 one. Plus the team is in 2nd place too.

Be good if the application can be optimised in the future to be more efficient and not require hogging all your CPU cores to use your GPU efficiently. I mean you can already optimise it greatly with use of nt parameter but that still uses all your CPU cores.

Maybe wishful thinking, but would be nice to have another GPU Bio project available that was well behaved and stable. Fixed credit and a cache limit of 1 or 2 tasks per GPU, like GPUGrid can fix the credit problems and then another nifty project choice for Nvidia owners is available. Everyone has their preferences but I prefer to use my GPU searching for curative drugs rather than aliens. No AMD GPU support is a continuing unfortunate theme of course, as it restricts GPU purchasing choice, reduces potential project participation and prevents project being AA target.

The pity about not properly taking care of credit considerations with a new project is that once the admins wake up and finally do something about it, they often overreact and go to the stingy dark side. This means that the oodles of credit amassed by the early adopters cannot be easily overtaken which reduces incentive for competitive crunchers in the future. Even when it is necessary and justified, a sudden massive reduction in credit rate always leaves a bad taste, haha.

*Haha, remember Spectrum:
Someday I'll have money
Money isn't easy come by
By the time it's come by I'll be gone
I'll sing my song and I'll be gone

tazzduke · October 22, 2017, 04:17:03 PM

Afternoon, figured it out with the app_config file, but once these are done am fully back onto WCG.

Yeah I remember Spectrum lol.

Its a motto around here sometimes lol.

tazzduke · October 22, 2017, 10:44:40 PM

Oh my the app_config file is not behaving like it should.

Oh I got me an error on job, oh bother.

Thats is, shutting down the GPU, running all cores on WCG for a few days while the temps are down outside.

I thought I might be able to get a blue badge on of the subprojects but will see about that lol.

Cheers and have a good night.

kashi · October 23, 2017, 01:58:08 AM

Tarnation, how annoying!

Just noticed those 9 tasks you downloaded were Gromacs_v2 tasks. That's a new application just released yesterday. As far as I can see, all Gromacs_v2 tasks completed are failing on upload same as yours with "file_xfer_error" messages. Small consolation I know, but it's not just you.

Have set this GPU equipped box to cleaning up some WCG tasks in my cache for the next 11 hours or so, than may try a few more DrugDiscovery GPU v0.26 tasks. After the credit drop, was fiddling around trying out different nt values with 2 tasks and the last task I completed doubled the credit rate per task again from its lowered rate of 30-40K back to 89K.

Not sure if it's actual CreditNew at work with these large credit fluctuations or the admin repeatedly trying and failing to fix the credit rate so it is reasonable for all. Either way, credit stability and consistency is probably impossible with such an inefficient GPU application. Means this messiness will continue until he either increases the efficiency and/or uses fixed credit.

Just for interest a single task with nt 4 took 102 minutes to complete and a task with nt 6 took 79 minutes. Good thing is the v.026 application now multiplies the actual runtime by the nt value and reports it as elapsed time. Previously actual elapsed time was reported and used for credit calculation, so you were penalised much more heavily then for increased efficiency.

For comparison running 4 concurrent tasks with default nt 1, each task completes in 306 to 387 minutes. Remembering application is heavily CPU bound so GPU load and task runtime varies a fair bit depending on whether spare CPU cores are used for CPU projects or left free. Think I was running 3 WCG tasks with the nt4 task and no CPU tasks with nt 6.

CPU cores really heat up running a single task nt 4 and nt 6, even with no CPU tasks running. So GPU application is actually heavily using CPU resources and not just that weird Nvidia bug/feature with GPU applications where a CPU core is reported as being used, but actual CPU usage is very low.

JugNut · October 23, 2017, 03:07:19 PM

@tazzduke: Eeek.. that's terrible , at least they seem to have canceled all the rest of them for now. So far I haven't received any of those so they must be a work in progress.

@kashi: You know I must be getting slow in my old age. I'm not sure why it didn't dawn on me before but the reason that these tasks run much the same on my gtx 970's as they do on the pair of 1080's is because there totally CPU bound. While we had already come to this conclusion yesterday but what I hadn't taken into account was the effect AVX plays in all this. Since the tasks use AVX and by nature AVX tries to use a full core and gets no advantage from hyper threading that means the box with the pair of 1080's in it should have 8 full cores to feed the 8 GPU tasks, which it does not.
I should only be running at most 3 tasks concurrently on each GPU. Why? Because that box only has a 6 core 12 thread CPU in it and not only have I been running 4 x GPU tasks concurrently on each GPU i've also been running the extra WCG tasks to boot. That can only mean lots of contention. In other words there's 8 GPU AVX tasks fighting for 6 cores. And those cores already have WCG tasks on them which only makes matters even worse. I presume that's why the CPU is working so hard & getting so hot too.

As we've talked about before technically the best way to run these GPU AVX apps would be to disable any other CPU based WU's reboot the PC and disable hyperthreading in BIOS then only run the the amount of actual cores you have, just like any other AVX CPU app would do, just like you might do with LLR tasks from Prime Grid.
Of course that's a pain and the few times I have ever tried it there wasn't a huge advantage anyway, but theoretically it should work.

On the other side my gtx 970's have a 8 core 16 thread CPU at their disposal which mean even though the GPU is slower the app can still run close to it's full speed whereas the 1080s can't.(even after disabling WCG work)

I can't say for sure but I doubt credit new is being used with these as credit new seems to be a means to stabilize credit and even reduce it year on year as hardware power increases, so I just find it hard to imagine credit new giving such huge credits at any time. I'm sure there'd be a way of finding out but I have no idea of how?

Also I think the "nt" switch is not such as good solution as it first sounds because most people still only have limited "actual" cores. So if you run 4 GPU work units you'll need four real cores. So in that scenario setting anything other than nt 1 should not work well at all especially if you use the most common of all CPU's the intel i5 4 core or the i7 4 core 8 threader as both only have 4 real cores. I suppose if you set nt 2 and only used 2 GPU task concurrently that might work out with a 4 core chip?

Of course this is just theoretical and only actual testing with this in mind will find the best sweet spot.

Anyhoo I thought i'd share my epiphany even though now I think of it, it's quite obvious.

Mmm I wonder how your 970 would go in the Asus server box? For all intensive purposes you'd have plenty of core's to feed the GPU whatever it needed. Might be a hassle to implement but wonder just how far you could load it up? Just a thought...

Crunch-on....

PS: 28mil in credit yesterday which rocketed me into first place!! I've never been this lucky before i'm not sure if I should be elated or embarrassed? I wonder how long it will be before they reverse the over crediting? Oh well, easy come easy go...

kashi · October 24, 2017, 01:19:58 AM

Congratulations on first place and achieving the 50 million DNA helix badge.

Ah yes, it had occurred to me to run a single task using nt 32 parameter for fun in dual box, if it had a GPU installed. Think it may not work too well though because of how the 2 separate CPUs interact with the memory and other resources. Kind of intra socket contention issues. Sometimes you can partly get around these issues causing slowdown of multithreaded CPU programs by running 2 program instances and allocating program threads/cores to a particular CPU using Task Scheduler or Process Lasso. However for GPU applications I think the lanes on one PCIe slot map to one CPU and another slot maps to other CPU. So you'd perhaps need 2 GPUs to utilise both CPUs.

Didn't think of turning off hyperthreading though, haven't wanted to use all 8 CPU "cores" on Skylake box because it's running warm enough just using 6 (X3 tasks with nt 2). But it may run a fair bit cooler running all 4 cores with hyperthreading off, might try it tomorrow.

However as you said when it comes to GPU applications rather than solely CPU applications, sometimes turning hyperthreading off gives little or no advantage. Plus without the elapsed time multiplier effect of running X4 tasks with nt 2 like you can with hyperthreading on, it may end up being more efficient by completing more tasks per day but actually get less daily credit. That's always irritating when a faulty credit scheme promotes inefficiency.

Gromacs is complaining of the inefficiency of not using -pin on (and -pinoffset for multiple jobs). Could try fixing threads to cores I suppose, haven't used Process Lasso for a while. Just tried 1 task with nt 7 and it completed in 71 minutes, now have cleaned heatsink of CPU a little plus also filters on case and am running 1 task at nt 8. GPU and CPU temps are warmish but acceptable, slightly less than when running 4 concurrent at nt 1. Afterburner GPU usage % graph line is very smooth and steady at 64% to 66%. Task Manager Details tab shows gmx.exe using 92% to 94% of CPU.

Probably will only be a tiny bit faster if at all than nt 7, but runtime will be multiplied by 8 instead of by 7. Think I'll leave it on that overnight as it seems stable. Dual box is crunching away on WCG without complaint.

Actually CreditNew has a history of wild and erratic spiking and dipping when being used with GPU applications. It has no effective mechanism to consistently compensate for the large differences in Runtime when used with inefficient GPU applications and the resulting concurrent type processing. Can't remember which projects, but quite a few had big trouble with CreditNew and had to hurriedly introduce fixed credit. Think that included either DNETC@HOME or Moo! Wrapper (or both!).

Even with CPU applications it can muck up. Aqua@Home had so much recurring strife with the incompatibility of CreditNew with their multithreaded CPU application causing massive credit spikes that they gave up and left BOINC completely. Plus CreditNew has a comparison feature that is supposed to compare with other projects and adjust the credit rate accordingly. When CreditNew was first used on WCG they were alarmed and dismayed that the credit rate was automatically increased to a reasonable rate and hurriedly disabled that feature as they are totally wedded to the unwholesome, inconsistent, illogical, unfair "reduce towards zero" credit philosophy.

So much so, that on WCG a recent architecture CPU will often have a credit rate that's similar or even less than a CPU architecture from years ago, even though efficency of more recent CPU architectures has increased greatly. They hate the whole idea of credit at WCG which is why they invented their colourful Runtime badges to ensure people with old, inefficient "boat anchor" CPUs could pretend they were doing a useful amount of work instead of basically just wasting power.

Gee I'm twitter and bisted sometimes. Never miss an opportunity to bag WCG's "war on credit" stinginess even though I often turn my bedroom into a bath of fire runnning 3 CPUs/35 cores on their projects to help cure diseases. Anyway, have a few Emeralds pending and will probably go for Zika Sapphire next, seeing as can't get many HS TB tasks at all. Maybe I too need to employ "old batchie" at the 3rd and 33rd minute every hour, haha.

JugNut · October 24, 2017, 09:35:17 AM

Hey thanks kashi,
I suppose it had to happen eventually but my credit output is now down by well over half what is was the day before, although that hasn't happened across the board though as i've noticed there are now others getting the same credit what I was receiving and are now racing up the ranks behind me. The funny thing is that one box that received the lions share of the credits performed quite poorly in the first place.
I wish I knew which setting controlled the credit boost though. It would be awesome if everyone in the team could receive the same huge credits for a time and that way the team could get a massive head start on this project. But sadly I have no idea what caused the spike as it was probably just a bit of luck. But you can bet your boots I looked into it anyway. LOL

Oh! And i'm sure you're right about credit new kashi but with my inbuilt bias against it I just couldn't imagine it ever giving something good to anyone at any time, plus had never done either of those projects you mentioned, so happily or sadly depending on your point of view I missed out on those particular rounds of credit craziness. Thanks for the info.

Anyway time to prepare for another fun day at the doctors. Oh well C'est la vie

Crunch em if you got em..

kashi · October 24, 2017, 08:32:10 PM

Yes you're quite right, CreditNew as used in CPU projects almost always causes very low credit because it applies the horrid, illogical "reduce towards zero" concept whereby newer faster computers are automatically crippled credit wise and awarded credits within the usual despised stingy range.

However admins who are uninformed or misguided enough to ignore repeated history and try and use CreditNew for GPU applications may often be the same type who panic when credit rate spikes ridiculously high. Then they may repeatedly manually adjust the task parameters relating to credit calculation to try and restore stability. The CreditNew "smarts" then "fights back" and the yoyo continues.

Wouldn't feel guilty over winning the occasional credit lottery. The oodles of years of processing time we've donated to projects where credit is unjustifiably low and a poor recompense for the many thousands of bucks we've spent on crunching computers, expensive GPUs, power and/or solar installations more than balances it out. You wouldn't have any colourful WCG badges at all if you lived by credit alone. But despite the holier than thou lamentations of the anti-credit whingers, avid Cruncherman does not live by badges alone either, so just enjoy it as a random bonus.

Don't think I got much advantage from those 3 projects I mentioned. Aqua@Home was a lucky dip as only a small number of contributors quickly gobbled up all the task batches where the credit was huge. Remember being a bit disappointed that I missed out. Can't remember for sure but I think they removed/reduced some credit also when it became too excessive. Think one or both of the GPU projects quickly and wisely moved to fixed credits, so again don't remember getting any bonanza there either. Also I was possibly focussing on MilkyWay@Home on GPU back then, so just observing for interest.

Back to DrugDiscovery, yes the team is building up a handy total. Not going to do any more testing of different task number and nt combinations. Getting very warm in this room today and GPU runs cooler running only a single task. Possibly could get more doing multiple concurrent but although rate continues to gradually drop, single task daily yield is currently still "quite generous", mwuhahaha.

BOINC-AUSTRALIA FORUM

News:

Re: Project Update

tazzduke

kashi

tazzduke

JugNut

kashi

tazzduke

chooka03

kashi

tazzduke

tazzduke

kashi

JugNut

kashi

JugNut

kashi