News:

Members can see all forum boards and posts. Non members can only see a few boards.  If you have forgotten your password use this link to change it.
https://forum.boinc-australia.net/index.php?action=reminder

Main Menu

Re: Project Update

Started by Tixx, May 07, 2009, 09:30:47 AM

Previous topic - Next topic

JugNut

#45
Not to worry kashi at least now you get to test out the checkpointing ;)



EDIT: Mmm I Spoke to soon it looks i'll soon get that opportunity as well.  You really have to keep your eye out for those short deadlined resends.  Now if I see one that hasn't yet jumped to lightspeed I abort it straight away.

Oh and what about those uploads & download?  There huge!!  With all the ancillary files the uploads get up around the 36 - 39MB each and downloads aren't far off it either. Lucky most are long runners or it would be a big hit to those without unlimited internet plans.


kashi

Haha, yes resume from checkpoint still not working in v0.25, that's why I exclaimed "Bah!". Have set No new tasks and reduced cache setting to hopefully avoid further annoying swap outs. Haven't tried checkpointing on v 0.26, still have oodles of v0.25 left, the first 2  I completed paid exactly the same credit rate as the v0.24 I completed.

Credit scales exactly with runtime for the last 3 tasks completed, so I've reverted to nt 1 from nt 3 and now doing 4 concurrent nt 1 rather than X2 nt 3. Tasks take more than twice as long without nt 3, so X4 nt 1 probably about 10% less efficient, but total credit yield should more than double due to runtime scaling.

GPU load is flicking between 68% and 76%, could possibly go 5 concurrent for a little extra but 4 will hopefully do nicely enough. If credit rate scaling remains the same as last 3 tasks, daily total will be huge at 4 concurrent.  v:

JugNut

#47
Yep thats what I ended up with 4 x concurrent on most my cards except for a single 980 ti that i'm testing 5 at a time on.  If I had 2 980's that would be 10 cores i'd have to keep free.  I'm thinking about stopping CPU work altogether as these guys seem to like as much CPU time as possible. At the moment nt1 is working out nicely so i'm leaving it as is for now.

Also i've noticed on the 5960x the CPU is at maximum power draw something i've never seen before. For some reason they are giving my PC's a good workout as the fans on my boxes are screaming like an army of angry banshee's.



EDIT: Jeeper's!!! If you thought those last WU's credited well then have a look at these... Link..

At this rate i'll have a billion credit before the days out.  :rofl: 
Ok that's a slight exaggeration on my part, but boy they are crediting well to say the least.


EDIT2: With the extra push of that last batch of big crediting jobs I jumped into fourth place...   :dance: 

Sing it loud..... happy days are here again the skies above are clear again,  let us sing a song of cheer again,  happy days are here again da da da da da dar....(it's an old war song my dad used to sing all the time) Song Link...

EDIT3: Going up...  now in 3rd place..

kashi

#48
Yep, you've really hit the jackpot on that machine.  :cheer1:

It's gotta be some form of CreditNew hasn't it, surely nothing else is that wildly, erraticly, inconsistent as to give hugely different credit rates to different computers with similar hardware specs for the same batch of tasks. Don't even need one of the fastest cards either, Tazzduke had a 622K task on his GTX 960 a few days ago.

My last 4 tasks completed and credited at the same hourly rate as the 3 before them. So if I multiply that by 4 concurrent, then by 24 hours, it should yield around 3.3 million, woohoo.

Know that song very well, my mum used to sing it sometimes, she sang it in a concert at Sydney Girls' High about 1941, which she won. Don't remember hearing that version you posted before, can remember a female version from when I was young, probably Patti Page. Then later on one by Max Bygraves. Slow version was Barbra Streisand's "signature song" during the early part of her career and was on her first LP.

Yeh, it's fun roaring up the charts, ain't it. I've zoomed up 130 places in the last 3 days, haha. v:

Edit: Ah, didn't notice, statseb just informed me of new badge. Oooh, giant, blue and shiny. :crazy

tazzduke

Greetings All

Yeah, I did strike it lucky with that one, but the others not so, now I just cant seem to get anything, so I have left it open to see if I can snag a task or two.  I just want to get to 1 million anyways.

I would be one of many who cannot participate if I didnt do my long overdue rebuild.

But I am kicking myself for not getting the 17y00 instead of the 1400, I could have had 16 cores instead of 8 cores.  Oh well, bit late now.

Well anyways, I am still crunching WCG for the time being.



 AA 24 - 53 participant

kashi

Yeh, I couldn't get any for a day or so too. Well, did get 1. Then got heaps. Will abort those later that will not finish before deadline, that will release some for others. Just testing some v0.26 tasks first to see if they credit the same as v0.25.

Probably only a few people have any tasks, and those that do would have tried to maximise their cache in case future tasks grant less. It's unfair really but happens commonly with limited batches of tasks available when new projects are under development.

Wouldn't matter if credit rate was consistent, anyone could easily catch up later. But if credit rate drops later then those who got lots of early credit monsters are unable to be overtaken easily or at all. With limited task availability, admin should have put a limit on number of tasks per GPU allowed in cache but he's been busy trying to get checkpoints working instead, plus faffing around with Linux compile questions.

Yep, I've still got both E5-2670 CPUs fully on WCG. Finished up with MCM for now and concentrating on next FA@H2 and HS TB badges.

tazzduke




 AA 24 - 53 participant

JugNut

#52
Now i'm not opposed to a good a days credit but yesterday I received over 20mil for just that one day,  even for me thats a tad excessive.    That kind of credit is unsustainable, I mean the only other project where I could get a similar total would be at Collatz and only then by heavily tuning the app.  But the Collatz project uses a tiny math app that's been worked on and optimised year after year by a skilled programmer so in some ways I can understand why they give the credit they do,  but that's not the case for Drug Discovery in fact their app is the total opposite.

The thing is at around the 250k mark for a 7 or 8 hour work unit that would have been quite acceptable and is somewhere around what GPUgrid gives in credit but because the Drug Discoveries app is so inefficient and you can run 3, 4 or more instances at once which of course means you  receive 3, 4 or more times the credit than you normally would do too.

Normally I despise people who complain about getting good credit thereby ruining everyones elses fun when its eventually changed thats why I wont be mentioning it on Drug Discoveries website,  but I still think it needs changing as it is more than just a tad extreme. Regardless of what I think it usually not long before a kill joy or two does start complaining and that may be the end of that.  So if you're looking for a quick top up on credit well nows the time as it might not be there in the future.   

But all in all it's great to have another Bio project that uses GPU's.

What do you guys think? To much? Or do you think there should be no limits on credit at all?

Anyway for now let the credit flow & the drugs be discovered.. LOL 

Crunch-on..

kashi

#53
Well high credit in itself is not a problem so much if it is a consistent rate, there are sufficient tasks available for all and application is reasonably efficient and not overly CPU bound.

Then many people get the chance to enjoy a project with high credit and the project itself may interest them as well. I certainly enjoy a project more when credit rate is high rather than stingy. Although credit's not the only criterion for choice, rather employ my GPU on GPUGrid or DrugDiscovery than on Collatz, regardless of credit rate.

However when the rate repeatedly varies excessively from lowish to very high, depending on when you downloaded a batch then it is unfair.

When documentation is poor and doesn't mention that a recent GPU driver version is required that is also unfair.

Most importantly when available tasks are limited and no restriction is placed on cache size, that is the unkindest cut of all. It means some like me and you can gallop up the charts with our fat caches whilst others suffer a task drought and are unable to join in the credit party.

There's a few problems causing this and they've happened a number of times before.

Firstly, the use of a credit system that varies the amount of credit granted proportionally according to runtime. This on a GPU project that has the second problem is a proven recipe for credit mayhem.

Secondly, as already mentioned, a GPU application that is extremely inefficient and/or CPU bound. This causes the running of multiple concurrent GPU tasks to try and use the GPU more efficiently. Not only does this greatly multiply any very high credit rate batch tasks but it also uses up all your CPU cores so prevents other projects being run.

Thirdly the lack of a cache size limit on a new project with limited tasks available causes a task feast for a few and a task famine for many. Cache limit is easy to implement through the BOINC software and should be done as a priority when task availability is limited.

Admins who are struggling just to get new applications to work without crashing and to checkpoint properly may not have sufficient working knowledge of various BOINC credit systems to avoid the greatly increased potential of runtime based credit systems going totally haywire when used for GPU applications.

Sometimes think they're a bit lacking in GPU application development skills. May be whizzes at compiling Linux applications and other programming, but perhaps have little experience and knowledge of how to optimise GPU CUDA or OpenCL code for a range of modern GPUs. If they can get their GPU applications to work without errors and checkpoint then that's it, job done. Always felt POEM was like that.

Yes good to have another Bio project for GPUs; even though I've currently got an Nvidia myself, still unfortunate it's not AMD GPU too though.

I've copped it many times on different projects when I've missed the boat on getting any early gigantor credit tasks, so other than a slight twinge I'm not going to be embarrassed by roaring up the charts because I managed to snag a cache of big 'uns. Bonanza will probably end soon enough so I'll just enjoy this tasty credit boost while it lasts.

tazzduke

Greetings All

Just got through 3 tasks and well yep credit is all over the place lol.

But then I forgot to apply an app_info file so I could run 2 tasks at a time and just got 9 tasks, so I will have to wait till they go through and then try again.

I am running a GTX 960 though, so am not sure if its even worth while.

Cheers.



 AA 24 - 53 participant

chooka03

Meanwhile those of us with AMD cards just watch on  :thumbdown:

I'm not worried though. Plenty of other things to crunch :)

kashi

Yes, looks like credit is on the way down again. My last task dropped by 81% compared to rate of my recent previous tasks. Never got any of the super dooper 500,000+ credit tasks, just the high 170,000 to 220,000 ones. At that 81% lower rate, daily yield would be similar to GPUGrid, but GPUGrid doesn't use all my CPU cores or make computer run as hot. See what happens with the current 4 tasks in progress, if they're same as last one then I'll sing my song and I'll be gone*.

Be a relief in some ways as I'll be able to add this computer back with other box's WCG badge hunting efforts. Those Sapphire badges and higher take a power of processing even with many cores a crunchin'.

At least our DrugDiscovery million badges are prettier than the blurry 100,000 one. Plus the team is in 2nd place too.

Be good if the application can be optimised in the future to be more efficient and not require hogging all your CPU cores to use your GPU efficiently. I mean you can already optimise it greatly with use of nt parameter but that still uses all your CPU cores.

Maybe wishful thinking, but would be nice to have another GPU Bio project available that was well behaved and stable. Fixed credit and a cache limit of 1 or 2 tasks per GPU, like GPUGrid can fix the credit problems and then another nifty project choice for Nvidia owners is available. Everyone has their preferences but I prefer to use my GPU searching for curative drugs rather than aliens. No AMD GPU support is a continuing unfortunate theme of course, as it restricts GPU purchasing choice, reduces potential project participation and prevents project being AA target.

The pity about not properly taking care of credit considerations with a new project is that once the admins wake up and finally do something about it, they often overreact and go to the stingy dark side. This means that the oodles of credit amassed by the early adopters cannot be easily overtaken which reduces incentive for competitive crunchers in the future. Even when it is necessary and justified, a sudden massive reduction in credit rate always leaves a bad taste, haha. 

*Haha, remember Spectrum:
Someday I'll have money
Money isn't easy come by
By the time it's come by I'll be gone
I'll sing my song and I'll be gone

tazzduke

Afternoon, figured it out with the app_config file, but once these are done am fully back onto WCG.

Yeah I remember Spectrum lol.

Its a motto around here sometimes lol.



 AA 24 - 53 participant

tazzduke

Oh my the app_config file is not behaving like it should.

Oh I got me an error on job, oh bother.

Thats is, shutting down the GPU, running all cores on WCG for a few days while the temps are down outside.

I thought I might be able to get a blue badge on of the subprojects but will see about that lol.

Cheers and have a good night.



 AA 24 - 53 participant

kashi

#59
Tarnation, how annoying!  :boom:

Just noticed those 9 tasks you downloaded were Gromacs_v2 tasks. That's a new application just released yesterday.  As far as I can see, all Gromacs_v2 tasks completed are failing on upload same as yours with "file_xfer_error" messages. Small consolation I know, but it's not just you.

Have set this GPU equipped box to cleaning up some WCG tasks in my cache for the next 11 hours or so, than may try a few more DrugDiscovery GPU v0.26 tasks. After the credit drop, was fiddling around trying out different nt values with 2 tasks and the last task I completed doubled the credit rate per task again from its lowered rate of 30-40K back to 89K.

Not sure if it's actual CreditNew at work with these large credit fluctuations or the admin repeatedly trying and failing to fix the credit rate so it is reasonable for all. Either way, credit stability and consistency is probably impossible with such an inefficient GPU application. Means this messiness will continue until he either increases the efficiency and/or uses fixed credit.

Just for interest a single task with nt 4 took 102 minutes to complete and a task with nt 6 took 79 minutes. Good thing is the v.026 application now multiplies the actual runtime by the nt value and reports it as elapsed time. Previously actual elapsed time was reported and used for credit calculation, so you were penalised much more heavily then for increased efficiency.

For comparison running 4 concurrent tasks with default nt 1, each task completes in 306 to 387 minutes. Remembering application is heavily CPU bound so GPU load and task runtime varies a fair bit depending on whether spare CPU cores are used for CPU projects or left free. Think I was running 3 WCG tasks with the nt4 task and no CPU tasks with nt 6.  

CPU cores really heat up running a single task nt 4 and nt 6, even with no CPU tasks running. So GPU application is actually heavily using CPU resources and not just that weird Nvidia bug/feature with GPU applications where a CPU core is reported as being used, but actual CPU usage is very low.