News:

If you are a member of the Team on BOINC you still need to register on this forum to see the member posts.  The posts available for visitors are not posted to much by members.
 Remember to answer the questions when Registering and also you must be a active member of Team BOINC@AUSTRALIA on BOINC.

Main Menu

Testing of BOINC version of theSkyNet

Started by Dingo, August 04, 2012, 12:27:05 AM

Previous topic - Next topic

kashi

Yeah you would have had about 15-20K for valid tasks by now if the current rate had applied earlier. Probably had invalids too which isn't happening now thanks to the all the testing and refinement.
So great work on the testing Dingo, thank you.  :congrats    :congrats

Yes I can see the team without logging in JugNut. You should sprint past me soon, but not quite yet. ;D

Mike Mitchell

I just joined the project, and added the team, got one work unit. I'll run that and see what it's like before making plans. I'm only the 300th project member.

Hmm, I have an idea for the next AA.  :wink
AA's > 1-Malaria 2-Tanpaku 3-Riesl Siev 4-Seti 5-ABC 6-Einstein 7-WCG 8-Seti 9-QMC 10-WCG 11-Cosmo 12-ABC 13-MilkyWay 14-3x+1 15-Rosetta 16-ABC 17-MilkyWay 18-Einstein 19-WCG 20-WCG 21-Poem 22-Rosetta 23-Docking 24-Spinhenge 25-Alternate 26-Simap 27-Alternate 28-Constellation 29-WCG 30-Edges 31-Alternate 32-Pogs 33-WCG 34-Seti 35-Pogs 36-Poem 37-Pogs 38-Asteroids 39-Pogs 40-Simap 41-Pogs 42-Seti


Dataman

I got 4 but big problems ... NO CHECKPOINTS????? I did my normal weekly PM on that machine and restarted at 0 time. No can do on 14-15 hour jobs. Any one getting checkpoints?  :hbang:


LawryB

#18
Checkpoints?  Was crunching 6 WUs that were between 4 and 75 percent complete.  
Had to power down for a couple of hours and when I restarted BOINC all the work units showed zero percent complete.
Now the weird bit -  After a couple of minutes four of the WUs shot back up to the percent complete stage they were at when I turned the machine off.  The other two did not.

I have to do the same thing later today.  I will take much more notice of the result so I can get onto the forum for answers if needs be.

EDIT:  Visit to the Forum shows problems with Windows while Linux and MACs work OK.   It is under investigation at this time.


JugNut

#19
Thanks for the update lawry.  I've had similar problems as well.  As I may have mentioned before I have two i7's one is 100% stable & one is at best 85% stable & much to much a risk on 12hr+ non check-pointing WU's.     I've stopped crunching on it & have other work that needs crunching also.  So while i'll keep a few core's with the light on, most of my effort will have to resume in a few days time. (hopefully when the check-pointing issue is fixed)

Cheers ..JN..


 - Participated in AA's 27 - 55 & Team Challenge # 1.
My team (Boinc@Australia) stat's
My personal stat's


     Crunching today for a better tomorrow...

Dataman

Yep, definately not checkpointing Windows. Have set it to no new tasks until I see something in the form.

HOT here and not much crunching going on. 106F

:aus1: :cheers: :US


Dingo

Sorry guys I thought it was fixed for all Operating Systems.   :boom:


Radioactive@home graph
Have a look at the BOINC@AUSTRALIA Facebook Page and join and also the Twitter Page.

Proud Founder and member of BOINC@AUSTRALIA

My Luck Prime 1,056,356 digits.
Have a look at my  Web Cam of Parliament House Ottawa, CANADA

Dataman

No harm no foul, Dingo. I'll keep watching.  :bloodshot


kashi

I had one task restart from a checkpoint successfully. "Leave applications in memory while suspended" was turned off when my preferences accidentally got reset. Have not tried exiting BOINC or rebooting though. Restart at fit 24 is shown in Stderr output:
"04:33:47 (4296): wrapper: running fit_sed (24 filters.dat observations.dat)
wrapper: starting
05:09:47 (2548): wrapper: running fit_sed (24 filters.dat observations.dat)"

After the checkpoint restart, Elapsed time as shown in BOINC Manager reset to zero or one  "fit" unit's runtime but percentage completed was correct. This mucked up "Remaining (estimated)" time too. This is all to do with how the wrapper mechanism interacts with BOINC.

When the task's Elapsed time reset I was concerned that the checkpoint hadn't worked but task completed without any trouble. 

All the tasks I have looked at are successfully writing checkpoint files as seen in wrapper_checkpoint.txt file. Because this application uses a wrapper, "CPU time at last checkpoint" does not show anything in BOINC Manager task properties. It will only show something there (one fit duration) if the application has actually restarted from a checkpoint.

At least that's how a restart was shown on this Windows 7 computer. Other operating systems may behave and/or report things differently.

LawryB

All cool Dingo, gives me a chance to clear my cache/s.


kashi

#25
What I meant to imply by my previous post is that due to the way it is reporting some people may think that checkpointing is not working when it is. For example, Dataman's 4 aborted tasks show 3 fits successfully completed and 2 restarts on fit 4, although what was shown to him by BOINC Manager made him think those tasks had restarted from the beginning.

Wait for one fit duration (often about 2% of total task time) then have a look in your stderr.txt file for that task in your BOINC slots folder if you think checkpointing hasn't worked.

JugNut

#26
Does anyone use BoincTasks? It track's checkpoints in real time & also CPU useage. (may have to be activated in settings?)

It paints a strange picture, checkpoints are always in the red (overtime) & CPU usage shows at between 2 & 15% also in the red? What the Fred does that mean?

Anyway it's all good Dingo just minor teething probs, we'll crunch up a storm when all the small bugs are fixed.. :wink

Cheers ..JN..


 - Participated in AA's 27 - 55 & Team Challenge # 1.
My team (Boinc@Australia) stat's
My personal stat's


     Crunching today for a better tomorrow...

kashi

If BoincTasks is based on the same information as BOINC Manager, it will have the same reporting limitations with these POGS wrapper tasks. In other words it will not display checkpoints or CPU time correctly, the same as BOINC Manager. With these wrapper tasks on Windows the CPU time resets at the end of each fit, that's why most completed Windows tasks show zero CPU time reported on the POGS website after the concatenation processing. For the same reason, during the task processing the BOINC Manager CPU time in Properties shows the time since the last fit was started, which is also the time since the last checkpoint.

LawryB


Just turned on my PC and hello hello ALL 9 pogs WUs restarted correctly except for the time remaining figures (as explained in Kashi's earlier post).
checked all the files as described on the forum and they are all there and contain the right information.  I guess I got caught out with the BOINC display as well.

When is the next AA??


Dingo

The next AA starts on the 1st September.  I have sent a message to Kevin at theSkyNet to see if he thinks it can handle an AA.

Quote

Hi Kevin,

Doing a great job with the project, only complaints at the present from my members is that check pointing is not working on Windows PC's.

Four times a year the Team has an Aussie Assault (AA) where the team picks a project that we get as many Team members as possible to work on at one time. This boosts the Team together and we usually pick a project that we need a boost in the standings on.

The next AA is due in September and I am thinking that I will propose this project as it is the first Australian Project. The team has over 4000 members but only about a 1000 usually join but because the project is Australian there may be more.

Do you think your project can handle the work load and there will be enough work available between 01 - 14 September 2012.

Cheers
Dingo


I just got his reply which looks like it may be ready he does't know, especially if he needs to write his own wrapper.
Quote
I thing I'll have to ditch the wrapper and write my own :-(

I've just got 216 new galaxies - so maybe....

We can test it over the next two weeks

I will put it up as an option and if it is not ready we can do the AA on the alternate.


Radioactive@home graph
Have a look at the BOINC@AUSTRALIA Facebook Page and join and also the Twitter Page.

Proud Founder and member of BOINC@AUSTRALIA

My Luck Prime 1,056,356 digits.
Have a look at my  Web Cam of Parliament House Ottawa, CANADA