log in

Advanced search

Message boards : Number crunching : No work?

1 · 2 · 3 · Next
Author Message
Profile Swordfish
Avatar
Send message
Joined: 1 Mar 08
Posts: 2
Credit: 1,018,061
RAC: 0
Message 1058 - Posted: 10 Oct 2008, 17:01:41 UTC

I've got four machines sitting idle with no work? I went ahead a put a second porject so this doesn't happen, but any reason why we've run out of work? This seems to happen to my machines a lot.

petros
Avatar
Send message
Joined: 22 Nov 08
Posts: 11
Credit: 4,186
RAC: 0
Message 1115 - Posted: 26 Nov 2008, 1:09:21 UTC

slowly my machine runs out of work too, hope new ones comin' soon.
____________

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1126 - Posted: 30 Nov 2008, 20:21:00 UTC - in response to Message 1115.

The current job we are running is finishing up. Most work units have been allocated, so even though the job is only 75% complete no more WU's will be sent out to be computed. New research is on the way!

GrumpyGuy
Send message
Joined: 3 Nov 08
Posts: 1
Credit: 133,850
RAC: 0
Message 1178 - Posted: 1 Jan 2009, 20:36:24 UTC

Is it a holiday thing or is there just no new work units PS happy New Years :D

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1179 - Posted: 10 Jan 2009, 19:26:42 UTC

The site says it has work ...

SO, why can't my Mac Pro get any? I ask for 16K seconds and get 0 tasks ... you issuing work only to Windows machines?

I did a project reset where upon I was told that activation.png and sideview.png don't exist ... trying on my XP machine, same missing files ...

sadly it want s no work at the moment ... so I can't tell you if it is an OS thing or a global thing ...
____________

laurenu2
Send message
Joined: 26 Dec 08
Posts: 10
Credit: 5,485,742
RAC: 0
Message 1181 - Posted: 11 Jan 2009, 7:07:13 UTC - in response to Message 1179.

I get the same thing in a XP OS so your not alone
Something is miss directed on the New server

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1182 - Posted: 11 Jan 2009, 8:19:18 UTC - in response to Message 1181.

I get the same thing in a XP OS so your not alone
Something is miss directed on the New server


Well, I should have gotten something on at least one of my machines, but nada ...

Thanks for the note ... nice to know I am not alone ... though I can see by the percentage done of the task (which hasn't changed) that work is not being issued.

You would think that SOMEONE would be watching to see that work was coming out ...
____________

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1183 - Posted: 12 Jan 2009, 2:11:49 UTC

Um, I don't know if any project types pass through ...

But, some of us can't get work ...

I don't know why ... but I tried project reset and now a detach and attach ... I know someone is getting work ... but some of us are not so lucky ... really, I would like to help ... but cannot if you won't issue work to my computers ....
____________

Profile Viking69
Avatar
Send message
Joined: 1 Feb 08
Posts: 17
Credit: 32,165
RAC: 0
Message 1184 - Posted: 12 Jan 2009, 3:38:16 UTC - in response to Message 1126.

The current job we are running is finishing up. Most work units have been allocated, so even though the job is only 75% complete no more WU's will be sent out to be computed. New research is on the way!



OK, that was from November '08. You state that you have new servers. The server status pages show no new work available. The homepage shows that a project is not yet complete.

Whats UP?
____________
What is this Mind control....?

YES, I WILL CRUNCH WU's>>>>......

laurenu2
Send message
Joined: 26 Dec 08
Posts: 10
Credit: 5,485,742
RAC: 0
Message 1185 - Posted: 12 Jan 2009, 7:05:14 UTC

Well now I am getting Downloads BUT all the WU log in as Failed download

Lots of Bugs to work on this Week

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1186 - Posted: 12 Jan 2009, 12:24:02 UTC - in response to Message 1185.

Well now I am getting Downloads BUT all the WU log in as Failed download

Lots of Bugs to work on this Week


At least you get assigned tasks ...

I get 0 tasks all systems ...

Though when I looked last night the current sub-project was only 10-13% complete there was no work pending ...

I think Viking69 had a good question ...
____________

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1187 - Posted: 12 Jan 2009, 16:57:16 UTC
Last modified: 12 Jan 2009, 17:21:28 UTC

Hey crunchers,

I understand your concern with the current job status, and I'll try to explain situation and why you're getting the messages you're getting.

Basically, for our research we use an algorithm that allows us to (hopefully) cut the amount of computations in half. We sample a parameter space, do some computations, and if the parameter space contains all the information we need-- we're done. If the parameter space does not contain all the information we need however, we have to order new parameter sets and do more computations. Thus, instead of sending out all parameter points at once we do them in chunks. That is why there can be 90% left of the parameter space to be examined and no work to be handed out. Sometimes we only send out about 5% of the total parameter space and we have to wait for the results to come back to do more.

For this particulat job (SASTNM_9_6), we are sending out about 200 parameter sets per WU because the computations are relatively small and can be done really, really fast. If we start doing less than 200 parameter sets per WU there becomes a lot of overhead between uploading and downloading. Thus, it becomes a tradeoff: overhead vs. fairness. With a high number of parameter sets per WU less volunteers get numbers to crunch (until a larger space is examined).

To remedy this I will reduce the number of paramter sets per WU to 50 and see how well they are allocated amongst the volunteers and what type of overhead we suffer from it. If the problem still persists or if my explanation wasn't clear, please post to this forum or send me a message and I will respond accordingly.

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1189 - Posted: 12 Jan 2009, 17:33:47 UTC - in response to Message 1187.
Last modified: 12 Jan 2009, 17:39:55 UTC

Hey crunchers,

I understand your concern with the current job status, and I'll try to explain situation and why you're getting the messages you're getting.

Basically, for our research we use an algorithm that allows us to (hopefully) cut the amount of computations in half. We sample a parameter space, do some computations, and if the parameter space contains all the information we need-- we're done. If the parameter space does not contain all the information we need however, we have to order new parameter sets and do more computations. Thus, instead of sending out all parameter points at once we do them in chunks. That is why there can be 90% left of the parameter space to be examined and no work to be handed out. Sometimes we only send out about 5% of the total parameter space and we have to wait for the results to come back to do more.

For this particulat job (SASTNM_9_6), we are sending out about 200 parameter sets per WU because the computations are relatively small and can be done really, really fast. If we start doing less than 200 parameter sets per WU there becomes a lot of overhead between uploading and downloading. Thus, it becomes a tradeoff: overhead vs. fairness. With a high number of parameter sets per WU less volunteers get numbers to crunch (until a larger space is examined).

To remedy this I will reduce the number of paramter sets per WU to 50 and see how well they are allocated amongst the volunteers and what type of overhead we suffer from it. If the problem still persists or if my explanation wasn't clear, please post to this forum or send me a message and I will respond accordingly.


Tom,

THanks for the explanation ...

But, from the way I am understanding it ... well ... you don't have much need for us at all.

Perhaps I am mistaken ... but if the computations can be done really fast ... then why use BOINC?

In the past, we had a relatively continual flow of work and this sudden drought just has us scratching our heads. It is not unheard of, just odd happening so suddenly.

I guess that the news on the front page kinda got us thinking that there was a good amount of work to be done ... Just a thought ... so we don't get into a dither, consider doing what SIMAP does ... the clearly state the size of the batch so that those of us that are rabid about supporting a particular project can tell if the flow of work is expectedly small or whatever ...

EX:

The new batch of work named "New batch of work" will have an initial run of 250 tasks and if the parameter space is not filled we will follow with a larger batch of 26,231,596 tasks in a couple of days.

With that type of announcement, we can judge if there is a reason for the queues to be empty, or whatever ... what had us in a lather is that there was an announcement of work ... then apparently, no work ... I know *I* was excited because I just ramped up my shares for Malaria Control and MM-Beta
in anticipation of throwing you more support just in time for both projects to seemingly crater on me ...

Not that it matters in the global scheme of things, but like Rosetta (who can't seem to create an application that runs correctly) you guys just fell back down the priority list as I turn attention to other projects ...

{edit}
Well, there is work, can't get it because of errors though ...
{/edit}
____________

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1191 - Posted: 12 Jan 2009, 17:45:21 UTC - in response to Message 1189.

Thanks for the suggestion. I will definately make it more clear during our next job submission if there will be lapses in work or not. However, we are currently working on a new algorithm to make crunching as smooth and constant as possible, and it should be finished sometime this week.

As for the computation errors, I'll examine that right away.

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1192 - Posted: 12 Jan 2009, 18:19:31 UTC
Last modified: 12 Jan 2009, 20:12:28 UTC

The errors seem to be coming from a versioning issue.

The applications versions in our database did not match the applications in our files system, so when boinc tried to download them it couldn't find the right location/directory. I updated the correct versions, so the mind modeling applications should be downloaded properly now.

(Note: This does not fix the Vista machine computation errors)

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1197 - Posted: 12 Jan 2009, 23:09:15 UTC - in response to Message 1192.
Last modified: 12 Jan 2009, 23:21:57 UTC

The errors seem to be coming from a versioning issue.


Um, can't test the executable issue though I did a project reset, now I get:


Mon Jan 12 15:08:34 2009|MindModeling@Beta|Started download of mm_ProjectIcon_BETA_01.png
Mon Jan 12 15:08:34 2009|MindModeling@Beta|Started download of activation.png
Mon Jan 12 15:08:35 2009|MindModeling@Beta|Giving up on download of mm_ProjectIcon_BETA_01.png: file not found


and:


Mon Jan 12 15:08:35 2009|MindModeling@Beta|Giving up on download of activation.png: file not found
Mon Jan 12 15:08:35 2009|MindModeling@Beta|Started download of sideview.png

Mon Jan 12 15:08:36 2009|MindModeling@Beta|Giving up on download of sideview.png: file not found

but as there is no work it does not try to download a specific executable... Even more interesting to me now is the fact that if I include the last line in the quote I lose the last of the post when previewing or posting ...

Not sure why ... but I find it fascinating ...

Sigh...

{edit}
I tried several other forums on other projects and none of them seem to have the problem. I noticed that my post in the mac questions area was likewise truncated with the last line ... so, I don't know what is happening, or why ... but ... now you know ...
{/edit}
____________

Profile Paul D. Buck
Send message
Joined: 13 Apr 08
Posts: 14
Credit: 57,599
RAC: 0
Message 1198 - Posted: 13 Jan 2009, 4:39:04 UTC

Just got two tasks ... now to wait and see if they run ...
____________

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1212 - Posted: 15 Jan 2009, 17:05:28 UTC
Last modified: 15 Jan 2009, 19:30:39 UTC

Hi Paul,

Thanks for all the information! You should be crunching far more often now that I've tweaked the config file to include less parameter sets per workunit. As for the quote/truncation problem... I have no idea. Unfortunately, it'll have to wait while I complete some of my more daunting tasks, haha.

Keep the comments comin'.

Tom

Speedy
Send message
Joined: 23 Mar 08
Posts: 8
Credit: 33,206
RAC: 0
Message 1216 - Posted: 18 Jan 2009, 7:47:38 UTC

Are work units at a all time low or is there just a problem creating them at present? As of 18 Jan 2009 7:22:47 UTC. 0 unsent task & 880 tasks in progress. all server are green.

Speedy

Profile Tom
Volunteer moderator
Avatar
Send message
Joined: 23 Jun 08
Posts: 490
Credit: 238,767
RAC: 0
Message 1217 - Posted: 18 Jan 2009, 19:52:53 UTC

We're in the last 2,000 points (about 200 WU) in a batch, and we're waiting on returns. I am not at a computer where I can SSH into our server and make some adjustments, but I'll try and send out more WU's this afternoon. There is a scheduling issue with our current work distribution algorithm, and I'm currently trying to work around it. Sorry for the 0 new tasks.

1 · 2 · 3 · Next

Message boards : Number crunching : No work?


Main page · Your account · Message boards


Copyright © 2019 MindModeling.org