log in

Advanced search

Message boards : News : Most ambitious job yet - Lots of work to be done

Author Message
Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2127 - Posted: 30 Aug 2012, 1:05:02 UTC
Last modified: 30 Aug 2012, 1:15:16 UTC

'Change Signal Full Mesh with Fatigue 3' was just launched. It consists over a million workunits studying the effects of a fatigued cognitive system.
At current through-put it will get done in ....... February
o_O
-Jack
____________
MindModeling@Home is fun

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2128 - Posted: 30 Aug 2012, 3:13:29 UTC - in response to Message 2127.

Does this mean that MindModeling will be running continuously until about February?

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2129 - Posted: 30 Aug 2012, 12:37:45 UTC - in response to Message 2128.

Right now there is months of work ready to process.

The process responsible for packaging workunits has a millions of workunits ready to create and will keep the BOINC queue at 8000ish for a long time

Whether or not it turns out to be February or sooner is of course dependent on the crunch level.

But yes, there does appear to be a large amount of work to do.





____________
MindModeling@Home is fun

zombie67 [MM]
Volunteer tester
Avatar
Send message
Joined: 25 Jan 08
Posts: 86
Credit: 2,908,280
RAC: 56
Message 2130 - Posted: 30 Aug 2012, 12:49:16 UTC - in response to Message 2129.

This is great news! About how long will these tasks run? Has the task-never-ends bug been solved yet?
____________
Dublin, California
Team: SETI.USA

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 56
Credit: 2,798,456
RAC: 2,877
Message 2131 - Posted: 30 Aug 2012, 13:14:05 UTC - in response to Message 2130.

We'll keep crunching.

One question though. Can you change the # of tasks per machine to at least 2 per core? Right now it appears to be 1 per core, which makes it harder to work multiple projects on one machine. When a core finishes its task, there isn't another work unit yet (starts downloading it), so the other project starts up while a new unit is downloaded. It then doesn't switch back to your project until boincmgr decides to, so the resource share # I set doesn't work as well in terms of hinting to boincmgr which project should get priority.

Thanks and keep the tasks coming!

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2132 - Posted: 30 Aug 2012, 14:07:59 UTC - in response to Message 2130.

zombie

Never ending tasks have been fixed!

The current application dynamically reconfigures the amount of work that is done by a workunit so as to fit the capabilities of the processing machines and the individual runtimes of simulations within that workunit.

Also, there are also mechanisms in the app for determining when a enough simulations are complete and too much runtime has passed for a workunit.


The default settings ensure that a workunit should never take more than a few minutes longer than 1 hour on average and no workunit should ever take longer than 2.

Backend server process then identify simulations not completed within a given workunit and redistributed should simulations appropriately.

____________
MindModeling@Home is fun

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2133 - Posted: 30 Aug 2012, 14:08:16 UTC - in response to Message 2131.

I also agree that 2 work units per core would be very helpful as I do not have a fast download speed. This would allow my pc's to have the next wu ready to process and no delay in work would occur.

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2134 - Posted: 30 Aug 2012, 14:11:08 UTC - in response to Message 2131.
Last modified: 30 Aug 2012, 14:11:49 UTC

Gary, Nosferatu*,

That is a great suggestion.
I just made the change.

Jack
____________
MindModeling@Home is fun

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 56
Credit: 2,798,456
RAC: 2,877
Message 2135 - Posted: 30 Aug 2012, 14:31:59 UTC - in response to Message 2134.

Thanks!

zombie67 [MM]
Volunteer tester
Avatar
Send message
Joined: 25 Jan 08
Posts: 86
Credit: 2,908,280
RAC: 56
Message 2136 - Posted: 30 Aug 2012, 21:08:26 UTC - in response to Message 2132.
Last modified: 30 Aug 2012, 21:21:54 UTC

zombie

Never ending tasks have been fixed!

The current application dynamically reconfigures the amount of work that is done by a workunit so as to fit the capabilities of the processing machines and the individual runtimes of simulations within that workunit.

Also, there are also mechanisms in the app for determining when a enough simulations are complete and too much runtime has passed for a workunit.


The default settings ensure that a workunit should never take more than a few minutes longer than 1 hour on average and no workunit should ever take longer than 2.

Backend server process then identify simulations not completed within a given workunit and redistributed should simulations appropriately.


All great news! Thanks!

Another question/suggestion: Would it be possible to have a due date further out than one day? Or is this one of those projects where tasks are iterative, each being built on a recently returned task?
____________
Dublin, California
Team: SETI.USA

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2137 - Posted: 30 Aug 2012, 23:28:05 UTC - in response to Message 2136.
Last modified: 30 Aug 2012, 23:29:32 UTC

The current large job is not iterative (though other jobs can be)
However, some of the other jobs in the system are quite small and get through the system in less than 2 days.

I will see if we can make the due date a dynamic set thing based on how big the job is -- larger jobs that are not running iteratively gets longer due dates; short jobs and iterative work gets shorter due dates

Actually, perhaps a better solution is not basing the due date on the job size but the size of what's left to process for a job

Thanks for the suggestion -- properly setting due dates will minimize redundancy that pops up in the system from unreturned work and thereby maximizes the throughput of the system

--Jack
____________
MindModeling@Home is fun

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2138 - Posted: 31 Aug 2012, 1:19:50 UTC - in response to Message 2137.

Thanks for the change to 2 wu per core.
Nosferatu*

Mumps [MM]
Send message
Joined: 12 Apr 10
Posts: 2
Credit: 16,206,522
RAC: 289
Message 2175 - Posted: 19 Sep 2012, 3:48:48 UTC

Is there any way the project could list the number of WU's yet to be processed for any given Job? So we can see how much is left other than what's already staged or the percentage on the Home page?

Darkknight900
Send message
Joined: 15 Sep 12
Posts: 16
Credit: 160,741
RAC: 0
Message 2184 - Posted: 21 Sep 2012, 22:10:02 UTC - in response to Message 2175.

Is there any way the project could list the number of WU's yet to be processed for any given Job? So we can see how much is left other than what's already staged or the percentage on the Home page?


I asked myself the same question! It would be just great to know!

Btw. i get sometimes the bug with the not finishing WU. I just read this was resolved last month... I just metioned it about 1 time per 30-40 WU, not everytime but often after a restart / suspendending of the task.

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2185 - Posted: 22 Sep 2012, 5:22:52 UTC

I also seem to run into the never ending tasks. It appears to happen on some pc's more than others. I have awakened in the morning to my 8150 8 core crunching on the same tasks for up to 8 or 9 hours. The only thing that seems to end the tasks is to log off and back on again, after a brief time (1 to 2 minutes) the task is returned and credit is granted. When you look at the completed tasks they show some ridiculous amounts of time that the unit was processed.

Nosferatu*

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2193 - Posted: 24 Sep 2012, 2:57:10 UTC - in response to Message 2185.

Mumps -- (WUs lefts)
It is hard to exactly say how many WUs are left for the current running job. WUs are created iteratively.
We know how many individual simulations are run out of the total set so we should be able to estimate how many WUs that should be, but right now we do not have a good method for calculating that dynamically. We'll look into this some more -- I suppose "lots" isn't a very good answer for now :)

Darkknight900--
Nosferatu*--
I am very troubled to hear that you have recently experienced a never ending WU. I am not exactly sure how this could have happened. Our last fix (version 180+) definitely removed a lot of problems resulting from divergent simulations (e.g., infinite loops). Can you give me a result link if it happens again? Thanks.


--Jack
____________
MindModeling@Home is fun

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2196 - Posted: 24 Sep 2012, 17:16:07 UTC - in response to Message 2193.

Mumps --
Right now 1.51 million workunits have been queued up for the 'Change Signal Full Mesh with Fatigue 5' job.

____________
MindModeling@Home is fun

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2217 - Posted: 28 Sep 2012, 1:00:40 UTC - in response to Message 2193.
Last modified: 28 Sep 2012, 1:06:36 UTC

I hope this is what you are asking for. I will supply several examples. All tasks have been aborted. I can supply some that complete if I log off and back on again the next time it happens if you need them.

http://mindmodeling.org/beta/result.php?resultid=2711949
http://mindmodeling.org/beta/result.php?resultid=2785376
http://mindmodeling.org/beta/result.php?resultid=2759654

If you need more information please let me know.
Nosferatu*

Profile Nosferatu*
Send message
Joined: 11 Aug 11
Posts: 8
Credit: 1,488,080
RAC: 0
Message 2218 - Posted: 28 Sep 2012, 11:17:01 UTC

@ Jack.Harris. Please note I don't seem to have this problem with Python tasks.

Nosferatu*

Profile Jack.Harris
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 24 Apr 07
Posts: 507
Credit: 761,261
RAC: 0
Message 2219 - Posted: 28 Sep 2012, 12:05:25 UTC - in response to Message 2217.

It appears as the final simulation was not gracefully completing and output was truncated.
The each result looked like it ran a little over 2hours (but < 2.5 hours)-- our watchdog built into the wrapper would (should) have done a hard kill of those processes at 2.5 hours and then sent back all the good results.

Thanks for sending us this information -- we will look into why the final simulation did not terminate properly and reducing the time the watchdog waits before stepping in and cleaning up.
Jack


____________
MindModeling@Home is fun

Message boards : News : Most ambitious job yet - Lots of work to be done


Main page · Your account · Message boards


Copyright © 2017 MindModeling.org