log in

Advanced search

Message boards : Number crunching : The Deadline is too short !!!

Previous · 1 · 2 · 3 · Next
Author Message
MaynardVizzutti
Send message
Joined: 4 Apr 15
Posts: 3
Credit: 135,322
RAC: 0
Message 3745 - Posted: 21 May 2016, 3:13:07 UTC - in response to Message 3744.
Last modified: 21 May 2016, 3:13:07 UTC

Normally, this is only a minor annoyance for me, but there have been some jobs in the past that use more than 1GB per core and other projects' work units have failed with an "out of memory" error when switching. This can certainly be blamed on the other project serving up faulty code, but then, I don't have much control of that either.

The end result has been 50- or 60,000 CPU seconds wasted on failed tasks on more than one occasion.

The periods of no work let the long term project debt build up and then a new batch plays unfairly with short deadlines, sometimes wrecking other work units. This effect is much too large for a project I have configured to get only 4.5 percent of my spare cycles.

If it were not for the deadlines, fewer processes would run at one time, and the memory limitations would not be an issue, but this project behavior doesn't work well with my configuration. I don't think it's reasonable to adjust what works perfectly with ten other projects to try to satisfy this one.

I will be reluctantly disconnecting from this project and checking in a few months time to see if the project's basic approach has changed.

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 267
Credit: 1,456,117
RAC: 0
Message 3747 - Posted: 23 May 2016, 12:57:38 UTC
Last modified: 23 May 2016, 12:57:38 UTC

Hello Crackenback,

On Friday I modified the job so it should not be sending out the shorter work units anymore. If you are still having issues please let me know.

Thanks for your support and happy crunching,

Brandon

Jacob Klein
Send message
Joined: 9 Oct 09
Posts: 37
Credit: 333,573
RAC: 0
Message 3748 - Posted: 23 May 2016, 15:12:26 UTC - in response to Message 3747.
Last modified: 23 May 2016, 15:12:26 UTC

What is the minimum deadline that we can expect for all future jobs for this application?

The answer to that question, will determine whether I can allow new tasks or not.

Crackenback
Send message
Joined: 28 Aug 09
Posts: 3
Credit: 265,703
RAC: 0
Message 3749 - Posted: 23 May 2016, 21:23:56 UTC - in response to Message 3747.
Last modified: 23 May 2016, 21:23:56 UTC

Hi Brandon,
Unfortunately, no, the WUs are still short dated. On the PCs I am solely running MM, they are same day deadlines.
Cheers.

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 267
Credit: 1,456,117
RAC: 0
Message 3750 - Posted: 24 May 2016, 13:17:09 UTC
Last modified: 24 May 2016, 13:18:22 UTC

Thanks for letting me know, I fixed it again, however some of the works units already in the queue there may still have shorter deadline.

Thanks for your support and happy crunching,

Brandon

Gunde
Send message
Joined: 8 Feb 15
Posts: 26
Credit: 7,169,421
RAC: 0
Message 3753 - Posted: 24 May 2016, 14:35:41 UTC
Last modified: 24 May 2016, 14:56:46 UTC

Deadline is sometimes very short for some task. There are now shorter task and those helps to catch up to to the deadline. I could run the task as it is now.

Have followed my host last days and there some other issues then short time.
3 times some of my host did get rejected to report the task when the got done and uploaded. For the most part the report instantly after upload. When this happens it got a counter to next update time (24 hours). All task that still is in manager won´t get reported and task being uploaded do not get counted to upload time. Task got "Completed, too late to validate".

Around 2-5% might get to late because of the deadline, but when they don´t report i have lost around 10-20% of tasks. One day there was 120 task and been waiting 12 hours until i manually updated project to get those task reported.
All of those task got "Completed, too late to validate".

Do not want to force instantly report in app_config when they already do it in most cases but could we time when update got rejected? 24 hours will make any task too late to validate.

I do think this is the reason why most host got too late. There is hard to notice if you don´t follow the manager and wu:s getting cleared out fast and got same info as not finished in time of deadline.

This have been an issue for other projects to not only Mindemodeling@Home but do not know if they solved the issue for upload and report.

I will keep running nut it´s big waste of compute power when task sends out to other host even if my task is valid and no other use when "minimum quorum 1" and "initial replication 1" is set for those wu:s.

Is there an issue to run cross platform with this batch? if possible it would be great if we could use linux app to if there are big batch of wu:s
Native Python v2.7 Application (Cross Platform)

Keep up the good work and i hope WE could solve this before people stop contribute because of short deadline.

Gunde
Send message
Joined: 8 Feb 15
Posts: 26
Credit: 7,169,421
RAC: 0
Message 3755 - Posted: 25 May 2016, 14:00:47 UTC
Last modified: 25 May 2016, 14:09:40 UTC

https://mindmodeling.org//results.php?hostid=80519&offset=0&show_names=0&state=5&appid=

Boinc Manager say "connection deferred" time until update 12 hours
Doing a manually update and task ends as invalid.
Another 30 task "Completed, too late to validate"

Found that this host was set to "Network activity based on preference" should work better if i put it on always.

This is a big waste.

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 267
Credit: 1,456,117
RAC: 0
Message 3757 - Posted: 25 May 2016, 18:11:01 UTC
Last modified: 25 May 2016, 18:11:01 UTC

Hello 47an,

It is possible you have already tried this, but have you tried resetting the project on that host?

The other reasons for the hang could either be our server was really busy at the time or network issues on either side.

Also, I extended the deadline for the work units yesterday, so there may have been some shorter ones in the queue yet to get out, but other than that the rest should have a longer deadline now.

If you still are having issues please let me know.

Thanks for supporting us and happy crunching,

Brandon

Profile Steve Hawker*
Send message
Joined: 30 Oct 12
Posts: 6
Credit: 109,889
RAC: 0
Message 3775 - Posted: 30 Jun 2016, 0:59:53 UTC - in response to Message 3757.
Last modified: 30 Jun 2016, 0:59:53 UTC

Also, I extended the deadline for the work units yesterday, so there may have been some shorter ones in the queue yet to get out, but other than that the rest should have a longer deadline now.

Brandon

Just had a task with a SEVEN hour deadline:

https://mindmodeling.org/workunit.php?wuid=24792857

Mine's the one at the top. You'll also see some arrived too late.

Please can we have a standard 5 day deadline for all tasks?

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 267
Credit: 1,456,117
RAC: 0
Message 3786 - Posted: 4 Jul 2016, 4:13:02 UTC
Last modified: 4 Jul 2016, 4:13:02 UTC

They should all be at 5 day tasks now.

If you have any more issues please let me know.

Brandon

Profile Steve Hawker*
Send message
Joined: 30 Oct 12
Posts: 6
Credit: 109,889
RAC: 0
Message 3816 - Posted: 13 Jul 2016, 23:16:06 UTC - in response to Message 3786.
Last modified: 13 Jul 2016, 23:16:06 UTC

They should all be at 5 day tasks now.

If you have any more issues please let me know.

Brandon

Native Python v2.7 Application (Linux Only) back to 7 hour deadline

Profile Steve Hawker*
Send message
Joined: 30 Oct 12
Posts: 6
Credit: 109,889
RAC: 0
Message 3839 - Posted: 22 Jul 2016, 20:00:02 UTC - in response to Message 3564.
Last modified: 22 Jul 2016, 20:00:02 UTC

Native Java 1.7 Application - Full Node (Windows Only) 0 1042 0.23 (0.03 - 0.68) 102

I've got a few of these. They each came in with 6 hour deadlines and not the five hour deadline you suggested was standard.

Derion
Send message
Joined: 22 Nov 15
Posts: 30
Credit: 1,208,499
RAC: 0
Message 3859 - Posted: 16 Aug 2016, 15:08:13 UTC
Last modified: 16 Aug 2016, 15:08:13 UTC

Deadline back to 6 hours

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 267
Credit: 1,456,117
RAC: 0
Message 3864 - Posted: 17 Aug 2016, 17:39:59 UTC
Last modified: 17 Aug 2016, 17:39:59 UTC

We identified the bug and fixed it. You should see new workunits return back to a normal deadline.

Thanks for supporting us and happy crunching,

Brandon

wolfe53
Send message
Joined: 23 Jul 16
Posts: 2
Credit: 5,328
RAC: 0
Message 3885 - Posted: 27 Aug 2016, 10:23:39 UTC - in response to Message 3864.
Last modified: 27 Aug 2016, 10:23:39 UTC

We identified the bug and fixed it. You should see new workunits return back to a normal deadline.

Brandon


Just for our information:
What is the normal deadline period?

I just joined this project, and all the WUs I get are only on a few hours (5) from the time of download. As has been pointed out several places in the Forums here, this does not play well when working with multiple BOINC based projects. I cannot believe that all the WUs are "at the beginning of a modelling run" or need such a short turnaround time. None of the other projects I am familiar with are using such short times.

If these short times are the standard policy of mind modelling, then I will have to stop allowing this project to send new jobs (unless I am feeling masochistic.)

AEM74
Send message
Joined: 5 Mar 14
Posts: 17
Credit: 286,834
RAC: 0
Message 3886 - Posted: 27 Aug 2016, 13:53:23 UTC - in response to Message 3885.
Last modified: 27 Aug 2016, 13:54:07 UTC


Just for our information:
What is the normal deadline period?

I just joined this project, and all the WUs I get are only on a few hours (5) from the time of download. As has been pointed out several places in the Forums here, this does not play well when working with multiple BOINC based projects. I cannot believe that all the WUs are "at the beginning of a modelling run" or need such a short turnaround time. None of the other projects I am familiar with are using such short times.

If these short times are the standard policy of mind modelling, then I will have to stop allowing this project to send new jobs (unless I am feeling masochistic.)

I've seen 5 days and I've seen a few hours for wu's. I would say both happen quite often, but if you have a strong CPU, the few hour wu's can be crunched faster than the 5 day one's. For me on a 4790k, the few hour wu's take anywhere from 5-15 minutes each.

Jacob Klein
Send message
Joined: 9 Oct 09
Posts: 37
Credit: 333,573
RAC: 0
Message 3888 - Posted: 27 Aug 2016, 20:02:19 UTC - in response to Message 3886.
Last modified: 27 Aug 2016, 20:02:19 UTC

This is not about how long they take.

This is about having ridiculous deadlines, that interfere with the BOINC scheduler, which can end up putting MindModeling tasks as high-priority (scheduled in front of everything else), and leaving GPUs idle as part of the fallout.

I have set "No New Tasks" for this project. The admins keep making the same mistakes, and there is no sense of "normal".

Would love to help, but not if it puts my GPUs idle. Please fix, permanently, and define your fix, so we can call you out when it goes wrong.

AEM74
Send message
Joined: 5 Mar 14
Posts: 17
Credit: 286,834
RAC: 0
Message 3889 - Posted: 28 Aug 2016, 0:05:23 UTC
Last modified: 28 Aug 2016, 0:05:23 UTC

MM doesn't interfere with my GPU tasks, only with my CPU ones and even then, it won't give me any because I have other projects that have wu's. I have to suspend one project in order to get any tasks.

wolfe53
Send message
Joined: 23 Jul 16
Posts: 2
Credit: 5,328
RAC: 0
Message 3890 - Posted: 28 Aug 2016, 5:38:22 UTC - in response to Message 3889.
Last modified: 28 Aug 2016, 5:38:22 UTC

That is the point.

The 5 hour deadline time causes the BOINC scheduler to preempt ALL the other tasks from other projects, and fills up all the allocated CPUs with the MM workunits. It is not that the units take all that long to process (usually no more than 15 minutes each here) but that they deliberately hog all the CPU slots and delay other project's WUs that may be due just a little after the MM stuff hits.

Also, if you watch the "remaining time" on MM WUs, they often dynamically increase their remaining time by minutes at a time during the run of a phase.
The memory footprint of the MM WUs are also quite large on the currently active method -- the "R" statistics package is not small and likes to have lots of memory.

To make the point again: MindModelling does not "play nice" with other projects; just like a schoolyard bully.

mikey
Send message
Joined: 11 Jun 16
Posts: 10
Credit: 517,625
RAC: 0
Message 3891 - Posted: 28 Aug 2016, 15:24:34 UTC - in response to Message 3890.
Last modified: 28 Aug 2016, 15:24:34 UTC

That is the point.

The 5 hour deadline time causes the BOINC scheduler to preempt ALL the other tasks from other projects, and fills up all the allocated CPUs with the MM workunits. It is not that the units take all that long to process (usually no more than 15 minutes each here) but that they deliberately hog all the CPU slots and delay other project's WUs that may be due just a little after the MM stuff hits.

Also, if you watch the "remaining time" on MM WUs, they often dynamically increase their remaining time by minutes at a time during the run of a phase.
The memory footprint of the MM WUs are also quite large on the currently active method -- the "R" statistics package is not small and likes to have lots of memory.

To make the point again: MindModelling does not "play nice" with other projects; just like a schoolyard bully.


That's MUCH better than the message I got this morning say that it WON'T work on my 64 bit Ubuntu Linux pc, this DESPITE the fact that these are my stats for that pc:

State: All (55) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (55) · Invalid (0) · Error (0)

I have suspended all further MM work on that pc and MM is back to being a project with a zero resource share on all my Windows pc's!!

Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : The Deadline is too short !!!


Main page · Your account · Message boards


Copyright © 2019 MindModeling.org