Only Errors !!!

log in

Advanced search

Message boards : Number crunching : Only Errors !!!

Author Message
Bommer
Send message
Joined: 18 Apr 09
Posts: 5
Credit: 61,215
RAC: 0
Message 1324 - Posted: 29 Apr 2009, 10:41:18 UTC

Hello

On my Q6600 with WIN Vista 64 i have only errors.

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
ACTR: boinc_init_options complete
ACTR: boinc_get_init_data(actr_aid) complete
ACTR: Trace 1
ACTR: Trace 2
ACTR: Trace 3
ACTR: Trace 4
ACTR: Trace 5
ACTR: Trace 6
ACTR: Trace 8
ACTR: Trace 9
ACTR: Trace 10 -- Lisp Running
ACTR: Trace 11 -- Watchdog Running (if Win32)
app error running lisp: 0xc0000005
called boinc_finish

</stderr_txt>
]]>



What can be the reason ???

Greets Bommer (Team Rechenkraft.net)

Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1326 - Posted: 29 Apr 2009, 16:24:45 UTC

Within the last couple weeks our servers weren't delivering our apps properly, including our lisp app (Steel Bank Common Lisp -SBCL). The errors you are getting are with running our lisp application, so try detaching and reattaching the project to see if you can download the lisp app correctly. Then let me know if you still get these errors.

Bommer
Send message
Joined: 18 Apr 09
Posts: 5
Credit: 61,215
RAC: 0
Message 1338 - Posted: 2 May 2009, 13:28:19 UTC

Hello

I have detached und redetached the project and the error is the same.

Here the log.

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
ACTR: boinc_init_options complete
ACTR: boinc_get_init_data(actr_aid) complete
ACTR: Trace 1
ACTR: Trace 2
ACTR: Trace 3
ACTR: Trace 4
ACTR: Trace 5
ACTR: Trace 6
ACTR: Trace 8
ACTR: Trace 9
ACTR: Trace 10 -- Lisp Running
ACTR: Trace 11 -- Watchdog Running (if Win32)
app error running lisp: 0xc0000005
called boinc_finish

</stderr_txt>
]]>


And now ???

Greets Bommer

Jonathan Brier
Avatar
Send message
Joined: 1 Feb 08
Posts: 1
Credit: 62,940
RAC: 123
Message 1346 - Posted: 4 May 2009, 16:21:36 UTC

I too am getting error messages. I looked back and have been getting this error since January 19th when I started again but was fine for the most part before December 11th. Anything with the client 6.4.5 and later has just errored out.

What I have noticed is when the task begins to run it begins at 0.000% but promptly goes to -100.000% and errors out.

Current: Windows XP service pack 3 BOINC version 6.6.20 using account manager GridRepublic. Computer ID: 9817

I tried the detach and reattach still the same behavior.

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
Error performing inpage operation. (0x3e7) - exit code 999 (0x3e7)
</message>
<stderr_txt>
ACTR: boinc_init_options complete
ACTR: boinc_get_init_data(actr_aid) complete
ACTR: Trace 1
ACTR: Trace 2
ACTR: Trace 3
ACTR: Trace 4
ACTR: Trace 5
ACTR: Trace 6
ACTR: Trace 8
ACTR: Trace 9
ACTR: Trace 10 -- Lisp Running
ACTR: Trace 11 -- Watchdog Running (if Win32)
ACTR: Complete after 0
ACTR: Trace 12 -- Calling Finish
called boinc_finish

</stderr_txt>
]]>

Olli Lukkarinen
Send message
Joined: 9 May 09
Posts: 1
Credit: 352
RAC: 0
Message 1370 - Posted: 11 May 2009, 18:38:19 UTC

I have this exact same problem. Every single WU fails. I'm running Vista x64 SP1, BOINC 6.6.20. I've tried resetting the project, but it didn't help.

Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1382 - Posted: 18 May 2009, 18:44:09 UTC - in response to Message 1370.

Try setting your configuration so that only one or two workunits are running in parallel. There may be a problem with running multiple iterations of the SBCL progam concurrently in Vista.

kevint
Send message
Joined: 22 Mar 08
Posts: 8
Credit: 533,460
RAC: 0
Message 1384 - Posted: 19 May 2009, 1:52:29 UTC - in response to Message 1382.

Try setting your configuration so that only one or two workunits are running in parallel. There may be a problem with running multiple iterations of the SBCL progam concurrently in Vista.



And how do you propose to do this? I have asked the BOINC developers to allow this option for several years, and it seems they are clueless on how to achieve this. I really should not be hard to do.

Once work is cached, BOINC will decide when to run the work, and how many concurrent WU's of a particular project to run at a time based on number of cores, debt, and priority. But no way to tell BOINC to run project x on core 0, project y on core 1, project z on core 2, etc...

The only way to limit it currently is to set your BOINC to run on a single core, leaving the others non productive.


Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1388 - Posted: 21 May 2009, 16:34:23 UTC - in response to Message 1384.

Try reducing how much memory your BOINC client uses when running a job. It that works, we'll at least know it's a problem with running multiple iterations of lisp on some Vista clients.

kevint
Send message
Joined: 22 Mar 08
Posts: 8
Credit: 533,460
RAC: 0
Message 1396 - Posted: 15 Jun 2009, 17:38:12 UTC
Last modified: 15 Jun 2009, 17:39:09 UTC

Crashing every WU on every machine.

Does not seem to matter if it is XP32 XP64 or Vista..


App 3.45


and I got all excited to see more work :(

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 50
Credit: 883,268
RAC: 289
Message 1419 - Posted: 27 Jun 2009, 0:36:50 UTC

The new DSST_6-3 work units seem to get stuck after about 21 minutes (on my 1.86Ghz E4300) and then the CPU utilization goes to zero. This is on 32-bit Unbuntu Linux. System Monitor shows the process is sleeping on futex_wait. I'm having to abort them when they stop running. Not sure if all of them are going to do this. The To Completion time keeps increasing, but the CPU time stops increasing. Looks like some kind of deadlock.

Profile taurec
Send message
Joined: 27 Jun 09
Posts: 1
Credit: 57,275
RAC: 0
Message 1427 - Posted: 28 Jun 2009, 9:12:02 UTC - in response to Message 1419.
Last modified: 28 Jun 2009, 9:13:35 UTC

Same here using sidux 64-bit and boincmanager 6.4.5.

Processed time 21:38 minutes with increasing time to end.

No activity shown in systemcontrol.

Greetings from Germany
Robert

Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1428 - Posted: 28 Jun 2009, 15:17:59 UTC

Sorry for the errors. We've had to delete that model from our jobs queue, because it is proving difficult to run on our volunteers. If you're computer has currently downloaded workunits with the "DSST_6" prefix, abort them immediately.

We should have a new, working model by the end of this week.

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 50
Credit: 883,268
RAC: 289
Message 1429 - Posted: 28 Jun 2009, 15:18:40 UTC - in response to Message 1427.

And it's not just the Linux/unix WUs. All Windows WUs are going to completion with error. I've set the project to no new tasks until someone can confirm the problems have been fixed.

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 50
Credit: 883,268
RAC: 289
Message 1430 - Posted: 28 Jun 2009, 15:20:46 UTC - in response to Message 1429.

Must have been typing when you were! The server status shows 24k results ready to send. Are all of those going to be aborted? Just wondering when I can let in new WUs.

Thanks!

Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1432 - Posted: 28 Jun 2009, 15:35:25 UTC - in response to Message 1430.

Wooah! Thanks for bringing the "24k workunits" to my attention. The DSST_6 model was aborted last Friday but it appears that the workunits were not destroyed along with it.

Yes, the data for this model will not be collected, and as such, all workunits should be aborted immediately on client machines. I'm going to have to do some immediate database purges.

Thanks Gary!

Gary Wilson
Send message
Joined: 25 Nov 08
Posts: 50
Credit: 883,268
RAC: 289
Message 1448 - Posted: 28 Jul 2009, 11:40:35 UTC - in response to Message 1432.

Noticed a lot of new work. However, on 32-bit Ubuntu, it only gets to 38.999% and then CPU time stops incrementing. I'm having to abort these as I have no idea when they would even decide to give up. They ran a few minutes only (to 38.999%), and then showed the to completion as 7+ hours and growing.

Profile Tom
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 23 Jun 08
Posts: 303
Credit: 105,388
RAC: 0
Message 1449 - Posted: 29 Jul 2009, 14:48:41 UTC - in response to Message 1448.

Hi Gary,

We've experienced similar problems for this job across the board. Our current belief is that the model reaches a memory restriction within our lisp environment (SBCL) and errors out. We'll resubmit the model once the problem has been fixed.

Steve Dodd
Send message
Joined: 1 Feb 08
Posts: 22
Credit: 249,902
RAC: 35
Message 1450 - Posted: 29 Jul 2009, 20:50:54 UTC
Last modified: 29 Jul 2009, 20:51:15 UTC

So Tom,
I'm getting errors on all wu on my VISTA laptop (2 cores). Do you want us to abort all of these wus too?


Post to thread

Message boards : Number crunching : Only Errors !!!


Main page · Your account · Message boards


Copyright © 2013 MindModeling.org