Posts by Jack.Harris |
| log in |
|
1)
Message boards :
Number crunching :
BitDefender claims you're infected with MIDAS3
(Message 2495)
Posted 88 days ago by Jack.Harris
Thanks Ivanst for the information. Well look into the MIDAS3 signature to see what could be causing the false-positives. Weird! Jack |
|
2)
Message boards :
Science :
LSF Integration
(Message 2481)
Posted 97 days ago by Jack.Harris
This is probably too old to matter, but if anyone wants to get a copy of these scripts PM me. We are integrated with LSF, PBS and LL batch systems. The scripts submit boinc instances for a given amount of time and then sends commands to abort any workunits not finished a little before the time expires. The script will then reschedule more jobs if it had to abort anything. Jack |
|
3)
Message boards :
Number crunching :
cant change password
(Message 2364)
Posted 216 days ago by Jack.Harris
PM sent We are planning on moving from 'server_stable' to the 'trunk' version of the BOINC server code in the near future. Hopefully this will resolve the issue about accounts created via BOINC-Wide teams. -Jack |
|
4)
Message boards :
News :
New 1.89 Windows App Released
(Message 2281)
Posted 221 days ago by Jack.Harris
I recommend resetting the project /twice/ That seems to help when this issue crops up. |
|
5)
Message boards :
News :
New 1.89 Windows App Released
(Message 2278)
Posted 221 days ago by Jack.Harris
The 1.89 application actually shares a lot of files that were used by the 1.88 application so as to minimize the bandwidth requirement needed to deploy the new application. The only file that is actually different is the wrapper executable (mm_wrapper_1.14_windows_intelx86.exe; old version is 1.13). --Jack |
|
6)
Message boards :
News :
L'Alliance Francophone takes first position and is ready to show their real strength
(Message 2276)
Posted 222 days ago by Jack.Harris
And thank you -- your crunching power is already being missed. |
|
7)
Message boards :
News :
New 1.89 Windows App Released
(Message 2226)
Posted 226 days ago by Jack.Harris
A new 1.89 Windows application was released today which removes a bug causing a workunit to possibly hang at 100% complete for up to 2 hours on some Windows systems before finishing. This improved application as already produced a 25-30% average performance improvement compared to the 1.88 application. Also, congratulations to tempo for being the first cruncher to accumulate over 3 million credit on the MindModeling project. Excellent! --Jack |
|
8)
Message boards :
Number crunching :
60+ hours... is this normal?
(Message 2225)
Posted 226 days ago by Jack.Harris
The new 1.89 windows app released today improves on the watch dog feature inside of our application. It removes the issue where a job gets to 100% and hangs until the 2.5 hour mark is reached. Programs completing 100% of the simulations in the workunit are forcefully terminating after 1 minute if they do not gracefully exit on their own. We have already seen on average a 25% improvement across windows hosts from 1.88 to 1.89 on Windows. --Jack |
|
9)
Message boards :
News :
Congratulations are in order
(Message 2224)
Posted 234 days ago by Jack.Harris
hl just joined the 1M total credit club! Furthermore, tempo has been crunching away and has become only the second person in MM history to pass the 2M total credit mark. Great job and thanks for the help! --Jack |
|
10)
Message boards :
News :
Record Setting Week
(Message 2222)
Posted 236 days ago by Jack.Harris
This week recorded our top 7 best crunching days in MM history. In that time there were 9 completed cognitive model explorations and our latest version of the very large 'Change Signal' job is now 25% of the way complete. Milestones: - tempo became the next member the 1M total credit club - GATE became the first 2M credit cruncher for the project - L'Alliance Francophone has become the #1 team in terms of both RAC and Total Credit - Project passed the 4 TFlop mark Thanks Everyone -- Let's keep crunching! -Jack |
|
11)
Message boards :
Number crunching :
Compute errors
(Message 2221)
Posted 236 days ago by Jack.Harris
The Windows Error Details about the Windows Error for our CCL application on some Windows boxes: > Error of type CCL::SIMPLE-FILE-ERROR: File exists : "C:/ProgramData/BOINC/slots/" > While executing: CREATE-DIRECTORY, in process listener(1). app exit status: 0xffffffff 22:49:20 (3684): called boinc_finish The problem seems to manifest itself if the host is running Deep Freeze . The current working theory about why this could be happening is that the Deep Freeze application makes use of a minifilter driver to produce the equivalent of a UNIONFS/RAM-overlay file system in Windows. Unfortunately, this operation of intercepting IO operations is interacting badly with our Clozure Common Lisp application when it attempts to interact with the filesystem. Now that we have a working theory we will try to reproduce, isolate and fix this issue. |
|
12)
Message boards :
Number crunching :
Compute errors
(Message 2220)
Posted 238 days ago by Jack.Harris
1) We would be very interested in working directly with anyone experiencing the first issue. > Error of type CCL::SIMPLE-FILE-ERROR: File exists : "C:/ProgramData/BOINC/slots/" We have not been able to reproduce the issue on any of our systems. Details about the system configurations where this issue occurs may be helpful. Offline testing of potential test environments directly with us would also be helpful (PM me if anyone is interested in helping us in that way). Our current ideas about the issue revolve around latency in write data to the drive (perhaps the drives may be SAN disks) -- in that case some sleep logic may help -- or maybe the issue may be permissions related. More research on this issue is required. 2) This issue is caused by the fact that the system has a kernel that is too old to run the binary. You may be able to fake a system into working since the actual function call is to a very basic function 'strcat' cd /lib/x86_64-linux-gnu/ ln -s /lib/x86_64-linux-gnu/libc.so.[whatever number you have -- 4? 5?] libc.so.6 Hope this helps someone, Jack |
|
13)
Message boards :
News :
Most ambitious job yet - Lots of work to be done
(Message 2219)
Posted 238 days ago by Jack.Harris
It appears as the final simulation was not gracefully completing and output was truncated. The each result looked like it ran a little over 2hours (but < 2.5 hours)-- our watchdog built into the wrapper would (should) have done a hard kill of those processes at 2.5 hours and then sent back all the good results. Thanks for sending us this information -- we will look into why the final simulation did not terminate properly and reducing the time the watchdog waits before stepping in and cleaning up. Jack |
|
14)
Message boards :
News :
Great last few days!
(Message 2211)
Posted 241 days ago by Jack.Harris
We are running n simulations per WU. Therefore after each simulation the percent moves up 1/n%. Unfortunately, this is only a good heuristic when each simulation has nearly equal runtime. The particular model that is using python right now has a lot of time variability between different simulations. This will not be for all jobs using the Python 2.7 app. Sorry for the lack of fidelity in the percent done estimation. Jack |
|
15)
Message boards :
News :
Great last few days!
(Message 2209)
Posted 241 days ago by Jack.Harris
Can't take the credit -- tempo suggested it to me in a PM :) Cheers, Jack |
|
16)
Message boards :
News :
L'Alliance Francophone takes first position and is ready to show their real strength
(Message 2199)
Posted 242 days ago by Jack.Harris
Just tested out two download mirrors from my house: -- Wright State: 1.07M/s -- University of Dayton: 903K/s The bandwidth seems pretty good right now. I have seen the 1.96Kb/s issue before on some of our local resources. I have seen this fix it either: close boinc, opening the client_state.xml state, edit the bw_up and bw_down (make them larger) http://boinc.berkeley.edu/dev/forum_thread.php?id=5437 or reset the project twice If anyone sees this again, please let me know which download site is causing issue and we'll try to remedy the issue -- see info below Thanks for the info and hope this helps. --Jack ******************************************************************************* ** Wright State ** ******************************************************************************* [jackharris@mac:~]$ wget http://wsu.mindmodeling.org/download/mm_wrapper_graphics_1.09_windows_intelx86.exe --2012-09-24 17:54:01-- http://wsu.mindmodeling.org/download/mm_wrapper_graphics_1.09_windows_intelx86.exe Resolving wsu.mindmodeling.org (wsu.mindmodeling.org)... 130.108.5.67 Connecting to wsu.mindmodeling.org (wsu.mindmodeling.org)|130.108.5.67|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 6647808 (6.3M) [application/x-ms-dos-executable] Saving to: “mm_wrapper_graphics_1.09_windows_intelx86.exe” 100%[============================================================================================================================>] 6,647,808 1.07M/s in 6.4s 2012-09-24 17:54:08 (1020 KB/s) - “mm_wrapper_graphics_1.09_windows_intelx86.exe” saved [6647808/6647808] ******************************************************************************* ** University of Dayton ** ******************************************************************************* [jackharris@mac:~]$ wget http://mindmodeling.org/download/mm_wrapper_graphics_1.09_windows_intelx86.exe --2012-09-24 17:54:14-- http://mindmodeling.org/download/mm_wrapper_graphics_1.09_windows_intelx86.exe Resolving mindmodeling.org (mindmodeling.org)... 131.238.103.152 Connecting to mindmodeling.org (mindmodeling.org)|131.238.103.152|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 6647808 (6.3M) [application/x-ms-dos-executable] Saving to: “mm_wrapper_graphics_1.09_windows_intelx86.exe.1” 100%[============================================================================================================================>] 6,647,808 903K/s in 9.1s 2012-09-24 17:54:23 (717 KB/s) - “mm_wrapper_graphics_1.09_windows_intelx86.exe.1” saved [6647808/6647808] |
|
17)
Message boards :
News :
Most ambitious job yet - Lots of work to be done
(Message 2196)
Posted 242 days ago by Jack.Harris
Mumps -- Right now 1.51 million workunits have been queued up for the 'Change Signal Full Mesh with Fatigue 5' job. |
|
18)
Message boards :
Number crunching :
Message from server: Invalid or missing account key. To fix, remove and add this project
(Message 2194)
Posted 243 days ago by Jack.Harris
This is very odd. I only see this problem personally when I use scripting to automagically move around account_keys in my linux configurations. This would occur if I: 1. attached started crunching 2. moved a key 3. restarted boinc 4. finished crunching However, I don't know why in your case the account xml went missing. Did reattaching fix everything for you? --Jack |
|
19)
Message boards :
News :
Most ambitious job yet - Lots of work to be done
(Message 2193)
Posted 243 days ago by Jack.Harris
Mumps -- (WUs lefts) It is hard to exactly say how many WUs are left for the current running job. WUs are created iteratively. We know how many individual simulations are run out of the total set so we should be able to estimate how many WUs that should be, but right now we do not have a good method for calculating that dynamically. We'll look into this some more -- I suppose "lots" isn't a very good answer for now :) Darkknight900-- Nosferatu*-- I am very troubled to hear that you have recently experienced a never ending WU. I am not exactly sure how this could have happened. Our last fix (version 180+) definitely removed a lot of problems resulting from divergent simulations (e.g., infinite loops). Can you give me a result link if it happens again? Thanks. --Jack |
|
20)
Message boards :
News :
L'Alliance Francophone takes first position and is ready to show their real strength
(Message 2192)
Posted 243 days ago by Jack.Harris
toTOW- Tom and I are actually planning to post a new multicore creatework process early next week. We noticed that our server-side creatework process couldn't keep up when the DSST job came online on Friday. Change-signal is holding strong and our server configuration still has some room for more growth .... for now :) Thanks all for the crunching support -Jack |