log in

Advanced search

Message boards : Number crunching : Latest MM WU's Not Obeying Use at Most CPU Time

Author Message
exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3843 - Posted: 30 Jul 2016, 2:03:31 UTC
Last modified: 30 Jul 2016, 2:03:31 UTC

I have the use at most 66.66% CPU time set in Boinc computing preferences. World community grid projects obey that setting but Mind Modeling does not. My local boinc preferences override the preferences set at WCG and MM. The CPU temperature is going too high on my desktop when most of the 8 cores are doing MM WU's. When most of the cores are doing WCG WU's the temperature is in acceptable range. There is nothing in the log that gives any indication of what is going on. The temperature on my AMD FX 8320 runs at about 62c continuously with MM WU's. AMD recommends that it should not run for long periods at or above that.
____________
xalpha

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3844 - Posted: 30 Jul 2016, 4:31:48 UTC
Last modified: 30 Jul 2016, 4:31:48 UTC

I am using Linux Mint 18
____________
xalpha

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 247
Credit: 1,456,117
RAC: 0
Message 3846 - Posted: 1 Aug 2016, 21:00:21 UTC
Last modified: 1 Aug 2016, 21:00:21 UTC

Hello xalpha,

The way the "use at most % of CPU time" feature works is that it runs the task for the percentage of time you set it to and then suspends it for the remaining percentage of time.

http://boinc.berkeley.edu/dev/forum_thread.php?id=7881

So for your specific settings you would see something like 2.67 seconds running, and then a pause for 1.33 seconds. You can view this with the top command. e.g.

top -d 0.2

That example top command will run top and update the view every tenth of a second so that you can see the BOINC processes running. Under the S (Status) field it should show either an R indicating it is running, or a T indicating that the process has been stopped.

We tested % CPU time on our local Linux test box running BOINC client 7.4.42 and the % CPU time is functioning normally.

I would recommend either CPU time at a lower rate, like 50%, or reducing the number of cores you allow to run at the same time, and then adjust based on your observed temperatures from there.

Thanks for your support and happy crunching,

Brandon

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3847 - Posted: 2 Aug 2016, 1:06:30 UTC - in response to Message 3846.
Last modified: 2 Aug 2016, 1:06:30 UTC

I am running Boinc 7.2.42 x64. I don't know if that version might be the reason for MM not obeying % of CPU time setting. I looked for a later version but could not find one for linux. There must be one because you tested on a later version. Where do I go to get that?
I have a hardware sensor program called Psensor, that shows % of CPU use and temperature. When all the work units running MM, it shows 100% CPU usage and a temperature of 62c. When all WCG projects are running it shows about the same % of CPU time as set in Computing Preferences and a temperature of about 55c.
____________
xalpha

Profile Nathan Neitman
Project administrator
Project developer
Project tester
Send message
Joined: 23 Jan 15
Posts: 5
Credit: 204,156
RAC: 0
Message 3848 - Posted: 2 Aug 2016, 14:54:26 UTC - in response to Message 3847.
Last modified: 2 Aug 2016, 14:54:26 UTC

There was a mistake on our end when we told you the BOINC version, we actually have the same version as you running on our test box (7.2.42 x64). So, the issue appears not to be the BOINC client version.

So, to rehash Brandon's previous post: BOINC does not throttle how much your CPU is used similar to how you would throttle a car's engine by how hard you press the gas pedal, but instead limits usage by restricting how long the BOINC process runs for (Or to go with the example how long you press the pedal).

A possible case for why you're seeing the 100% CPU usage via Psensor is that pretty much all monitoring software works via snapshots of utilization. The utilization amount is decided by your OS based on how many operations it has queued up (really how busy it is going to be). At that moment of the snapshot you may see that BOINC could be well over your desired CPU usage, but if you have a graph over time instead of the snapshot you will see that the usage should be close to what you set (allowing some variance because the snapshots could potentially have an interval that falls in line with when the CPU is busy with BOINC processing).

A possible solution would be for you to dial back your CPU time setting to something like 40% or so, and seeing if that produces more favorable temperatures. If not, you should dial it back further.

P.S. Thank you for making me aware of pSensor, it looks really cool! :)

Cheers and happy crunching,
Nathan

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3849 - Posted: 4 Aug 2016, 22:03:14 UTC
Last modified: 4 Aug 2016, 22:13:26 UTC

I did dial back my computing preferences. All the apps running in Boinc were MM. This is a screenshot:

This is a screenshot of CPU usage and tempererature from Psensor.

As can be seen the temperature is still high at 58c. The room where the computer is does not have any heater on. Inside it is about 10c. That should help keep the temperature down but is still high. When only World Community Grid apps are running Psensor shows about 55c with "Use at Most 66% CPU Time" and temperature inside of about 18c.
To open the screenshots you might have to right click and open in new window or tab. The indicator to look at is "CPU Socket Temperature" and there is also graphs.
____________
xalpha

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3850 - Posted: 5 Aug 2016, 5:59:58 UTC - in response to Message 3848.
Last modified: 5 Aug 2016, 6:03:47 UTC

A possible case for why you're seeing the 100% CPU usage via Psensor is that pretty much all monitoring software works via snapshots of utilization. The utilization amount is decided by your OS based on how many operations it has queued up (really how busy it is going to be). At that moment of the snapshot you may see that BOINC could be well over your desired CPU usage, but if you have a graph over time instead of the snapshot you will see that the usage should be close to what you set (allowing some variance because the snapshots could potentially have an interval that falls in line with when the CPU is busy with BOINC processing).

If you look at the light brown line in the graph on Psensor it shows near 100% usage over time, even though "Use at most 33.33%" down from 66.66%.

A possible solution would be for you to dial back your CPU time setting to something like 40% or so, and seeing if that produces more favorable temperatures. If not, you should dial it back further.

I did dial it back down to 33.33%. Even that makes no difference.

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3851 - Posted: 5 Aug 2016, 11:42:26 UTC
Last modified: 5 Aug 2016, 11:42:26 UTC

Another thing I have found The WU's are not suspending according to hours set for computing in computing preferences. Even after the time to suspend there is a lot of activity. This is shown in the processes tab of System Monitor. Here is a screenshot:

____________
xalpha

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 247
Credit: 1,456,117
RAC: 0
Message 3852 - Posted: 5 Aug 2016, 15:47:03 UTC
Last modified: 5 Aug 2016, 15:47:03 UTC

Hello exalpha,

To see if your BOINC client is functioning normally. Can you please run this
command:


top -d 0.2


That example top command will run top and update the view every tenth of a second so that you can see the BOINC processes running. Under the S (Status) field it should show either an R indicating it is running, or a T indicating that the process has been stopped.


This way you will be able to see if the task if being suspended or not.

Also, I will run tests using the restricted run time and test to make sure the work units to suspended properly.

Thanks for supporting us and happy crunching,

Brandon

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 247
Credit: 1,456,117
RAC: 0
Message 3853 - Posted: 5 Aug 2016, 15:51:04 UTC
Last modified: 5 Aug 2016, 15:51:04 UTC

Also,

Make sure that the option Run based on preferences has been selected in the BOINC Manager advanced view's Activity menu.

Best,

Brandon

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3854 - Posted: 5 Aug 2016, 21:06:50 UTC
Last modified: 5 Aug 2016, 21:12:42 UTC

Top : top -d 0.2
Here is a screenshot with running top and this command and I have suspended projects set at projects tab. In the screenshot it shows boincmgr with status S but peridically the status goes to R.

Here is top showing mm processes running when supposed to be suspended. The mm processes always have the S and R status. The mm_wrapper processes show S but R processes are mm activity and they have status of R.

In activity menu I made sure "Run based on preferences" was selected.
____________
xalpha

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 247
Credit: 1,456,117
RAC: 0
Message 3856 - Posted: 8 Aug 2016, 14:44:16 UTC
Last modified: 8 Aug 2016, 14:44:16 UTC

Great, so if you are seeing the toggling of S and R that means the % CPU Time is functioning on your machine.

Still looking into the suspension of tasks I will let you know if I find anything interesting.

Profile Brandon
Project administrator
Project developer
Project tester
Avatar
Send message
Joined: 5 Jan 15
Posts: 247
Credit: 1,456,117
RAC: 0
Message 3857 - Posted: 8 Aug 2016, 14:50:02 UTC
Last modified: 8 Aug 2016, 14:50:02 UTC

Just finished testing task suspension on our linux test box running the same BOINC client as yours and task suspension is functioning normally.

I'll try to look at other boinc forums to see if anyone has had your issue before, and see if I can find another solution for you.

exalpha
Avatar
Send message
Joined: 23 Mar 14
Posts: 23
Credit: 505,507
RAC: 0
Message 3858 - Posted: 9 Aug 2016, 11:33:07 UTC - in response to Message 3857.
Last modified: 9 Aug 2016, 11:33:07 UTC

I tried a complete reinstall of Boinc. Normally when I install Boinc I install it over the top of already installed Boinc, so as not to lose any work. This time I deleted the whole Boinc folder (located in my /home/user_name/). I then reattached all projects. That did not help. I think this problem might have started when I upgraded my Linux Mint 17.3 Cinnamon, to Mint 18 Cinnamon.
____________
xalpha

Message boards : Number crunching : Latest MM WU's Not Obeying Use at Most CPU Time


Main page · Your account · Message boards


Copyright © 2018 MindModeling.org