[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ale] fascinating data on temperature, including ATI / AMD Radeon gpu



One of the fun parts of temp monitoring is when the sensors must be
calibrated. Most chips "know" the scale factors but some are off a bit. So
the driver makes the change. With Linux system, you can feed a bunch
scale-factor params to the start up of lm_sensors. Tyan used to provide the
lm_sensor data they had tested for best accuracy on their boards. Not sure
if other makers do or not.


On Sun, Apr 21, 2013 at 12:38 AM, Ron Frazier (ALE) <
atllinuxenthinfo at techstarship.com> wrote:

> Hi all,
>
> The topic of monitoring temperatures in a PC comes up here periodically.
>  As I mentioned in other threads, I've been working with graphics cards on
> a Mint installation for cryptocurrency computations.  As you may know from
> my previous posts, I've always wanted to keep an eye on the status of my
> systems.  In the process of working with this project, I've discovered a
> number of interesting pieces of information that I thought I'd share.
>
> Take a look at this image:
>
> https://dl.dropboxusercontent.**com/u/9879631/sensors-sample1.**png<https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png>
>
> This shows a part of my screen on my Mint system.  Note my Gnome panel at
> the top with a temperature monitor on it.  This is the hardware monitor
> widget that is available in Gnome.  However, when I installed the ATI / AMD
> graphics drivers, the sensor system was no longer able to monitor the cpu.
>  After a bit of googling, I was directed to lm-sensors.  Many of you are
> already aware of that.  I tried this command.
>
> --> sudo apt-get install lm-sensors
>
> I found that it was already installed.
>
> I then found and issued these two commands to reinitialize the system.
>
> --> sudo sensors-detect
>
> I accepted the defaults here then told it to save the changes.
>
> --> sudo service module-init-tools start
>
> I think that allowed the changes to take effect without a reboot.
>
> This allowed the sensor system to work again, and my panel widgets to read
> both the cpu temperature and the hard drive temperatures as shown in the
> image.
>
> You can use this command to read the sensors once in a terminal window.
>
> --> sensors
>
> This command will read the sensors every few seconds and display the
> results continuously.
>
> --> watch sensors
>
> I searched for a while to find a utility to read the gpu temperatures.  I
> found nothing for a while.  Then I discovered that it's built into the ATI
> / AMD driver.  I don't know how to do this with nvidia cards.
>
> The following command will read the clock speed and load on the first gpu.
>
> --> aticonfig --adapter=0 --od-getclocks
>
> The following command will read and display the results continuously.
>
> --> watch aticonfig --adapter=0 --od-getclocks
>
> The following command will read the temperature of the first gpu.
>
> --> aticonfig --adapter=0 --odgt
>
> The following command will read and display the results continuously.
>
> --> watch aticonfig --adapter=0 --odgt
>
> Once I found this out, I modified my mining program to add a temperature
> status window for each gpu so I could keep an eye on the temperature.  This
> script file shows how I did it.
>
> https://dl.dropboxusercontent.**com/u/9879631/start-miners<https://dl.dropboxusercontent.com/u/9879631/start-miners>
>
> If you look at these images, I also discovered something very interesting.
>  The first one is the same as the one mentioned above, including the
> temperature readings of the GPU's on my Mint machine.  The second is an
> image of the temperature readings of the GPU's on my Windows machine.
>
> https://dl.dropboxusercontent.**com/u/9879631/sensors-sample1.**png<https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png>
> https://dl.dropboxusercontent.**com/u/9879631/sensors-sample2.**png<https://dl.dropboxusercontent.com/u/9879631/sensors-sample2.png>
>
> All the gpu's are being run at close to 100% load, and the cases of both
> computers are well ventilated with multiple fans.
>
> Look at the Miner 1 temperature window in image 1.  This is an MSI 7850
> gpu running in the Mint machine.  It's running at 73 deg C.
>
> Now, look at the right hand window in image 2.  This is an IDENTICAL MSI
> 7850 gpu running in the Windows machine.  It's running at 62 deg C.
>
> Like I said, they're identical cards running in almost identical
> conditions.  So why is one running 11 degrees hotter than the other.
>
> This was puzzling me for a while but I think I've figured it out.
>
> In the Linux machine, the MSI card is in the TOP one in the chassis.  That
> means its intake fan is right next to the 2nd gpu, with only about 1/8" of
> space between.  So, it's air flow is very restricted.  That's the card
> that's running hotter.
>
> In the Windows machine, the MSI card is the SECOND one in the chassis.  It
> has several inches of air gap to the next object.  It's the one that is
> running cooler.
>
> Now look at each image and compare the readings for each card within the
> same computer.
>
> In image 1, the Mint machine, Miner 1, the top card, is at 73 deg C.
>  Miner 2, the bottom card, is at 57 deg C.
>
> In image 2, the Windows machine, the left window is an Asus 7850 card, and
> is the top card.  It's at 75 deg C.  The right window, the MSI card, is in
> the bottom slot.  It's running at 62 deg C.
>
> So, in one case, the top card is running 16 degrees hotter.  In the other
> case, the top card is running 13 degrees hotter.
>
> Based on this, I am convinced that any gpu or other card with it's own fan
> on the side will run substantially hotter than its baseline temperature if
> it's next to another card.
>
> I'm not quite sure what to do about it.  I think 75 deg C is OK, but not
> great.  For what it's worth, I think my AMD cpu's are rated at about 67 deg
> C.  Apparently, the gpu's have more tolerance.  You can see in image 2 that
> the fans on the gpu's in the Windows system are only running at about 40%
> of their max, assuming that GPU-Z is reading them right.  So, maybe the
> card is not too unhappy.  But, it may mean the card would be pushed over
> its thermal limits much faster if a case fan fails, or if the room ambient
> temperature rises too much.
>
> Anyway, I found this fascinating.  I guess I'll just have to keep a close
> eye on any PCI-E cards with fans which are jammed up against other cards.
>
> PS I think I was monitoring the wrong temperature for CPU on my desktop
> machine for years.  The MSI motherboards have a 2 digit led display on the
> board which monitors post codes and then temperature once the machine is
> running.  I was monitoring the sensor that matched that reading.  When I
> ran the AMD Overdrive utility, it came up with a different, lower, number
> for CPU temperature, so I started monitoring that instead.  I don't know
> now exactly which temperature that the motherboard display is monitoring.
>
> PPS I took some of the text in this email from the Linux machine to the
> Windows machine to write the email.  When I tried to open it up in notepad,
> I just got one long line of text with no breaks, since Windows has
> different line breaks.  However, I found out that I could open it in
> Wordpad and it worked OK.  Then, I could copy it into this email.
>
> Let me know what your experiences have been monitoring and controlling
> temperature.
>
> Hope this is helpful.
>
> Sincerely,
>
> Ron
>
>
> --
>
> (PS - If you email me and don't get a quick response, you might want to
> call on the phone.  I get about 300 emails per day from alternate energy
> mailing lists and such.  I don't always see new email messages very
> quickly.)
>
> Ron Frazier
> 770-205-9422 (O)   Leave a message.
> linuxdude AT techstarship.com
> Litecoin: LZzAJu9rZEWzALxDhAHnWLRvybVAVg**wTh3
> Bitcoin: 15s3aLVsxm8EuQvT8gUDw3RWqvuY9h**PGUU
>
> ______________________________**_________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/**listinfo/ale<http://mail.ale.org/mailman/listinfo/ale>
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/**listinfo<http://mail.ale.org/mailman/listinfo>
>



-- 
-- 
James P. Kinney III
*
*Every time you stop a school, you will have to build a jail. What you gain
at one end you lose at the other. It's like feeding a dog on his own tail.
It won't fatten the dog.
- Speech 11/23/1900 Mark Twain
*
http://electjimkinney.org
http://heretothereideas.blogspot.com/
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20130421/85afec2d/attachment-0001.html>