[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ale] fascinating data on temperature, including ATI / AMD Radeon gpu

Hi all,

The topic of monitoring temperatures in a PC comes up here 
periodically.  As I mentioned in other threads, I've been working with 
graphics cards on a Mint installation for cryptocurrency computations.  
As you may know from my previous posts, I've always wanted to keep an 
eye on the status of my systems.  In the process of working with this 
project, I've discovered a number of interesting pieces of information 
that I thought I'd share.

Take a look at this image:


This shows a part of my screen on my Mint system.  Note my Gnome panel 
at the top with a temperature monitor on it.  This is the hardware 
monitor widget that is available in Gnome.  However, when I installed 
the ATI / AMD graphics drivers, the sensor system was no longer able to 
monitor the cpu.  After a bit of googling, I was directed to 
lm-sensors.  Many of you are already aware of that.  I tried this command.

--> sudo apt-get install lm-sensors

I found that it was already installed.

I then found and issued these two commands to reinitialize the system.

--> sudo sensors-detect

I accepted the defaults here then told it to save the changes.

--> sudo service module-init-tools start

I think that allowed the changes to take effect without a reboot.

This allowed the sensor system to work again, and my panel widgets to 
read both the cpu temperature and the hard drive temperatures as shown 
in the image.

You can use this command to read the sensors once in a terminal window.

--> sensors

This command will read the sensors every few seconds and display the 
results continuously.

--> watch sensors

I searched for a while to find a utility to read the gpu temperatures.  
I found nothing for a while.  Then I discovered that it's built into the 
ATI / AMD driver.  I don't know how to do this with nvidia cards.

The following command will read the clock speed and load on the first gpu.

--> aticonfig --adapter=0 --od-getclocks

The following command will read and display the results continuously.

--> watch aticonfig --adapter=0 --od-getclocks

The following command will read the temperature of the first gpu.

--> aticonfig --adapter=0 --odgt

The following command will read and display the results continuously.

--> watch aticonfig --adapter=0 --odgt

Once I found this out, I modified my mining program to add a temperature 
status window for each gpu so I could keep an eye on the temperature.  
This script file shows how I did it.


If you look at these images, I also discovered something very 
interesting.  The first one is the same as the one mentioned above, 
including the temperature readings of the GPU's on my Mint machine.  The 
second is an image of the temperature readings of the GPU's on my 
Windows machine.


All the gpu's are being run at close to 100% load, and the cases of both 
computers are well ventilated with multiple fans.

Look at the Miner 1 temperature window in image 1.  This is an MSI 7850 
gpu running in the Mint machine.  It's running at 73 deg C.

Now, look at the right hand window in image 2.  This is an IDENTICAL MSI 
7850 gpu running in the Windows machine.  It's running at 62 deg C.

Like I said, they're identical cards running in almost identical 
conditions.  So why is one running 11 degrees hotter than the other.

This was puzzling me for a while but I think I've figured it out.

In the Linux machine, the MSI card is in the TOP one in the chassis.  
That means its intake fan is right next to the 2nd gpu, with only about 
1/8" of space between.  So, it's air flow is very restricted.  That's 
the card that's running hotter.

In the Windows machine, the MSI card is the SECOND one in the chassis.  
It has several inches of air gap to the next object.  It's the one that 
is running cooler.

Now look at each image and compare the readings for each card within the 
same computer.

In image 1, the Mint machine, Miner 1, the top card, is at 73 deg C.  
Miner 2, the bottom card, is at 57 deg C.

In image 2, the Windows machine, the left window is an Asus 7850 card, 
and is the top card.  It's at 75 deg C.  The right window, the MSI card, 
is in the bottom slot.  It's running at 62 deg C.

So, in one case, the top card is running 16 degrees hotter.  In the 
other case, the top card is running 13 degrees hotter.

Based on this, I am convinced that any gpu or other card with it's own 
fan on the side will run substantially hotter than its baseline 
temperature if it's next to another card.

I'm not quite sure what to do about it.  I think 75 deg C is OK, but not 
great.  For what it's worth, I think my AMD cpu's are rated at about 67 
deg C.  Apparently, the gpu's have more tolerance.  You can see in image 
2 that the fans on the gpu's in the Windows system are only running at 
about 40% of their max, assuming that GPU-Z is reading them right.  So, 
maybe the card is not too unhappy.  But, it may mean the card would be 
pushed over its thermal limits much faster if a case fan fails, or if 
the room ambient temperature rises too much.

Anyway, I found this fascinating.  I guess I'll just have to keep a 
close eye on any PCI-E cards with fans which are jammed up against other 

PS I think I was monitoring the wrong temperature for CPU on my desktop 
machine for years.  The MSI motherboards have a 2 digit led display on 
the board which monitors post codes and then temperature once the 
machine is running.  I was monitoring the sensor that matched that 
reading.  When I ran the AMD Overdrive utility, it came up with a 
different, lower, number for CPU temperature, so I started monitoring 
that instead.  I don't know now exactly which temperature that the 
motherboard display is monitoring.

PPS I took some of the text in this email from the Linux machine to the 
Windows machine to write the email.  When I tried to open it up in 
notepad, I just got one long line of text with no breaks, since Windows 
has different line breaks.  However, I found out that I could open it in 
Wordpad and it worked OK.  Then, I could copy it into this email.

Let me know what your experiences have been monitoring and controlling 

Hope this is helpful.




(PS - If you email me and don't get a quick response, you might want to
call on the phone.  I get about 300 emails per day from alternate energy
mailing lists and such.  I don't always see new email messages very quickly.)

Ron Frazier
770-205-9422 (O)   Leave a message.
linuxdude AT techstarship.com
Litecoin: LZzAJu9rZEWzALxDhAHnWLRvybVAVgwTh3
Bitcoin: 15s3aLVsxm8EuQvT8gUDw3RWqvuY9hPGUU