[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030616: did something change at ULM? (cont.)



Adam,

>Date: Mon, 16 Jun 2003 15:54:48 -0500
>From: address@hidden
>Organization: University of Louisiana at Monroe
>To: Steve Emmerson <address@hidden>
>Subject: Re: 20030616: did something change at ULM? (cont.) 

The above message contained the following:

> ok, At 6:00pm CST I am going to take the system down(once i made sure
> you guys are off) and check the CPU's.  Hey, it worth a shot.  Maybe
> over the last 2 years the heatsinks on the two CPU's has warped some
> and is not making as good contact as before.  It should be back up by
> 6:15 or 6:30 at the latest.  I will restart the ldm at that time if it
> is ok with you and see how it works over the night.

Hold off.  We activated some loadable modules in the kernel that allows
you to monitor CPU temperatures (and other things).  You can run this
anytime you want, e.g.,

    $ /usr/bin/sensors
    adm1021-i2c-0-18
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter
    temp:        +41°C  (min =  +20°C, max =  +60°C)
    remote_temp:
                 +32°C  (min =  +20°C, max =  +60°C)die_code:    4

    adm1021-i2c-0-4c
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter
    temp:        +42°C  (min =  +20°C, max =  +60°C)
    remote_temp:
                 +33°C  (min =  +20°C, max =  +60°C)die_code:    4

    ds1780-i2c-0-2c
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter
    2.5V:      +2.51 V  (min =  +2.22 V, max =  +2.72 V)   
    Vccp1:     +1.70 V  (min =  +2.40 V, max =  +2.93 V)   ALARM
    3.3V:      +1.49 V  (min =  +2.93 V, max =  +3.59 V)   ALARM
    5V:        +5.02 V  (min =  +4.45 V, max =  +5.44 V)   
    12V:      +11.93 V  (min = +10.68 V, max = +13.06 V)   
    Vccp2:     +1.68 V  (min =  +2.40 V, max =  +2.93 V)   ALARM
    fan1:        0 RPM  (min = 3000 RPM, div = 2)          ALARM
    fan2:        0 RPM  (min = 3000 RPM, div = 2)          ALARM
    temp:      +37.5°C  (limit =  +60°C, hysteresis =  +50°C) 
    vid:      +1.70 V
    alarms:   Chassis intrusion detection                  ALARM

    ds1780-i2c-0-2d
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter
    2.5V:      +1.25 V  (min =  +2.22 V, max =  +2.72 V)   ALARM
    Vccp1:     +1.47 V  (min =  +2.40 V, max =  +2.93 V)   ALARM
    3.3V:      +3.33 V  (min =  +2.93 V, max =  +3.59 V)   
    5V:        +3.33 V  (min =  +4.45 V, max =  +5.44 V)   ALARM
    12V:      +11.93 V  (min = +10.68 V, max = +13.06 V)   
    Vccp2:     +1.78 V  (min =  +2.40 V, max =  +2.93 V)   ALARM
    fan1:        0 RPM  (min = 3000 RPM, div = 2)          ALARM
    fan2:        0 RPM  (min = 3000 RPM, div = 2)          ALARM
    temp:      +46.5°C  (limit =  +60°C, hysteresis =  +50°C) 
    vid:      +1.95 V
    alarms:   Chassis intrusion detection                  ALARM

    eeprom-i2c-0-50
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter

    eeprom-i2c-0-51
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter

    eeprom-i2c-0-52
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter

    eeprom-i2c-0-53
    Adapter: SMBus I801 adapter at ccd0
    Algorithm: Non-I2C SMBus adapter

The temperature of your 2nd CPU (46.5 C) is getting up there, but 
they're currently both with operational parameters.

If and when you take the machine down, then you should check your BIOS 
settings to see at what temperature the machine with shut-off.

We're going to set-up a crontab(1) entry for user "ldm" to run this
every 5 minutes and concatenate the relevant output to the log file
~/logs/sensors.log (along with a timestamp).

Regards,
Steve Emmerson