Troubleshooting – how to Tweak Your PC to Unleash Its Power

0
80
Just2know : There is no knowledge that is not power
Just2know : There is no knowledge that is not power

< Previous | Contents | Next >

Chapter 10: Troubleshooting

Troubleshooting Overview

The need for system troubleshooting arises when an overclocking attempt is not successful. In the world of overclocking, the word successful is subjectively defined. Success, or lack thereof, is usually measured by whether or not the system boots and whether or not it remains stable over time. Symptoms of system instability include software applications that freeze, crash (with or without an error message), or otherwise produce unexpected and unwelcome results. Some users are content with a system that crashes once a day. Others require complete system stability for weeks at a time before they are satisfied. Whatever “success” means to you, it is possible to turn overclocking failures into successes by understanding what went wrong and taking the right steps to correct problems.

Overclocking can create system stability problems that would not occur at default settings. Deviations from normal (or default) temperatures, voltages, and operating frequencies of various system components can lead to instability, or even failure, after prolonged operation. To correct such problems, or to fix a system that will not boot after an overclocking attempt, it may be necessary to restore the system to its default configuration, or at least to a more stable configuration. Ways to do this include clearing the computer’s CMOS data via a motherboard jumper, changing motherboard dipswitches or jumpers back to defaults, or reflashing the BIOS to force default settings. The risk of device failure is minimal if you exercise care and patience during overclocking. System instability is the larger concern. Nevertheless, failure or damage to hardware is a possibility. Troubleshooting can shed light on the causes of instability and help you get the kinks out of any overclocking attempt.

Proper Cooling and Thermal Monitoring

The main cause of system instability or damage that arises from overclocking is poor cooling, because increased operating speeds produce higher thermal loads. The excess heat must be adequately dissipated, usually by means of a heatsink combined with a forced-air fan cooler. To know whether or not your system is being cooled sufficiently, you must monitor its operating temperatures.

Most quality motherboards permit onboard temperature monitoring through a variety of sensors mounted in strategic locations. The most valuable sensor point is underneath the base of the processor inside the socket interface (if there is one) or adjacent to the processor on slot-interface boards. Some architectures offer an additional layer of security by embedding a thermal sensing circuit in the processor’s die substrate layers. A chipset or third-party circuit then actively monitors the hardware, passing temperatures and other relevant data to the CMOS for analysis and display on the BIOS Setup’s user interface. Software applications can also extract temperature data from the sensor circuitry and move it directly to the operating system for real-time monitoring.

Most motherboards that offer thermal monitoring also support user-defined temperature limits to safeguard against damage to the processor and other components. Suggested maximums vary with processor designs. Exceeding temperature limits may trigger a warning or cause the system to shut down. Although stability problems can occur, even at low temperatures, you should avoid exceeding 60° Celsius when overclocking any current generation processor.

Maximum recommended operating temperatures for a variety of processors can be found in Chapters 5 and 6. Just as you should establish a performance baseline when benchmarking your system, you should also establish a processor or system case temperature baseline before overclocking. Of course, you can be quite confident that processor and system case temperatures will increase when your processor is overclocked. Instability will not necessarily follow increased temperature, but it is important to know where your system started out. Such knowledge will make troubleshooting easier down the road.

Overclocking tends to be a trial-and-error process. The more data you can gather about your system, the better. If an overclocking attempt is successful, in the sense

that the system remains stable, you can record the processor’s temperature during that attempt. If you later overclock your processor to a higher speed, and that produces instability, you’ll know what temperature to seek via improved cooling to improve stability.

Processor Voltage

An increase in processor speed sometimes requires an increase in processor voltage to improve stability or to make a system boot at all. Unfortunately, increasing voltage is perhaps the most dangerous aspect of overclocking. Next to excessive temperatures, it is the quickest way to kill your processor. Increasing the voltage improves a processor’s ability to operate at higher speeds, but take care: increase voltage levels by tiny increments and monitor every aspect of system operation— temperature, performance, and stability—after each increase. High voltage means a greater likelihood of electromigration, which can destroy a processor’s circuit pathways. Although rare, electromigration is impossible to measure or predict. Unless your processor is expendable, the best rule of thumb is to deviate from the processor’s default voltage as little as possible. Slight changes in voltage are usually well tolerated, whereas large changes are more troublesome. As with all overclocking attempts, quality cooling should be in place before you raise processor voltage.

Processor Voltage

An increase in processor speed sometimes requires an increase in processor voltage to improve stability or to make a system boot at all. Unfortunately, increasing voltage is perhaps the most dangerous aspect of overclocking. Next to excessive temperatures, it is the quickest way to kill your processor. Increasing the voltage improves a processor’s ability to operate at higher speeds, but take care: increase voltage levels by tiny increments and monitor every aspect of system operation— temperature, performance, and stability—after each increase. High voltage means a greater likelihood of electromigration, which can destroy a processor’s circuit pathways. Although rare, electromigration is impossible to measure or predict. Unless your processor is expendable, the best rule of thumb is to deviate from the processor’s default voltage as little as possible. Slight changes in voltage are usually well tolerated, whereas large changes are more troublesome. As with all overclocking attempts, quality cooling should be in place before you raise processor voltage.

Bus Overclocking: Drives

Increased stress on system components, due to overclocking via the front-side bus, often generates instability. Components built with low-quality manufacturing or design standards are most likely to suffer, though even the best components have limits.

Hard drives, video cards, and memory are the most susceptible to instability introduced by overclocking.

Overclocking the front-side bus affects the operating speeds of all system buses, as discussed in Chapter 4. Each bus rate is affected differently, according to the front- side bus speed. The exact deviation in operating speed depends on the chipset being used with each motherboard’s specific design. In general, all system bus speeds increase as front-side bus rates increase, although most x86 architectures

employ multiplier division to derive bus speeds. Devices attached to each of the various buses must be able to sustain operation at extended speeds introduced through the overclocking process.

Drive arrays are the most susceptible to problems in an overclocked system. Loss of data integrity is a real concern. PCI bus speeds that exceed the normal specification of 33 MHz can cause problems, especially above 40 MHz. Drives based on the IDE standard are most likely to suffer data corruption, while SCSI-based devices are less worrisome.

Decreasing the drive controller’s transfer rates through the BIOS Setup interface will often prevent data corruption problems. You’ll have to experiment to find optimum stability, but most drive problems cease when the controller’s transfer rate is lowered by one or two levels. For example, ATA/66 DMA 4 could be taken down to ATA/33 DMA 2. While you can lower drive transfer rates without devastating performance, you should never disable direct memory access (DMA) transfers in current generation systems. DMA bypasses the processor during data transfers; this valuable feature should be enabled at all times.

Examine any changes in performance that result when drive transfer rates are lowered. System performance could be degraded due to decreases in drive data throughput. Bandwidth gained through overclocking the bus rate may be negated by reduced signaling rates. You must maintain a balance between maximum operating frequency and available bandwidth. A quality benchmarking utility, such as SiSoft Sandra, will help you assess drive bandwidth as you make overclocking decisions.

Bus Overclocking: Graphics Accelerators

Graphics accelerators can be very susceptible to bus overclocking, especially the older generations of AGP video cards. Based on a 66-MHz PCI bus design, AGP cards store graphical texture data in the system’s main memory. This improves 3D rendering by speeding up access to more graphical information than can be stored in the video card’s onboard memory. Subsequent revisions have extended the AGP standard to include an advanced data signaling technique, which transfers information at up to eight times the rate of the original specification.

The latest generation of video cards can usually sustain AGP speeds approaching 90 MHz, though earlier models often fail above 75 MHz. Lowering the effective data transfer rate can neutralize the stress of extended speeds. AGP 4x cards must often be lowered to 2x for successful operation. In addition, disabling side-band transfers and fast-write capabilities can limit the effects of bus overclocking on stability in the AGP subsystem. AGP bus configuration can usually be altered from within a system’s BIOS interface, though some video cards feature onboard jumpers.

Most games and other 3D applications will see only mild performance losses from lowered transfer rates or disabled advanced AGP. The latest graphics accelerators feature 64 to128 megabytes of onboard memory, so the need to perform texturing operations in the system’s main memory is reduced. Even the most advanced 3D games rarely demand more than 32 MB of graphics memory at display resolutions below 1024 x 768. Computer-aided design applications suffer the worst performance

degradation, but professionals who use such software rarely resort to overclocking. They prize data integrity and system stability, and tend to obtain speed by upgrading rather than tweaking their systems.

Bus Overclocking: Memory

The memory subsystem is often the first to lose stability during front-side bus overclocking. The quality of the memory itself is important in determining overclocking potential. Generic memory types should be avoided, as the memory chip manufacturer often has no control over the production processes used in building generic components. Small manufacturers purchase memory chips from brand-name corporations and mount them on low-quality printed circuit boards.

These boards often fail quality testing at overclocked speeds because they are designed to meet minimum quality standards.

Quality memory modules undergo production and testing within the same manufacturing environment. Large manufacturers, such as Micron’s Crucial Technology retail sales division, offer superior quality testing. Each Crucial memory module must meet the same stringent standards as Micron’s OEM modules. This high degree of vigilance usually produces an excellent chip.

Some vendors take advantage of the confusion surrounding memory quality. For example, Micron provides memory chips for sale to other manufacturers for assembly on a generic printed circuit board. Some vendors receive the generic memory, then attempt to sell it as a brand-name product because each individual chip is tagged with Micron’s manufacturing and model codes. Users expect a quality product, but instead they receive a generic part that may not meet Micron quality assurance standards. Purchasing memory directly from large-scale manufacturers is the best way to avoid low-quality components though some smaller manufacturers do have good testing procedures and warranties.

Type and performance ratings are the two most important things to consider when evaluating whether or not a memory module is suitable for an overclocked configuration. Better-quality PC-133 SDRAM and PC-2100 DDR memory modules can operate up to 33 MHz beyond their factory ratings with good stability. The high- frequency RAMBUS designs can often go 100 MHz beyond their rated speed. Older fast-page and extended data-out technologies are not as scalable, with overclocking potential less than 33 MHz. As with processors, the maximum overclocked speed depends on a number of design, production, and testing factors.

You may need to tweak memory timing values when overclocking the memory bus. Most motherboards allow memory timing rates to be defined in the BIOS Setup interface. While you must use trial and error to determine the best timing pattern, setting CAS (column address strobe) latency can be a valuable overclocking tool. CAS latency determines the rate for memory read, write, and move operations.

Most quality memory will feature a CAS latency of 2 for SDRAM, or up to 2.5 for DDR memory modules. Adjusting this value to 3 will often enable memory modules that would otherwise fail at overclocked bus rates to operate without any problems. As latency increases, bandwidth decreases. However, performance loss is negligible

because subsequent overclocking of the memory bus can deliver more available bandwidth to offset latency. Benchmark testing is required to determine the proper relationships among timing, latency, and operating frequencies.

Many motherboards, especially those featuring non-Intel chipsets, give users the ability to specify custom bus rates correlated to the front-side bus. This can prove invaluable when overclocking the front-side bus. Most chipsets feature a fixed multiplier range for PCI, AGP, and other interconnect bus operations, but many chipsets skew the memory bus rate through an additive or subtractive process. The skew value is often derived from the rate of the PCI bus, which is 33 MHz at default clock operation.

Manipulating the memory bus is an important aspect of overclocking. Let’s look at the Intel Pentium III e series, for example. This processor operates with a 100-MHz front- side bus. Many non-Intel motherboards allow the memory bus to be offset from the front-side bus by 33 MHz. Thus, a high-performance PC-133 SDRAM in asynchronous mode can replace the original synchronous default. The P3e processor can be overclocked to a front-side bus rate of 133 MHz. If the user already has PC-133 memory, the memory bus can be configured for synchronous operation. However, with low-quality PC-100 memory, the user could opt to use an asynchronous 100 MHz memory bus (133 MHz minus 33 MHz), thus improving stability while allowing the processor to be overclocked to the 133-MHz front-side rate.

The P3e system is only one example. Most chipsets support the skewing of memory bus rates, not only for overclocking but also to improve performance in other ways. In the P3e with PC-100 SDRAM example, assume a user wants to upgrade to PC-166 technology. After changing out the old memory, he or she can use the additive asynchronous mode to maximize memory bandwidth. This process allows the front- side bus to retain its 133-MHz overclocked rate, while the memory bus is skewed to achieve 166 MHz.

Sadly, most Intel chipsets do not allow skewing the memory bus though asynchronous operation, though nearly all non-Intel chipsets do. Remember that lowering operating speeds for system buses, especially the memory bus, may also reduce performance, even when the processor or front-side bus is overclocked.

Resetting the BIOS

Unresponsive or unstable systems can often be rescued by resetting the firmware BIOS to its default configuration. The BIOS is based around a CMOS memory that stores system configuration data on an EEPROM (electrically erasable programmable read-only memory) flash memory module. The method for resetting the BIOS to default values differs with the motherboard model, but the basic premise remains the same. The electrical supply to the EEPROM, from the motherboard’s onboard battery, must either be removed or shorted to clear the stored memory.

Most motherboards feature a CMOS clear jumper, while others require the user to hold a key during boot to reset BIOS values. If specific instructions are not available, simply removing the CMOS battery for 15 minutes can reset most motherboards. If

that isn’t successful, shorting the positive-to-negative posts of the battery interface socket with a wire jumper or paper clip may speed up erasure.

If the battery is soldered to the motherboard, it may be possible to drain it by attaching a resistor, with impedance under 100 ohms, between the positive and negative battery poles. Be careful, though: if the motherboard doesn’t support battery recharging, this will render the system useless. A new battery will have to be soldered in place of the drained one.

Flash updating of the system BIOS can often restore the default configuration. The command to force such an update option varies by vendor, but most boards featuring popular Award or AMI BIOS routines can be forced to reinstate original CMOS values with a flash and reboot. Upgrading the BIOS can also prove beneficial, as the latest code usually includes compatibility and performance updates.

All user-defined settings are lost once the configuration is reset to factory defaults. Retain a hard-copy version of all BIOS, jumper, and dipswitch settings, as configured when the system is operating in a stable manner. The task of writing down each setting can be tedious, but this extra work ensures your ability to restore the system if it should fail. If you lack a backup list, most motherboard user manuals recommend BIOS settings for common configurations, and most BIOS Setup interfaces contain a Restore to BIOS Defaults option. Reset all motherboard jumpers and dipswitches to their default or recommended values before booting a reset system.

Hardware Failure and Warranties

Device failure is a significant concern. As noted several times throughout this text, only quality components should be used in an overclocked system. If your system is factory built, you may not have the best components, but you can still take care to cool and configure your system properly. Remember that overclocking can void product warranties. Multiplier overclocking voids processor warranties, while front- side bus overclocking voids agreements for nearly all system components. Trying to get product replacements or refunds for damages incurred during overclocking is unethical if not illegal. Deceitful tactics increase hardware costs for everyone and have compelled some companies to implement anti-overclocking technologies.

LEAVE A REPLY

Please enter your comment!
Please enter your name here