Intel 13th & 14th Gen Instability Issues Cased By Buggy Microcode, eTVB Fix Issued In New BIOS With "0x125" Microcode
Intel 13th & 14th Gen Instability Issues Cased By Buggy Microcode, eTVB Fix Issued In New BIOS With “0x125” Microcode
Although Intel has yet to publicly issue a statement regarding the serious matter of instability that affects its high-end 13th and 14th Gen CPUs despite saying that it would do so a few months back, it looks like Igor's Lab has discovered internal documents (NDA) which spill the beans on what has been causing these issues from the start.
The first reports of Intel's 13th and 14th Gen Instability issues can be found more than a year back across various forums and Steam's per-game discussion pages where gamers started seeing the issue on a common basis. The issue then came to the limelight earlier this year when more & more people started having issues.
The issues were so bad that gamers were returning their entire PCs and getting new ones with AMD Ryzen CPUs instead. Although a fix was provided after Intel forced its board partners to issue "Intel Default Settings" as the default options for power limits, this caused a severe drop in performance which meant that reviewers and tech outlets had to redo the entire 13th & 14th Gen reviews using settings that were stable and didn't caused any problems to end users.
In the documents, Intel states that the root cause happens to be an incorrect value within the microcode algorithm associated with the eTVB (Enhanced Thermal Velocity Boost) feature that comes with 13th and 14th Gen Unlocked CPUs. The increased frequency and the corresponding high voltages which lead to high temperatures can lead to a reduction in the processor's reliability which is more or less saying that your CPU will degrade over time. Knowing that Intel's 13th Gen CPUs have been out for over a year, most if not all processors, that have had this issue are now suspected of severe degradation.
"Root Cause Root cause is an incorrect value in a microcode algorithm associated with the eTVB feature.
Implication Increased frequency and corresponding voltage at high temperature may reduce processor reliability.
Observed Found internally.
Impacted platforms Raptor Lake S, Raptor Lake Refresh S (CPUID 0xB0671)
via Igor's Lab"
It's easy to tell that a CPU has been degraded because games would crash, you will frequently get BSOD or your PC will fail to start. These are just a few notifiers but for myself, my chip started to face serious problems once summer arrived. The higher temperatures plus the heat that the CPU was already producing meant that the degradation process was accelerated. So the only option is to revert to Underclock/Undervoltage or stick with default power limits which reduce the capability of the chip by pushing the limits down from 253W to 125W at the baseline.
So what's the resolution to this issue? Well a new BIOS will soon be rolled out with the necessary microcode, version 0x125 or later. Intel will ask customers to update the BIOS of their PCs by 7/19 (2024). It is not known if the warranty will expire if the user doesn't update his BIOS or if Intel will offer an extended warranty to users since their chips have already been degraded to some extent. Following is the full statement:
"Failure Analysis (FA) of 13th and 14th Generation K SKU processors indicates a shift in minimum operating voltage on affected processors resulting from cumulative exposure to elevated core voltages. Intel® analysis has determined a confirmed contributing factor for this issue is elevated voltage input to the processor due to previous BIOS settings which allow the processor to operate at turbo frequencies and voltages even while the processor is at a high temperature. Previous generations of Intel® K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.
Intel® requests all customers to update BIOS to microcode 0x125 or later by 7/19/2024.
This microcode includes an eTVB fix for an issue which may allow the processor to enter a higher performance state even when the processor temperature has exceeded eTVB thresholds.
via Igor's Lab"