38 lines
		
	
	
		
			1.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			38 lines
		
	
	
		
			1.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
What:		/sys/devices/system/machinecheck/machinecheckX/tolerant
 | 
						|
Contact:	Borislav Petkov <bp@suse.de>
 | 
						|
Date:		Dec, 2021
 | 
						|
Description:
 | 
						|
		Unused and obsolete after the advent of recoverable machine
 | 
						|
		checks (see last sentence below) and those are present since
 | 
						|
		2010 (Nehalem).
 | 
						|
 | 
						|
		Original description:
 | 
						|
 | 
						|
		The entries appear for each CPU, but they are truly shared
 | 
						|
		between all CPUs.
 | 
						|
 | 
						|
		Tolerance level. When a machine check exception occurs for a
 | 
						|
		non corrected machine check the kernel can take different
 | 
						|
		actions.
 | 
						|
 | 
						|
		Since machine check exceptions can happen any time it is
 | 
						|
		sometimes risky for the kernel to kill a process because it
 | 
						|
		defies normal kernel locking rules. The tolerance level
 | 
						|
		configures how hard the kernel tries to recover even at some
 | 
						|
		risk of	deadlock. Higher tolerant values trade potentially
 | 
						|
		better uptime with the risk of a crash or even corruption
 | 
						|
		(for tolerant >= 3).
 | 
						|
 | 
						|
		==  ===========================================================
 | 
						|
		 0  always panic on uncorrected errors, log corrected errors
 | 
						|
		 1  panic or SIGBUS on uncorrected errors, log corrected errors
 | 
						|
		 2  SIGBUS or log uncorrected errors, log corrected errors
 | 
						|
		 3  never panic or SIGBUS, log all errors (for testing only)
 | 
						|
		==  ===========================================================
 | 
						|
 | 
						|
		Default: 1
 | 
						|
 | 
						|
		Note this only makes a difference if the CPU allows recovery
 | 
						|
		from a machine check exception. Current x86 CPUs generally
 | 
						|
		do not.
 |