| .. | .. |
|---|
| 1 | | -========================== |
|---|
| 2 | 1 | PCIe Device AER statistics |
|---|
| 3 | | -========================== |
|---|
| 2 | +-------------------------- |
|---|
| 3 | + |
|---|
| 4 | 4 | These attributes show up under all the devices that are AER capable. These |
|---|
| 5 | 5 | statistical counters indicate the errors "as seen/reported by the device". |
|---|
| 6 | 6 | Note that this may mean that if an endpoint is causing problems, the AER |
|---|
| .. | .. |
|---|
| 9 | 9 | problematic endpoint itself (which may report all counters as 0 as it never |
|---|
| 10 | 10 | saw any problems). |
|---|
| 11 | 11 | |
|---|
| 12 | | -Where: /sys/bus/pci/devices/<dev>/aer_dev_correctable |
|---|
| 12 | +What: /sys/bus/pci/devices/<dev>/aer_dev_correctable |
|---|
| 13 | 13 | Date: July 2018 |
|---|
| 14 | | -Kernel Version: 4.19.0 |
|---|
| 14 | +KernelVersion: 4.19.0 |
|---|
| 15 | 15 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 16 | 16 | Description: List of correctable errors seen and reported by this |
|---|
| 17 | 17 | PCI device using ERR_COR. Note that since multiple errors may |
|---|
| 18 | 18 | be reported using a single ERR_COR message, thus |
|---|
| 19 | 19 | TOTAL_ERR_COR at the end of the file may not match the actual |
|---|
| 20 | | - total of all the errors in the file. Sample output: |
|---|
| 21 | | -------------------------------------------------------------------------- |
|---|
| 22 | | -localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable |
|---|
| 23 | | -Receiver Error 2 |
|---|
| 24 | | -Bad TLP 0 |
|---|
| 25 | | -Bad DLLP 0 |
|---|
| 26 | | -RELAY_NUM Rollover 0 |
|---|
| 27 | | -Replay Timer Timeout 0 |
|---|
| 28 | | -Advisory Non-Fatal 0 |
|---|
| 29 | | -Corrected Internal Error 0 |
|---|
| 30 | | -Header Log Overflow 0 |
|---|
| 31 | | -TOTAL_ERR_COR 2 |
|---|
| 32 | | -------------------------------------------------------------------------- |
|---|
| 20 | + total of all the errors in the file. Sample output:: |
|---|
| 33 | 21 | |
|---|
| 34 | | -Where: /sys/bus/pci/devices/<dev>/aer_dev_fatal |
|---|
| 22 | + localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable |
|---|
| 23 | + Receiver Error 2 |
|---|
| 24 | + Bad TLP 0 |
|---|
| 25 | + Bad DLLP 0 |
|---|
| 26 | + RELAY_NUM Rollover 0 |
|---|
| 27 | + Replay Timer Timeout 0 |
|---|
| 28 | + Advisory Non-Fatal 0 |
|---|
| 29 | + Corrected Internal Error 0 |
|---|
| 30 | + Header Log Overflow 0 |
|---|
| 31 | + TOTAL_ERR_COR 2 |
|---|
| 32 | + |
|---|
| 33 | +What: /sys/bus/pci/devices/<dev>/aer_dev_fatal |
|---|
| 35 | 34 | Date: July 2018 |
|---|
| 36 | | -Kernel Version: 4.19.0 |
|---|
| 35 | +KernelVersion: 4.19.0 |
|---|
| 37 | 36 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 38 | 37 | Description: List of uncorrectable fatal errors seen and reported by this |
|---|
| 39 | 38 | PCI device using ERR_FATAL. Note that since multiple errors may |
|---|
| 40 | 39 | be reported using a single ERR_FATAL message, thus |
|---|
| 41 | 40 | TOTAL_ERR_FATAL at the end of the file may not match the actual |
|---|
| 42 | | - total of all the errors in the file. Sample output: |
|---|
| 43 | | -------------------------------------------------------------------------- |
|---|
| 44 | | -localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal |
|---|
| 45 | | -Undefined 0 |
|---|
| 46 | | -Data Link Protocol 0 |
|---|
| 47 | | -Surprise Down Error 0 |
|---|
| 48 | | -Poisoned TLP 0 |
|---|
| 49 | | -Flow Control Protocol 0 |
|---|
| 50 | | -Completion Timeout 0 |
|---|
| 51 | | -Completer Abort 0 |
|---|
| 52 | | -Unexpected Completion 0 |
|---|
| 53 | | -Receiver Overflow 0 |
|---|
| 54 | | -Malformed TLP 0 |
|---|
| 55 | | -ECRC 0 |
|---|
| 56 | | -Unsupported Request 0 |
|---|
| 57 | | -ACS Violation 0 |
|---|
| 58 | | -Uncorrectable Internal Error 0 |
|---|
| 59 | | -MC Blocked TLP 0 |
|---|
| 60 | | -AtomicOp Egress Blocked 0 |
|---|
| 61 | | -TLP Prefix Blocked Error 0 |
|---|
| 62 | | -TOTAL_ERR_FATAL 0 |
|---|
| 63 | | -------------------------------------------------------------------------- |
|---|
| 41 | + total of all the errors in the file. Sample output:: |
|---|
| 64 | 42 | |
|---|
| 65 | | -Where: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal |
|---|
| 43 | + localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal |
|---|
| 44 | + Undefined 0 |
|---|
| 45 | + Data Link Protocol 0 |
|---|
| 46 | + Surprise Down Error 0 |
|---|
| 47 | + Poisoned TLP 0 |
|---|
| 48 | + Flow Control Protocol 0 |
|---|
| 49 | + Completion Timeout 0 |
|---|
| 50 | + Completer Abort 0 |
|---|
| 51 | + Unexpected Completion 0 |
|---|
| 52 | + Receiver Overflow 0 |
|---|
| 53 | + Malformed TLP 0 |
|---|
| 54 | + ECRC 0 |
|---|
| 55 | + Unsupported Request 0 |
|---|
| 56 | + ACS Violation 0 |
|---|
| 57 | + Uncorrectable Internal Error 0 |
|---|
| 58 | + MC Blocked TLP 0 |
|---|
| 59 | + AtomicOp Egress Blocked 0 |
|---|
| 60 | + TLP Prefix Blocked Error 0 |
|---|
| 61 | + TOTAL_ERR_FATAL 0 |
|---|
| 62 | + |
|---|
| 63 | +What: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal |
|---|
| 66 | 64 | Date: July 2018 |
|---|
| 67 | | -Kernel Version: 4.19.0 |
|---|
| 65 | +KernelVersion: 4.19.0 |
|---|
| 68 | 66 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 69 | 67 | Description: List of uncorrectable nonfatal errors seen and reported by this |
|---|
| 70 | 68 | PCI device using ERR_NONFATAL. Note that since multiple errors |
|---|
| 71 | 69 | may be reported using a single ERR_FATAL message, thus |
|---|
| 72 | 70 | TOTAL_ERR_NONFATAL at the end of the file may not match the |
|---|
| 73 | | - actual total of all the errors in the file. Sample output: |
|---|
| 74 | | -------------------------------------------------------------------------- |
|---|
| 75 | | -localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal |
|---|
| 76 | | -Undefined 0 |
|---|
| 77 | | -Data Link Protocol 0 |
|---|
| 78 | | -Surprise Down Error 0 |
|---|
| 79 | | -Poisoned TLP 0 |
|---|
| 80 | | -Flow Control Protocol 0 |
|---|
| 81 | | -Completion Timeout 0 |
|---|
| 82 | | -Completer Abort 0 |
|---|
| 83 | | -Unexpected Completion 0 |
|---|
| 84 | | -Receiver Overflow 0 |
|---|
| 85 | | -Malformed TLP 0 |
|---|
| 86 | | -ECRC 0 |
|---|
| 87 | | -Unsupported Request 0 |
|---|
| 88 | | -ACS Violation 0 |
|---|
| 89 | | -Uncorrectable Internal Error 0 |
|---|
| 90 | | -MC Blocked TLP 0 |
|---|
| 91 | | -AtomicOp Egress Blocked 0 |
|---|
| 92 | | -TLP Prefix Blocked Error 0 |
|---|
| 93 | | -TOTAL_ERR_NONFATAL 0 |
|---|
| 94 | | -------------------------------------------------------------------------- |
|---|
| 71 | + actual total of all the errors in the file. Sample output:: |
|---|
| 95 | 72 | |
|---|
| 96 | | -============================ |
|---|
| 73 | + localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal |
|---|
| 74 | + Undefined 0 |
|---|
| 75 | + Data Link Protocol 0 |
|---|
| 76 | + Surprise Down Error 0 |
|---|
| 77 | + Poisoned TLP 0 |
|---|
| 78 | + Flow Control Protocol 0 |
|---|
| 79 | + Completion Timeout 0 |
|---|
| 80 | + Completer Abort 0 |
|---|
| 81 | + Unexpected Completion 0 |
|---|
| 82 | + Receiver Overflow 0 |
|---|
| 83 | + Malformed TLP 0 |
|---|
| 84 | + ECRC 0 |
|---|
| 85 | + Unsupported Request 0 |
|---|
| 86 | + ACS Violation 0 |
|---|
| 87 | + Uncorrectable Internal Error 0 |
|---|
| 88 | + MC Blocked TLP 0 |
|---|
| 89 | + AtomicOp Egress Blocked 0 |
|---|
| 90 | + TLP Prefix Blocked Error 0 |
|---|
| 91 | + TOTAL_ERR_NONFATAL 0 |
|---|
| 92 | + |
|---|
| 97 | 93 | PCIe Rootport AER statistics |
|---|
| 98 | | -============================ |
|---|
| 94 | +---------------------------- |
|---|
| 95 | + |
|---|
| 99 | 96 | These attributes show up under only the rootports (or root complex event |
|---|
| 100 | 97 | collectors) that are AER capable. These indicate the number of error messages as |
|---|
| 101 | 98 | "reported to" the rootport. Please note that the rootports also transmit |
|---|
| .. | .. |
|---|
| 103 | 100 | device, so these counters include them and are thus cumulative of all the error |
|---|
| 104 | 101 | messages on the PCI hierarchy originating at that root port. |
|---|
| 105 | 102 | |
|---|
| 106 | | -Where: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor |
|---|
| 103 | +What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor |
|---|
| 107 | 104 | Date: July 2018 |
|---|
| 108 | | -Kernel Version: 4.19.0 |
|---|
| 105 | +KernelVersion: 4.19.0 |
|---|
| 109 | 106 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 110 | 107 | Description: Total number of ERR_COR messages reported to rootport. |
|---|
| 111 | 108 | |
|---|
| 112 | | -Where: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal |
|---|
| 109 | +What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal |
|---|
| 113 | 110 | Date: July 2018 |
|---|
| 114 | | -Kernel Version: 4.19.0 |
|---|
| 111 | +KernelVersion: 4.19.0 |
|---|
| 115 | 112 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 116 | 113 | Description: Total number of ERR_FATAL messages reported to rootport. |
|---|
| 117 | 114 | |
|---|
| 118 | | -Where: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal |
|---|
| 115 | +What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal |
|---|
| 119 | 116 | Date: July 2018 |
|---|
| 120 | | -Kernel Version: 4.19.0 |
|---|
| 117 | +KernelVersion: 4.19.0 |
|---|
| 121 | 118 | Contact: linux-pci@vger.kernel.org, rajatja@google.com |
|---|
| 122 | 119 | Description: Total number of ERR_NONFATAL messages reported to rootport. |
|---|