Troubleshooting a dual kernel configuration =========================================== This page is a troubleshooting guide enumerating known issues with dual kernel Xenomai configurations. [TIP] If running any release from the Xenomai 2 series, or a Xenomai 3 release using the *Cobalt* real-time core, then you are using a dual kernel configuration, and this document was meant for you. Xenomai 3 over the *Mercury* core stands for a single kernel configuration instead, for which you can find specific link:troubleshooting-a-single-kernel-configuration/[troubleshooting information here]. == Kernel-related issues [[kconf]] === Common kernel configuration issues When configuring the Linux kernel, some options should be avoided. CONFIG_CPU_FREQ:: This allows the CPU frequency to be modulated with workload, but many CPUs change the TSC counting frequency also, which makes it useless for accurate timing when the CPU clock can change. Also some CPUs can take several milliseconds to ramp up to full speed. CONFIG_CPU_IDLE:: Allows the CPU to enter deep sleep states, increasing the time it takes to get out of these sleep states, hence the latency of an idle system. Also, on some CPU, entering these deep sleep states causes the timers used by Xenomai to stop functioning. CONFIG_KGDB:: This option should not be enabled, except with x86. CONFIG_CONTEXT_TRACKING_FORCE:: This option which appeared in kernel 3.8 is forced off by I-pipe patches since 3.14 onward, as it is incompatible with interrupt pipelining, and has no upside for regular users. However, you have to manually disable it for older kernels when present. Common effects observed with this feature enabled include RCU-related kernel warnings during real-time activities, and pathologically high latencies. === Kernel hangs after "Uncompressing Linux... done, booting the kernel." This means that the kernel crashes before the console is enabled. You should enable the +CONFIG_EARLY_PRINTK+ option. For some architectures (x86, arm), enabling this option also requires passing the +earlyprintk+ parameter on the kernel command line. See 'Documentation/kernel-parameters.txt' for possible values. For the ARM architecture, you have to enable +CONFIG_DEBUG_KERNEL+ and +CONFIG_DEBUG_LL+ in order to be able to enable +CONFIG_EARLY_PRINTK+. === Kernel OOPSes Please make sure to check the <> section first. If nothing seems wrong there, try capturing the OOPS information using a _serial console_ or _netconsole_, then post it to the mailto:xenomai@xenomai.org[xenomai mailing list], along with the kernel configuration file (aka `.config`) matching the kernel build. === Kernel boots but does not print any message Your distribution may be configured to pass the +quiet+ option on the kernel command line. In this case, the kernel does not print all the log messages, however, they are still available using the +dmesg+ command. [[kerror]] === Kernel log displays Xenomai or I-pipe error messages [[no-timer]] ==== I-pipe: could not find timer for cpu #N The most probable reason is that no hardware timer chip is available for Xenomai timing operations. Check that you did not enable some of the conflicting options listed in the <> section. With AMD x86_64 CPUs:: You will most likely also see the following message: -------------------------------------------- I-pipe: cannot use LAPIC as a tick device I-pipe: disable C1E power state in your BIOS -------------------------------------------- The interrupt pipeline outputs this message if C1E option is enabled in the BIOS. To fix this issue, disable C1E support in the BIOS. In some Award BIOS this option is located in the +Advanced BIOS Features->+ menu (+AMD C1E Support+). [WARNING] Disabling the +AMD K8 Cool&Quiet+ feature in the BIOS will *NOT* solve this problem. With other CPU architectures:: The interrupt pipeline implementation may lack a registration for a hardware timer available to Xenomai timing operations (e.g. a call to +ipipe_timer_register()+). If you are working on porting the interrupt pipeline to some ARM SoC, you may want to have a look at this link:porting-xenomai-to-a-new-arm-soc/#The_general_case[detailed information]. [[SMI]] ==== SMI-enabled chipset found, but SMI workaround disabled You may have an issue with System Management Interrupts on your x86 platform. You may want to look at link:dealing-with-x86-smi-troubles/[this document]. === Xenomai and Linux devices share the same IRQ vector This x86-specific issue might still happen on legacy hardware with no MSI support. See link:what-if-xenomai-and-linux-devices-share-the-same-IRQ[this article] from the Knowledge Base. === Kernel issues specific to the Xenomai 2.x series ==== system init failed, code -19 See <>. ==== Local APIC absent or disabled! The Xenomai 2.x _nucleus_ issues this warning if the kernel configuration enables the local APIC support (+CONFIG_X86_LOCAL_APIC+), but the processor status gathered at boot time by the kernel says that no local APIC support is available. There are two options for fixing this issue: * either your CPU really has _no_ local APIC hardware, in which case you need to rebuild a kernel with LAPIC support disabled. * or it does have a local APIC but the kernel boot parameters did not specify to activate it using the _lapic_ option. The latter is required since 2.6.9-rc4 for boxes which APIC hardware is disabled by default by the BIOS. You may want to look at the file 'Documentation/kernel-parameters.txt' from the Linux source tree, for more information about this parameter. == Application-level issues [[vsyscall]] === --enable-x86-sep needs NPTL and Linux 2.6.x or higher or, === --enable-x86-vsyscall requires NPTL ... This message may happen when starting a Xenomai 2.x or 3.x application respectively. On the x86 architecture, the configure script option mentioned allows Xenomai to use the _vsyscall_ mechanism for issuing system calls, based on the most efficient method determined by the kernel for the current system. This mechanism is only available from NPTL-enabled glibc releases. Turn off this feature for other libc flavours. === Cobalt core not enabled in kernel As mentioned in the message, the target kernel is lacking Cobalt support. See link:installing-xenomai-3-x/#Installing_the_Cobalt_core[this document] for detailed information about installing Cobalt. === binding failed: Function not implemented Another symptom of the previous issue, i.e. the Cobalt core is not enabled in the target kernel. === binding failed: Operation not permitted This is the result of an attempt to run a Xenomai application as an unprivileged user, which fails because invoking Xenomai services requires +CAP_SYS_NICE+. However, you may allow a specific group of users to access Xenomai services, by following the instructions on link:running-a-Xenomai-application-as-a-regular-user[this page]. === incompatible ABI revision level Same as below: === ABI mismatch The ABI concerned by this message is the system call binary interface between the Xenomai libraries and the real-time kernel services it invokes (e.g. +libcobalt+ and the Cobalt kernel with Xenomai 3.x). This ABI may evolve over time, only between major Xenomai releases or testing candidate releases (i.e. -rc series) though. When this happens, the ABI level required by the application linked against Xenomai libraries may not match the ABI exposed by the Xenomai co-kernel implementation on the target machine, which is the situation this message reports. To fix this issue, just make sure to rebuild both the Xenomai kernel support and the user-space binaries for your target system. If however you did install the appropriate Xenomai binaries on your target system, chances are that stale files from a previous Xenomai installation still exist on your system, causing the mismatch. Each major Xenomai release (e.g. 2.1.x, 2.2.x ... 2.6.x, 3.0.x ...) defines such kernel/user ABI, which remains stable across minor update releases (e.g. 2.6.0 -> 2.6.4). This guarantee makes partial updates possible with production systems (i.e. kernel and/or user support). For instance, any application built over the Xenomai 2.6.0 binaries can run over a Xenomai 2.6.4 kernel support, and conversely. [TIP] Debian-based distributions (notably Ubuntu) may ship with pre-installed Xenomai libraries. Make sure that these files don't get in the way if you plan to install a more recent Xenomai kernel support. === : not found Although the program in question may be present, this message may happen on ARM platforms when a mismatch exists between the kernel and user library configurations with respect to EABI support. Typically, if user libraries are compiled with a toolchain generating OABI code, the result won't run over a kernel not enabling the +CONFIG_OABI_COMPAT+ option. Conversely, the product of a compilation with an EABI toolchain won't run on a kernel not enabling the +CONFIG_AEABI+ option. === incompatible feature set When a Xenomai application starts, the set of core features it requires is compared to the feature set the kernel provides. This message denotes a mismatch between both sets, which can be solved by fixing the kernel and/or user build configuration. Further details are available from link:installing-xenomai-3-x[this page] for Xenomai 3, and link:installing-xenomai-2-x[this page] for Xenomai 2. ==== feature mismatch: missing="smp/nosmp" On SMP-capable architectures, both kernel and user-space components (i.e. Xenomai libraries) must be compiled with the same setting with respect to SMP support. SMP support in the kernel is controlled via the +CONFIG_SMP+ option. The +--enable-smp+ configuration switch enables this feature for the Xenomai libraries (conversely, +--disable-smp+ disables it). [CAUTION] Using Xenomai libraries built for a single-processor configuration (i.e. +--disable-smp+) over a SMP kernel (i.e. +CONFIG_SMP=y+) is *NOT* valid. On the other hand, using Xenomai libraries built with SMP support enabled over a single-processor kernel is fine. === Application-level issues specific to the Xenomai 2.x series The following feature mismatches can be detected with the 2.x series: ==== feature mismatch: missing="kuser_tsc" See the <> section. [NOTE] This issue does not affect Xenomai 3.x as the latter requires modern I-pipe series which must provide _KUSER_TSC_ support on the ARM architecture. ==== feature mismatch: missing="sep" This error is specific to the x86 architecture on Xenomai 2.x, for pre-Pentium CPUs which do not provide the _sysenter/sysexit_ instruction pair. See <>. [NOTE] This issue does not affect Xenomai 3.x as the latter does not support pre-Pentium systems in the first place. ==== feature mismatch: missing="tsc" This error is specific to the x86 architecture on Xenomai 2.x, for pre-Pentium CPUs which do not provide the _rdtsc_ instruction. In this particular case, +--enable-x86-tsc+ cannot be mentioned in the configuration options for building the user libraries, since the processor does not support this feature. The rule of thumb is to pick the *exact* processor for your x86 platform when configuring the kernel, at the very least the most specific model which is close to the target CPU, not a generic placeholder such as _i586_, for which _rdtsc_ is not available. If your processor does not provide the _rdtsc_ instruction, you have to pass +--disable-x86-tsc+ option to the configure script for building the user librairies. In this case, Xenomai will provide a (much slower) emulation of the hardware TSC. [NOTE] This issue does not affect Xenomai 3.x as the latter does not support pre-Pentium systems in the first place. [[arm-tsc]] ==== ARM tsc emulation issues In order to allow applications to measure short durations with as little overhead as possible, Xenomai uses a 64 bits high resolution counter. On x86, the counter used for this purpose is the time-stamp counter readable by the dedicated _rdtsc_ instruction. ARM processors generally do not have a 64 bits high resolution counter available in user-space, so this counter is emulated by reading whatever high resolution counter is available on the processor, and used as clock source in kernel-space, and extend it to 64 bits by using data shared with the kernel. If Xenomai libraries are compiled without emulated tsc support, system calls are used, which have a much higher overhead than the emulated tsc code. In recent versions of the I-pipe patch, SOCs generally select the +CONFIG_IPIPE_ARM_KUSER_TSC+ option, which means that the code for reading this counter is provided by the kernel at a predetermined address (in the vector page, a page which is mapped at the same address in every process) and is the code used if you do not pass the +--enable-arm-tsc+ or +--disable-arm-tsc+ option to configure, or pass +--enable-arm-tsc=kuser+. This default should be fine with recent patches and most ARM SOCs. However, if you see the following message: ------------------------------------------------------------------------------- incompatible feature set (userland requires "kuser_tsc...", kernel provides..., missing="kuser_tsc") ------------------------------------------------------------------------------- It means that you are either using an old patch, or that the SOC you are using does not select the +CONFIG_IPIPE_ARM_KUSER_TSC+ option. So you should resort to what Xenomai did before branch 2.6: select the tsc emulation code when compiling Xenomai user-space support by using the +--enable-arm-tsc+ option. The parameter passed to this option is the name of the SOC or SOC family for which you are compiling Xenomai. Typing: ------------------------------------------------------------------------------- /patch/to/xenomai/configure --help ------------------------------------------------------------------------------- will return the list of valid values for this option. If after having enabled this option and recompiled, you see the following message when starting the latency test: ------------------------------------------------------------------------------- kernel/user tsc emulation mismatch ------------------------------------------------------------------------------- or ------------------------------------------------------------------------------- Hardware tsc is not a fast wrapping one ------------------------------------------------------------------------------- It means that you selected the wrong SOC or SOC family, reconfigure Xenomai user-space support by passing the right parameter to +--enable-arm-tsc+ and recompile. The following message: ------------------------------------------------------------------------------- Your board/configuration does not allow tsc emulation ------------------------------------------------------------------------------- means that the kernel-space support for the SOC you are using does not provide support for tsc emulation in user-space. In that case, you should recompile Xenomai user-space support passing the +--disable-arm-tsc+ option. ==== hardware tsc is not a fast wrapping one or, ==== kernel/user tsc emulation mismatch or, ==== board/configuration does not allow tsc emulation See the <> section. ==== native skin or CONFIG_XENO_OPT_PERVASIVE disabled Possible reasons for this error are: * you booted a kernel without Xenomai or I-pipe support, a kernel with I-pipe and Xenomai support should have a '/proc/ipipe/version' and '/proc/xenomai/version' files; * the kernel you booted does not have the +CONFIG_XENO_SKIN_NATIVE+ and +CONFIG_XENO_OPT_PERVASIVE+ options enabled; * Xenomai failed to start, check the <> section; * you are trying to run Xenomai user-space support compiled for x86_32 on an x86_64 kernel. ==== "warning: is deprecated" while compiling kernel code Where is a thread creation service, one of: * +cre_tsk+ * +pthread_create+ * +rt_task_create+ * +sc_tecreate+ or +sc_tcreate+ * +taskSpawn+ or +taskInit+ * +t_create+ Starting with Xenomai 3, APIs are not usable from kernel modules anymore, at the notable exception of the RTDM device driver API, which by essence must be used from kernel space for writing real-time device drivers. Those warnings are there to remind you that application code should run in user-space context instead, so that it can be ported to Xenomai 3. You may switch those warnings off by enabling the +CONFIG_XENO_OPT_NOWARN_DEPRECATED+ option in your kernel configuration, but nevertheless, you have been *WARNED*. ==== a Xenomai system call fails with code -38 (ENOSYS) Possible reasons for this error are: * you booted a kernel without Xenomai or I-pipe support, a kernel with I-pipe and Xenomai support should have a '/proc/ipipe/version' and '/proc/xenomai/version' files; * the kernel you booted does not have the +CONFIG_XENO_SKIN_*+ option enabled for the skin you use, or +CONFIG_XENO_OPT_PERVASIVE+ is disabled; * Xenomai failed to start, check the <> section; * you are trying to run Xenomai user-space support compiled for x86_32 on an x86_64 kernel. ==== the application overconsumes system memory Your user-space application unexpectedly commits a lot of virtual memory, as reported by "+top+" or '/proc//maps'. Sometimes OOM situations may even appear during runtime on systems with limited memory. The reason is that Xenomai threads are underlaid by regular POSIX threads, for which a large default amount of stack space memory is commonly reserved by the POSIX threading library (8MiB per thread by the _glibc_). Therefore, the kernel will commit as much as _8MiB{nbsp}*{nbsp}nr_threads_ bytes to RAM space for the application, as a side-effect of calling the +mlockall()+ service to lock the process memory, as Xenomai requires. This behaviour can be controlled in two ways: - via the _stacksize_ parameter passed to the various thread creation routines, or +pthread_attr_setstacksize()+ directly when using the POSIX API. - by setting a lower user-limit for the initial stack allocation from the application's parent shell which all threads from the child process inherit, as illustrated below: --------------------------------------------------------------------- ulimit -s --------------------------------------------------------------------- ==== freeze or machine lockup Possible reasons may be: - Stack space overflow issue now biting some real-time kernel thread? - Spurious delay/timeout values computed by the application (specifically: too short). - A case of freeze is a system call called in a loop which fails without its return value being properly checked. On x86, whenever the nucleus watchdog does not trigger, you may want to try disabling CONFIG_X86_UP_IOAPIC while keeping CONFIG_X86_UP_APIC, and arm the kernel NMI watchdog on the LAPIC (nmi_watchdog=2). You may be lucky and have a backtrace after the freeze. Maybe enabling all the nucleus debug options would catch something too. == Issues when running Xenomai test programs [[latency]] === Issues when running the _latency_ test The first test to run to see if Xenomai is running correctly on your platform is the latency test. The following sections describe the usual reasons for this test not to run correctly. ==== failed to open benchmark device You have launched +latency -t 1+ or +latency -t 2+ which both require the kernel to have been configured with the +CONFIG_XENO_DRIVERS_TIMERBENCH+ option. ==== the _latency_ test hangs The most common reason for this issues is a too short period passed with the +-p+ option, try increasing the period. If you enable the watchdog (option +CONFIG_XENO_OPT_WATCHDOG+, in your kernel configuration), you should see the <> message. [[short-period]] ==== watchdog triggered (period too short?) The built-in Xenomai watchdog has stopped the _latency_ test because it was using all the CPU in pure real-time mode (aka _primary mode_). This is likely due to a too short period. Run the _latency_ test again, passing a longer period using the +-p+ option this time. ==== the _latency_ test shows high latencies The _latency_ test runs, but you are seeing high latencies. * make sure that you carefully followed the <>. * if running on a Raspberry Pi SBC, make sure you don't hit a firmware issue, see https://github.com/raspberrypi/firmware/issues/497. * if running on a x86 platform, make sure that you do not have an issue with SMIs, see the <>. * if running on a x86 platform with a _legacy USB_ switch available from the BIOS configuration, try disabling it. * if you do not have this option at BIOS configuration level, it does not necessarily mean that there is no support for it, thus no potential for high latencies; this support might just be forcibly enabled at boot time. To solve this, in case your machine has some USB controller hardware, make sure to enable the corresponding host controller driver support in your kernel configuration. For instance, UHCI-compliant hardware needs +CONFIG_USB_UHCI_HCD+. As part of its init chores, the driver should reset the host controller properly, kicking out the BIOS off the concerned hardware, and deactivate the USB legacy mode if set in the same move. * if you observe high latencies while running X-window, try disabling hardware acceleration in the X-window server file. With recent versions of X-window, try using the 'fbdev' driver. Install it (Debian package named 'xserver-xorg-video-fbdev' for instance), then modifiy the +Device+ section to use this driver in '/etc/X11/xorg.conf', as in: ------------------------------------------------------------------------------- Section "Device" Identifier "Card0" Driver "fbdev" EndSection ------------------------------------------------------------------------------- With olders versions of X-window, keep the existing driver, but add the following line to the +Device+ section: ------------------------------------------------------------------------------- Option "NoAccel" ------------------------------------------------------------------------------- === Issues when running the _switchtest_ program ==== pthread_create: Resource temporarily unavailable The switchtest test creates many kernel threads, an operation which consumes memory taken from internal pools managed by the Xenomai real-time core. Xenomai 2.x and 3.x series require +CONFIG_XENO_OPT_SYS_HEAPSZ+ to be large enough in the kernel configuration settings, to cope with the allocation requests. Xenomai 2.x may also require to increase the +CONFIG_XENO_OPT_SYS_STACKPOOLSZ+ setting.