Linux SMP FAQ David Mentr, David.Mentre@irisa.fr v0.22, 17 april 1998 This FAQ review main issues (and I hope solutions) related to SMP con- figuration under Linux. ______________________________________________________________________ Table of Contents 1. Introduction 2. Questions related to any architectures 2.1 Kernel side 2.2 User side 3. Intel architecture specific questions 3.1 Why it doesn't work on my machine? 3.2 Possible causes of crash 3.3 Motherboard specific information 3.3.1 Motherboards with known problems 3.3.2 Motherboards with no known problems 4. Useful pointers 4.1 Various 4.2 SMP specific patches 5. Glossary 6. List of contributors ______________________________________________________________________ 11.. IInnttrroodduuccttiioonn Linux can work on SMP (Symetric Multi-Processors) machines. SMP support has started with the 2.0 family and has been improved in the 2.1 (future 2.2) saga. FAQ maintained by David Mentr (David.Mentre@irisa.fr). The latest edition of this FAQ can be found at http://www.irisa.fr/prive/mentre/smp-faq/. If you want to contribute to this FAQ, I would prefer a diff against the SGML version http://www.irisa.fr/prive/mentre/smp-faq/smp- faq.sgml> of this document, but any remarks (in plain text) will be greatly appreciated. This FAQ is an improvement of a first draft made by CChhrriiss PPiirriihh. All information contained in this FAQ is provided "as is." All warranties, expressed, implied or statutory, concerning the accuracy of the information of the suitability for any particular use are hereby specifically disclaimed. While every effort has been taken to ensure the accuracy of the information contained in this FAQ, the authors assume(s) no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. 22.. QQuueessttiioonnss rreellaatteedd ttoo aannyy aarrcchhiitteeccttuurreess 22..11.. KKeerrnneell ssiiddee 1. DDooeess LLiinnuuxx ssuuppppoorrtt mmuullttii--tthhrreeaaddiinngg?? IIff II ssttaarrtt ttwwoo oorr mmoorree pprroocceesssseess,, wwiillll tthheeyy bbee ddiissttrriibbuutteedd aammoonngg tthhee aavvaaiillaabbllee CCPPUUss?? Yes. 2. WWhhaatt kkiinndd ooff aarrcchhiitteeccttuurreess aarree ssuuppppoorrtteedd iinn SSMMPP?? FFrroomm AAllaann CCooxx: SMP is supported in 2.0 on the hypersparc (SS20, etc.) systems and Intel 486, Pentium or higher machines which are Intel MP1.1/1.4 compliant. SMP support for UltraSparc, SparcServer, Alpha and PowerPC machines is in progress in 2.1.x. FFrroomm RRaallff BBcchhllee: MIPS, m68k and ARM does not support SMP; the latter two probly won't ever. That is, I'm going to hack on MIPS-SMP as soon as I get a SMP box ... 3. HHooww ddoo II mmaakkee aa LLiinnuuxx SSMMPP kkeerrnneell?? Uncomment the SMP=1 line in the main Makefile (/usr/src/linux/Makefile). AND enable "RTC support" (from RRoobbeerrtt GG.. BBrroowwnn). Note that inserting RTC support actually doesn't afaik prevent drift, but according to a discussion [Robert G. Brown] remember from a year ago or so it can prevent lockup when the clock is read at boot time. AND do NOT enable APM! APM and SMP are not compatible, and your system will almost certainly (or at least probably ;)) crash under boot if APM is enabled (JJaakkoobb OOeesstteerrggaaaarrdd). AAllaann CCooxx confirms this : 2.1.x turns APM off for SMP boxes. Basically APM is undefined in the presence of SMP systems, and anything could occur. You must rebuild all your kernel and kernel modules when changing to and from SMP mode. Remember to make modules and make modules_install (from AAllaann CCooxx). 4. HHooww ddoo II mmaakkee aa LLiinnuuxx nnoonn-SMP kernel? CCoommmmeenntt the SMP=1 line in the Makefile (and not set SMP to 0). You must rebuild all your kernel and kernel modules when changing to and from SMP mode. Remember to make modules and make modules_install. 5. HHooww ccaann II tteellll iiff iitt wwoorrkkeedd?? cat /proc/cpuinfo Typical output (dual PentiumII): ______________________________________________________________________ processor : 0 cpu : 686 model : 3 vendor_id : GenuineIntel stepping : 3 fdiv_bug : no hlt_bug : no fpu : yes fpu_exception : yes cpuid : yes wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic 11 mtrr pge mca cmov mmx bogomips : 267.06 processor : 1 cpu : 686 model : 3 vendor_id : GenuineIntel stepping : 3 fdiv_bug : no hlt_bug : no fpu : yes fpu_exception : yes cpuid : yes wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic 11 mtrr pge mca cmov mmx bogomips : 267.06 ______________________________________________________________________ 6. WWhhaatt iiss tthhee ssttaattuuss ooff ccoonnvveerrttiinngg tthhee kkeerrnneell ttoowwaarrdd ffiinneerr ggrraaiinneedd lloocckkiinngg aanndd mmuullttiitthhrreeaaddiinngg?? 2.1.x has signal handling, interrupts and some I/O stuff fine grain locked. The rest is gradually migrating. All the scheduling is SMP safe 7. DDooeess LLiinnuuxx SSMMPP ssuuppppoorrtt pprroocceessssoorr aaffffiinniittyy?? No and Yes. There is no way to force a process onto specific CPU's but the linux scheduler has a processor bias for each process, which tends to keep processes tied to a specific CPU. 22..22.. UUsseerr ssiiddee 1. DDoo II rreeaallllyy nneeeedd SSMMPP?? If you have to ask, you probably don't. :) 2. HHooww ddooeess oonnee ddiissppllaayy mmuuttiippllee ccppuu ppeerrffoorrmmaannccee?? Thanks to SSaammuueell SS.. CChheessssmmaann, here is some useful utilities: CChhaarraacctteerr bbaasseedd:: http://www.cs.inf.ethz.ch/~rauch/procps.html Basically, it's procps v1.12.2 (top, ps, et. al.) and some patches to support SMP. GGrraapphhiicc:: xosview-1.5.1 supports SMP. And kernels above 2.1.85 (included) have the /proc/stat/cpuX entry. The official homepage for xosview is: http://lore.ece.utexas.edu/~bgrayson/xosview.html The various forissier's kernel patches are at: http://www- isia.cma.fr/~forissie/smp_kernel_patch/ 3. HHooww ccaann II pprrooggrraamm ttoo uussee ttwwoo ((oorr mmoorree CCPPUUss)) ?? Use a kernel-thread library. A good library, the pthread library made by Xavier Leroy http://pauillac.inria.fr/~xleroy/linuxthreads/>. LinuxThread is now integrated with glibc2 (aka libc6). From JJaakkoobb OOeesstteerrggaaaarrdd: Also consider using MPI. It's the industry standard message passing interface. It doesn't give you shared memory like threads, but it allows you to use your program in a cluster too. 4. WWhhaatt hhaass cchhaannggeedd iinn tthhee tthhrreeaaddss ppaacckkaaggeess,, lliinnuuxxtthhrreeaadd,, eettcc.. Glibc is the big change. glibc is threadsafe and includes linuxthreads Posix.4 threads by default. Real time signals are also in glibc so POSIX AIO should also be in glibc2.1 (I hope). 5. HHooww ccaann II eennaabbllee mmoorree tthhaann 11 pprroocceessss ffoorr mmyy kkeerrnneell ccoommppiillee?? use: ___________________________________________________________________ # make [modules|zImage|bzImages] MAKE="make -jX" where X=max number of processes. WARNING: This won't work for "make dep". ___________________________________________________________________ With a 2.1.x like kernel, see also the file /usr/src/linux/Documentation/smp for specific instruction. BTW, since running multiple compilers allows a machine with sufficient memory to use use the otherwise wasted CPU time during I/O caused delays make MAKE="make -j 2" -j 2 actually even helps on uniprocessor boxes (from RRaallff BBcchhllee). 6. WWhhyy tthhee ttiimmee ggiivveenn bbyy tthhee time command is false ? (from JJooeell MMaarrcchhaanndd) In the 2.0 series, the result given by the time command is false. The sum user+system is right *but* the spreading between user and system time is false. This bug in corrected in 2.1 series. 7. HHooww wwiillll mmyy aapppplliiccaattiioonn ppeerrffoorrmm uunnddeerr SSMMPP?? Look at SMP Performance of Linux http://www.interlog.com/~mackin/linux-smp.html> which gives useful hints how to bench a specific machine (from a post made by CCaammeerroonn MMaaccKKiinnnnoonn). 8. WWhheerree ccaann II ffoouunndd mmoorree iinnffoorrmmaattiioonn aabboouutt ppaarraalllleell pprrooggrraammmmiinngg?? Look at the Linux Parallel Processing HOWTO http://yara.ecn.purdue.edu/~pplinux/PPHOWTO/pphowto.html> Lots of useful information can be found at Parallel Processing using Linux http://yara.ecn.purdue.edu/~pplinux/> 33.. IInntteell aarrcchhiitteeccttuurree ssppeecciiffiicc qquueessttiioonnss 33..11.. WWhhyy iitt ddooeessnn''tt wwoorrkk oonn mmyy mmaacchhiinnee?? 1. CCaann II uussee mmyy CCyyrriixx//AAMMDD//nnoonn--IInntteell CCPPUU iinn SSMMPP?? SShhoorrtt aannsswweerr:: no. LLoonngg aannsswweerr:: Intel claims ownership to the APIC SMP scheme, and unless a company licenses it from Intel they may not use it. There are currently no companies that have done so. (This of course can change in the future) FYI - Both Cyrix and AMD support the non- proprietary OpenPIC SMP standard but currently there are no motherboards that use it. 2. WWhhyy ddooeessnn''tt mmyy oolldd CCoommppaaqq wwoorrkk?? Put it into MP1.1/1.4 compliant mode. 3. WWhhyy ddooeessnntt mmyy AALLRR wwoorrkk?? From RRoobbeerrtt HHyyaatttt : ALR Revolution quad-6 seems quite safe, while some older revolution quad machines without P6 processors seem "iffy"... 4. WWhhyy ddooeess SSMMPP ggoo ssoo sslloowwllyy?? or WWhhyy ddooeess oonnee CCPPUU sshhooww aa vveerryy llooww bbooggoommiippss vvaalluuee wwhhiillee tthhee ffiirrsstt oonnee iiss nnoorrmmaall?? From AAllaann CCooxx: If one of your CPU's is reporting a very low bogomips value the cache is not enabled on it. Your vendor probably provides a buggy BIOS. Get the patch to work around this or better yet send it back and buy a board from a competent supplier. 5. II''vvee hheeaarrdd IIBBMM mmaacchhiinneess hhaavvee pprroobblleemmss Some IBM machines have the MP1.4 bios block in the EBDA, allowed but not supported by < 2.1.80. Please update to the right kernel. There is an old 486SLC based IBM SMP box. Linux/SMP requires hardware FPU support. 6. IIss tthheerree aannyy aaddvvaannttaaggee ooff IInntteell MMPP 11..44 oovveerr 11..11 ssppeecciiffiiccaattiioonn?? Nope (according to Alan :) ), 1.4 is just a stricker specs of 1.1. 7. WWhhyy ddooeess tthhee cclloocckk ddrriifftt ssoo rraappiiddllyy wwhheenn II rruunn lliinnuuxx SSMMPP?? This is known problem with IRQ handling and long kernel locks in the 2.0 series kernels. Consider upgrading to a later 2.1 kernel (not garenteed to work). From JJaakkoobb OOeesstteerrggaaaarrdd: Or, consider running xntpd. That should keep your clock right on time. (I think that I've heard that enabling RTC in the kernel also fixes the clock drift. It works for me! but I'm not sure whether that's general or I'm just being lucky) 8. WWhhyy aarree mmyy CCPPUU''ss nnuummbbeerreedd 00 aanndd 22 iinnsstteeaadd ooff 00 aanndd 11 ((oorr ssoommee ootthheerr oodddd nnuummbbeerriinngg))?? The CPU number is assigned by the MB manufacturer and doesn't mean anything. Ignore it. 9. MMyy SSMMPP ssyysstteemm iiss lloocckkiinngg uupp aallll tthhee ttiimmee.. BBllaacckk ssccrreeeenn,, nnootthhiinngg iinn tthhee llooggss.. HHeellpp!! If you're running a 2.0 kernel, consider upgrading to later 2.0.32+ kernels or apply Leonard Zubkoff's deadlock patch. If you still have deadlocks, apply Ingo Molnar's deadlock detection patch and post the results (against your System.map) to linux-smp or linux- kernel. You might also consider running a 2.1 kernel. 33..22.. PPoossssiibbllee ccaauusseess ooff ccrraasshh You'll find in this section some ppoossssiibbllee reasons for a crash of an SMP machine (credits are due to JJaakkoobb OOeesstteerrggaaaarrdd for this part). As far as I (david) know, theses problems are Intel specific. +o CCoooolliinngg pprroobblleemmss From RRaallff BBcchhllee: [Related to case size and fans] It's important that the air is flowing. It of course can't where cables etc. are preventing this like in too small cases. On the other side I've seen oversized cases causing big problems. There are some tower cases on the market that actually are worse for cooling than desktops. In short, the right thing is thinking about aerodynamics in the case. Extra cases for hot peripherals are usefull as well. +o BBaadd mmeemmoorryy Don't buy too cheap RAM and don't use mixed RAM modules on a motherboard that is picky about it. Especially Tyan motherboards are known to be picky about RAM speed. +o BBaadd ccoommbbiinnaattiioonn ooff ddiiffffeerreenntt sstteeppppiinngg CCPPUUss Check /proc/cpuinfo to see that your CPUs are same stepping. +o YYoouu aarree rruunnnniinngg 22..00..3333 aarreenn''tt yyoouu ?? If you run 2.0.31 or 2.1.xx you can't be sure that SMP is stable. 2.0.33 is the right kernel for a production system. 2.1.xx kernels perform better, but they are development releases and should NOT be considered stable! +o IIff yyoouurr ssyysstteemm iiss uunnssttaabbllee,, tthheenn DDOONN''TT oovveerrcclloocckk iitt!! ...and even if it is stable, DON'T overclock. From RRaallff BBcchhllee: Overclocking causes very subtile problems. I have a nice example, one of my overclocked old machines misscomputes a couple of pixels of a 640 x 400 fractal. The problem is only visible when comparing them using tools. So better say _n_e_v_e_r_, _n_u_n_c_a_s_, _j_a_m_a_i_s_, _n_i_e_m_a_l_s overclock. +o 22..00..xx kkeerrnneell aanndd ffaasstt eetthheerrnneett (from RRoobbeerrtt GG.. BBrroowwnn) 2.0.X kernels on high performance fast ethernet systems have significant (and known) problems with a race/deadlock condition in the networking interrupt handler. The solution is to get the latest 100BT development drivers from CESDIS (ones that define SMPCHECK). +o AA bbuugg iinn tthhee 444400FFXX cchhiippsseett (from EEmmiill BBrriiggggss) If you had a system using the 440FX chipset then your problem with the lockups was possibly due to a documented errata in the chipset. Here is a reference References: Intel 440FX PCIset 82441FX (PMC) and 82442FX (DBX) Specification Update. pg. 13 http://www.intel.com/design/pcisets/specupdt/297654.htm The problem can be fixed with a bios workaround (Or a kernel patch) and in fact David Wragg wrote a patch that's included with Richard Gooch's mttr patch. For more information and a fix look here. http://nemo.physics.ncsu.edu/~briggs/vfix.html Some hardware is also known to cause problems. This includes: +o AAddaapptteecc SSCCSSII ccoonnttrroolllleerrss Don't buy them, Adaptec is unsupportive to the linux developers. This is not a SMP problem, but a general high-performance Linux problem. It also seems that aic7xxx driver is broken under SMP (from RRoobbeerrtt HHyyaatttt). (from DDoouugg LLeeddffoorrdd, author of the Adaptec driver) Just a quick note, the 5.0.11 version of my driver for 2.0.33 is the one I [Doug] personally recommend for SMP and/or PII systems. It's what I use here on a PII/266 dual system, although I'm running 2.1.92 right now instead of 2.0.33. Second note, the patch will not go into 2.0.34-pre6 cleanly, but can be used, and it has not been submitted for any of the 34pre kernels because I don't think it's had enough testing yet. +o 33CCoomm 33cc990055 ccaarrddss Some work, some don't. Try disabling busmastering if your system is unstable. 33..33.. MMootthheerrbbooaarrdd ssppeecciiffiicc iinnffoorrmmaattiioonn Some more specific information can be found with the survey of SMP motherboards http://styx.phy.vanderbilt.edu/smp/mainboards.html> 33..33..11.. MMootthheerrbbooaarrddss wwiitthh kknnoowwnn pprroobblleemmss +o Gigabyte Solution: BIOS upgrade +o SuperMicro Solution: BIOS upgrade +o EPoX KP6-LS (CChhrriissttoopphheerr AAlllleenn WWiinngg, 16 march 1998) It appears to have the same BIOS related BogoMIPS problem as other motherboards. (one CPU only gives about 3 BogoMIPS, the other gives the full amount) All 2.0.x kernels lock up soon after booting, late 2.1.x kernels run slowly but don't seem to lock up. There is no BIOS upgrade available (yet). I wrote the manufacturer but have not received a reply. +o Tyan Tyan motherboards are known to be picky about RAM speed (JJaakkoobb OOeesstteerrggaaaarrdd). From DDoouugg LLeeddffoorrdd about the onboard aic-7895 SCSI controller (for which he wrote the driver): "BTW, make sure you have at least BIOS version 1.16 on that Tyan motherboard. The 1.15 and below BIOS versions have a bug relating to IRQ allocation for the 7895 SCSI controller" (submitted by SSzzaakkaaccssiittss SSzzaabboollccss). +o GA686DLX (AAnnddrreeww CCrraannee) Same BIOS related BogoMIPS problem as other motherboards. Solution from Alan Cox: Congratulations, send the bill for your hair damage to the supplier. You have yet another SMP box with faulty bios. There is a patch for 2.0.x on www.uk.linux.org and there are people working on generic MTRR handling for 2.1.x +o MS-6114 More details for this motherboard at http://www.msi.com.tw/product/6114/6114.htm http://www.msi.com.tw/product/6114/6114.htm> Solution: BIOS upgrade Somebody experienced solid hangs (nothing in the log files) under constant load of about 5 running processes within less than 12 hours with AMI BIOS v1.1. v1.4b3 runs without problems. 33..33..22.. MMootthheerrbbooaarrddss wwiitthh nnoo kknnoowwnn pprroobblleemmss +o AIR P6NDP and P6NDI (LLeeoonnaarrdd NN.. ZZuubbkkooffff) My primary production machine is based on an AIR P6NDP and one of my test machines uses a P6NDI. Both seem to be fine motherboards in my experience. The P6NDI BIOS is a little conservative in its programming of the Natoma chipset for 50ns EDO, but a minor tweek to one register in rc.local took care of that. +o AIR 54CDP (CChhrriiss MMaauurriittzz) You can also list the following motherboard as working with no problems: AIR 54CDP motherboard / EISA/PCI / onboard aic7870 / dual P120 / Redhat 5.0 (2.0.32 and 2.0.33 kernels) +o HP XU 6/200 (JJeeaann--FFrraannccooiiss MMiiccoouulleeaauu) Works with 2.0 and 2.1 kernels. Some problems under high network load with 2.0.x kernel. Works under 2.1.78 with Ingo Molnar IO-APIC patch. +o Elitegroup P6FX2-A (BBeenneeddiikktt HHeeiinneenn) Had this mainboard running with ONE PPro on it for several months, and since about a year, it's running without problems with TWO PPro 200MHz. The only crashes this machine ever experienced were before Leonard Zubkoff's deadlock-patches for Linux 2.0.30... ;) Elitegroup P6FX2-A / ISA/PCI / Dual PPro200 / Debian "hamm" 44.. UUsseeffuull ppooiinntteerrss 44..11.. VVaarriioouuss +o Parallel Processing using Linux http://yara.ecn.purdue.edu/~pplinux/> +o Linux Parallel Processing HOWTO http://yara.ecn.purdue.edu/~pplinux/PPHOWTO/pphowto.html> +o ((oouuttddaatteedd)) Linux SMP home page http://www.uk.linux.org/SMP/title.html> +o linux-smp mailing list To ssuubbssccrriibbee, send subscribe linux-smp in the message body at majordomo@vger.rutgers.edu To uunnssuubbssccrriibbee, send unsubscribe linux-smp in the message body at majordomo@vger.rutgers.edu +o pthread library made by Xavier Leroy http://pauillac.inria.fr/~xleroy/linuxthreads/> +o Linux SMP archives http://www.linuxhq.com/lnxlists/linux-smp/> +o Survey of SMP motherboards http://styx.phy.vanderbilt.edu/smp/mainboards.html> +o procps http://www.cs.inf.ethz.ch/~rauch/procps.html> +o xosview http://lore.ece.utexas.edu/~bgrayson/xosview.html> +o Pentium Pro Optimized BLAS and FFTs for Intel Linux http://www.cs.utk.edu/~ghenry/distrib/> +o SMP Performance of Linux http://www.interlog.com/~mackin/linux- smp.html> +o Multithreaded programs on linux http://www.e.kth.se/~e94_bek/mthread.html> 44..22.. SSMMPP ssppeecciiffiicc ppaattcchheess +o Forissier kernel patches http://www- isia.cma.fr/~forissie/smp_kernel_patch/> +o Patch for a bug in the 440FX chipset http://nemo.physics.ncsu.edu/~briggs/vfix.html> +o MTRR patch (latest version: 1.9) http://www.atnf.csiro.au/~rgooch/kernel-patches.html> 55.. GGlloossssaarryy +o SSMMPP Symetric Multi-Processors +o AAPPIICC Advanced Programmable Interrupt Controler +o tthhrreeaadd A thread is a processor activity in a process. The same process can have multiple threads. Those threads share the process address space and can therefore share data. +o pptthhrreeaadd Posix thread, threads defined by the Posix standard. 66.. LLiisstt ooff ccoonnttrriibbuuttoorrss Many thanks to those who help me to maintain this FAQ. +o Emil Briggs +o Robert G. Brown +o Samuel S. Chessman +o Alan Cox +o Andrew Crane +o Jocelyne Erhel +o Byron Faber +o Benedikt Heinen +o Robert Hyatt +o Tony Kocurko +o Doug Ledford +o Cameron MacKinnon +o Joel Marchand +o Chris Mauritz +o Jean-Francois Micouleau +o Jakob Oestergaard +o Jean-Michel Rouet +o Ralf Bchle +o Sumit Roy +o Szakacsits Szabolcs +o El Warren +o Christopher Allen Wing +o Leonard N. Zubkoff