Discussion:
guile 2.2.3 crashing on osx 10.11?
Dan Kegel
2017-12-30 22:32:40 UTC
Permalink
Hi all! Happily building and using guile on osx 10.12, 10.13, and Ubuntu 16.04.

osx 10.11, though, crashes when I just evaluate (display (version)),
or sometimes while building.

Happens whether I build it myself, or use brew. Happens on more than
one machine, too.

This is with xcode 7.1:
$ cc --version
Apple LLVM version 7.0.0 (clang-700.1.76)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

I think it also happens with xcode 7.3.1:
$ cc --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode7.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Haven't tried the latest, xcode 8.2.1.

For instance:

$ brew install guile
$ guile
...
scheme@(guile-user)> (display (version))
Illegal instruction: 4

Here's another: building from source fails early with

Making all in bootstrap
GUILE_AUTO_COMPILE=0 \
../meta/build-env \
guild compile --target="x86_64-apple-darwin15.6.0" \
-O1 \
-L "/Users/buildbot/src/yobuild/recipes/guile/btmp/guile-2.2.3/module"
\
-L "/Users/buildbot/src/yobuild/recipes/guile/btmp/guile-2.2.3/guile-readline"
\
-o "ice-9/eval.go" "../module/ice-9/eval.scm"
make[2]: *** [ice-9/eval.go] Illegal instruction: 4
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

Anyone run into this before?

- Dan
Matt Wette
2017-12-30 23:31:23 UTC
Permalink
Post by Dan Kegel
Hi all! Happily building and using guile on osx 10.12, 10.13, and Ubuntu 16.04.
osx 10.11, though, crashes when I just evaluate (display (version)),
or sometimes while building.
Happens whether I build it myself, or use brew. Happens on more than
one machine, too.
$ cc --version
Apple LLVM version 7.0.0 (clang-700.1.76)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
$ cc --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode7.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Haven't tried the latest, xcode 8.2.1.
$ brew install guile
$ guile
...
Illegal instruction: 4
Here's another: building from source fails early with
Making all in bootstrap
GUILE_AUTO_COMPILE=0 \
../meta/build-env \
guild compile --target="x86_64-apple-darwin15.6.0" \
-O1 \
-L "/Users/buildbot/src/yobuild/recipes/guile/btmp/guile-2.2.3/module"
\
-L "/Users/buildbot/src/yobuild/recipes/guile/btmp/guile-2.2.3/guile-readline"
\
-o "ice-9/eval.go" "../module/ice-9/eval.scm"
make[2]: *** [ice-9/eval.go] Illegal instruction: 4
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Anyone run into this before?
- Dan
Hi Dan,

I have not seen that on macOS before, but previously ran into other issues. This may help to chase it down:

build with use
CFLAGS=-g LDFLAGS=-g ./configure --disable-shared --prefix=/opt/local

in meta/gdb-uninstalled-guile, change:
gdb --args ${top_builddir}/libguile/guile "$@"
to
lldb -- ${top_builddir}/libguile/guile "$@"

and, IIRC, run meta/gdb-installed-guile
Dan Kegel
2018-01-01 01:53:59 UTC
Permalink
Post by Matt Wette
Post by Dan Kegel
osx 10.11, though, crashes when I just evaluate (display (version)),
or sometimes while building.
build with use
CFLAGS=-g LDFLAGS=-g ./configure --disable-shared --prefix=/opt/local
to
and, IIRC, run meta/gdb-installed-guile
Thanks. Also had to do
sudo /usr/sbin/DevToolsSecurity --enable

Here's a backtrace:

* thread #1: tid = 0x1628f8, 0x00000001003e95be
libgmp.10.dylib`__gmpn_mul_1 + 94, queue = 'com.apple.main-thread',
stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
frame #0: 0x00000001003e95be libgmp.10.dylib`__gmpn_mul_1 + 94
libgmp.10.dylib`__gmpn_mul_1:
-> 0x1003e95be <+94>: mulxq (%rsi), %rbx, %rax

Guess what? This machine is a i7-3720QM (in a Macbook Pro 9,1), which
doesn't support MULX. (It's 2012 Ivy Bridge, which is just
pre-Haswell.)

$ gobjdump -d libgmp.dylib | grep mulx
confirms the presence of the mulx instruction.

So my gmp was built wrong for this machine. (There was a related
bugfix for low-end cpus in gmp 6.1.1, but I've got 6.1.2, and no
low-end cpus.)

Bit of a mystery, then, but nothing to do with guile.
- Dan
Dan Kegel
2018-01-01 18:25:46 UTC
Permalink
So, for completeness, here's why guile was crashing instantly in gmp
on some machines for me.

If you build and run everything on the same machine, none of this is
likely to affect you; this is only for the case where each library is
built by a random machine from a buildbot pool.

I had been paying attention to the AVX1.0 divide across our buildbot fleet,
and arranged to segregate MacPro5,1 machines off in their own pool,
to avoid crashes from vxorps (an AVX instruction) in ImageMagick;
here's the output of
sysctl -n hw.model machdep.cpu.brand_string machdep.cpu.features
on our machines, with boring bits removed:

MacPro5,1 E5620 SMX SSE4.2
MacBookPro8,2 i7-2720QM SMX SSE4.2 x2APIC XSAVE OSXSAVE
TSCTMR AVX1.0
Macmini6,2 i7-3615QM SSE4.2 x2APIC XSAVE OSXSAVE
TSCTMR AVX1.0 RDRAND F16C
MacBookPro9,1 i7-3720QM SMX SSE4.2 x2APIC XSAVE OSXSAVE
TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,2 i7-4870HQ SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,2 i7-4960HQ SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,5 i7-4980HQ SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C

MULX is a nice Intel instruction that was introduced in 2013 in
non-low-end Haswell chips.
https://software.intel.com/sites/default/files/m/f/7/c/36945 says one
detects it like this:
CPUID.(EAX=07H, ECX=0H):EBX.BMI2[bit 8]: if 1 indicates the processor
supports the second group of advanced bit manipulation extensions
(BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX);
http://publicclu2.blogspot.com/2013/05/flags-in-x86-linuxs-proccpuinfo.html
clarifies that, on Linux, /proc/cpuinfo will contain the string BMI2
if MULX is present. This evidently is also true of sysctl -n
machdep.cpu.leaf7_features on mac, which says:

MacPro5,1 E5620
MacBookPro8,2 i7-2720QM
Macmini6,2 i7-3615QM SMEP ERMS RDWRFSGS
MacBookPro9,1 i7-3720QM SMEP ERMS RDWRFSGS
MacBookPro11,2 i7-4870HQ SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
HLE AVX2 BMI2 INVPCID RTM
MacBookPro11,2 i7-4960HQ SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
MacBookPro11,5 i7-4980HQ SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
AVX2 BMI2 INVPCID FPU_CSDS

which means I have three basic groups:
1) no AVX: (MacPro5,1; circa 2010)
2) AVX but no BMI2 (circa 2011-2012)
3) AVX and BMI2 (MacBookPro11.2, 11.5; circa 2013-2015)

I'm sure other people will have different needs, but for me,
segregating shared build machines into those three pools -- and/or
sticking with two pools and disabling use of MULX in gmp -- should
avoid the crashes I saw due to ImageMagick and GMP cpu specific
instruction assumptions.

It'd be nice if gmp and imagemagick were more agile about cpu feature
detection, and did (more of) it at runtime, but that's life.
- Dan
Post by Dan Kegel
Post by Matt Wette
Post by Dan Kegel
osx 10.11, though, crashes when I just evaluate (display (version)),
or sometimes while building.
build with use
CFLAGS=-g LDFLAGS=-g ./configure --disable-shared --prefix=/opt/local
to
and, IIRC, run meta/gdb-installed-guile
Thanks. Also had to do
sudo /usr/sbin/DevToolsSecurity --enable
* thread #1: tid = 0x1628f8, 0x00000001003e95be
libgmp.10.dylib`__gmpn_mul_1 + 94, queue = 'com.apple.main-thread',
stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
frame #0: 0x00000001003e95be libgmp.10.dylib`__gmpn_mul_1 + 94
-> 0x1003e95be <+94>: mulxq (%rsi), %rbx, %rax
Guess what? This machine is a i7-3720QM (in a Macbook Pro 9,1), which
doesn't support MULX. (It's 2012 Ivy Bridge, which is just
pre-Haswell.)
$ gobjdump -d libgmp.dylib | grep mulx
confirms the presence of the mulx instruction.
So my gmp was built wrong for this machine. (There was a related
bugfix for low-end cpus in gmp 6.1.1, but I've got 6.1.2, and no
low-end cpus.)
Bit of a mystery, then, but nothing to do with guile.
- Dan
Loading...