Alexander Leidinger
2024-03-29 15:52:55 UTC
Hi,
sources from 2024-03-11 work. Sources from 2024-03-25 and today don't
work (see below for the issue). As the monthly stabilisation pass didn't
find obvious issues, it is something related to my setup:
- not a generic kernel
- very modular kernel (as much as possible as a module)
- bind_now (a build without fails too, tested with clean /usr/obj)
- ccache (a build without fails too, tested with clean /usr/obj)
- kernel retpoline (build without in progress)
- userland retpoline (build without in progress)
- kernel build with WITH_CTF / DDB_CTF (next one to test if it isn't
retpoline)
- -fno-builtin
- CPUFLAGS=native (except for stuff in /usr/src/sys/boot)
- malloc production
- COPTFLAGS= -O2 -pipe
The issue is, that kernel modules load OK from loader, but once it
starts init any module fails to load (e.g. via autodetection of hardware
or rc.conf kld_list) with the message that the kernel and module
versions are out of sync and the module refuses to load.
I tried the workaround to load the modules from the loader, which works,
but then I can't login remotely as ssh fails to allocate a pty. By
loading modules via the loader, I can see messages about missing CTF
info when the nvidia modules (from ports = not yet rebuild = in
/boot/modules/...ko instead of /boot/kernel/...ko) try to get
initialised... and it looks like they are failing to get initialised
because of this missing CTF stuff (I'm back to the previous boot env to
be able to login remotely and send mails, I don't have a copy of the
failure message at hand).
I assume the missing CTF stuff is due to the CTF based pretty printing
(https://cgit.freebsd.org/src/commit/?id=c21bc6f3c2425de74141bfee07b609bf65b5a6b3).
Is this supposed to fail to load modules which are compiled without CTF
data? Shouldn't this work gracefully (e.g. spit out a warning that
pretty printing is not available for module X and have the module
working)?
Next steps:
- try a world without retpoline (bind_now and ccache active)
- try a kernel without CTF (bind now, ccache, retpoline active)
- try a world without bind_now, retpoline, CTF, CPUFLAGS, COPTFLAGS
If anyone has an idea how to debug this in some other way...
Bye,
Alexander.
sources from 2024-03-11 work. Sources from 2024-03-25 and today don't
work (see below for the issue). As the monthly stabilisation pass didn't
find obvious issues, it is something related to my setup:
- not a generic kernel
- very modular kernel (as much as possible as a module)
- bind_now (a build without fails too, tested with clean /usr/obj)
- ccache (a build without fails too, tested with clean /usr/obj)
- kernel retpoline (build without in progress)
- userland retpoline (build without in progress)
- kernel build with WITH_CTF / DDB_CTF (next one to test if it isn't
retpoline)
- -fno-builtin
- CPUFLAGS=native (except for stuff in /usr/src/sys/boot)
- malloc production
- COPTFLAGS= -O2 -pipe
The issue is, that kernel modules load OK from loader, but once it
starts init any module fails to load (e.g. via autodetection of hardware
or rc.conf kld_list) with the message that the kernel and module
versions are out of sync and the module refuses to load.
I tried the workaround to load the modules from the loader, which works,
but then I can't login remotely as ssh fails to allocate a pty. By
loading modules via the loader, I can see messages about missing CTF
info when the nvidia modules (from ports = not yet rebuild = in
/boot/modules/...ko instead of /boot/kernel/...ko) try to get
initialised... and it looks like they are failing to get initialised
because of this missing CTF stuff (I'm back to the previous boot env to
be able to login remotely and send mails, I don't have a copy of the
failure message at hand).
I assume the missing CTF stuff is due to the CTF based pretty printing
(https://cgit.freebsd.org/src/commit/?id=c21bc6f3c2425de74141bfee07b609bf65b5a6b3).
Is this supposed to fail to load modules which are compiled without CTF
data? Shouldn't this work gracefully (e.g. spit out a warning that
pretty printing is not available for module X and have the module
working)?
Next steps:
- try a world without retpoline (bind_now and ccache active)
- try a kernel without CTF (bind now, ccache, retpoline active)
- try a world without bind_now, retpoline, CTF, CPUFLAGS, COPTFLAGS
If anyone has an idea how to debug this in some other way...
Bye,
Alexander.
--
http://www.Leidinger.net ***@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org ***@FreeBSD.org : PGP 0x8F31830F9F2772BF
http://www.Leidinger.net ***@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org ***@FreeBSD.org : PGP 0x8F31830F9F2772BF