- Move a bit of the shared logic of secondary bootstrapping into a new
function, lk_secondary_cpu_entry_early() which sets the current cpu
pointer before calling the first half of the secondary LK_INIT
routines.
- Create the per cpu idle threads on the main cpu instead of the
secondary as they come up.
- Tweak all of the SMP capable architectures to use this new path.
- Move the top level mp routines into a separate file top/mp.c
- A bit more correctly ifdef out more SMP code.
- Make the secondary entry point be logically separate function, though
declared in the same file.
- Add a trick where the kernel base + 4 is the secondary entry point.
Not really useful except makes it easy to compute the offset
elsewhere.
- Changed the entry point to arm64_reset and move _start to the linker
script, which is what most other arches do.
- While was in the linker script, make sure the text segment is aligned
on MAXPAGESIZE, though doesn't make any real difference currently.
- Generally clean up the assembly in start.S with newer macros from
Fuchsia, and avoid using ldr X, =value as much as possible.
- Fix and make sure arm64 can build and run with WITH_SMP set to false.
Add a new no-smp project to test this.
Note this will likely break systems where all of the cpus enter the
kernel simultaneously, which we can fix if that becomes an issue.
Secondary code now completely assumes the cpu number is passed in x0.
This can be emulated with platform specific trampoline code if it needs
to that then just directs into the the secondary entry point, instead of
trying to make the arch code have to deal with all cases.
- Add a percpu structure for each cpu, akin to x86-64 and riscv. Pointed
to by x18, which is now reserved for this in the kernel. Tweaked
exception and context switch routines to leave x18 alone.
- Remove the cpu-trapping spinlock logic that is unused in mainline,
probably. (Can add a new version of it back if it's necessary).
- Switch fdtwalk helper to using the newer, cleaner way of initializing
secondaries using the PSCI CPU_ON argument that should be pretty
standard on modern implementations. (Possibly an issue with old
firmware).
- Remove the notion of computing the cpu ID from the Affinity levels,
which doesn't really work properly on modern ARM CPUs which more or
less abandoned the logical meaning of AFFn.
More definitively set up each cpu's SCTLR_EL1 instead of relying on any
default values being present. Also set all RES1 values to 1 according to
what is useful at the moment, generally giving the maximum amount of
priviledges EL1 and EL0.
The test inside riscv/rules.mk was assuming gcc and that the CC variable
isn't passed in from the user. This is not a very clean solution and
acts like a bandaid over the problem. Added some todos for a potential
solution.
No real functional change except how the smaller ARCH_DEFAULT_PAGE_SIZE
is now computed and set in defines.h instead of rules.mk for arch/arm to
be consistent with the other arch that has a large/small build (riscv).
Clean up make targets list-arch and list-toolchain to be much faster and
work without needing to invoke the archtecture's arch rules.mk. This
should make it work on machines that do not have that particular
toolchain in the path.
This is setting up for using it in the github action script.
A large pile of changes to the PC platform and x86 architecture that
facilitate SMP support. Tested in both 64 and 32bit on qemu and real
hardware all the way back through i486.
Currently only supports both set at the same time, but rearrange their
order such that theoretically both are pretty independent.
Somewhat untested in that there's no configuration for !KERNEL_VM and
WITH_SMP, but at least allows that config to be tested.
When dropping from EL2 (or EL3), load vmpidr_el2 and vpidr_el2 with the
correct values to make sure EL1 sees the 'real' mpidr_el1 and midr_el1.
Though in most cases they're already configured by whatever firmware ran
before, there's no actual guarantee that it is, and it may be full of
random garbage.
Sadly this doesn't really work in all situations and only happens to
work with gcc + binutils for 32bit accesses, presumably because gnu as
replaces a literal 0 with wzr.
Clang doesn't understand it at all.
This reverts commit 6c14941dec.
To work properly with some hypervisors on various architectures (ARM,
ARM64, x86), add global routines to allow access to MMIO registers via
architecturally defined accessors.
Add accessors for ARM, ARM64, and x86-32/64. Have the other arches
default to just using whatever the compiler emits.
Will need to generally move things off the legacy REG*() accessors
since they're really not safe going forward with what compilers emit.
Precisely set bits [55:22] of the vaddress in bits [43:0] for the vae1is
and vaee1is TLBI commands.
On platforms where FEAT_TLL is implemented, bits [47:44] of the command
accept a TTL parameter which can optionally be set to hint the
translation table level containing the address being invalidated.
Implementations aren't architecturally required to perform the
invalidation if the hint is incorrect however. Invalidations may
therefore fail with the current implementation if the vaddress has bits
set in [58:55].
This is notably an issue on ARM fastmodels which doesn't perform the
invalidation when the TTL parameter is incorrect.
Clang does not accept this .if condition since phys_offset is a register
alias and not an absolute expression. We can keep these two instructions
here if the argument is zero since the result will be the same.
Additionally, this macro is only called once and always passes a non-zero
argument. If more calls are added in the future and avoiding these two
instructions just before a loop is really important, we could use
`.ifnc \phys_offset,0` instead, but that looks rather obscure to me.
Even though we only need 32 bits here, clang warns that we should be
using a "w" register in the inline assembly (which is not legal with
mrs/msr). Silence the warning by declaring the value as unsigned long.
(cherry picked from commit a0de5d88dfc67b3ba34c0455b1619e12e6cfccae)
Trim the arch mmu unit tests accordingly.
Should probably switch this to a #define, but it's possible some of
these queries could be dynamically detected (XN for example). May
revisit at some point.
Allow asking the arch layer if it supports NX pages or NS pages.
Have the arch mmu test code test accordingly.
Also tweak the tests to pass on arm32 mmu, which does not precisely
match the return semantics of the rest of the mmu routines on map/unmap.
Change the early startup code to set TCR_EL1.IPS to
ID_AA64MMFR0_EL1.PARange if it has a defined value (the currently
defined values have the same meanings), but use 48-bit PAs if 52-bit
PAs are supported because 52-bit PAs have a different translation
table format that we don't support. Stash the computed TCR_EL1 in a
variable and use it in the context switch code.
Have the arch define additional compiler flags to explicit support or
not support a floating point unit.
Add ability for modules to per file or for the whole module mark code
as needing floating point support.
Add default flags for arm64, riscv, and x86 toolchains.
Needed because gcc 12 is getting much more aggressive about using vector
instructions for non float code, so getting away with avoiding it was
no longer working.
Still not perfect: printf code is being compiled with float, so it's
possible to use floating point instructions inside core kernel or
interrupt handling code if a printf is used.
Possibly will have problems on architectures where mixing float and non
float code at the linker generates issues, but so far seems to be okay.
I noticed that LK failed to boot on systems that do not support 64KB
page sizes (e.g. Linux KVM guest on Apple M1) because the trampoline
translation table used a compile-time hardcoded 64KB page size.
Instead of trying to make the trampoline translation table code
look for a supported page size at runtime, I realized that it should
be possible to remove the trampoline translation table entirely by
replacing it with a VBAR that branches to the instruction following
the MMU enable. That's what this patch does.
Add cache clean + invalidate on the page tables that get modified during
startup before the MMU is enabled. Without this, if these memory regions
were present in cache before LK started, the CPU will see the stale
cached values as soon as the MMU is enabled. Invalidating these forces
the CPU to fetch the correct values from memory after the MMU is enabled.
If we were booted at EL2 (e.g. when passing -machine
virt,virtualization=on), we need to use SMC instead of HVC for PSCI
calls. Change psci_call() to do this and add a flag to do-qemuarm to
allow testing this scenario.
It is possible for early initialization functions such as lk_main()
to contain NEON instructions because we don't build the kernel with
-mgeneral-regs-only. As a result we can end up taking an FPU exception
before we are ready to handle it.
We didn't have this problem when starting at a higher exception level
than EL1 because we turned off FPU traps in arm64_elX_to_el1(). But we
neglected to do so when starting at EL1. Fix the problem by moving the
CPACR_EL1 manipulation out of arm64_elX_to_el1() and into arm_reset().
Much of the start.S path avoids using these registers up until now to
avoid trashing any state, but its getting fairly difficult and error
prone to keep this up. Save the args as soon as its known that its the
boot cpu in a temporary place prior to calling lk_main. Wastes 32 bytes
of memory but should be more solid.
It's called immediately upon entering the kernel entry vector, prior
to knowing if it's the boot cpu or needing to save any boot arguments,
so avoid using these registers
Previously would only set both UXN and PXN for no execute pages, but for
pages not marked no execute, neither bit was set. Change to mask out the
other privilege mode.
Previously if they couldn't find the toolchain they would full stop the
build. Change to print a warning and then go with the default prefix.
Hopefully this doesn't break anyone downstream but it's helpful for the
CI builder which wants to read from the build system which toolchain to
grab prior to having it in the path.
When a crash is because of a BRK instruction, print that instead of
the default "unhandled synchronous exception".
Bug: 179516283
Change-Id: I9667d7157d24a79e2b2ceb7ef283ebc2b09398d0
1) Decode FSC and dump more human readable status
2) Add support of stack unwinding as referred from
arm64 procedure call standard and frame pointer usage.
3) Compiler options for not omitting frame pointer
are enabled to ensure usage of frame pointers even
with higher optimization levels enabled.
Signed-off-by: vannapurve <vannapurve@google.com>