The pendsv_ asm handler has been pushing 9 words on the stack prior to
calling into C code. This violates the ABI which requires 8 byte
alignment. It has worked mostly fine and thus hasn't been caught before.
Add an extra bump of the stack to align it after pushing the registers.
Release the thread lock before context switching to a thread that was
preempted and thus not holding the thread lock. Add a few asserts to
make sure this invariant is maintained in the context switch and PENDSV
handler.
This has never mattered before because the thread lock (and other
spinlocks) were not being tested for validity on by definition single
processor cortex-m systems. After adding some code to test the
spinlocks' values this discrepancy was uncovered.
arch_mmu_map was failing hard, because the identity mapping does not fall within the `vmm_get_kernel_aspace`
this creates a new aspace covering the loader, so it can identity map
linux is also unable to use the FPU if lazy FPU context switching had turned it off prior to the chainload, `arm_fpu_set_enable()` is used to turn it back on
Previously if they couldn't find the toolchain they would full stop the
build. Change to print a warning and then go with the default prefix.
Hopefully this doesn't break anyone downstream but it's helpful for the
CI builder which wants to read from the build system which toolchain to
grab prior to having it in the path.
It's non ideal, since there's no relaxation done in the linker so we
have to assume the branch target is > 16 bits away and do what the
compiler usually does and emit a full 32bit jsr.
Mostly rewrote to be cleaner and more obvious what it's doing, but turns
out the real problem was a lack of "memory" and/or volatile. In one of
the test cases the compiler was rearranging the arch_ints_disabled()
check.
Uses the QEMU virt machine for 68k defined in qemu 6.0+.
Basic support that boots, prints to the console, takes input from
console, and context switches.
TODO: interrupt support, timer support.
The issue was found on AMD machine when run lk with qemu kvm, it
can't boot if kvm hardware is enable in qemu.
According to Intel system programming guild Chapter 4 "Paging",
if the page table entry is non-leaf entry, then the G bit will be
ignored.
However, According to AMD programmer mannul Volume 2, Chapter 5.3
"Long-Mode Page Translation", the non-leaf page table entry G bit
must be zero.
The patch sets inner page table entry G bit to zero so that it
works on both Intel and AMD CPU
Add build system support for at least being aware of the FPU on
the architecture, not building code to use it.
At the moment, only sets up the FPU into Initial state prior to
entering user space and then ignores it.
For user space support, the sscratch register cannot hold the pointer to
the current cpu, as much as it is convenient.
Change the logic to use tp register (x4) to point to percpu, and
dereference the local thread from it directly.
Create a user space address space, map some pages, query the pages,
context switch, to the new aspace, access the pages.
Basic test that the aspace abstraction is working.
Will generate errors on some of the arches that dont fully implement all
of this, but not a crash.
Kind of wasteful, but much simpler than having to manually sync every time something changes
in the kernel aspace. I think riscv machines with mmu can waste 1MB of page tables up front.
Can revisit later if needed.
When a crash is because of a BRK instruction, print that instead of
the default "unhandled synchronous exception".
Bug: 179516283
Change-Id: I9667d7157d24a79e2b2ceb7ef283ebc2b09398d0
Up until now the bottom part of ram has been identity mapped, left over
from initial bootstrapping. Set up two top level page tables: one with the
the identity map and one without. Once the kernel starts switch to the second
but keep the former around for bootstrapping secondary cpus.
Start adding support for user address spaces, currently mostly untested.
Still have to solve the problem of keeping the kernel parts of the page tables
in sync. Will probably preallocate all of the ones needed.
Move out of inline routines since the body is relatively
large and to keep the disassembly clean. Have spinlocks
store the holder cpu + 1 instead of just 1. Add an appropriate
barrier to the release.
If the cpu is always in thumb mode there's really no reason to pass
this switch and it can and does foul up libgcc selection.
Possible it can be removed entirely since the build system doesn't
really support anything prior to armv7 or armv6 where thumb interwork
became implicit. Unclear if it'll cause linking issues to not have it
set, however.
This provides a way for a platform or target to insert code or
other secret sauce in front of the vector table for targets that
need a second stage loader prepended or something like that.
Instead of compiling each .c or .cpp or .S file into an equivalent .o file,
map it to a file with .c.o or .cpp.o extension.
IE,
foo.c -> foo.c.o
bar.cpp -> bar.cpp.o
Reason for this being that if you change the suffix of a file it'll
automatically pick it up and recompile.
-Add support for probing SBI extensions
-Switch to newer versions if present
-Add HBM extension which allows proper secondary cpu bootstrap
-Add support for secondary bootup via HBM.
Turns out that in some cases we can't really rely on a particular boot cpu id
so go ahead and simply dynamically assign cpu numbers by having each cpu
increment an atomic number in the very top of start.S. The first one in gets
the worm.
In a two segment binary was previously trying to use MAXPAGESIZE set to 4
to cram the sections in very tight for embedded. This seemed to with this
toolchain cause the linker to get confused and sometimes appear to stuff in
an extra 4 bytes in the output file, thus misaligning the data segment.
It's possible it's still a bug on my side in the linker, but setting
max_page_size to 8 seems to work around it for now. Possible there's some
implicit 64bit aligning slipped in a stage somewhere in binutils thats
causing it to get confused. Either way, 8 byte alignment is no large loss
here.
Previous to now it had always relied on a custom patched gcc
and a custom sim. In the interim since the initial port went in
some time in 2015 GCC and QEMU have both officially picked up support
for the architecture and the machine that was emulated in the previous
emultor.
Using gcc 10.2 fix up the build and get it basically working. Timers
seem to not be working right but it's probably fairly easy to fix.
1) Decode FSC and dump more human readable status
2) Add support of stack unwinding as referred from
arm64 procedure call standard and frame pointer usage.
3) Compiler options for not omitting frame pointer
are enabled to ensure usage of frame pointers even
with higher optimization levels enabled.
Signed-off-by: vannapurve <vannapurve@google.com>
All shipping ARMv7-R processors include VFPv3-D16, a subset of VFPv3
with only 16 double-precision registers. The first -R profile CPU
with a complete VFP or NEON implementation is the Cortex-R52,
implementing ARMv8-R.
LK's context switch code had a dynamic check and would only save
d16-d31 if present, but gcc/gas will not assemble code that includes
references to d16-d31 when mcpu=cortex-r4f or other v7-R CPUs.
Ideally we'd key the #if off of __TARGET_FPU_VFPV3_D16, but that
only appears to be defined by the ARM compilers, not gcc. Use
V7-R as the key instead.