- Add a percpu structure for each cpu, akin to x86-64 and riscv. Pointed
to by x18, which is now reserved for this in the kernel. Tweaked
exception and context switch routines to leave x18 alone.
- Remove the cpu-trapping spinlock logic that is unused in mainline,
probably. (Can add a new version of it back if it's necessary).
- Switch fdtwalk helper to using the newer, cleaner way of initializing
secondaries using the PSCI CPU_ON argument that should be pretty
standard on modern implementations. (Possibly an issue with old
firmware).
- Remove the notion of computing the cpu ID from the Affinity levels,
which doesn't really work properly on modern ARM CPUs which more or
less abandoned the logical meaning of AFFn.
More definitively set up each cpu's SCTLR_EL1 instead of relying on any
default values being present. Also set all RES1 values to 1 according to
what is useful at the moment, generally giving the maximum amount of
priviledges EL1 and EL0.
This will generally turn off more FPU codegen, even if its using
software fallback unless the project/target/platform selects a cpu that
has FPU support. This also turns off a few blocks of test code and the
upcoming floating point printf if it's not present on the arch.
This may break projects that were compiling for say cortex-m0 but
expected FPU code to be present. If so it should be pretty easy to
override it, but not going to add that yet unless it's necessary.
- X86 cpuid feature list dump was using the wrong array and walking off
the end of one.
- GICv2 code had a left shift by up to 31 of an integer. Needs to be
unsigned.
- PLIC same as GIC code.
- fdtwalker code should be using a bytewise accessor based helper
function for reading large integers out of an unaliged FDT.
- PCI BIOS32 search code could do a 32bit unaligned read of a string,
switch to using memcmp.
The test inside riscv/rules.mk was assuming gcc and that the CC variable
isn't passed in from the user. This is not a very clean solution and
acts like a bandaid over the problem. Added some todos for a potential
solution.
Most of the functions for this was declared in a top level lk/ include
space, so go ahead and move it there.
A few exceptions:
- Moved spin() over to platform/time.h and platform/time.c since the
function more logically belongs to platform/time.h. Any users of
spin() will need to update their headers to include platform/time.h
instead.
- Renamed spin_cycles() to arm_cm_spin_cycles() and moved over into
arm/cm.h since it is currently defined in arch/arm-m and only used for
targets that implicitly are for arm-m.
Doesn't do much but provided the detection path for it and ability to
hold initialized state. The higher level platform code is going to need
to use it directly so will mostly just provide an api for access to it.
Moved ACPI sniffing back to just after the VM is initialized instead of
all the way into platform_init(). This should try to ensure that all
drivers that come up afterwards will have ioapics discovered in case
future development tries to enable and use them, kicking the machine out
of virtual-wire-mode.
I have a bulldozer machine here that curiously starts the APIC IDs for
the cpus at 16 and counts up.
This is a problem since the current code assumes that the boot cpu is 0,
and would try to start itself (apic id 16) later because it thought it
was the first secondary. Fix this by re-reading the APIC id on the boot
cpu and patching the percpu structure a bit into boot. Kinda a hack but
avoids having to detect the APIC, find the type of ID to read, etc.
Also means that practically speaking the system is using the full 32bit
APIC IDs if that feature is present, since now the local apic id is
entirely read from the local apic as it should be (if present).
Fixes#475
No real functional change except how the smaller ARCH_DEFAULT_PAGE_SIZE
is now computed and set in defines.h instead of rules.mk for arch/arm to
be consistent with the other arch that has a large/small build (riscv).
Clean up make targets list-arch and list-toolchain to be much faster and
work without needing to invoke the archtecture's arch rules.mk. This
should make it work on machines that do not have that particular
toolchain in the path.
This is setting up for using it in the github action script.
Require the platform or arch to provide it. The default was pretty
obsolete, assuming a 32bit arch with a split 50/50 mapping. May as well
make it a forced arch thing so that it's known to be correct.
Had to define it for OR1K, it was already defined for all the other
(known) architectures.
For newer SMP and modern cpus, just doing a full fpu context switch
is simpler and cheaper than dealing with the complexity of lazy fpu
context switch logic.
Plus this old logic was broken in the recent SMP-ification of the x86
code base. A functional version of this would be far more complex.
A large pile of changes to the PC platform and x86 architecture that
facilitate SMP support. Tested in both 64 and 32bit on qemu and real
hardware all the way back through i486.
In legacy builds it's possible to boot on a cpu that doesn't appear to
have CR4 implemented (Am586 to be precise), but there's no features
needed to set, so it seems that this was architecturally okay.
32 and 64 bit:
- For now SMAP causes the mmu unit tests to fail, so disable.
- Make sure CR4.PGE is set if present.
- Make sure the rest of the system knows that user aspaces are available
on 32bit.
- Skip cpus in the MADT table that are not enabled
- Bump count to 16 cpus
- Move the spurious interrupt vector to 0xff since it needs to end in
0xf on <=P6.
- Fix an assert in local apic code when not using x2apic and starting
secondaries.
- Follow the spec a bit closer and wait up till a second for each
secondary core to start.
-Move the local apic driver to arch/x86
-Add routines to send IPIs between cpus
Something is unstable at the moment and the system crashes after a while
with random corruptions when using SMP.
Currently only supports both set at the same time, but rearrange their
order such that theoretically both are pretty independent.
Somewhat untested in that there's no configuration for !KERNEL_VM and
WITH_SMP, but at least allows that config to be tested.
This triggered a out of range count overflow warning when
building for RV32 (calling the macro with `ticks >> 32`)
since the truncation happened before the shift.
Rearrange some of the cpu initialization code to be runnable on each cpu
as they come up. Complete the 64bit bootstrap mechanism and call into C
code.
Makes it as far as trying to reschedule via an IPI. Need to implement
local apic based IPI mechanism.
It's getting too hard to maintain a single layout that works with both,
so go ahead and split it. Also redo the layout so it should be usable
with user space and syscall and sysenter instructions from either mode.
I hadn't noticed this before, but you can directly reference a global
variable in a load/store in assembly, which combines a lla + ld/sd into
a 2 instruction pair instead of 3 due to the 12 bit offset provided in
the load/store.