To work properly with some hypervisors on various architectures (ARM,
ARM64, x86), add global routines to allow access to MMIO registers via
architecturally defined accessors.
Add accessors for ARM, ARM64, and x86-32/64. Have the other arches
default to just using whatever the compiler emits.
Will need to generally move things off the legacy REG*() accessors
since they're really not safe going forward with what compilers emit.
For both 32 and 64bit x86, have each of the exception stubs which push a
few words and branch to the common isr routine be simply 16 byte aligned
to make it easy to calculate the offset from the main isr table. This
cleans up some complexity that was actually broken for interrupts >= 0x80.
Also:
-Switch alignment directives to .balign
-Expand the x86-32 exception table to a full 256
-Remove an extraneous define
-Make sure the IDT is 8 or 16 byte aligned
-Use END_DATA and END_FUNCTION in the exception and gdt asm files
The current code results in
`error: invalid reassignment of non-absolute variable 'isr_stub_start'`.
Use a numbered label instead (as that can be reassigned) and reference
the last occurrence using the b suffix.
Move the mmu_initial_mapping from platform into arch/x86. For this
architecture the default mappings are basically hard coded in arch/x86
anyway, so move ownership of this data there, closer to where it's
actually initialized.
Update the 32 and 64bit paging code to properly use the paddr_to_kvaddr
and vice versa routines.
Update to allocate page tables directly from the pmm instead of the heap
(on 32bit code).
Clean up the x86.h file a bit with how constants are defined
Switch rdtsc to builtin
Added all of the known bits for the main CR registers.
Move the invlpg macro over to the common header.
Update comments in start.S
Trim the arch mmu unit tests accordingly.
Should probably switch this to a #define, but it's possible some of
these queries could be dynamically detected (XN for example). May
revisit at some point.
-Large cleanup of both the 32 and 64bit mmu code. Still the same general
flow but tighten up usage of mixing physical addresses with virtual
addresses.
-Remove the PAE code path from 32bit mmu, which was unused.
Allow asking the arch layer if it supports NX pages or NS pages.
Have the arch mmu test code test accordingly.
Also tweak the tests to pass on arm32 mmu, which does not precisely
match the return semantics of the rest of the mmu routines on map/unmap.
Some of the 16 bit values the cpu pushes on the stack are aligned to
32bit offsets but actually only 16 bits were pushed. Make sure printf
masks off the top 16 bits when printing these fields out.
Save the model/family information in a new structure.
Move the printing portion of the detection to the arch_init runtime
so it gets a chance to print after the uart is initialized.
Turns out all of the in and rep ins/outs instruction macros have been wrong
since they were first added.
-Make sure they clobber memory
-Make sure edi/esi/ecx registers are marked as both read and written by
the instruction.
The latter was causing a codegen problem in the ide driver where the
pointer was pushed forward by a rep ins but the compiler didn't know the
register was modified.
Parse up to 16 pmm arenas from the multiboot memory data structure. Roll
the 32bit code to properly trim at 1GB as before, but using new logic.
Remove conditional checks on WITH_KERNEL_VM in x86 code, which only
really compiles with the mmu and the vm on.
-march=x86-64-v2 is not supported on old compilers.
Can fix in the future, but for the moment may as well just drop -v2
since it's not really being used.
Have the arch define additional compiler flags to explicit support or
not support a floating point unit.
Add ability for modules to per file or for the whole module mark code
as needing floating point support.
Add default flags for arm64, riscv, and x86 toolchains.
Needed because gcc 12 is getting much more aggressive about using vector
instructions for non float code, so getting away with avoiding it was
no longer working.
Still not perfect: printf code is being compiled with float, so it's
possible to use floating point instructions inside core kernel or
interrupt handling code if a printf is used.
Possibly will have problems on architectures where mixing float and non
float code at the linker generates issues, but so far seems to be okay.
Previously if they couldn't find the toolchain they would full stop the
build. Change to print a warning and then go with the default prefix.
Hopefully this doesn't break anyone downstream but it's helpful for the
CI builder which wants to read from the build system which toolchain to
grab prior to having it in the path.
The issue was found on AMD machine when run lk with qemu kvm, it
can't boot if kvm hardware is enable in qemu.
According to Intel system programming guild Chapter 4 "Paging",
if the page table entry is non-leaf entry, then the G bit will be
ignored.
However, According to AMD programmer mannul Volume 2, Chapter 5.3
"Long-Mode Page Translation", the non-leaf page table entry G bit
must be zero.
The patch sets inner page table entry G bit to zero so that it
works on both Intel and AMD CPU
Mostly deduplicating x86-32 and x86-64 code since virtually all of it
can be shared.
Fixed up some cpuid usage which was not properly marking registers as
clobber.
This lets some arches return a 64bit counter.
As a result of fixing this, removed -Wno-format switch in the test app
which caused the need to fix a lot of printfs.
Easy to do except for the legacy compile case for i386, in which case we
have to start defining fallthrough atomic routines that the compiler
will call.
At the moment only implement __atomic_fetch_add_4 since its the only one
in use.
Previously, was relying on a regular definition with the arch_ops.h code
overriding it with a static inline. This has been annoying for some
years since it forces the declarations to be in order. Change it to
simple declare an inline wrapper around an arch_ routine that does
whatever it needs to do.
Set -fno-builtin to keep the compiler from generating load/stores using
sse outside of floating point code. Not ideal for a lot of reasons but
it's difficult to segregate kernel code and user code such that it only
generates SSE instructions there.
Will probably need to do some work to let certain flags be set per
module, and then have only some of the modules be marked as user vs
kernel.