I used to work closely with the Android team at Unity, and in my experience, shifting large native codebases to a new page size often uncovers subtle runtime assumptions beyond just replacing hardcoded constants like PAGE_SIZE. I’m optimistic Google’s tooling will help a lot, but interested about how effectively it catches these more nuanced compatibility issues like custom allocators or memory pooling tuned for 4K boundaries.
Someone I collaborate with has been having all sorts of fun with a 4k->64k page transition for stuff running on Arm. Among some of the fun has been discovering memory leaks that really weren't noticeable or a big deal at 4k, but now that the page is 16x larger, suddenly becomes noticeable and can even cause problems.
Tangent, but 1Mb pages aren't that absurd really. The x86-64 has hardware support for 4k, 2MB, and 1GB page sizes (because each level of the pagemap cuts 9 bits from the virtual address). Luckily it supports all 3 mixed together so normally you just keep most of your data in 4kb pages and use 2Mb/1Gb occasionally. But from my understanding nothings prevents you from forcing 2Mb on all userspace code even though Linux kernel doesn't support it.
Memory page size should be transparent for tagged pointers (any pointers, really), I don't see how they can be affected. You have an object at address 0xAB0BA, does the size of underlying page matter?
It can be an issue of behavior; for example, Redis recommended disabling transparent huge page support in Linux because of (among other things?) copy-on-write memory page behaviors, and still does if you're going to persist data to disk.
1. You have a redis instance with e.g. 1GB of mapped memory in one 1GB huge page
2. Redis forks a copy of itself when it tries to persist data to disk so it can avoid having to lock the entire dataset for writes
3. The new Redis process does anything to modify any of the data anywhere in that 1GB
4. The OS has to now allocate a new 1GB page and copy the entire data set over
5. Oops, we're under memory pressure! Better page out 1GB of data to the paging file, or flush 1GB of data from the filesystem cache, so that I can allocate this 1GB page for the next 200ms.
You could imagine how memory allocators that try to be intelligent about what they're allocating and how much in order to optimize performance might care; when a custom allocator is trying to allocate many small pages and keep them in a pool so it can re-use them without having to request new pages from the OS, getting 100x 2M pages instead of 100x 4k pages is a colossal waste of memory and (potentially) performance.
It's not necessarily that the allocators will break or behave in weird, incorrect ways (they may) but often that the allocators will say "I can't work under these conditions!" (or will work but sub-optimally).
This just provides yet another example of why tagged pointers are a terrible idea and shouldn't be used. Someday, more of the address space will get used and your software will break.
I can understand the desire for google to want devs to recompile their apps, but I don't see the need to dump old apps from the app store... who cares if an old app that works wastes 12k if it only needs a single 4k page?
You have to update an application every year, even if it is just meaningless version bump. Otherwise it will be removed after 2 years. Despite saying that this policy is required to ensure user security, several recent Android releases didn't have any corresponding major security changes.
I am not familiar with Android, but Linux ELF binaries that specify 4KB alignment will not work on systems with 16KB page sizes, since the ELF interpreter will refuse to load them. This hit me recently when trying to run a 32-bit binary on a Linux ARM system that had 16KB size pages, since the 32-bit OpenSSL libraries specified 4KB alignment. Presumably, this was done for maximizing entropy available to ASLR, but it breaks the binaries when the page size increases.
In any case, I assume that there is something similar affecting Android.
As a user not involved in android or linux development: I don't care. Fix it. You just don't break the entire ecosystem of unmaintained apps for a 3% performance improvement.
We maintained win32-x86 executable compatibility for decades. Keeping things working might require some sort of emulation layer, and it might impact performance substantially, and that's fine. I can accept that.
"Everything just stops working" is not an option for a real operating system. I don't expect to put my workshop tools away and wake up in the morning to find the toolchest manufacturer sent them to the landfill because they didn't efficiently fit their new drawers.
One of the areas that Android is common in that I couldn't possibly recommend is home automation. Your light switches are 50-year purchases. Odds that the app based light switches are working in five years are 50/50... Compound odds of longer are miniscule.
Is anything industrial is going to be built on Android.
There are no ATM's, no manufacturing CNC machines etc.
One my say everything that runs on Android is throw away. It is only recently Samsung and Google started to aim at 7 year life spans. At 7 years for an industrial piece of equipment, I may not have even paid it off as yet, then again is the software on these things even updated?
Page size impacts page permissions; it's not a matter of wasting 12k, it's that with 4kb pages you're allowed to have a consecutive 8kb region with different permissions. 16kb pages can't do that without segfaulting every time memory is used "wrong", and trying to fix that up transparently would be a nightmare.
That's a valid point, but isn't memory protection the only common user-visible effect of changed page sizes? It would seem most apps which do not use write-protected memory would be unaffected.
I think the most immediate problem would be ELF segments that aren't 16kb aligned. Code will abut data, you can't add a gap without breaking offsets inside the ELF, and you'll induce the segfault during every write to a global at the start of the writable code, or executing code at the end of the code segment.
A less safe option would be for permissions to be a union in that region, as code rarely depends on a permission being absent. That would be quite the security hole though.
> 16kb pages can't do that without segfaulting every time memory is used "wrong", and trying to fix that up transparently would be a nightmare.
I would natively imagine the kernel could trap that and remap on the fly, at the tiny cost of murdering performance. Is that untrue, or is the perf so bad that it's not worth it?
Anything which involves the kernel tracking permissions at 4k granularity despite using larger pages is just going to be worse in every way than using 4k pages.
It depends on how much of a program actually triggers the failure case, so you can't answer in the abstract.
In the worst case, ~every memory access causes the kernel to need to fix it, causing every memory access to be several orders of magnitude worse (e.g. a hot cache hit vs trapping into kernel, wiping caches, at the very least hundreds more accesses).
EDIT: I see you suggested remapping the page permissions. Maybe that helps! But maybe it adds the cost of the remapping onto the worst case, e.g. the first 4kb are instructions that write into the second 4kb.
> I can understand the desire for google to want devs to recompile their apps, but I don't see the need to dump old apps from the app store
It could be that Google explicitly wants to dump un(der) maintained apps. Sure some might be clean and basic utilities that will work till the end of time, but many are probably abandonware or crummy demos and hello-world apps that look old and dated. They went on a whole purge recently already.
The App Store market is changing as Google/Apple grapple with efforts to end their monopolies. Maybe they're seeing that change and trying to use their dying advantage to frame themselves as the curated and reputable stores with high-quality maintained and up to date apps. When distribution is available from other places, they can chose their customers.
Performance, safety and IO critical code must care, because the page size affects TLB caching and is the finest granularity for security flags such as read-only, no execute, etc. which are critical for e.g. guard pages.
If your code that created two guard pages sandwiching a security critical page to make sure that under/overruns caused a page fault and crashed that assumed the boundary was at 4KiB, but is really now at 16KiB, that means that buffer overruns now will not get caught.
Further, code that assumed it was on a page boundary for some reason, for performance reasons, will now have only a 25% chance of being so.
It also means that MMIO physical pages that were expected to be contained within a 4KiB page such that when mapped into a sensitive user space driver context, neighboring MMIO control blocks wouldn't be touched, might be affected too since you'll get up to 3 neighboring blocks in either direction. This probably doesn't happen so often, I don't know Android internals much, but still something to consider.
This is in large part because PAGE_SIZE in a lot of C code is a macro or constant, rather than something populated at runtime depending on the system the code is running - something I've always felt is a bit problematic.
That being said, code that's hard coding PAGE_SIZE won't run anyway if using e.g. mmap() because it validates the page size and will error on mismatch.
This is going to wreak general havoc for a while no matter how you spin it.
Because the final ELF binary is linked to contain page aligned segments. Segments define how should the binary be loaded into memory and what permissions they require.
If you have a 4KB segment that is marked Read-Write followed immediately by a Read-Execute, naively loading it will open a can of security issues.
Moreover many platform data structures like Global Object Table of the dynamic executable uses addresses. You cannot simply bump things around.
On top of that libraries like C++ standard library (or abseil from Google) rely on the page size to optimize data structures like hash maps (i.e. unordered_map).
If you rely on being able to do things like mark a range of memory as read-only or executable, you now have to care about page sizes. If your code is still assuming 4KB pages you may try to change the protection of a subset of a page and it will either fail to do what you want or change way too much. In both cases weird failures will result.
It also can have performance consequences. For example, if before you were making a lot of 3.5KB allocations using mmap, the wastage involved in allocating a 4KB page for each one might not have been too bad. But now those 3.5KB allocations will eat a whole 16KB page, making your app waste a lot of memory. Ideally most applications aren't using mmap directly for this sort of thing though. I could imagine it making life harder for the authors of JIT compilers.
Some algorithms also take advantage of the page size to do addressing tricks. For example, if you know your page size is 4KB, the addresses '3' and '4091' both can be known to have the same protection flags applied (R/W/X) and be the same kind of memory (mmap'd file on disk, shared memory segment, mapped memory from a GPU, etc.) This would allow any tables tracking information like that to only have 4KB granularity and make the tables much smaller. So that sort of trick needs to know the page size too.
most code shouldn't but you don't know what the library you're using is doing behind the scenes. the few code that do care, if a lot of people use them as a dependency, that could get real messy real fast.
Weird. AFAIK 4K and 64K were the common ARM64 page sizes, and 16K was the odd "think different" one that Apple uses. No mention of 16K in the Linux kernel docs:
AMD also switched to 16k(4 x 4K) down from 8 in Zen1 for there PTE Coalescing system that is effectively run length like compression of page table entries with sequential addresses in to one TLB slot.
If you're making the migration at all, you really ought to be going for fully variable page sizes, otherwise 5 years from now there'll be a 64K page size CPU and suddenly everyone has to recompile everything again and there is another compatibility wall...
Is there a such a thing? Page size gets baked into things like executable layouts, plus any place that uses the PAGE_SIZE constant (instead of sysconf(_SC_PAGESIZE)).
4 KiB page sizes have been used since the 1960's. More memory doesn't necessarily mean that larger pages are beneficial. Maybe 16 KiB is better for Android? Maybe. There really is no clear consensus on what the optimal page size for modern architectures should be.
> Starting November 1st, 2025, all new apps and app updates that use native C/C++ code targeting Android 15+ devices submitted to Google Play must support 16 KB page sizes.
I realize that most apps wouldn’t need to make changes and that a recompilation would suffice, but is this time frame enough for the apps that do need code changes?
> I don't know what "targetting Android 15+" means specifically. Does that include anything with a lower API level?
- On Android, apps are built with targetSdkVersion set to the API version you're app is compiled for and tested against, but you cam set a lower minSdkVersion to the lowest device API version your app will run on.
- On devices with API level newer than targetSdkVersion, the OS looks at your app's targetSdkVersion and disables newer behaviours than your app is targetting. So the app should run well on newer devices.
- On devices with API level older than targetSdkVersion, but newer than (or same as) minSdkVersion, your own app is responsible for detecting missing APIs before trying to use them, and adapting itself to the older environment.
- On devices with API level older than minSdkVersion, your app will not be run. This ensures the user gets a clear failure, rather than unpredictable crashes or unexpected behaviour due to missing APIs the app tries to call.
So, in principle, it's possible to build an app which targets the most recent Android 15, while being capable of running on all versions of Android back to version 1. Apps linked for 16 kiB page-alignment should run on older devices that use 4 kiB pages too.
The Google Play Store enforces that targetSdkVersion is fairly close to the latest Android version. But it doesn't place requirements on minSdkVersion.
Android apps have a flag in their manifest which tells the OS "this app was built with Android X (API level X) in mind".
This allows the OS to selectively enable backwards compatibility and change certain behaviors (e.g. selectively enforce new permissions so old apps aren't broken).
Play Store requires apps to target new OSes and port APIs within certain time of an OS launching (usually ~2 years).
This avoids apps targeting older OSes to avoid new security and privacy enhancements (e.g. asking for permisisons to show notifications, asking for permisisons to access microphone, being allowed to show a fullscreen popup ad, etc. Those restrictions were all gated behind the target check.)
If you, yourself have native code you're trying to build, it only required bumping the NDK (which is automatically bumped when you upgrade the android gradle plugin), so that's mostly an automatic step (provided you're not stuck on old, AGP7 build scripts).
If you depend on a package that uses a native library, you wait for them to update. Or you fork, bump AGP and rebuild.
It's a very minor change, unless you depend on unmaintained code.
From the experience implementing 64K page sizes on aarch64 in Fedora & RHEL, this is not going to be a simple transition. All sorts of things will break in subtle, strange and interesting ways. Good luck to the Android team :-)
This is dumb. The abstraction is at the wrong level.
Applications should assume the page size is 1 byte. One should be able to map, protect, etc memory ranges down to byte granularity - which is the granularity of everything else in computers. One fewer thing for programmers to worry about. History has shown that performance hacks with ongoing complexity tend not to survive (eg. interlaced video).
At the hardware level, rather than picking a certain number of bits of the address as the page size, you have multiple page tables, and multiple TLB caches - eg. one for 1 megabyte pages, one for 4 kilobyte pages, and one for individual byte pages. The hardware will simultaneously check all the tables (parallelism is cheap in hardware!).
The benefit of this is that, assuming the vast majority of bytes in a process address space are made of large mappings, you can fit far more mappings in the (divided up) TLB - which results in better performance too, whilst still being able to do precise byte-level protections.
The OS is the only place there is complexity - which has to find a way to fit the mappings the application wants into what the hardware can do (ie. 123456 bytes might become 30 4-kilobyte pages and 576 byte pages.).
Your response to a change that's motivated by performance improvements is to suggest switching to a scheme that'll have catastrophically worse performance?
It would likely have better performance for similar power and silicon area, because a hierarchical TLB will have a higher hit rate for the same number of transistors.
If you're going to go that far, you might as well move malloc() into hardware and start using ARM-style secure tagged pointers. Then finally C users can be free of memory allocation bugs.
Transistors aren't free (as in power consumptions, thermal etc), and wasting them on implementing 1 byte granularity TLBs would probably be a hard sell, even if assuming everything can indeed be done in parallel.
Dozens of years of kernel building, dozens of OSes, dozens of physical architectures, all having settled on minimum 4KB pages being a right balance between performance and memory usage, wiped away by a single offhand comment with no knowledge about the situation. Now that's HN.
Just the sheer TLB memory usage and performance implication of doing single byte pages would send CPU performance back to the stone age.
Completely false. The 4 KiB page size came from a machine with a total of 512 KiB (1962 Atlas, 3072B pages, 96k 48b words). It hasn’t scaled at all for inertia reasons and it has real and measurable costs. 64 KiB would have been the better choice IMO, but 16 is better than 4.
Hence the "minimum" part. The thread is literally about Android being compiled for 16KB pages, CPU support for larger pages has grown, easily up to 4MB for most consumer CPUs.
Going down _lower_ than 4KB is purely a waste of memory and performance.
My proposed design has many page sizes - nothing stops a software developer making all mappings multiples of 4kb and not using the byte sized pages.
My example was 1mb, 4kb and 1 byte pages - but a real design would probably use every power of two, or every even power of two to get best use of the TLB space.
It hasn't been done before because of a chicken and egg problem. CPU designers don't build it because no OS has the ability to use it, and no OS uses it because no CPU supports it. It would be a substantial amount of work for both parties.
I used to work closely with the Android team at Unity, and in my experience, shifting large native codebases to a new page size often uncovers subtle runtime assumptions beyond just replacing hardcoded constants like PAGE_SIZE. I’m optimistic Google’s tooling will help a lot, but interested about how effectively it catches these more nuanced compatibility issues like custom allocators or memory pooling tuned for 4K boundaries.
Someone I collaborate with has been having all sorts of fun with a 4k->64k page transition for stuff running on Arm. Among some of the fun has been discovering memory leaks that really weren't noticeable or a big deal at 4k, but now that the page is 16x larger, suddenly becomes noticeable and can even cause problems.
Could they find those by setting page size to some absurdly large value like 1MB?
Tangent, but 1Mb pages aren't that absurd really. The x86-64 has hardware support for 4k, 2MB, and 1GB page sizes (because each level of the pagemap cuts 9 bits from the virtual address). Luckily it supports all 3 mixed together so normally you just keep most of your data in 4kb pages and use 2Mb/1Gb occasionally. But from my understanding nothings prevents you from forcing 2Mb on all userspace code even though Linux kernel doesn't support it.
A lot of software wont work if you do that. Many jits and memory allocators have opinions on page size. Also tagged pointers are very common.
Memory page size should be transparent for tagged pointers (any pointers, really), I don't see how they can be affected. You have an object at address 0xAB0BA, does the size of underlying page matter?
It can be an issue of behavior; for example, Redis recommended disabling transparent huge page support in Linux because of (among other things?) copy-on-write memory page behaviors, and still does if you're going to persist data to disk.
1. You have a redis instance with e.g. 1GB of mapped memory in one 1GB huge page
2. Redis forks a copy of itself when it tries to persist data to disk so it can avoid having to lock the entire dataset for writes
3. The new Redis process does anything to modify any of the data anywhere in that 1GB
4. The OS has to now allocate a new 1GB page and copy the entire data set over
5. Oops, we're under memory pressure! Better page out 1GB of data to the paging file, or flush 1GB of data from the filesystem cache, so that I can allocate this 1GB page for the next 200ms.
You could imagine how memory allocators that try to be intelligent about what they're allocating and how much in order to optimize performance might care; when a custom allocator is trying to allocate many small pages and keep them in a pool so it can re-use them without having to request new pages from the OS, getting 100x 2M pages instead of 100x 4k pages is a colossal waste of memory and (potentially) performance.
It's not necessarily that the allocators will break or behave in weird, incorrect ways (they may) but often that the allocators will say "I can't work under these conditions!" (or will work but sub-optimally).
Also quite a lot of kernel drivers allocate whole pages (sometimes the device being driven requires it).
This just provides yet another example of why tagged pointers are a terrible idea and shouldn't be used. Someday, more of the address space will get used and your software will break.
I can understand the desire for google to want devs to recompile their apps, but I don't see the need to dump old apps from the app store... who cares if an old app that works wastes 12k if it only needs a single 4k page?
Google already dumps old apps from store for no reason whatsoever:
https://android-developers.googleblog.com/2022/04/expanding-...
You have to update an application every year, even if it is just meaningless version bump. Otherwise it will be removed after 2 years. Despite saying that this policy is required to ensure user security, several recent Android releases didn't have any corresponding major security changes.
I am not familiar with Android, but Linux ELF binaries that specify 4KB alignment will not work on systems with 16KB page sizes, since the ELF interpreter will refuse to load them. This hit me recently when trying to run a 32-bit binary on a Linux ARM system that had 16KB size pages, since the 32-bit OpenSSL libraries specified 4KB alignment. Presumably, this was done for maximizing entropy available to ASLR, but it breaks the binaries when the page size increases.
In any case, I assume that there is something similar affecting Android.
As a user not involved in android or linux development: I don't care. Fix it. You just don't break the entire ecosystem of unmaintained apps for a 3% performance improvement.
We maintained win32-x86 executable compatibility for decades. Keeping things working might require some sort of emulation layer, and it might impact performance substantially, and that's fine. I can accept that.
"Everything just stops working" is not an option for a real operating system. I don't expect to put my workshop tools away and wake up in the morning to find the toolchest manufacturer sent them to the landfill because they didn't efficiently fit their new drawers.
One of the areas that Android is common in that I couldn't possibly recommend is home automation. Your light switches are 50-year purchases. Odds that the app based light switches are working in five years are 50/50... Compound odds of longer are miniscule.
I think most apps are written in Java, and according to the blog post will not be affected.
It's only the apps written in c++ that need to be compiled, and those are probably large games and heavily performance critical apps.
Is anything industrial is going to be built on Android. There are no ATM's, no manufacturing CNC machines etc. One my say everything that runs on Android is throw away. It is only recently Samsung and Google started to aim at 7 year life spans. At 7 years for an industrial piece of equipment, I may not have even paid it off as yet, then again is the software on these things even updated?
Point of sale systems, like Toast. Media systems on airplanes. Infotainment systems in cars.
In fact, NCR does sell an Android-based ATM solution. [1]
Android is actually used somewhat widely in embedded systems that need to provide a nice GUI to the user.
[1]: https://www.zdnet.com/article/ncr-launches-kalpana-an-androi...
Fortunately, like a truly embedded system, those are usually completely independent of Google or its app store.
Page size impacts page permissions; it's not a matter of wasting 12k, it's that with 4kb pages you're allowed to have a consecutive 8kb region with different permissions. 16kb pages can't do that without segfaulting every time memory is used "wrong", and trying to fix that up transparently would be a nightmare.
That's a valid point, but isn't memory protection the only common user-visible effect of changed page sizes? It would seem most apps which do not use write-protected memory would be unaffected.
I think the most immediate problem would be ELF segments that aren't 16kb aligned. Code will abut data, you can't add a gap without breaking offsets inside the ELF, and you'll induce the segfault during every write to a global at the start of the writable code, or executing code at the end of the code segment.
A less safe option would be for permissions to be a union in that region, as code rarely depends on a permission being absent. That would be quite the security hole though.
> 16kb pages can't do that without segfaulting every time memory is used "wrong", and trying to fix that up transparently would be a nightmare.
I would natively imagine the kernel could trap that and remap on the fly, at the tiny cost of murdering performance. Is that untrue, or is the perf so bad that it's not worth it?
Anything which involves the kernel tracking permissions at 4k granularity despite using larger pages is just going to be worse in every way than using 4k pages.
It depends on how much of a program actually triggers the failure case, so you can't answer in the abstract.
In the worst case, ~every memory access causes the kernel to need to fix it, causing every memory access to be several orders of magnitude worse (e.g. a hot cache hit vs trapping into kernel, wiping caches, at the very least hundreds more accesses).
EDIT: I see you suggested remapping the page permissions. Maybe that helps! But maybe it adds the cost of the remapping onto the worst case, e.g. the first 4kb are instructions that write into the second 4kb.
> I can understand the desire for google to want devs to recompile their apps, but I don't see the need to dump old apps from the app store
It could be that Google explicitly wants to dump un(der) maintained apps. Sure some might be clean and basic utilities that will work till the end of time, but many are probably abandonware or crummy demos and hello-world apps that look old and dated. They went on a whole purge recently already.
The App Store market is changing as Google/Apple grapple with efforts to end their monopolies. Maybe they're seeing that change and trying to use their dying advantage to frame themselves as the curated and reputable stores with high-quality maintained and up to date apps. When distribution is available from other places, they can chose their customers.
They already force apps to update to new APIs every couple of years, it was the only way to stop developers to keep using deprecated stuff.
Apps that expect 4K pages will try to enforce memory protections at that granularity.
"but I don't see the need to dump old apps from the app store"
They literally remove almost 50% total android app earlier this year, they clearly devoted to quality and security
What if you have a data structure that straddles the 4k boundary?
> This table describes who needs to transition and recompile their apps
information design is my passion
Table added as image, alt text vaguely explains table contents and has a spelling error. Great.
From my noobish standpoint, it feels like most code shounldn't care what the page size is? Why does it need te be recompiled?
What typically tends to break when changing it?
Performance, safety and IO critical code must care, because the page size affects TLB caching and is the finest granularity for security flags such as read-only, no execute, etc. which are critical for e.g. guard pages.
If your code that created two guard pages sandwiching a security critical page to make sure that under/overruns caused a page fault and crashed that assumed the boundary was at 4KiB, but is really now at 16KiB, that means that buffer overruns now will not get caught.
Further, code that assumed it was on a page boundary for some reason, for performance reasons, will now have only a 25% chance of being so.
It also means that MMIO physical pages that were expected to be contained within a 4KiB page such that when mapped into a sensitive user space driver context, neighboring MMIO control blocks wouldn't be touched, might be affected too since you'll get up to 3 neighboring blocks in either direction. This probably doesn't happen so often, I don't know Android internals much, but still something to consider.
This is in large part because PAGE_SIZE in a lot of C code is a macro or constant, rather than something populated at runtime depending on the system the code is running - something I've always felt is a bit problematic.
That being said, code that's hard coding PAGE_SIZE won't run anyway if using e.g. mmap() because it validates the page size and will error on mismatch.
This is going to wreak general havoc for a while no matter how you spin it.
Because the final ELF binary is linked to contain page aligned segments. Segments define how should the binary be loaded into memory and what permissions they require.
If you have a 4KB segment that is marked Read-Write followed immediately by a Read-Execute, naively loading it will open a can of security issues.
Moreover many platform data structures like Global Object Table of the dynamic executable uses addresses. You cannot simply bump things around.
On top of that libraries like C++ standard library (or abseil from Google) rely on the page size to optimize data structures like hash maps (i.e. unordered_map).
Typically low level code and some manual fiddling with memory by asuming page size.
Everything's ok until some obscure library suddenly segfaults without any error
Mostly for I/O, e.g. mmap requires file offset to be multiple of the page size.
Off the top of my head:
If you rely on being able to do things like mark a range of memory as read-only or executable, you now have to care about page sizes. If your code is still assuming 4KB pages you may try to change the protection of a subset of a page and it will either fail to do what you want or change way too much. In both cases weird failures will result.
It also can have performance consequences. For example, if before you were making a lot of 3.5KB allocations using mmap, the wastage involved in allocating a 4KB page for each one might not have been too bad. But now those 3.5KB allocations will eat a whole 16KB page, making your app waste a lot of memory. Ideally most applications aren't using mmap directly for this sort of thing though. I could imagine it making life harder for the authors of JIT compilers.
Some algorithms also take advantage of the page size to do addressing tricks. For example, if you know your page size is 4KB, the addresses '3' and '4091' both can be known to have the same protection flags applied (R/W/X) and be the same kind of memory (mmap'd file on disk, shared memory segment, mapped memory from a GPU, etc.) This would allow any tables tracking information like that to only have 4KB granularity and make the tables much smaller. So that sort of trick needs to know the page size too.
most code shouldn't but you don't know what the library you're using is doing behind the scenes. the few code that do care, if a lot of people use them as a dependency, that could get real messy real fast.
Weird. AFAIK 4K and 64K were the common ARM64 page sizes, and 16K was the odd "think different" one that Apple uses. No mention of 16K in the Linux kernel docs:
https://www.kernel.org/doc/html/next/arm64/memory.html
64K starts to be a little too wasteful. It is a small performance gain as you'd expect, but less granularity means significantly more wasted memory
On a phone with limited RAM, this starts to be a bad tradeoff quickly. 16K is a reasonable jump from the venerable 4K page size.
The 64-bit kernel shipped for the Raspberry Pi 5 uses 16KB pages.
AMD also switched to 16k(4 x 4K) down from 8 in Zen1 for there PTE Coalescing system that is effectively run length like compression of page table entries with sequential addresses in to one TLB slot.
16K is the weird one in practice, but ARM says they are implementation-defined.
https://developer.arm.com/documentation/101811/0104/Translat...
If you're making the migration at all, you really ought to be going for fully variable page sizes, otherwise 5 years from now there'll be a 64K page size CPU and suddenly everyone has to recompile everything again and there is another compatibility wall...
Is there a such a thing? Page size gets baked into things like executable layouts, plus any place that uses the PAGE_SIZE constant (instead of sysconf(_SC_PAGESIZE)).
Indeed it would take redesigning a bunch of things to make runtime variable page size an option.
4 KiB page sizes have been used since the 1960's. More memory doesn't necessarily mean that larger pages are beneficial. Maybe 16 KiB is better for Android? Maybe. There really is no clear consensus on what the optimal page size for modern architectures should be.
> Starting November 1st, 2025, all new apps and app updates that use native C/C++ code targeting Android 15+ devices submitted to Google Play must support 16 KB page sizes.
I realize that most apps wouldn’t need to make changes and that a recompilation would suffice, but is this time frame enough for the apps that do need code changes?
They've mentioned this requirement before, last hn post I see is from early may.
They only added support in android 15, in august 2024. https://android-developers.googleblog.com/2024/08/adding-16-...
I don't know what "targetting Android 15+" means specifically. Does that include anything with a lower API level?
> I don't know what "targetting Android 15+" means specifically. Does that include anything with a lower API level?
- On Android, apps are built with targetSdkVersion set to the API version you're app is compiled for and tested against, but you cam set a lower minSdkVersion to the lowest device API version your app will run on.
- On devices with API level newer than targetSdkVersion, the OS looks at your app's targetSdkVersion and disables newer behaviours than your app is targetting. So the app should run well on newer devices.
- On devices with API level older than targetSdkVersion, but newer than (or same as) minSdkVersion, your own app is responsible for detecting missing APIs before trying to use them, and adapting itself to the older environment.
- On devices with API level older than minSdkVersion, your app will not be run. This ensures the user gets a clear failure, rather than unpredictable crashes or unexpected behaviour due to missing APIs the app tries to call.
So, in principle, it's possible to build an app which targets the most recent Android 15, while being capable of running on all versions of Android back to version 1. Apps linked for 16 kiB page-alignment should run on older devices that use 4 kiB pages too.
The Google Play Store enforces that targetSdkVersion is fairly close to the latest Android version. But it doesn't place requirements on minSdkVersion.
Android apps have a flag in their manifest which tells the OS "this app was built with Android X (API level X) in mind".
This allows the OS to selectively enable backwards compatibility and change certain behaviors (e.g. selectively enforce new permissions so old apps aren't broken).
Play Store requires apps to target new OSes and port APIs within certain time of an OS launching (usually ~2 years).
This avoids apps targeting older OSes to avoid new security and privacy enhancements (e.g. asking for permisisons to show notifications, asking for permisisons to access microphone, being allowed to show a fullscreen popup ad, etc. Those restrictions were all gated behind the target check.)
If you, yourself have native code you're trying to build, it only required bumping the NDK (which is automatically bumped when you upgrade the android gradle plugin), so that's mostly an automatic step (provided you're not stuck on old, AGP7 build scripts).
If you depend on a package that uses a native library, you wait for them to update. Or you fork, bump AGP and rebuild.
It's a very minor change, unless you depend on unmaintained code.
It's kind of too bad Linux to doesn't just support multiple base page sizes.
"offering improved performance gains" is a pleonasm. "offering performance gains" works just fine.
From the experience implementing 64K page sizes on aarch64 in Fedora & RHEL, this is not going to be a simple transition. All sorts of things will break in subtle, strange and interesting ways. Good luck to the Android team :-)
I think you meant "Android developers" that are forced to switch their apps.
This is dumb. The abstraction is at the wrong level.
Applications should assume the page size is 1 byte. One should be able to map, protect, etc memory ranges down to byte granularity - which is the granularity of everything else in computers. One fewer thing for programmers to worry about. History has shown that performance hacks with ongoing complexity tend not to survive (eg. interlaced video).
At the hardware level, rather than picking a certain number of bits of the address as the page size, you have multiple page tables, and multiple TLB caches - eg. one for 1 megabyte pages, one for 4 kilobyte pages, and one for individual byte pages. The hardware will simultaneously check all the tables (parallelism is cheap in hardware!).
The benefit of this is that, assuming the vast majority of bytes in a process address space are made of large mappings, you can fit far more mappings in the (divided up) TLB - which results in better performance too, whilst still being able to do precise byte-level protections.
The OS is the only place there is complexity - which has to find a way to fit the mappings the application wants into what the hardware can do (ie. 123456 bytes might become 30 4-kilobyte pages and 576 byte pages.).
Your response to a change that's motivated by performance improvements is to suggest switching to a scheme that'll have catastrophically worse performance?
It would likely have better performance for similar power and silicon area, because a hierarchical TLB will have a higher hit rate for the same number of transistors.
If you're going to go that far, you might as well move malloc() into hardware and start using ARM-style secure tagged pointers. Then finally C users can be free of memory allocation bugs.
Transistors aren't free (as in power consumptions, thermal etc), and wasting them on implementing 1 byte granularity TLBs would probably be a hard sell, even if assuming everything can indeed be done in parallel.
Dozens of years of kernel building, dozens of OSes, dozens of physical architectures, all having settled on minimum 4KB pages being a right balance between performance and memory usage, wiped away by a single offhand comment with no knowledge about the situation. Now that's HN.
Just the sheer TLB memory usage and performance implication of doing single byte pages would send CPU performance back to the stone age.
Completely false. The 4 KiB page size came from a machine with a total of 512 KiB (1962 Atlas, 3072B pages, 96k 48b words). It hasn’t scaled at all for inertia reasons and it has real and measurable costs. 64 KiB would have been the better choice IMO, but 16 is better than 4.
Hence the "minimum" part. The thread is literally about Android being compiled for 16KB pages, CPU support for larger pages has grown, easily up to 4MB for most consumer CPUs.
Going down _lower_ than 4KB is purely a waste of memory and performance.
My proposed design has many page sizes - nothing stops a software developer making all mappings multiples of 4kb and not using the byte sized pages.
My example was 1mb, 4kb and 1 byte pages - but a real design would probably use every power of two, or every even power of two to get best use of the TLB space.
It hasn't been done before because of a chicken and egg problem. CPU designers don't build it because no OS has the ability to use it, and no OS uses it because no CPU supports it. It would be a substantial amount of work for both parties.
> One should be able to map, protect, etc memory ranges down to byte granularity - which is the granularity of everything else in computers.
But you can do this, you simply have to pay the cost of using PAGE_SIZE of memory per byte you want to protect?