DisARMing Code
System-level programming, debugging and reverse engineering on Aarch64 platforms

An ARM assembly primer

Introduction to Assembly

Instruction Format

Operand types

Variants and aliases

Binary

Negative numbers
Integer Overflows

Hexadecimal representation

Byte Ordering

Registers

General Purpose Registers
Floating Point Registers
Special Registers
System Registers
The ARM Procedure Call Standard (PCS)

PSTATE (Process State)

Conditionals

A Working Vocabulary

Data Processing Instructions

PC-Relative Addressing
Immediate to Register Moves
Integer Arithmetic Instructions
Logical Operations
Bit-Level Operations
Extensions
Shifts
Rotations
Extractions
General Bitfield operations
Conditional Comparisons
Conditional Select

Load/Store Instructions

Variants
Load/Store Addressing Modes
LDRSW (register) and C-switch statements
Load/Store-Exclusive
Additional Load/Store Instructions

Branches, Exceptions & System Instructions

Branches
Exceptions
System Instructions
Barriers
Processor HINTs

Floating Point Operations

Floating Point Formats

SIMD instructions

SIMD use as a method for compiler optimizations

Scalable Vector Extensions (SVE) instructions

SVE Register sets
SVE Instructions

Scalable Matrix Extensions (SME/SME2) instructions

ARM chipset features

ARMv8.x features

ARMv8.1
ARMv8.2
ARMv8.3
ARMv8.4
ARMv8.5/v9.0 (2018)
ARMv8.6/9.1 (2019)
ARMv8.7/9.2 (2020)
ARMv8.8/9.3 (2021)
ARMv8.9/9.4 (2022)

ARMv9.x

ARMv9.1 (2019)
ARMv9.2 (2020)
ARMv9.3 (2021)
ARMv9.4 (2022)
ARMv9.5 (2023)
ARMv9.6 (2024)

Determining features at runtime

Linux
Darwin

Review Questions

Compilation & Linking

Preprocessing

Macros

Constant definitions
Built-in constants
Interpolating variable names
Compile-time functions
Static Assertions

Conditional compilation

Compiler Builtins

Syntactic Sugar

Pragmas

Code Generation/Optimization

Optimization

Optimization Levels

Compiler attributes

Inline Assembly

GCC: Profiling (Linux)

Debugging Information

Linux: GNU gcc debug Levels
Darwin: clang and .dSYM
Debug Information Sections
DWARF
ULEB128

Linking

Static linking

Library Archives
TBD files (Darwin clang(1))

Dynamic Linking

The runtime linker
Dynamic Libraries
Load Time Linking
Run Time Linking
Imports & Stubs
Darwin: Mach-O Plugins
Darwin 23+: Mergeable Libraries

Relocations

Linker Environment Variables

Linux: LD_* environment variables
Darwin: DYLD_* Environment Variables

Linker Security

Linux/Android: AT_SECURE
Darwin: DYLD security

Review Questions

Binary Formats

Windows/EFI: Portable Executable (PE32+)

.reloc

Linux: Executable & Library Format (ELF)

The ELF header

The Program Headers

The Section Headers

Section Types

The Dynamic Section (.dynamic, PT_DYNAMIC, SHT_DYNAMIC)
String Tables (.dynstr/.shstrtab/.strtab, SHT_STRTAB)
Symbol Tables (.dynsym/.symtab, DT_DYMSTAB, SHT_DYNSYM/SHT_SYMTAB)
Note sections (.note*, PT_NOTE, SHT_NOTE)
The Procedure Linkage Table (.plt)
The Global Offset Table (.got and .got.plt)

Relocations

ELF: SHT_REL[A]
Android: SHT_ANDROID_REL[A]
RELR

Darwin: Mach-O

Prelude: Fat Files

The fat header
Architecture selection

The Mach-O Header

Mach-O file types
Flags

Load Commands

Segments and Sections

Mach-O Segments
Mach-O Sections
Custom segments and sections

The standard segments

The __TEXT segment
The __DATA_CONST segment
The __DATA segment
The __LINKEDIT segment

Darwin 20: Chained Fixups

Pointer formats
Pointer Chaining
Handling imports

The DYLD Shared Cache

The Shared Cache Header
Split Caches (Darwin 20)
Dylibs in the cache
LocalSymbols
Atlas
Shared Cache Resliding (Darwin 20+)

DYLD Introspection APIs

Environment variables
DYLD Kdebug Codes
dyld_all_image_infos
Cache introspection APIs

Review Questions

The Process Lifecycle

Prerequisite: System Calls

System Call Internals (Part I)

Linux & Android: SVC #0
Darwin: SVC #128

Bypassing system call wrappers

Using the generic syscall interface
Using inline assembly

The UNIX standard system calls

Process Lifecycle stages

Birth: The fork(2)/execve(2) model

Birth (II): The posix_spawn(2) model

POSIX File Actions
POSIX spawn attributes

Lifetime: Process Identifiers

Darwin Extensions

Death

Un-Death: Zombies

Linux: The /proc/ filesystem

Resource utilization

POSIX resource management

POSIX resource limits
POSIX resource usage counters
Extended resource usage counters

Darwin Extensions

task_info(2!)
Ledgers
Darwin: Coalitions

Linux Extensions

Linux: Control Groups (cgroups)
Linux: Namespaces

Process-Level Security

Credentials

Darwin: Personae

Linux Security Features

Capabilities
SELinux
SECCOMP-BPF

Darwin Security Features

The Sandbox
Entitlements
Restricting Debugging

Review Questions

Memory - I - The System View

Paging

Pages

Page Lifecycle

Swapping

Page Size

Memory APIs

POSIX Memory Management APIs

mmap(2)/munmap(2)
madvise(2)
mprotect(2)
mincore(2)
m[un]lock[all](2)
msync(2)/fsync(2)/sync(2)

Darwin specific memory extensions

The mach_vm/vm_map MIG subsystems
VM Tags
Purgable Memory

Memory Pressure

Linux: OOM
Android: Low Memory Killer Daemon (/system/bin/lmkd)
Darwin: MemoryStatus (macOS) and Jetsam (elsewhere)

System-level Memory Metrics

Linux: /proc/meminfo

High-Level memory statistics
Cached memory statistics
Memory Activity Statistics
Unevictable/Locked Memory Statistics
Swap/Compressed RAM Statistics
Kernel Memory Statistics
Kernel Same-page Merging (KSM)

Darwin-specific metrics

memory_pressure(1)
vm_stat(1)

Review Questions

Memory - II - The Process View

The Stack

The initial stack

Linux: The ELF Auxiliary Vector
Darwin: The apple[] argument

The Frame

Prolog and Epilog

Dynamic stack allocations with alloca(3)

Backtraces

UN*X: execinfo.h and backtrace_* APIs
Darwin: Stack APIs

The Heap

Linux: glibc ptmalloc2

Nomenclature
Chunks
Heaps
Arenas
Allocation bins and fastbins
The tcache
Putting it all together: the flow of malloc(3) and free(3)
Customizing ptmalloc
mallinfo(3)

Android: Scudo

Nomenclature
Regions
Chunks
Transfer Batches
Local Cache
Errors
Configurable Options

Darwin: libsystem_malloc.dylib

Feature Flags
Typed Memory Operations

Memory Bugs

Memory Leaks*

Stack Corruptions

Heap Corruptions

Heap Overflows
Double Free
Use After Free (UaF)

Exploit Mitigation Strategies

Stack Canaries

ARMv8.3 Pointer Authentication Codes (PAC)

Additional PAC features
Pointer Signing Instructions
Operating System Support

ARMv8.5 Memory Tagging Extensions (MTE)

Additional MTE features
MTE Tag Handling instructions
MTE Pointer Arithmetic
Operating System Support
MTE Implementations in memory allocators

ARMv9: Guarded Control Stack FEAT_GCS

Process-level Memory Analysis

Process Memory Layout

Linux
Android
Darwin

Heap Introspection

Linux
Android: libmemunreachable
Darwin
Cross Platform: memento(j)

Review Questions

MultiThreading

Multithreading

Risks

The POSIX Thread APIs

Thread Lifecycle calls

Thread Creation
Thread Exit
Thread Cancelation

Thread Specific Data (TSD) and Thread Local Storage (TLS)

CPU-level support: TPIDR_ELx and TPIDRRO_EL0
Library support: Thread Local Storage
Language level support: The __thread keyword
Compiler support: TLS Models
Compiler Support: Emulated TLS
Binary Format Support

Thread Identifiers & Names

Linux thread identifiers
Darwin thread identifiers
Non-POSIX extension: Setting Thread Names

pthread_t implementations

Linux: glibc's pthread_t
Android: Bionic's pthread_internal_t
Darwin: The pthread_s object

Concurrency

Prerequisite: Atomic access

Memory Ordering

example: Bionic's pthread_key_delete

Synchronization Primitives

pthread primitives
Linux: futex(7)es (Fast User-space Mutexes)
Darwin: ulocks
Darwin: Mach Event Links

Transactional Memory Extensions (ARMv9 FEAT_TME)

Working with transactions

Thread Safety Analysis

Scheduling

Thread Priorities (nice(2) guys finish last)

Linux scheduling APIs

Core Affinity

Darwin: Core Affinity

Darwin: Process policy (#323) and I/O policies

Micro and Nano Delays

Data Independent Timing

Review Questions

I/O & IPC

File I/O

File I/O related system calls

Darwin: Guarded File Descriptors
Directory Entry Manipulation
Links

File medatadata related system calls

File Permissions
File Attributes (Darwin)
File Extended Attributes
File Control

Filesystem related system calls

The security perspective

Symbolic links
File descriptor leaks

UN*X IPC

Shared Memory

Shared memory using open(2)/mmap(2)
POSIX Shared Memory (shm_open(2/3)/shm_unlink(2/3)
System V Shared Memory (shmat(2)/shmget(2)/shmdt(2))
Linux: memfd_create(2)

UN*X: Pipes

UN*X: System V IPC

Semaphores
Message Queues

UN*X: Sockets

Socket APIs
Named, Anonymous and (in Linux) Abstract Sockets
Connection Oriented vs. Connectionless
Darwin: System sockets (AF_SYSTEM)
Monitoring Socket Activity
Finding peers
Packet Capture

Socket-level security

Non-Blocking I/O

Multiplexed Synchronous I/O

The standard model: select(2)/poll(3)
Linux extensions: epoll(3)

asynchronous I/O

Linux: Signal driven I/O (F_GETSIG/F_SETSIG)

Extensions

Higher Privilege Levels

Darwin IPC

Mach Messages

Mach Messaging APIs
Mach Message Format
MIG

XPC

XPC APIs
Hooking/Tracing XPC Messages

Darwin IPC Security

Android: Binder

Abstraction: The Android Interface Definition Language

Binder System Call Interface

Tracing Binder Activity

Binder IPC Security

Review Questions

System-Wide Tracing, Profiling & Auditing

Profiling in Linux

Linux Event Tracing

The task event tracer
The raw_syscalls event tracer
The ftrace facility
The trace_marker

File Activity Monitoring

Linux: fanotify
/proc/pid/fd and /proc/pid/fdinfo
tracefs event tracing and FS provides

perf

Profiling in Darwin

System/Process level: kdebug

Consumer API
kdebug clients
Producer API
kdebug codes

Process Level:proc_info

Process Level: microstackshot

Process Level: stack_snapshot_with_config (#491)

Darwin: sysdiagnose(1)

System-wide: BSM auditing

File Activity Monitoring

Darwin: FSEvents

Network profiling

ntstat

Auditing

Linux: auditd

Darwin: Endpoint Security Framework

Event Types

Extending the system

Darwin: DTrace

Linux/Android: eBPF

User Statically Defined Tracing

Review Questions

Hooking & Injecting

Load time hooking

Windows: Using the registry

AppInit_DLLs
Windows 10: AppCertDLLs

LD_PRELOAD (Linux/Android)/ DYLD_INSERT_LIBRARIES (Darwin)

Code in constructors

Overriding dependencies (hooking)

Darwin: Interposing Functions

Debugger Interfaces

Prerequisite: Attaching to a target

Linux
Darwin

Accessing Process Memory

Linux
Darwin

Controlling Foreign Threads

Linux: PTRACE_[GET/SET]REGS and PTRACE_[GET/SET]REGSET
Darwin: thread_get_state(…)/thread_set_state(…)

Exception Handling

Linux
Darwin

Debug proxying and relaying

*OS lldb

Runtime Hooking

System Call Tracing

Linux/Android
Darwin

Hooking external functions (via import tables)

Hooking both internal and external functions (via breakpoints)

Putting it all together: Code Injection

Linux/Android code injection

Darwin Code Injection

In memory linking

Darwin: NSCreateObjectFileImageFromMemory and friends

Review Questions

Notes

Runtimes & Higher-Level Languages

Darwin: Objective-C

The Mach-O perspective

__TEXT sections
__DATA_CONST sections
__DATA sections

Name Mangling

Calling Convention

Reversing Objective-C

The Class Menagerie
Reversing code
Targeted runtime object corruption
Swizzling
Tracing the runtime

Darwin: Swift

The Mach-O Perspective

__DATA_CONST.__objc_imageinfo
__TEXT.__swift5_typeref
__TEXT.__swift5_types
__TEXT.__swift5_assocty
__TEXT.__swift5_builtin
__TEXT.__swift5_protos
__TEXT.__swift5_proto
__TEXT.__swift5_fieldmd
__TEXT.__swift5_replace
__TEXT.__swift5_capture
__TEXT.__swift5_reflstr
__TEXT.__swift5_mpenum
__TEXT.__constg_swiftt and __TEXT.__constg_swiftm

Name Mangling

Calling convention

swift-inspect

Compiled Python (.pyc)

Android: ART

Prelude: Why Dalvik Matters

Locating ART (and framework) objects in memory

ART Object Structure
Enumerating class definitions
Reflecting objects

Review Questions

Exceptions, Crashes and other Fatalities

Exception Handling

UN*X Signal Handling

Mach Exception Types
Traditional signal handling methods
Signal-handling system calls
sigaction(2)
Linux: signalfd(2)

Long jumps

Stack Unwinding

ELF/Mach-O: .eh_frame
ELF: .eh_frame_hdr
Mach-O: __TEXT.__unwind_info

.gcc_except_table/__TEXT.__gcc_except_tab

Core Dumps

Prerequisite: The AS_CORE Resource Limit

Linux (ELF) cores

/proc/pid/coredump_filter
Core dump filename control
Android: Re-enabling cores

Darwin: Mach-O cores

Additional Restrictions
Core dump filename control

Darwin: Corpses

Corpse MIG calls
Corpse information elements

Live core dumps

Darwin/Linux: gcore(1)
procexp(j) and memento(j)

Crash Reporting

Crash Reporting in Darwin

os_fault_with_payload
Darwin: Spindump
Darwin's ReportCrash
Assisted Suicide
Crash report format
iOS: Moving crash reports
SubmitDiagInfo

Android: Tombstones

Review Questions

Beyond User Mode

Exception Levels

ELx

Determining the current EL

The Exception Vector

Exception Classes
SPSR_ELx
Taking and Servicing an Exception

Darwin: Guarded Mode

Kernel Mode

VBAR_EL1

System Call Internals (Part II)

Linux
Darwin
System calls from kernel mode

Virtual Memory Management

Linux: Slabs, Slubs and Slob allocators
XNU: The zone allocator

Physical Memory Management

Translation Lookaside Buffer
The AT command
Address Spaces
Page Table Control Registers
TCR_ELx
Operating System Support

Kernel Hardening measures

PAN
KASLR
Darwin: __ARM_KERNEL_PROTECT__
Darwin: KTRR
Darwin: SPTM

Panics

Linux: Panics data through pstorefs
Kernel core dumps
Darwin: Panics and .ips files

Hypervisors

Setting up a Hypervisor

Hypervisor Traps

HCR_EL2
Fine Grained Traps

Nested Virtualization (FEAT_NV and FEAT_NV2)

Paravirtualization

Virtualization Host Extensions (FEAT_VHE)

EL2 code for EL1 monitoring and attestation

Secure Monitors

The Secure World

EL3 registers

Monitoring and Attestation

ARMv9.2 Realms (FEAT_RME)

Implementation

Review Questions

Reverse Engineering

WorkFlow

Identifying External Dependencies

Identifying Pointers in Data

Pointers to Text
Pointers to Data

Identifying functions and code blocks

References and Cross-References

Function Call Arguments

Following strings
Following non-string arguments
Variable Arguments Functions

Identifying Local Variables

Local (automatic) Structures

Identifying Globals

Symbolication

Obfuscation

Code Encryption
String Encryption
Code Obfuscation
Anti-Debugging techniques

Case Studies

Case Study: The Linux kernel on ARM

Precursor step: Extracting the Linux kernel from an Android boot image
Locating /proc/kallsyms
The sys_call_table

Case Study: The XNU Kernel

Precursor step: Extracting the kernel from the kernelcache
Locating sysent
Locating the mach_trap_table

Case Study: Darwin - AppleMobileFileIntegrity.kext

Case Study: Darwin - Sandbox.kext

Case Study: Darwin - MIG handlers

Blocks

SMC/HVC/SVC

Review Questions

disarm(j) - the missing manual page

Basic Information

Header/Format

File format agnostic information
File format specific information

Segmenting/Sectioning

File format agnostic segmentation
File format specific segmentation

Operating on regions, or other distinct portions of a file.

Dumping

Data Inspection

Classic (hexdump -C) Dumping

Smart Dumping

Disassembly

The core

Common workflows

Example: Cursory (no flow control) decompilation:
Example: ...

Program Analysis

Companion Files

Naming convention
File format

Matchers

Argument Matchers
Region Matchers
Opcode Matchers

Locating Strings

Definition context
Use context

References

Code (function) references
Data (global) references

Gadgets

jtrace(j) - the missing manual page

Basic Usage (strace(1) compatibility)

Enhancements (and jtrace(j)-specific options)

Color

Marks

Freeze/Thaw

Plugins