Abusing ELF Files' forgotten fields
Small journey to ELF header unused fields
Abusing ELF Files’ forgotten fields
Through my various experiments with ELF files, I noticed that the Linux loader ignores some fields in the ELF Header structure. This led me to the idea of a simple exploitation of this feature by inserting various bytes of code to confuse malware researchers and their tools, without limiting the functionality of the original code. As a result, a specialized utility and library were created. They help to modify code in your programs so that the program fails to launch if these insignificant bytes are altered. Thus, analysts must choose between two options: either the program can not be analysed but it can be started, or the program can be analysed but it can not be started.
ELF File Loading Process
First, we need to understand exactly how Linux loads ELF files into memory and which kernel functions are used for this. For this, we’ll use the perf utility and collect a trace of all functions during the execution of a test file, including those that load the file.
Let’s create a simple file and start tracing:
To view the trace results, enter: perf script -i perf.data
We see a clear order of function calls for loading and starting the file:
flush_signal_handlers- clearing signal handlers of the old processbegin_new_exec- preparing a new execution contextload_elf_binary- parsing ELF, loading segments, configuring memorybprm_execve- working with thelinux_binprmstructure (arguments, environment)do_execveat_common.isra.0- general execution logic__x64_sys_execve- system call handler for x86_64x64_sys_call→do_syscall_64→entry_SYSCALL_64- system call dispatcher and hardware entry
Thus, our attention should focus on the load_elf_binary function.
ELF File Architecture
An ELF file is the basic executable file in the Linux OS. It contains all the information needed by the OS to load and run the file. This information is structured as follows:
- ELF Header
- Segments Headers or Program Headers
- Section Headers
The ELF Header is mostly useless (as we’ll see later) and contains basic file information. It and other headers can be viewed using the readelf utility:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
~$ readelf -h sample
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x411710
Start of program headers: 64 (bytes into file)
Start of section headers: 498224 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 8
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 27
For simplicity, we’ll use the field names from the readelf output here and later, i.e., instead of e_ident we’ll say Magic.
Here’s a brief description of the purpose of the fields:
Magic- ELF file identifierClass- program bitnessData- program byte orderVersion- program versionPadding bytes- 8 zero bytesOS/ABI- ABI version used in the programABI Version- specific ABI versionType- program type (e.g., executable or library)Machine- processor type for which the program is intendedVersion- program version (second time, always equal to the value above)Entry point address- program entry pointStart of program headers- offset to Program HeadersStart of section headers- offset to Section HeadersFlags- flags allowing the program to specify additional information needed by the OSSize of this header- size of ELF headerSize of program headers- size of Program HeadersNumber of program headers- number of segments in Program HeadersSize of section headers- size of Section HeadersNumber of section headers- number of sections in Section HeadersSection header string table index- index of the section containing all section names as a single string
Program Headers tell the loader how to efficiently transfer the ELF binary into memory. Section Headers provide a logical breakdown of the ELF file. These two headers are not of interest to us here - modifying them requires much more extensive file manipulation, and all their fields are actively used by the loader and interpreter.
ELF File Loader Operation
The loader source code is available on GitHub.
The file first undergoes checks in this order:
- Checking the
Magicfield: must always be0x7fELF - Checking the
Typefield: the file must be either executable (ET_EXEC) or a dynamically shared library (ET_DYN) - Checking program architecture: generally, the
Machinefield must equal one of the values from this file. However, in some cases additional conditions apply:- For
ARCOMPACTandARCV2: the third byte in theFlagsfield must not equal3or4 - For
PARISCandRISCV: theClassfield must equal1(ELF32) or2(ELF64) - For
ARM:- Both words of the
Entry point addressfield are not even - The high byte of the
Flagsfield is not 0 - an unknown ABI format is used:- If the third bit of the
Flagsfield is set, the processor must support 26-bit mode - If the 10th or 9th bit of the
Flagsfield is set, the processor must support VFP
- If the third bit of the
- Both words of the
- For
ARMandXTENSA, check that theOS/ABIfield equals65
- For
- Checking the file driver: the file driver must use one implementation of the mmap API
Next follows a lengthy process of handling the rest of the ELF file content: parsing and loading sections and segments. Throughout this process, the Flags field is used multiple times, which closes access to its free modification for us. Other fields of the ELF header are used by the program interpreter, which also doesn’t allow us to modify them freely.
Thus, almost no checks are performed on the ELF header content. ELF Header fields not checked (ignored) by the loader and interpreter:
DatafieldClassfield is ignored in most architecturesVersionfield (both variants)Padding bytesfieldOS/ABIfield is ignored in most architecturesABI versionfield is ignored in most architecturesSize of this headerfieldSize of section headersfieldNumber of section headersfieldSection header string table indexfield
In Linux architecture, these fields are called respectively:
e_ident[EI_DATA]e_ident[EI_CLASS]e_ident[EI_VERSION]ande_versione_ident[EI_PAD]e_ident[EI_OSABI]e_ident[EI_ABIVERSION]e_ehsizee_shentsizee_shnume_shstrndx
Exploitation Concept and Implementation
This feature of the ELF header can be easily exploited: if these fields are not checked, they can store ANY data. The disadvantages of the technique is that this data can occupy a total of 24 bytes, and it is not contiguous in the file. These fields can store:
- Some key for decrypting file content, C2 address, or other similar artifact
- Any information whose integrity will be checked by the program
In this article, I would like to focus on the second idea. Big thx for it to @while_not_False!
By writing random data into these fields, we can confuse malware analysts and their analysis tools that are incorrectly configured for specific bytes in these fields. At the same time, the program will continue to execute successfully. If these bytes are modified by an analyst, the program will stop executing because it checks the integrity of its code. Thus, the analyst finds themselves in a difficult position: they must rely either only on static or only on dynamic code analysis.
Of course, one can always find and patch the integrity check function, but various obfuscation mechanisms should help with that.
Implementation
Viewer
First, a utility called elfields was written to modify and view the interesting fields in an ELF file.
Example output of these fields on a normal program:
The utility also allows calculating the hash of these fields, which will be useful to us later:
Now let’s change some fields of this file using the utility. In this case, I wrote the string “someMalwareKey” into the fields:
But the file still launches!
Let’s try modifying something more complex than a “Hello World!” program. I chose the ls utility. As we can see, it also launches without issues:
But the most interesting part is the reaction of various malware analysis tools.
No disassembler can parse our modified program (I used dogbolt project):
Libraries for analyzing ELF files like pyelftools also crash:
Tools for viewing files like Detect It Easy cannot parse the file:
IDA 9.2 correctly parsed the file magic:
But crashes during further processing:
GDB also cannot start debugging the file:
Checker
A header file elfields.h was implemented that handles comparing the current hash of unused ELF file fields with one pre-recorded by the developer. This functionality allows checking the file before running the main code - if the file was modified by a malware analyst and the field hash changed, then execution must be terminated. Moreover, implementing through a header file leaves the choice of method for storing the comparison hash, the location for calling the key comparison function, and the handling of the function execution result up to the developer. This makes the file’s usage options more flexible.
Below is an example of this header file in action:
- Compile the program with the
-lssl-lcryptooptions for hash function support - Calculate the hash of unused fields - it does not change regardless of program content
- Call the hash check function inside the program, for example before starting the main program, and pass it the obtained hash as an argument
- Try changing some field of the program - imagine we’re an analyst trying to fix the program header
- Now the program won’t launch!
The complete implementation, including the elfields utility and elfields.h library, is available on GitHub for further research and development.
Detection
From Linux architecture we know all possible valid values of an unused fields. For example, field EI_CLASS can be 0x00, 0x01 or 0x02. Based on that info, we can easily write a simple yara-rule to detect “strange” elf-files:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
rule Invalid_ELF_Header_Fields
{
meta:
description = "Detects ELF files with invalid header field values"
author = "kyrr1s"
strings:
$elf_magic = { 7F 45 4C 46 }
$valid_class_00 = { 00 }
$valid_class_01 = { 01 }
$valid_class_02 = { 02 }
$valid_data_00 = { 00 }
$valid_data_01 = { 01 }
$valid_data_02 = { 02 }
$valid_version_00 = { 00 }
$valid_version_01 = { 01 }
$zero_byte = { 00 }
$valid_osabi_00 = { 00 } // SYSV
$valid_osabi_01 = { 01 } // HPUX
$valid_osabi_02 = { 02 } // NetBSD
$valid_osabi_03 = { 03 } // Linux
$valid_osabi_04 = { 04 } // GNUHurd
$valid_osabi_06 = { 06 } // Solaris
$valid_osabi_07 = { 07 } // AIX
$valid_osabi_08 = { 08 } // IRIX
$valid_osabi_09 = { 09 } // FreeBSD
$valid_osabi_0a = { 0A } // Tru64
$valid_osabi_0b = { 0B } // NovellModesto
$valid_osabi_0c = { 0C } // OpenBSD
$valid_osabi_0d = { 0D } // OpenVMS
$valid_osabi_0e = { 0E } // NonStopKernel
$valid_osabi_0f = { 0F } // AROS
$valid_osabi_10 = { 10 } // FenixOS
$valid_osabi_11 = { 11 } // CloudABI
$valid_osabi_12 = { 12 } // OpenVOS
$valid_osabi_40 = { 40 } // ARM_EABI
$valid_osabi_ff = { FF } // STANDALONE
$valid_shentsize_32 = { 28 00 }
$valid_shentsize_64 = { 40 00 }
condition:
$elf_magic at 0
and
(
not ($valid_class_00 at 4 or $valid_class_01 at 4 or $valid_class_02 at 4)
or not ($valid_data_00 at 5 or $valid_data_01 at 5 or $valid_data_02 at 5)
or not ($valid_version_00 at 6 or $valid_version_01 at 6)
or not ($valid_osabi_00 at 7 or $valid_osabi_01 at 7 or $valid_osabi_02 at 7 or
$valid_osabi_03 at 7 or $valid_osabi_04 at 7 or $valid_osabi_06 at 7 or
$valid_osabi_07 at 7 or $valid_osabi_08 at 7 or $valid_osabi_09 at 7 or
$valid_osabi_0a at 7 or $valid_osabi_0b at 7 or $valid_osabi_0c at 7 or
$valid_osabi_0d at 7 or $valid_osabi_0e at 7 or $valid_osabi_0f at 7 or
$valid_osabi_10 at 7 or $valid_osabi_11 at 7 or $valid_osabi_12 at 7 or
$valid_osabi_40 at 7 or $valid_osabi_ff at 7)
or not ($zero_byte at 8 and $zero_byte at 9 and $zero_byte at 10 and $zero_byte at 11
and $zero_byte at 12 and $zero_byte at 13 and $zero_byte at 14 and $zero_byte at 15)
or ($valid_class_01 at 4 and not $valid_shentsize_32 at 58)
or ($valid_class_02 at 4 and not $valid_shentsize_64 at 58)
)
}
Use that yara only in hunting and not in production! Because all that checks are really heavy.
Results:
Conclusion and Future Work
The research presented demonstrates a practical exploitation of a significant oversight in Linux ELF file loading mechanisms. By targeting the 24 bytes of ignored fields in the ELF header, we’ve created a technique that:
- Effectively disrupts static analysis tools without affecting program execution
- Forces analysts to choose between static and dynamic analysis approaches
- Provides built-in tamper detection through hash verification
- Maintains full program functionality while confusing analysis tools
The implications are significant for both offensive and defensive security applications. For malware developers, this provides an additional layer of evasion that’s difficult to detect and counter. For defenders and tool developers, it highlights a critical area where ELF parsers need improvement.
In the future I would like to make similar research with PE files.
(This is my first research so if there are some mistakes, dm me in tg)
















