Executable and Linkable Format


From a “C” source code to the executable file there is a journey that follows 4 steps.
1. Pre-processing
2. Compilation
3. Assembly
4. Linking

At the end of the journey you get the output something like “Hello world”.

Pre-processing
This is the first stage that a source code goes through. Following tasks are done in this stage.
1. Macro substitution
2. Comments are stripped off
3. Expansion of the included files

Compilation
In this stage it takes a file as input from the outputs of the above stage and generates an output file (filename.s) with assembly level instructions.

Assembly
At this stage the “filename.s” file is taken as an input and an intermediate file “filename.o” is produced. This file is also known as the object file.
This file is produced by the assembler that understands and converts a ‘.s’ file with assembly instructions into a ‘.o’ object file which contains machine level instructions. At this stage only the existing code is converted into machine language, the function calls like printf() are not resolved.
Executable and linkable format (ELF) is a relatively new format for machine level object files and executable that are produced by gcc. Prior to this, a format known as a.out was used. ELF is said to be more sophisticated format than a.out .

Linking
This is the final stage at which all the linking of function calls with their definitions are done. As discussed earlier, till this stage gcc doesn’t know about the definition of functions like printf(). Until the compiler knows exactly where all of these functions are implemented, it simply uses a place-holder for the function call. It is at this stage, the definition of printf() is resolved and the actual address of the function printf() is plugged in.
Now we know the stages that a code goes through when it executes. And how ELF is used when executing a program.



Linux ELF Object File Format
It is a standard file format for objects files in linux. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unix systems.

ELF supports:
  • Different processors
  •  Different data encoding
  •  Different classes of machines
Elf can be any of the following types

Relocatable file
This type of object file contains data and code that can be linked together with other relocatable files to produce an executable binary or a shared object file.
Shared object file
This type of object file is used by the dynamic linker to combine it with the executable and/or other shared object files to create a complete process image.
Executable file
This type of object file is a file that is capable of executing a program when run.
An elf file begins with an elf header that describes the complete organization of the file. Relocatable and shared object files contain sections but on executable file are composed of segments. So the header gives the information of the file depending on the type of the object file.
In executable files header is followed by a program header table. It helps to create the process image. Program header table is an essential part to the executable files. But for other files it is optional.

ELF object file format
  1.  .symtab section
    1. o symbol table
    2. o procedure and static variable name
    3. o section names and locations
  2. .rel.text section
    1. o relocation info for .text section
    2. o addresses of instructions that will need to be
    3. o odified in the executable
    4. o Instructions for modifying.
  3.  .rel.data section
    1. o relocation info for .data section
    2. o addresses of pointer data that will need to be
    3. o modified in the merged executable
  4.  .debug section
    1. o info for symbolic debugging (gcc -g)
  5.  Elf header
    1. o Magic number, type (.o, exec, .so), machine, byte ordering, etc.
  6. Program header table
    1. o page size, virtual addresses for memory segments (sections), segment sizes.
  7. .text section
    1. o code
  8. .data section
    1. o initialized (static) data
  9. .bss section
    1. o uninitialized (static) data
    2. o “Block Started by Symbol”
    3. o “Better Save Space”
    4. o has section header but occupies no space
Difference between a.out and ELF
  • The header of an a.out file (struct exec, defined in /usr/include/linux/a.out.h) contains limited information. It only allows the sections of the object file to exist and does not directly support any additional sections.
  • It contains only the sizes of the various sections, but does not directly specify the offsets within the file where the section starts. Thus the linker and the kernel loader have some unwritten understanding about where the various sections start within a file.
  • There is no built-in shared library support—a.out was developed before shared library technology was developed, so implementations of shared libraries based on a.out must abuse and misuse some of the existing sections in order to accomplish the tasks required.


  • References
  • http://www.linuxjournal.com/article/1059?page=0,0
  •  http://www.thegeekstuff.com/2011/10/c-program-to-an-executable/
  •  http://www.thegeekstuff.com/2012/07/elf-object-file-format/
  •  http://en.wikipedia.org/wiki/Executable_and_Linkable_Format

please leave a comment.
SHARE

Harsha Jayamanna

    Blogger Comment
    Facebook Comment

0 comments:

Post a Comment