LangRef.rst

==============================
LLVM Language Reference Manual
==============================

.. contents::
   :local:
   :depth: 3

Abstract
========

This document is a reference manual for the LLVM assembly language. LLVM
is a Static Single Assignment (SSA) based representation that provides
type safety, low-level operations, flexibility, and the capability of
representing 'all' high-level languages cleanly. It is the common code
representation used throughout all phases of the LLVM compilation
strategy.

Introduction
============

The LLVM code representation is designed to be used in three different
forms: as an in-memory compiler IR, as an on-disk bitcode representation
(suitable for fast loading by a Just-In-Time compiler), and as a human
readable assembly language representation. This allows LLVM to provide a
powerful intermediate representation for efficient compiler
transformations and analysis, while providing a natural means to debug
and visualize the transformations. The three different forms of LLVM are
all equivalent. This document describes the human readable
representation and notation.

The LLVM representation aims to be light-weight and low-level while
being expressive, typed, and extensible at the same time. It aims to be
a "universal IR" of sorts, by being at a low enough level that
high-level ideas may be cleanly mapped to it (similar to how
microprocessors are "universal IR's", allowing many source languages to
be mapped to them). By providing type information, LLVM can be used as
the target of optimizations: for example, through pointer analysis, it
can be proven that a C automatic variable is never accessed outside of
the current function, allowing it to be promoted to a simple SSA value
instead of a memory location.

.. _wellformed:

Well-Formedness
---------------

It is important to note that this document describes 'well formed' LLVM
assembly language. There is a difference between what the parser accepts
and what is considered 'well formed'. For example, the following
instruction is syntactically okay, but not well formed:

.. code-block:: llvm

    %x = add i32 1, %x

because the definition of ``%x`` does not dominate all of its uses. The
LLVM infrastructure provides a verification pass that may be used to
verify that an LLVM module is well formed. This pass is automatically
run by the parser after parsing input assembly and by the optimizer
before it outputs bitcode. The violations pointed out by the verifier
pass indicate bugs in transformation passes or input to the parser.

.. _identifiers:

Identifiers
===========

LLVM identifiers come in two basic types: global and local. Global
identifiers (functions, global variables) begin with the ``'@'``
character. Local identifiers (register names, types) begin with the
``'%'`` character. Additionally, there are three different formats for
identifiers, for different purposes:

#. Named values are represented as a string of characters with their
   prefix. For example, ``%foo``, ``@DivisionByZero``,
   ``%a.really.long.identifier``. The actual regular expression used is
   '``[%@][a-zA-Z$._][a-zA-Z$._0-9]*``'. Identifiers which require other
   characters in their names can be surrounded with quotes. Special
   characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
   code for the character in hexadecimal. In this way, any character can
   be used in a name value, even quotes themselves.
#. Unnamed values are represented as an unsigned numeric value with
   their prefix. For example, ``%12``, ``@2``, ``%44``.
#. Constants, which are described in the section  Constants_ below.

LLVM requires that values start with a prefix for two reasons: Compilers
don't need to worry about name clashes with reserved words, and the set
of reserved words may be expanded in the future without penalty.
Additionally, unnamed identifiers allow a compiler to quickly come up
with a temporary variable without having to avoid symbol table
conflicts.

Reserved words in LLVM are very similar to reserved words in other
languages. There are keywords for different opcodes ('``add``',
'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
'``i32``', etc...), and others. These reserved words cannot conflict
with variable names, because none of them start with a prefix character
(``'%'`` or ``'@'``).

Here is an example of LLVM code to multiply the integer variable
'``%X``' by 8:

The easy way:

.. code-block:: llvm

    %result = mul i32 %X, 8

After strength reduction:

.. code-block:: llvm

    %result = shl i32 %X, 3

And the hard way:

.. code-block:: llvm

    %0 = add i32 %X, %X           ; yields {i32}:%0
    %1 = add i32 %0, %0           ; yields {i32}:%1
    %result = add i32 %1, %1

This last way of multiplying ``%X`` by 8 illustrates several important
lexical features of LLVM:

#. Comments are delimited with a '``;``' and go until the end of line.
#. Unnamed temporaries are created when the result of a computation is
   not assigned to a named value.
#. Unnamed temporaries are numbered sequentially

It also shows a convention that we follow in this document. When
demonstrating instructions, we will follow an instruction with a comment
that defines the type and name of value produced.

High Level Structure
====================

Module Structure
----------------

LLVM programs are composed of ``Module``'s, each of which is a
translation unit of the input programs. Each module consists of
functions, global variables, and symbol table entries. Modules may be
combined together with the LLVM linker, which merges function (and
global variable) definitions, resolves forward declarations, and merges
symbol table entries. Here is an example of the "hello world" module:

.. code-block:: llvm

    ; Declare the string constant as a global constant. 
    @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 

    ; External declaration of the puts function 
    declare i32 @puts(i8* nocapture) nounwind 

    ; Definition of main function
    define i32 @main() {   ; i32()*  
      ; Convert [13 x i8]* to i8  *... 
      %cast210 = getelementptr [13 x i8]* @.str, i64 0, i64 0

      ; Call puts function to write out the string to stdout. 
      call i32 @puts(i8* %cast210)
      ret i32 0 
    }

    ; Named metadata
    !1 = metadata !{i32 42}
    !foo = !{!1, null}

This example is made up of a :ref:`global variable <globalvars>` named
"``.str``", an external declaration of the "``puts``" function, a
:ref:`function definition <functionstructure>` for "``main``" and
:ref:`named metadata <namedmetadatastructure>` "``foo``".

In general, a module is made up of a list of global values (where both
functions and global variables are global values). Global values are
represented by a pointer to a memory location (in this case, a pointer
to an array of char, and a pointer to a function), and have one of the
following :ref:`linkage types <linkage>`.

.. _linkage:

Linkage Types
-------------

All Global Variables and Functions have one of the following types of
linkage:

``private``
    Global values with "``private``" linkage are only directly
    accessible by objects in the current module. In particular, linking
    code into a module with an private global value may cause the
    private to be renamed as necessary to avoid collisions. Because the
    symbol is private to the module, all references can be updated. This
    doesn't show up in any symbol table in the object file.
``linker_private``
    Similar to ``private``, but the symbol is passed through the
    assembler and evaluated by the linker. Unlike normal strong symbols,
    they are removed by the linker from the final linked image
    (executable or dynamic library).
``linker_private_weak``
    Similar to "``linker_private``", but the symbol is weak. Note that
    ``linker_private_weak`` symbols are subject to coalescing by the
    linker. The symbols are removed by the linker from the final linked
    image (executable or dynamic library).
``internal``
    Similar to private, but the value shows as a local symbol
    (``STB_LOCAL`` in the case of ELF) in the object file. This
    corresponds to the notion of the '``static``' keyword in C.
``available_externally``
    Globals with "``available_externally``" linkage are never emitted
    into the object file corresponding to the LLVM module. They exist to
    allow inlining and other optimizations to take place given knowledge
    of the definition of the global, which is known to be somewhere
    outside the module. Globals with ``available_externally`` linkage
    are allowed to be discarded at will, and are otherwise the same as
    ``linkonce_odr``. This linkage type is only allowed on definitions,
    not declarations.
``linkonce``
    Globals with "``linkonce``" linkage are merged with other globals of
    the same name when linkage occurs. This can be used to implement
    some forms of inline functions, templates, or other code which must
    be generated in each translation unit that uses it, but where the
    body may be overridden with a more definitive definition later.
    Unreferenced ``linkonce`` globals are allowed to be discarded. Note
    that ``linkonce`` linkage does not actually allow the optimizer to
    inline the body of this function into callers because it doesn't
    know if this definition of the function is the definitive definition
    within the program or whether it will be overridden by a stronger
    definition. To enable inlining and other optimizations, use
    "``linkonce_odr``" linkage.
``weak``
    "``weak``" linkage has the same merging semantics as ``linkonce``
    linkage, except that unreferenced globals with ``weak`` linkage may
    not be discarded. This is used for globals that are declared "weak"
    in C source code.
``common``
    "``common``" linkage is most similar to "``weak``" linkage, but they
    are used for tentative definitions in C, such as "``int X;``" at
    global scope. Symbols with "``common``" linkage are merged in the
    same way as ``weak symbols``, and they may not be deleted if
    unreferenced. ``common`` symbols may not have an explicit section,
    must have a zero initializer, and may not be marked
    ':ref:`constant <globalvars>`'. Functions and aliases may not have
    common linkage.

.. _linkage_appending:

``appending``
    "``appending``" linkage may only be applied to global variables of
    pointer to array type. When two global variables with appending
    linkage are linked together, the two global arrays are appended
    together. This is the LLVM, typesafe, equivalent of having the
    system linker append together "sections" with identical names when
    .o files are linked.
``extern_weak``
    The semantics of this linkage follow the ELF object file model: the
    symbol is weak until linked, if not linked, the symbol becomes null
    instead of being an undefined reference.
``linkonce_odr``, ``weak_odr``
    Some languages allow differing globals to be merged, such as two
    functions with different semantics. Other languages, such as
    ``C++``, ensure that only equivalent globals are ever merged (the
    "one definition rule" --- "ODR").  Such languages can use the
    ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
    global will only be merged with equivalent globals. These linkage
    types are otherwise the same as their non-``odr`` versions.
``linkonce_odr_auto_hide``
    Similar to "``linkonce_odr``", but nothing in the translation unit
    takes the address of this definition. For instance, functions that
    had an inline definition, but the compiler decided not to inline it.
    ``linkonce_odr_auto_hide`` may have only ``default`` visibility. The
    symbols are removed by the linker from the final linked image
    (executable or dynamic library).
``external``
    If none of the above identifiers are used, the global is externally
    visible, meaning that it participates in linkage and can be used to
    resolve external symbol references.

The next two types of linkage are targeted for Microsoft Windows
platform only. They are designed to support importing (exporting)
symbols from (to) DLLs (Dynamic Link Libraries).

``dllimport``
    "``dllimport``" linkage causes the compiler to reference a function
    or variable via a global pointer to a pointer that is set up by the
    DLL exporting the symbol. On Microsoft Windows targets, the pointer
    name is formed by combining ``__imp_`` and the function or variable
    name.
``dllexport``
    "``dllexport``" linkage causes the compiler to provide a global
    pointer to a pointer in a DLL, so that it can be referenced with the
    ``dllimport`` attribute. On Microsoft Windows targets, the pointer
    name is formed by combining ``__imp_`` and the function or variable
    name.

For example, since the "``.LC0``" variable is defined to be internal, if
another module defined a "``.LC0``" variable and was linked with this
one, one of the two would be renamed, preventing a collision. Since
"``main``" and "``puts``" are external (i.e., lacking any linkage
declarations), they are accessible outside of the current module.

It is illegal for a function *declaration* to have any linkage type
other than ``external``, ``dllimport`` or ``extern_weak``.

Aliases can have only ``external``, ``internal``, ``weak`` or
``weak_odr`` linkages.

.. _callingconv:

Calling Conventions
-------------------

LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
:ref:`invokes <i_invoke>` can all have an optional calling convention
specified for the call. The calling convention of any pair of dynamic
caller/callee must match, or the behavior of the program is undefined.
The following calling conventions are supported by LLVM, and more may be
added in the future:

"``ccc``" - The C calling convention
    This calling convention (the default if no other calling convention
    is specified) matches the target C calling conventions. This calling
    convention supports varargs function calls and tolerates some
    mismatch in the declared prototype and implemented declaration of
    the function (as does normal C).
"``fastcc``" - The fast calling convention
    This calling convention attempts to make calls as fast as possible
    (e.g. by passing things in registers). This calling convention
    allows the target to use whatever tricks it wants to produce fast
    code for the target, without having to conform to an externally
    specified ABI (Application Binary Interface). `Tail calls can only
    be optimized when this, the GHC or the HiPE convention is
    used. <CodeGenerator.html#id80>`_ This calling convention does not
    support varargs and requires the prototype of all callees to exactly
    match the prototype of the function definition.
"``coldcc``" - The cold calling convention
    This calling convention attempts to make code in the caller as
    efficient as possible under the assumption that the call is not
    commonly executed. As such, these calls often preserve all registers
    so that the call does not break any live ranges in the caller side.
    This calling convention does not support varargs and requires the
    prototype of all callees to exactly match the prototype of the
    function definition.
"``cc 10``" - GHC convention
    This calling convention has been implemented specifically for use by
    the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
    It passes everything in registers, going to extremes to achieve this
    by disabling callee save registers. This calling convention should
    not be used lightly but only for specific situations such as an
    alternative to the *register pinning* performance technique often
    used when implementing functional programming languages. At the
    moment only X86 supports this convention and it has the following
    limitations:

    -  On *X86-32* only supports up to 4 bit type parameters. No
       floating point types are supported.
    -  On *X86-64* only supports up to 10 bit type parameters and 6
       floating point parameters.

    This calling convention supports `tail call
    optimization <CodeGenerator.html#id80>`_ but requires both the
    caller and callee are using it.
"``cc 11``" - The HiPE calling convention
    This calling convention has been implemented specifically for use by
    the `High-Performance Erlang
    (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
    native code compiler of the `Ericsson's Open Source Erlang/OTP
    system <http://www.erlang.org/download.shtml>`_. It uses more
    registers for argument passing than the ordinary C calling
    convention and defines no callee-saved registers. The calling
    convention properly supports `tail call
    optimization <CodeGenerator.html#id80>`_ but requires that both the
    caller and the callee use it. It uses a *register pinning*
    mechanism, similar to GHC's convention, for keeping frequently
    accessed runtime components pinned to specific hardware registers.
    At the moment only X86 supports this convention (both 32 and 64
    bit).
"``cc <n>``" - Numbered convention
    Any calling convention may be specified by number, allowing
    target-specific calling conventions to be used. Target specific
    calling conventions start at 64.

More calling conventions can be added/defined on an as-needed basis, to
support Pascal conventions or any other well-known target-independent
convention.

Visibility Styles
-----------------

All Global Variables and Functions have one of the following visibility
styles:

"``default``" - Default style
    On targets that use the ELF object file format, default visibility
    means that the declaration is visible to other modules and, in
    shared libraries, means that the declared entity may be overridden.
    On Darwin, default visibility means that the declaration is visible
    to other modules. Default visibility corresponds to "external
    linkage" in the language.
"``hidden``" - Hidden style
    Two declarations of an object with hidden visibility refer to the
    same object if they are in the same shared object. Usually, hidden
    visibility indicates that the symbol will not be placed into the
    dynamic symbol table, so no other module (executable or shared
    library) can reference it directly.
"``protected``" - Protected style
    On ELF, protected visibility indicates that the symbol will be
    placed in the dynamic symbol table, but that references within the
    defining module will bind to the local symbol. That is, the symbol
    cannot be overridden by another module.

Named Types
-----------

LLVM IR allows you to specify name aliases for certain types. This can
make it easier to read the IR and make the IR more condensed
(particularly when recursive types are involved). An example of a name
specification is:

.. code-block:: llvm

    %mytype = type { %mytype*, i32 }

You may give a name to any :ref:`type <typesystem>` except
":ref:`void <t_void>`". Type name aliases may be used anywhere a type is
expected with the syntax "%mytype".

Note that type names are aliases for the structural type that they
indicate, and that you can therefore specify multiple names for the same
type. This often leads to confusing behavior when dumping out a .ll
file. Since LLVM IR uses structural typing, the name is not part of the
type. When printing out LLVM IR, the printer will pick *one name* to
render all types of a particular shape. This means that if you have code
where two different source types end up having the same LLVM type, that
the dumper will sometimes print the "wrong" or unexpected type. This is
an important design point and isn't going to change.

.. _globalvars:

Global Variables
----------------

Global variables define regions of memory allocated at compilation time
instead of run-time. Global variables may optionally be initialized, may
have an explicit section to be placed in, and may have an optional
explicit alignment specified.

A variable may be defined as ``thread_local``, which means that it will
not be shared by threads (each thread will have a separated copy of the
variable). Not all targets support thread-local variables. Optionally, a
TLS model may be specified:

``localdynamic``
    For variables that are only used within the current shared library.
``initialexec``
    For variables in modules that will not be loaded dynamically.
``localexec``
    For variables defined in the executable and only used within it.

The models correspond to the ELF TLS models; see `ELF Handling For
Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
more information on under which circumstances the different models may
be used. The target may choose a different TLS model if the specified
model is not supported, or if a better choice of model can be made.

A variable may be defined as a global ``constant``, which indicates that
the contents of the variable will **never** be modified (enabling better
optimization, allowing the global data to be placed in the read-only
section of an executable, etc). Note that variables that need runtime
initialization cannot be marked ``constant`` as there is a store to the
variable.

LLVM explicitly allows *declarations* of global variables to be marked
constant, even if the final definition of the global is not. This
capability can be used to enable slightly better optimization of the
program, but requires the language definition to guarantee that
optimizations based on the 'constantness' are valid for the translation
units that do not include the definition.

As SSA values, global variables define pointer values that are in scope
(i.e. they dominate) all basic blocks in the program. Global variables
always define a pointer to their "content" type because they describe a
region of memory, and all memory objects in LLVM are accessed through
pointers.

Global variables can be marked with ``unnamed_addr`` which indicates
that the address is not significant, only the content. Constants marked
like this can be merged with other constants if they have the same
initializer. Note that a constant with significant address *can* be
merged with a ``unnamed_addr`` constant, the result being a constant
whose address is significant.

A global variable may be declared to reside in a target-specific
numbered address space. For targets that support them, address spaces
may affect how optimizations are performed and/or what target
instructions are used to access the variable. The default address space
is zero. The address space qualifier must precede any other attributes.

LLVM allows an explicit section to be specified for globals. If the
target supports it, it will emit globals to the section specified.

By default, global initializers are optimized by assuming that global
variables defined within the module are not modified from their
initial values before the start of the global initializer.  This is
true even for variables potentially accessible from outside the
module, including those with external linkage or appearing in
``@llvm.used``. This assumption may be suppressed by marking the
variable with ``externally_initialized``.

An explicit alignment may be specified for a global, which must be a
power of 2. If not present, or if the alignment is set to zero, the
alignment of the global is set by the target to whatever it feels
convenient. If an explicit alignment is specified, the global is forced
to have exactly that alignment. Targets and optimizers are not allowed
to over-align the global if the global has an assigned section. In this
case, the extra alignment could be observable: for example, code could
assume that the globals are densely packed in their section and try to
iterate over them as an array, alignment padding would break this
iteration.

For example, the following defines a global in a numbered address space
with an initializer, section, and alignment:

.. code-block:: llvm

    @G = addrspace(5) constant float 1.0, section "foo", align 4

The following example defines a thread-local global with the
``initialexec`` TLS model:

.. code-block:: llvm

    @G = thread_local(initialexec) global i32 0, align 4

.. _functionstructure:

Functions
---------

LLVM function definitions consist of the "``define``" keyword, an
optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
style <visibility>`, an optional :ref:`calling convention <callingconv>`,
an optional ``unnamed_addr`` attribute, a return type, an optional
:ref:`parameter attribute <paramattrs>` for the return type, a function
name, a (possibly empty) argument list (each with optional :ref:`parameter
attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
an optional section, an optional alignment, an optional :ref:`garbage
collector name <gc>`, an opening curly brace, a list of basic blocks,
and a closing curly brace.

LLVM function declarations consist of the "``declare``" keyword, an
optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
style <visibility>`, an optional :ref:`calling convention <callingconv>`,
an optional ``unnamed_addr`` attribute, a return type, an optional
:ref:`parameter attribute <paramattrs>` for the return type, a function
name, a possibly empty list of arguments, an optional alignment, and an
optional :ref:`garbage collector name <gc>`.

A function definition contains a list of basic blocks, forming the CFG
(Control Flow Graph) for the function. Each basic block may optionally
start with a label (giving the basic block a symbol table entry),
contains a list of instructions, and ends with a
:ref:`terminator <terminators>` instruction (such as a branch or function
return).

The first basic block in a function is special in two ways: it is
immediately executed on entrance to the function, and it is not allowed
to have predecessor basic blocks (i.e. there can not be any branches to
the entry block of a function). Because the block can have no
predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.

LLVM allows an explicit section to be specified for functions. If the
target supports it, it will emit functions to the section specified.

An explicit alignment may be specified for a function. If not present,
or if the alignment is set to zero, the alignment of the function is set
by the target to whatever it feels convenient. If an explicit alignment
is specified, the function is forced to have at least that much
alignment. All alignments must be a power of 2.

If the ``unnamed_addr`` attribute is given, the address is know to not
be significant and two identical functions can be merged.

Syntax::

    define [linkage] [visibility]
           [cconv] [ret attrs]
           <ResultType> @<FunctionName> ([argument list])
           [fn Attrs] [section "name"] [align N]
           [gc] { ... }

Aliases
-------

Aliases act as "second name" for the aliasee value (which can be either
function, global variable, another alias or bitcast of global value).
Aliases may have an optional :ref:`linkage type <linkage>`, and an optional
:ref:`visibility style <visibility>`.

Syntax::

    @<Name> = alias [Linkage] [Visibility] <AliaseeTy> @<Aliasee>

.. _namedmetadatastructure:

Named Metadata
--------------

Named metadata is a collection of metadata. :ref:`Metadata
nodes <metadata>` (but not metadata strings) are the only valid
operands for a named metadata.

Syntax::

    ; Some unnamed metadata nodes, which are referenced by the named metadata.
    !0 = metadata !{metadata !"zero"}
    !1 = metadata !{metadata !"one"}
    !2 = metadata !{metadata !"two"}
    ; A named metadata.
    !name = !{!0, !1, !2}

.. _paramattrs:

Parameter Attributes
--------------------

The return type and each parameter of a function type may have a set of
*parameter attributes* associated with them. Parameter attributes are
used to communicate additional information about the result or
parameters of a function. Parameter attributes are considered to be part
of the function, not of the function type, so functions with different
parameter attributes can have the same function type.

Parameter attributes are simple keywords that follow the type specified.
If multiple parameter attributes are needed, they are space separated.
For example:

.. code-block:: llvm

    declare i32 @printf(i8* noalias nocapture, ...)
    declare i32 @atoi(i8 zeroext)
    declare signext i8 @returns_signed_char()

Note that any attributes for the function result (``nounwind``,
``readonly``) come immediately after the argument list.

Currently, only the following parameter attributes are defined:

``zeroext``
    This indicates to the code generator that the parameter or return
    value should be zero-extended to the extent required by the target's
    ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by
    the caller (for a parameter) or the callee (for a return value).
``signext``
    This indicates to the code generator that the parameter or return
    value should be sign-extended to the extent required by the target's
    ABI (which is usually 32-bits) by the caller (for a parameter) or
    the callee (for a return value).
``inreg``
    This indicates that this parameter or return value should be treated
    in a special target-dependent fashion during while emitting code for
    a function call or return (usually, by putting it in a register as
    opposed to memory, though some targets use it to distinguish between
    two different kinds of registers). Use of this attribute is
    target-specific.
``byval``
    This indicates that the pointer parameter should really be passed by
    value to the function. The attribute implies that a hidden copy of
    the pointee is made between the caller and the callee, so the callee
    is unable to modify the value in the caller. This attribute is only
    valid on LLVM pointer arguments. It is generally used to pass
    structs and arrays by value, but is also valid on pointers to
    scalars. The copy is considered to belong to the caller not the
    callee (for example, ``readonly`` functions should not write to
    ``byval`` parameters). This is not a valid attribute for return
    values.

    The byval attribute also supports specifying an alignment with the
    align attribute. It indicates the alignment of the stack slot to
    form and the known alignment of the pointer specified to the call
    site. If the alignment is not specified, then the code generator
    makes a target-specific assumption.

``sret``
    This indicates that the pointer parameter specifies the address of a
    structure that is the return value of the function in the source
    program. This pointer must be guaranteed by the caller to be valid:
    loads and stores to the structure may be assumed by the callee
    not to trap and to be properly aligned. This may only be applied to
    the first parameter. This is not a valid attribute for return
    values.
``noalias``
    This indicates that pointer values `*based* <pointeraliasing>` on
    the argument or return value do not alias pointer values which are
    not *based* on it, ignoring certain "irrelevant" dependencies. For a
    call to the parent function, dependencies between memory references
    from before or after the call and from those during the call are
    "irrelevant" to the ``noalias`` keyword for the arguments and return
    value used in that call. The caller shares the responsibility with
    the callee for ensuring that these requirements are met. For further
    details, please see the discussion of the NoAlias response in `alias
    analysis <AliasAnalysis.html#MustMayNo>`_.

    Note that this definition of ``noalias`` is intentionally similar
    to the definition of ``restrict`` in C99 for function arguments,
    though it is slightly weaker.

    For function return values, C99's ``restrict`` is not meaningful,
    while LLVM's ``noalias`` is.
``nocapture``
    This indicates that the callee does not make any copies of the
    pointer that outlive the callee itself. This is not a valid
    attribute for return values.

.. _nest:

``nest``
    This indicates that the pointer parameter can be excised using the
    :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
    attribute for return values.

.. _gc:

Garbage Collector Names
-----------------------

Each function may specify a garbage collector name, which is simply a
string:

.. code-block:: llvm

    define void @f() gc "name" { ... }

The compiler declares the supported values of *name*. Specifying a
collector which will cause the compiler to alter its output in order to
support the named garbage collection algorithm.

.. _fnattrs:

Function Attributes
-------------------

Function attributes are set to communicate additional information about
a function. Function attributes are considered to be part of the
function, not of the function type, so functions with different function
attributes can have the same function type.

Function attributes are simple keywords that follow the type specified.
If multiple attributes are needed, they are space separated. For
example:

.. code-block:: llvm

    define void @f() noinline { ... }
    define void @f() alwaysinline { ... }
    define void @f() alwaysinline optsize { ... }
    define void @f() optsize { ... }

``address_safety``
    This attribute indicates that the address safety analysis is enabled
    for this function.
``alignstack(<n>)``
    This attribute indicates that, when emitting the prologue and
    epilogue, the backend should forcibly align the stack pointer.
    Specify the desired alignment, which must be a power of two, in
    parentheses.
``alwaysinline``
    This attribute indicates that the inliner should attempt to inline
    this function into callers whenever possible, ignoring any active
    inlining size threshold for this caller.
``nonlazybind``
    This attribute suppresses lazy symbol binding for the function. This
    may make calls to the function faster, at the cost of extra program
    startup time if the function is not called during program startup.
``inlinehint``
    This attribute indicates that the source code contained a hint that
    inlining this function is desirable (such as the "inline" keyword in
    C/C++). It is just a hint; it imposes no requirements on the
    inliner.
``naked``
    This attribute disables prologue / epilogue emission for the
    function. This can have very system-specific consequences.
``noimplicitfloat``
    This attributes disables implicit floating point instructions.
``noinline``
    This attribute indicates that the inliner should never inline this
    function in any situation. This attribute may not be used together
    with the ``alwaysinline`` attribute.
``noredzone``
    This attribute indicates that the code generator should not use a
    red zone, even if the target-specific ABI normally permits it.
``noreturn``
    This function attribute indicates that the function never returns
    normally. This produces undefined behavior at runtime if the
    function ever does dynamically return.
``nounwind``
    This function attribute indicates that the function never returns
    with an unwind or exceptional control flow. If the function does
    unwind, its runtime behavior is undefined.
``optsize``
    This attribute suggests that optimization passes and code generator
    passes make choices that keep the code size of this function low,
    and otherwise do optimizations specifically to reduce code size.
``readnone``
    This attribute indicates that the function computes its result (or
    decides to unwind an exception) based strictly on its arguments,
    without dereferencing any pointer arguments or otherwise accessing
    any mutable state (e.g. memory, control registers, etc) visible to
    caller functions. It does not write through any pointer arguments
    (including ``byval`` arguments) and never changes any state visible
    to callers. This means that it cannot unwind exceptions by calling
    the ``C++`` exception throwing methods.
``readonly``
    This attribute indicates that the function does not write through
    any pointer arguments (including ``byval`` arguments) or otherwise
    modify any state (e.g. memory, control registers, etc) visible to
    caller functions. It may dereference pointer arguments and read
    state that may be set in the caller. A readonly function always
    returns the same value (or unwinds an exception identically) when
    called with the same set of arguments and global state. It cannot
    unwind an exception by calling the ``C++`` exception throwing
    methods.
``returns_twice``
    This attribute indicates that this function can return twice. The C
    ``setjmp`` is an example of such a function. The compiler disables
    some optimizations (like tail calls) in the caller of these
    functions.
``ssp``
    This attribute indicates that the function should emit a stack
    smashing protector. It is in the form of a "canary" --- a random value
    placed on the stack before the local variables that's checked upon
    return from the function to see if it has been overwritten. A
    heuristic is used to determine if a function needs stack protectors
    or not. The heuristic used will enable protectors for functions with:

    - Character arrays larger than ``ssp-buffer-size`` (default 8).
    - Aggregates containing character arrays larger than ``ssp-buffer-size``.
    - Calls to alloca() with variable sizes or constant sizes greater than
      ``ssp-buffer-size``.

    If a function that has an ``ssp`` attribute is inlined into a
    function that doesn't have an ``ssp`` attribute, then the resulting
    function will have an ``ssp`` attribute.
``sspreq``
    This attribute indicates that the function should *always* emit a
    stack smashing protector. This overrides the ``ssp`` function
    attribute.

    If a function that has an ``sspreq`` attribute is inlined into a
    function that doesn't have an ``sspreq`` attribute or which has an
    ``ssp`` or ``sspstrong`` attribute, then the resulting function will have
    an ``sspreq`` attribute.
``sspstrong``
    This attribute indicates that the function should emit a stack smashing
    protector. This attribute causes a strong heuristic to be used when
    determining if a function needs stack protectors.  The strong heuristic
    will enable protectors for functions with:

    - Arrays of any size and type
    - Aggregates containing an array of any size and type.
    - Calls to alloca().
    - Local variables that have had their address taken.

    This overrides the ``ssp`` function attribute.

    If a function that has an ``sspstrong`` attribute is inlined into a
    function that doesn't have an ``sspstrong`` attribute, then the
    resulting function will have an ``sspstrong`` attribute.
``uwtable``
    This attribute indicates that the ABI being targeted requires that
    an unwind table entry be produce for this function even if we can
    show that no exceptions passes by it. This is normally the case for
    the ELF x86-64 abi, but it can be disabled for some compilation
    units.
``noduplicate``
    This attribute indicates that calls to the function cannot be
    duplicated. A call to a ``noduplicate`` function may be moved
    within its parent function, but may not be duplicated within
    its parent function.

    A function containing a ``noduplicate`` call may still
    be an inlining candidate, provided that the call is not
    duplicated by inlining. That implies that the function has
    internal linkage and only has one call site, so the original
    call is dead after inlining.

.. _moduleasm:

Module-Level Inline Assembly
----------------------------

Modules may contain "module-level inline asm" blocks, which corresponds
to the GCC "file scope inline asm" blocks. These blocks are internally
concatenated by LLVM and treated as a single unit, but may be separated
in the ``.ll`` file if desired. The syntax is very simple:

.. code-block:: llvm

    module asm "inline asm code goes here"
    module asm "more can go here"

The strings can contain any character by escaping non-printable
characters. The escape sequence used is simply "\\xx" where "xx" is the
two digit hex code for the number.

The inline asm code is simply printed to the machine code .s file when
assembly code is generated.

Data Layout
-----------

A module may specify a target specific data layout string that specifies
how data is to be laid out in memory. The syntax for the data layout is
simply:

.. code-block:: llvm

    target datalayout = "layout specification"

The *layout specification* consists of a list of specifications
separated by the minus sign character ('-'). Each specification starts
with a letter and may include other information after the letter to
define some aspect of the data layout. The specifications accepted are
as follows:

``E``
    Specifies that the target lays out data in big-endian form. That is,
    the bits with the most significance have the lowest address
    location.
``e``
    Specifies that the target lays out data in little-endian form. That
    is, the bits with the least significance have the lowest address
    location.
``S<size>``
    Specifies the natural alignment of the stack in bits. Alignment
    promotion of stack variables is limited to the natural stack
    alignment to avoid dynamic stack realignment. The stack alignment
    must be a multiple of 8-bits. If omitted, the natural stack
    alignment defaults to "unspecified", which does not prevent any
    alignment promotions.
``p[n]:<size>:<abi>:<pref>``
    This specifies the *size* of a pointer and its ``<abi>`` and
    ``<pref>``\erred alignments for address space ``n``. All sizes are in
    bits. Specifying the ``<pref>`` alignment is optional. If omitted, the
    preceding ``:`` should be omitted too. The address space, ``n`` is
    optional, and if not specified, denotes the default address space 0.
    The value of ``n`` must be in the range [1,2^23).
``i<size>:<abi>:<pref>``
    This specifies the alignment for an integer type of a given bit
    ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
``v<size>:<abi>:<pref>``
    This specifies the alignment for a vector type of a given bit
    ``<size>``.
``f<size>:<abi>:<pref>``
    This specifies the alignment for a floating point type of a given bit
    ``<size>``. Only values of ``<size>`` that are supported by the target
    will work. 32 (float) and 64 (double) are supported on all targets; 80
    or 128 (different flavors of long double) are also supported on some
    targets.
``a<size>:<abi>:<pref>``
    This specifies the alignment for an aggregate type of a given bit
    ``<size>``.
``s<size>:<abi>:<pref>``
    This specifies the alignment for a stack object of a given bit
    ``<size>``.
``n<size1>:<size2>:<size3>...``
    This specifies a set of native integer widths for the target CPU in
    bits. For example, it might contain ``n32`` for 32-bit PowerPC,
    ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
    this set are considered to support most general arithmetic operations
    efficiently.

When constructing the data layout for a given target, LLVM starts with a
default set of specifications which are then (possibly) overridden by
the specifications in the ``datalayout`` keyword. The default
specifications are given in this list:

-  ``E`` - big endian
-  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment
-  ``S0`` - natural stack alignment is unspecified
-  ``i1:8:8`` - i1 is 8-bit (byte) aligned
-  ``i8:8:8`` - i8 is 8-bit (byte) aligned
-  ``i16:16:16`` - i16 is 16-bit aligned
-  ``i32:32:32`` - i32 is 32-bit aligned
-  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
   alignment of 64-bits
-  ``f16:16:16`` - half is 16-bit aligned
-  ``f32:32:32`` - float is 32-bit aligned
-  ``f64:64:64`` - double is 64-bit aligned
-  ``f128:128:128`` - quad is 128-bit aligned
-  ``v64:64:64`` - 64-bit vector is 64-bit aligned
-  ``v128:128:128`` - 128-bit vector is 128-bit aligned
-  ``a0:0:64`` - aggregates are 64-bit aligned

When LLVM is determining the alignment for a given type, it uses the
following rules:

#. If the type sought is an exact match for one of the specifications,