LangRef.rst

#. If no match is found, and the type sought is a vector type, then the
   largest vector type that is smaller than the sought vector type will
   be used as a fall back. This happens because <128 x double> can be
   implemented in terms of 64 <2 x double>, for example.

The function of the data layout string may not be what you expect.
Notably, this is not a specification from the frontend of what alignment
the code generator should use.

Instead, if specified, the target data layout is required to match what
the ultimate *code generator* expects. This string is used by the
mid-level optimizers to improve code, and this only works if it matches
what the ultimate code generator uses. If you would like to generate IR
that does not embed this target-specific detail into the IR, then you
don't have to specify the string. This will disable some optimizations
that require precise layout information, but this also prevents those
optimizations from introducing target specificity into the IR.

.. _pointeraliasing:

Pointer Aliasing Rules
----------------------

Any memory access must be done through a pointer value associated with
an address range of the memory access, otherwise the behavior is
undefined. Pointer values are associated with address ranges according
to the following rules:

-  A pointer value is associated with the addresses associated with any
   value it is *based* on.
-  An address of a global variable is associated with the address range
   of the variable's storage.
-  The result value of an allocation instruction is associated with the
   address range of the allocated storage.
-  A null pointer in the default address-space is associated with no
   address.
-  An integer constant other than zero or a pointer value returned from
   a function not defined within LLVM may be associated with address
   ranges allocated through mechanisms other than those provided by
   LLVM. Such ranges shall not overlap with any ranges of addresses
   allocated by mechanisms provided by LLVM.

A pointer value is *based* on another pointer value according to the
following rules:

-  A pointer value formed from a ``getelementptr`` operation is *based*
   on the first operand of the ``getelementptr``.
-  The result value of a ``bitcast`` is *based* on the operand of the
   ``bitcast``.
-  A pointer value formed by an ``inttoptr`` is *based* on all pointer
   values that contribute (directly or indirectly) to the computation of
   the pointer's value.
-  The "*based* on" relationship is transitive.

Note that this definition of *"based"* is intentionally similar to the
definition of *"based"* in C99, though it is slightly weaker.

LLVM IR does not associate types with memory. The result type of a
``load`` merely indicates the size and alignment of the memory from
which to load, as well as the interpretation of the value. The first
operand type of a ``store`` similarly only indicates the size and
alignment of the store.

Consequently, type-based alias analysis, aka TBAA, aka
``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
:ref:`Metadata <metadata>` may be used to encode additional information
which specialized optimization passes may use to implement type-based
alias analysis.

.. _volatile:

Volatile Memory Accesses
------------------------

Certain memory accesses, such as :ref:`load <i_load>`'s,
:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
marked ``volatile``. The optimizers must not change the number of
volatile operations or change their order of execution relative to other
volatile operations. The optimizers *may* change the order of volatile
operations relative to non-volatile operations. This is not Java's
"volatile" and has no cross-thread synchronization behavior.

IR-level volatile loads and stores cannot safely be optimized into
llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
flagged volatile. Likewise, the backend should never split or merge
target-legal volatile load/store instructions.

.. admonition:: Rationale

 Platforms may rely on volatile loads and stores of natively supported
 data width to be executed as single instruction. For example, in C
 this holds for an l-value of volatile primitive type with native
 hardware support, but not necessarily for aggregate types. The
 frontend upholds these expectations, which are intentionally
 unspecified in the IR. The rules above ensure that IR transformation
 do not violate the frontend's contract with the language.

.. _memmodel:

Memory Model for Concurrent Operations
--------------------------------------

The LLVM IR does not define any way to start parallel threads of
execution or to register signal handlers. Nonetheless, there are
platform-specific ways to create them, and we define LLVM IR's behavior
in their presence. This model is inspired by the C++0x memory model.

For a more informal introduction to this model, see the :doc:`Atomics`.

We define a *happens-before* partial order as the least partial order
that

-  Is a superset of single-thread program order, and
-  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
   ``b``. *Synchronizes-with* pairs are introduced by platform-specific
   techniques, like pthread locks, thread creation, thread joining,
   etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
   Constraints <ordering>`).

Note that program order does not introduce *happens-before* edges
between a thread and signals executing inside that thread.

Every (defined) read operation (load instructions, memcpy, atomic
loads/read-modify-writes, etc.) R reads a series of bytes written by
(defined) write operations (store instructions, atomic
stores/read-modify-writes, memcpy, etc.). For the purposes of this
section, initialized globals are considered to have a write of the
initializer which is atomic and happens before any other read or write
of the memory in question. For each byte of a read R, R\ :sub:`byte`
may see any write to the same byte, except:

-  If write\ :sub:`1`  happens before write\ :sub:`2`, and
   write\ :sub:`2` happens before R\ :sub:`byte`, then
   R\ :sub:`byte` does not see write\ :sub:`1`.
-  If R\ :sub:`byte` happens before write\ :sub:`3`, then
   R\ :sub:`byte` does not see write\ :sub:`3`.

Given that definition, R\ :sub:`byte` is defined as follows:

-  If R is volatile, the result is target-dependent. (Volatile is
   supposed to give guarantees which can support ``sig_atomic_t`` in
   C/C++, and may be used for accesses to addresses which do not behave
   like normal memory. It does not generally provide cross-thread
   synchronization.)
-  Otherwise, if there is no write to the same byte that happens before
   R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
-  Otherwise, if R\ :sub:`byte` may see exactly one write,
   R\ :sub:`byte` returns the value written by that write.
-  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
   see are atomic, it chooses one of the values written. See the :ref:`Atomic
   Memory Ordering Constraints <ordering>` section for additional
   constraints on how the choice is made.
-  Otherwise R\ :sub:`byte` returns ``undef``.

R returns the value composed of the series of bytes it read. This
implies that some bytes within the value may be ``undef`` **without**
the entire value being ``undef``. Note that this only defines the
semantics of the operation; it doesn't mean that targets will emit more
than one instruction to read the series of bytes.

Note that in cases where none of the atomic intrinsics are used, this
model places only one restriction on IR transformations on top of what
is required for single-threaded execution: introducing a store to a byte
which might not otherwise be stored is not allowed in general.
(Specifically, in the case where another thread might write to and read
from an address, introducing a store can change a load that may see
exactly one write into a load that may see multiple writes.)

.. _ordering:

Atomic Memory Ordering Constraints
----------------------------------

Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
an ordering parameter that determines which other atomic instructions on
the same address they *synchronize with*. These semantics are borrowed
from Java and C++0x, but are somewhat more colloquial. If these
descriptions aren't precise enough, check those specs (see spec
references in the :doc:`atomics guide <Atomics>`).
:ref:`fence <i_fence>` instructions treat these orderings somewhat
differently since they don't take an address. See that instruction's
documentation for details.

For a simpler introduction to the ordering constraints, see the
:doc:`Atomics`.

``unordered``
    The set of values that can be read is governed by the happens-before
    partial order. A value cannot be read unless some operation wrote
    it. This is intended to provide a guarantee strong enough to model
    Java's non-volatile shared variables. This ordering cannot be
    specified for read-modify-write operations; it is not strong enough
    to make them atomic in any interesting way.
``monotonic``
    In addition to the guarantees of ``unordered``, there is a single
    total order for modifications by ``monotonic`` operations on each
    address. All modification orders must be compatible with the
    happens-before order. There is no guarantee that the modification
    orders can be combined to a global total order for the whole program
    (and this often will not be possible). The read in an atomic
    read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
    :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
    order immediately before the value it writes. If one atomic read
    happens before another atomic read of the same address, the later
    read must see the same value or a later value in the address's
    modification order. This disallows reordering of ``monotonic`` (or
    stronger) operations on the same address. If an address is written
    ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
    read that address repeatedly, the other threads must eventually see
    the write. This corresponds to the C++0x/C1x
    ``memory_order_relaxed``.
``acquire``
    In addition to the guarantees of ``monotonic``, a
    *synchronizes-with* edge may be formed with a ``release`` operation.
    This is intended to model C++'s ``memory_order_acquire``.
``release``
    In addition to the guarantees of ``monotonic``, if this operation
    writes a value which is subsequently read by an ``acquire``
    operation, it *synchronizes-with* that operation. (This isn't a
    complete description; see the C++0x definition of a release
    sequence.) This corresponds to the C++0x/C1x
    ``memory_order_release``.
``acq_rel`` (acquire+release)
    Acts as both an ``acquire`` and ``release`` operation on its
    address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
``seq_cst`` (sequentially consistent)
    In addition to the guarantees of ``acq_rel`` (``acquire`` for an
    operation which only reads, ``release`` for an operation which only
    writes), there is a global total order on all
    sequentially-consistent operations on all addresses, which is
    consistent with the *happens-before* partial order and with the
    modification orders of all the affected addresses. Each
    sequentially-consistent read sees the last preceding write to the
    same address in this global order. This corresponds to the C++0x/C1x
    ``memory_order_seq_cst`` and Java volatile.

.. _singlethread:

If an atomic operation is marked ``singlethread``, it only *synchronizes
with* or participates in modification and seq\_cst total orderings with
other operations running in the same thread (for example, in signal
handlers).

.. _fastmath:

Fast-Math Flags
---------------

LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,
:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
:ref:`frem <i_frem>`) have the following flags that can set to enable
otherwise unsafe floating point operations

``nnan``
   No NaNs - Allow optimizations to assume the arguments and result are not
   NaN. Such optimizations are required to retain defined behavior over
   NaNs, but the value of the result is undefined.

``ninf``
   No Infs - Allow optimizations to assume the arguments and result are not
   +/-Inf. Such optimizations are required to retain defined behavior over
   +/-Inf, but the value of the result is undefined.

``nsz``
   No Signed Zeros - Allow optimizations to treat the sign of a zero
   argument or result as insignificant.

``arcp``
   Allow Reciprocal - Allow optimizations to use the reciprocal of an
   argument rather than perform division.

``fast``
   Fast - Allow algebraically equivalent transformations that may
   dramatically change results in floating point (e.g. reassociate). This
   flag implies all the others.

.. _typesystem:

Type System
===========

The LLVM type system is one of the most important features of the
intermediate representation. Being typed enables a number of
optimizations to be performed on the intermediate representation
directly, without having to do extra analyses on the side before the
transformation. A strong type system makes it easier to read the
generated code and enables novel analyses and transformations that are
not feasible to perform on normal three address code representations.

Type Classifications
--------------------

The types fall into a few useful classifications:


.. list-table::
   :header-rows: 1

   * - Classification
     - Types

   * - :ref:`integer <t_integer>`
     - ``i1``, ``i2``, ``i3``, ... ``i8``, ... ``i16``, ... ``i32``, ...
       ``i64``, ...

   * - :ref:`floating point <t_floating>`
     - ``half``, ``float``, ``double``, ``x86_fp80``, ``fp128``,
       ``ppc_fp128``


   * - first class

       .. _t_firstclass:

     - :ref:`integer <t_integer>`, :ref:`floating point <t_floating>`,
       :ref:`pointer <t_pointer>`, :ref:`vector <t_vector>`,
       :ref:`structure <t_struct>`, :ref:`array <t_array>`,
       :ref:`label <t_label>`, :ref:`metadata <t_metadata>`.

   * - :ref:`primitive <t_primitive>`
     - :ref:`label <t_label>`,
       :ref:`void <t_void>`,
       :ref:`integer <t_integer>`,
       :ref:`floating point <t_floating>`,
       :ref:`x86mmx <t_x86mmx>`,
       :ref:`metadata <t_metadata>`.

   * - :ref:`derived <t_derived>`
     - :ref:`array <t_array>`,
       :ref:`function <t_function>`,
       :ref:`pointer <t_pointer>`,
       :ref:`structure <t_struct>`,
       :ref:`vector <t_vector>`,
       :ref:`opaque <t_opaque>`.

The :ref:`first class <t_firstclass>` types are perhaps the most important.
Values of these types are the only ones which can be produced by
instructions.

.. _t_primitive:

Primitive Types
---------------

The primitive types are the fundamental building blocks of the LLVM
system.

.. _t_integer:

Integer Type
^^^^^^^^^^^^

Overview:
"""""""""

The integer type is a very simple type that simply specifies an
arbitrary bit width for the integer type desired. Any bit width from 1
bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.

Syntax:
"""""""

::

      iN

The number of bits the integer will occupy is specified by the ``N``
value.

Examples:
"""""""""

+----------------+------------------------------------------------+
| ``i1``         | a single-bit integer.                          |
+----------------+------------------------------------------------+
| ``i32``        | a 32-bit integer.                              |
+----------------+------------------------------------------------+
| ``i1942652``   | a really big integer of over 1 million bits.   |
+----------------+------------------------------------------------+

.. _t_floating:

Floating Point Types
^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1

   * - Type
     - Description

   * - ``half``
     - 16-bit floating point value

   * - ``float``
     - 32-bit floating point value

   * - ``double``
     - 64-bit floating point value

   * - ``fp128``
     - 128-bit floating point value (112-bit mantissa)

   * - ``x86_fp80``
     -  80-bit floating point value (X87)

   * - ``ppc_fp128``
     - 128-bit floating point value (two 64-bits)

.. _t_x86mmx:

X86mmx Type
^^^^^^^^^^^

Overview:
"""""""""

The x86mmx type represents a value held in an MMX register on an x86
machine. The operations allowed on it are quite limited: parameters and
return values, load and store, and bitcast. User-specified MMX
instructions are represented as intrinsic or asm calls with arguments
and/or results of this type. There are no arrays, vectors or constants
of this type.

Syntax:
"""""""

::

      x86mmx

.. _t_void:

Void Type
^^^^^^^^^

Overview:
"""""""""

The void type does not represent any value and has no size.

Syntax:
"""""""

::

      void

.. _t_label:

Label Type
^^^^^^^^^^

Overview:
"""""""""

The label type represents code labels.

Syntax:
"""""""

::

      label

.. _t_metadata:

Metadata Type
^^^^^^^^^^^^^

Overview:
"""""""""

The metadata type represents embedded metadata. No derived types may be
created from metadata except for :ref:`function <t_function>` arguments.

Syntax:
"""""""

::

      metadata

.. _t_derived:

Derived Types
-------------

The real power in LLVM comes from the derived types in the system. This
is what allows a programmer to represent arrays, functions, pointers,
and other useful types. Each of these types contain one or more element
types which may be a primitive type, or another derived type. For
example, it is possible to have a two dimensional array, using an array
as the element type of another array.

.. _t_aggregate:

Aggregate Types
^^^^^^^^^^^^^^^

Aggregate Types are a subset of derived types that can contain multiple
member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
aggregate types. :ref:`Vectors <t_vector>` are not considered to be
aggregate types.

.. _t_array:

Array Type
^^^^^^^^^^

Overview:
"""""""""

The array type is a very simple derived type that arranges elements
sequentially in memory. The array type requires a size (number of
elements) and an underlying data type.

Syntax:
"""""""

::

      [<# elements> x <elementtype>]

The number of elements is a constant integer value; ``elementtype`` may
be any type with a size.

Examples:
"""""""""

+------------------+--------------------------------------+
| ``[40 x i32]``   | Array of 40 32-bit integer values.   |
+------------------+--------------------------------------+
| ``[41 x i32]``   | Array of 41 32-bit integer values.   |
+------------------+--------------------------------------+
| ``[4 x i8]``     | Array of 4 8-bit integer values.     |
+------------------+--------------------------------------+

Here are some examples of multidimensional arrays:

+-----------------------------+----------------------------------------------------------+
| ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
+-----------------------------+----------------------------------------------------------+
| ``[12 x [10 x float]]``     | 12x10 array of single precision floating point values.   |
+-----------------------------+----------------------------------------------------------+
| ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
+-----------------------------+----------------------------------------------------------+

There is no restriction on indexing beyond the end of the array implied
by a static type (though there are restrictions on indexing beyond the
bounds of an allocated object in some cases). This means that
single-dimension 'variable sized array' addressing can be implemented in
LLVM with a zero length array type. An implementation of 'pascal style
arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
example.

.. _t_function:

Function Type
^^^^^^^^^^^^^

Overview:
"""""""""

The function type can be thought of as a function signature. It consists
of a return type and a list of formal parameter types. The return type
of a function type is a first class type or a void type.

Syntax:
"""""""

::

      <returntype> (<parameter list>)

...where '``<parameter list>``' is a comma-separated list of type
specifiers. Optionally, the parameter list may include a type ``...``,
which indicates that the function takes a variable number of arguments.
Variable argument functions can access their arguments with the
:ref:`variable argument handling intrinsic <int_varargs>` functions.
'``<returntype>``' is any type except :ref:`label <t_label>`.

Examples:
"""""""""

+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+

.. _t_struct:

Structure Type
^^^^^^^^^^^^^^

Overview:
"""""""""

The structure type is used to represent a collection of data members
together in memory. The elements of a structure may be any type that has
a size.

Structures in memory are accessed using '``load``' and '``store``' by
getting a pointer to a field with the '``getelementptr``' instruction.
Structures in registers are accessed using the '``extractvalue``' and
'``insertvalue``' instructions.

Structures may optionally be "packed" structures, which indicate that
the alignment of the struct is one byte, and that there is no padding
between the elements. In non-packed structs, padding between field types
is inserted as defined by the DataLayout string in the module, which is
required to match what the underlying code generator expects.

Structures can either be "literal" or "identified". A literal structure
is defined inline with other types (e.g. ``{i32, i32}*``) whereas
identified types are always defined at the top level with a name.
Literal types are uniqued by their contents and can never be recursive
or opaque since there is no way to write one. Identified types can be
recursive, can be opaqued, and are never uniqued.

Syntax:
"""""""

::

      %T1 = type { <type list> }     ; Identified normal struct type
      %T2 = type <{ <type list> }>   ; Identified packed struct type

Examples:
"""""""""

+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

.. _t_opaque:

Opaque Structure Types
^^^^^^^^^^^^^^^^^^^^^^

Overview:
"""""""""

Opaque structure types are used to represent named structure types that
do not have a body specified. This corresponds (for example) to the C
notion of a forward declared structure.

Syntax:
"""""""

::

      %X = type opaque
      %52 = type opaque

Examples:
"""""""""

+--------------+-------------------+
| ``opaque``   | An opaque type.   |
+--------------+-------------------+

.. _t_pointer:

Pointer Type
^^^^^^^^^^^^

Overview:
"""""""""

The pointer type is used to specify memory locations. Pointers are
commonly used to reference objects in memory.

Pointer types may have an optional address space attribute defining the
numbered address space where the pointed-to object resides. The default
address space is number zero. The semantics of non-zero address spaces
are target-specific.

Note that LLVM does not permit pointers to void (``void*``) nor does it
permit pointers to labels (``label*``). Use ``i8*`` instead.

Syntax:
"""""""

::

      <type> *

Examples:
"""""""""

+-------------------------+--------------------------------------------------------------------------------------------------------------+
| ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
+-------------------------+--------------------------------------------------------------------------------------------------------------+
| ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
+-------------------------+--------------------------------------------------------------------------------------------------------------+
| ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5.                           |
+-------------------------+--------------------------------------------------------------------------------------------------------------+

.. _t_vector:

Vector Type
^^^^^^^^^^^

Overview:
"""""""""

A vector type is a simple derived type that represents a vector of
elements. Vector types are used when multiple primitive data are
operated in parallel using a single instruction (SIMD). A vector type
requires a size (number of elements) and an underlying primitive data
type. Vector types are considered :ref:`first class <t_firstclass>`.

Syntax:
"""""""

::

      < <# elements> x <elementtype> >

The number of elements is a constant integer value larger than 0;
elementtype may be any integer or floating point type, or a pointer to
these types. Vectors of size zero are not allowed.

Examples:
"""""""""

+-------------------+--------------------------------------------------+
| ``<4 x i32>``     | Vector of 4 32-bit integer values.               |
+-------------------+--------------------------------------------------+
| ``<8 x float>``   | Vector of 8 32-bit floating-point values.        |
+-------------------+--------------------------------------------------+
| ``<2 x i64>``     | Vector of 2 64-bit integer values.               |
+-------------------+--------------------------------------------------+
| ``<4 x i64*>``    | Vector of 4 pointers to 64-bit integer values.   |
+-------------------+--------------------------------------------------+

Constants
=========

LLVM has several different basic types of constants. This section
describes them all and their syntax.

Simple Constants
----------------

**Boolean constants**
    The two strings '``true``' and '``false``' are both valid constants
    of the ``i1`` type.
**Integer constants**
    Standard integers (such as '4') are constants of the
    :ref:`integer <t_integer>` type. Negative numbers may be used with
    integer types.
**Floating point constants**
    Floating point constants use standard decimal notation (e.g.
    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
    hexadecimal notation (see below). The assembler requires the exact
    decimal value of a floating-point constant. For example, the
    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
    decimal in binary. Floating point constants must have a :ref:`floating
    point <t_floating>` type.
**Null pointer constants**
    The identifier '``null``' is recognized as a null pointer constant
    and must be of :ref:`pointer type <t_pointer>`.

The one non-intuitive notation for constants is the hexadecimal form of
floating point constants. For example, the form
'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
than) '``double 4.5e+15``'. The only time hexadecimal floating point
constants are required (and the only time that they are generated by the
disassembler) is when a floating point constant must be emitted but it
cannot be represented as a decimal floating point number in a reasonable
number of digits. For example, NaN's, infinities, and other special
values are represented in their IEEE hexadecimal format so that assembly
and disassembly do not cause any bits to change in the constants.

When using the hexadecimal form, constants of types half, float, and
double are represented using the 16-digit form shown above (which
matches the IEEE754 representation for double); half and float values
must, however, be exactly representable as IEEE 754 half and single
precision, respectively. Hexadecimal format is always used for long
double, and there are three forms of long double. The 80-bit format used
by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
128-bit format used by PowerPC (two adjacent doubles) is represented by
``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
represented by ``0xL`` followed by 32 hexadecimal digits; no currently
supported target uses this format. Long doubles will only work if they
match the long double format on your target. The IEEE 16-bit format
(half precision) is represented by ``0xH`` followed by 4 hexadecimal
digits. All hexadecimal formats are big-endian (sign bit at the left).

There are no constants of type x86mmx.

Complex Constants
-----------------

Complex constants are a (potentially recursive) combination of simple
constants and smaller complex constants.

**Structure constants**
    Structure constants are represented with notation similar to
    structure type definitions (a comma separated list of elements,
    surrounded by braces (``{}``)). For example:
    "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
    "``@G = external global i32``". Structure constants must have
    :ref:`structure type <t_struct>`, and the number and types of elements
    must match those specified by the type.
**Array constants**
    Array constants are represented with notation similar to array type
    definitions (a comma separated list of elements, surrounded by
    square brackets (``[]``)). For example:
    "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
    :ref:`array type <t_array>`, and the number and types of elements must
    match those specified by the type.
**Vector constants**
    Vector constants are represented with notation similar to vector
    type definitions (a comma separated list of elements, surrounded by
    less-than/greater-than's (``<>``)). For example:
    "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
    must have :ref:`vector type <t_vector>`, and the number and types of
    elements must match those specified by the type.
**Zero initialization**
    The string '``zeroinitializer``' can be used to zero initialize a
    value to zero of *any* type, including scalar and
    :ref:`aggregate <t_aggregate>` types. This is often used to avoid
    having to print large zero initializers (e.g. for large arrays) and
    is always exactly equivalent to using explicit zero initializers.
**Metadata node**
    A metadata node is a structure-like constant with :ref:`metadata
    type <t_metadata>`. For example:
    "``metadata !{ i32 0, metadata !"test" }``". Unlike other
    constants that are meant to be interpreted as part of the
    instruction stream, metadata is a place to attach additional
    information such as debug info.

Global Variable and Function Addresses
--------------------------------------

The addresses of :ref:`global variables <globalvars>` and
:ref:`functions <functionstructure>` are always implicitly valid
(link-time) constants. These constants are explicitly referenced when
the :ref:`identifier for the global <identifiers>` is used and always have
:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
file:

.. code-block:: llvm

    @X = global i32 17
    @Y = global i32 42
    @Z = global [2 x i32*] [ i32* @X, i32* @Y ]

.. _undefvalues:

Undefined Values
----------------

The string '``undef``' can be used anywhere a constant is expected, and
indicates that the user of the value may receive an unspecified
bit-pattern. Undefined values may be of any type (other than '``label``'
or '``void``') and be used anywhere a constant is permitted.

Undefined values are useful because they indicate to the compiler that
the program is well defined no matter what value is used. This gives the
compiler more freedom to optimize. Here are some examples of
(potentially surprising) transformations that are valid (in pseudo IR):

.. code-block:: llvm

      %A = add %X, undef
      %B = sub %X, undef
      %C = xor %X, undef
    Safe:
      %A = undef
      %B = undef
      %C = undef

This is safe because all of the output bits are affected by the undef
bits. Any output bit can have a zero or one depending on the input bits.

.. code-block:: llvm

      %A = or %X, undef
      %B = and %X, undef
    Safe:
      %A = -1
      %B = 0
    Unsafe:
      %A = undef
      %B = undef

These logical operations have bits that are not always affected by the
input. For example, if ``%X`` has a zero bit, then the output of the
'``and``' operation will always be a zero for that bit, no matter what
the corresponding bit from the '``undef``' is. As such, it is unsafe to
optimize or assume that the result of the '``and``' is '``undef``'.
However, it is safe to assume that all bits of the '``undef``' could be
0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
all the bits of the '``undef``' operand to the '``or``' could be set,
allowing the '``or``' to be folded to -1.

.. code-block:: llvm

      %A = select undef, %X, %Y
      %B = select undef, 42, %Y
      %C = select %X, %Y, undef
    Safe:
      %A = %X     (or %Y)
      %B = 42     (or %Y)
      %C = %Y
    Unsafe:
      %A = undef
      %B = undef
      %C = undef

This set of examples shows that undefined '``select``' (and conditional
branch) conditions can go *either way*, but they have to come from one
of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
both known to have a clear low bit, then ``%A`` would have to have a
cleared low bit. However, in the ``%C`` example, the optimizer is
allowed to assume that the '``undef``' operand could be the same as
``%Y``, allowing the whole '``select``' to be eliminated.

.. code-block:: llvm

      %A = xor undef, undef

      %B = undef
      %C = xor %B, %B

      %D = undef
      %E = icmp lt %D, 4
      %F = icmp gte %D, 4

    Safe:
      %A = undef
      %B = undef
      %C = undef
      %D = undef
      %E = undef
      %F = undef

This example points out that two '``undef``' operands are not
necessarily the same. This can be surprising to people (and also matches
C semantics) where they assume that "``X^X``" is always zero, even if
``X`` is undefined. This isn't true for a number of reasons, but the
short answer is that an '``undef``' "variable" can arbitrarily change
its value over its "live range". This is true because the variable
doesn't actually *have a live range*. Instead, the value is logically
read from arbitrary registers that happen to be around when needed, so
the value is not necessarily consistent over time. In fact, ``%A`` and
``%C`` need to have the same semantics or the core LLVM "replace all
uses with" concept would not hold.

.. code-block:: llvm

      %A = fdiv undef, %X
      %B = fdiv %X, undef
    Safe:
      %A = undef
    b: unreachable

These examples show the crucial difference between an *undefined value*
and *undefined behavior*. An undefined value (like '``undef``') is
allowed to have an arbitrary bit-pattern. This means that the ``%A``
operation can be constant folded to '``undef``', because the '``undef``'
could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's.
However, in the second example, we can make a more aggressive
assumption: because the ``undef`` is allowed to be an arbitrary value,
we are allowed to assume that it could be zero. Since a divide by zero
has *undefined behavior*, we are allowed to assume that the operation
does not execute at all. This allows us to delete the divide and all
code after it. Because the undefined operation "can't happen", the
optimizer can assume that it occurs in dead code.

.. code-block:: llvm

    a:  store undef -> %X
    b:  store %X -> undef
    Safe:
    a: <deleted>
    b: unreachable

These examples reiterate the ``fdiv`` example: a store *of* an undefined
value can be assumed to not have any effect; we can assume that the
value is overwritten with bits that happen to match what was already
there. However, a store *to* an undefined location could clobber
arbitrary memory, therefore, it has undefined behavior.

.. _poisonvalues: