Skip to content
LangRef.html 109 KiB
Newer Older
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
                      "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
  <title>LLVM Assembly Language Reference Manual</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  <meta name="author" content="Chris Lattner">
  <meta name="description" 
  content="LLVM Assembly Language Reference Manual.">
  <link rel="stylesheet" href="llvm.css" type="text/css">
</head>
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_title"> LLVM Language Reference Manual </div>
Chris Lattner's avatar
Chris Lattner committed
<ol>
  <li><a href="#abstract">Abstract</a></li>
  <li><a href="#introduction">Introduction</a></li>
  <li><a href="#identifiers">Identifiers</a></li>
  <li><a href="#highlevel">High Level Structure</a>
    <ol>
      <li><a href="#modulestructure">Module Structure</a></li>
      <li><a href="#linkage">Linkage Types</a></li>
      <li><a href="#globalvars">Global Variables</a></li>
      <li><a href="#functionstructure">Function Structure</a></li>
    </ol>
  </li>
Chris Lattner's avatar
Chris Lattner committed
  <li><a href="#typesystem">Type System</a>
    <ol>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#t_primitive">Primitive Types</a> 	
        <ol>
          <li><a href="#t_classifications">Type Classifications</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#t_derived">Derived Types</a>
        <ol>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#t_array">Array Type</a></li>
          <li><a href="#t_function">Function Type</a></li>
          <li><a href="#t_pointer">Pointer Type</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#t_struct">Structure Type</a></li>
          <li><a href="#t_packed">Packed Type</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#constants">Constants</a>
    <ol>
      <li><a href="#simpleconstants">Simple Constants</a>
      <li><a href="#aggregateconstants">Aggregate Constants</a>
      <li><a href="#globalconstants">Global Variable and Function Addresses</a>
      <li><a href="#undefvalues">Undefined Values</a>
      <li><a href="#constantexprs">Constant Expressions</a>
    </ol>
Chris Lattner's avatar
Chris Lattner committed
  </li>
Chris Lattner's avatar
Chris Lattner committed
  <li><a href="#instref">Instruction Reference</a>
    <ol>
      <li><a href="#terminators">Terminator Instructions</a>
        <ol>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_ret">'<tt>ret</tt>' Instruction</a></li>
          <li><a href="#i_br">'<tt>br</tt>' Instruction</a></li>
          <li><a href="#i_switch">'<tt>switch</tt>' Instruction</a></li>
          <li><a href="#i_invoke">'<tt>invoke</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_unwind">'<tt>unwind</tt>'  Instruction</a></li>
          <li><a href="#i_unreachable">'<tt>unreachable</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#binaryops">Binary Operations</a>
        <ol>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_add">'<tt>add</tt>' Instruction</a></li>
          <li><a href="#i_sub">'<tt>sub</tt>' Instruction</a></li>
          <li><a href="#i_mul">'<tt>mul</tt>' Instruction</a></li>
          <li><a href="#i_div">'<tt>div</tt>' Instruction</a></li>
          <li><a href="#i_rem">'<tt>rem</tt>' Instruction</a></li>
          <li><a href="#i_setcc">'<tt>set<i>cc</i></tt>' Instructions</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#bitwiseops">Bitwise Binary Operations</a>
        <ol>
          <li><a href="#i_and">'<tt>and</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_or">'<tt>or</tt>'  Instruction</a></li>
          <li><a href="#i_xor">'<tt>xor</tt>' Instruction</a></li>
          <li><a href="#i_shl">'<tt>shl</tt>' Instruction</a></li>
          <li><a href="#i_shr">'<tt>shr</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#memoryops">Memory Access Operations</a>
        <ol>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_malloc">'<tt>malloc</tt>'   Instruction</a></li>
          <li><a href="#i_free">'<tt>free</tt>'     Instruction</a></li>
          <li><a href="#i_alloca">'<tt>alloca</tt>'   Instruction</a></li>
	 <li><a href="#i_load">'<tt>load</tt>'     Instruction</a></li>
	 <li><a href="#i_store">'<tt>store</tt>'    Instruction</a></li>
	 <li><a href="#i_getelementptr">'<tt>getelementptr</tt>' Instruction</a></li>
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#otherops">Other Operations</a>
        <ol>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_phi">'<tt>phi</tt>'   Instruction</a></li>
          <li><a href="#i_cast">'<tt>cast .. to</tt>' Instruction</a></li>
          <li><a href="#i_select">'<tt>select</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_call">'<tt>call</tt>'  Instruction</a></li>
          <li><a href="#i_vanext">'<tt>vanext</tt>' Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_vaarg">'<tt>vaarg</tt>'  Instruction</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
Chris Lattner's avatar
Chris Lattner committed
      </li>
Chris Lattner's avatar
Chris Lattner committed
    </ol>
Chris Lattner's avatar
Chris Lattner committed
  </li>
  <li><a href="#intrinsics">Intrinsic Functions</a>
    <ol>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#int_varargs">Variable Argument Handling Intrinsics</a>
        <ol>
          <li><a href="#i_va_start">'<tt>llvm.va_start</tt>' Intrinsic</a></li>
          <li><a href="#i_va_end">'<tt>llvm.va_end</tt>'   Intrinsic</a></li>
          <li><a href="#i_va_copy">'<tt>llvm.va_copy</tt>'  Intrinsic</a></li>
        </ol>
      </li>
      <li><a href="#int_gc">Accurate Garbage Collection Intrinsics</a>
        <ol>
          <li><a href="#i_gcroot">'<tt>llvm.gcroot</tt>' Intrinsic</a></li>
          <li><a href="#i_gcread">'<tt>llvm.gcread</tt>' Intrinsic</a></li>
          <li><a href="#i_gcwrite">'<tt>llvm.gcwrite</tt>' Intrinsic</a></li>
        </ol>
      </li>
Chris Lattner's avatar
Chris Lattner committed
      <li><a href="#int_codegen">Code Generator Intrinsics</a>
        <ol>
          <li><a href="#i_returnaddress">'<tt>llvm.returnaddress</tt>' Intrinsic</a></li>
          <li><a href="#i_frameaddress">'<tt>llvm.frameaddress</tt>'   Intrinsic</a></li>
        </ol>
      </li>
      <li><a href="#int_os">Operating System Intrinsics</a>
        <ol>
          <li><a href="#i_readport">'<tt>llvm.readport</tt>' Intrinsic</a></li>
          <li><a href="#i_writeport">'<tt>llvm.writeport</tt>' Intrinsic</a></li>
          <li><a href="#i_readio">'<tt>llvm.readio</tt>'   Intrinsic</a></li>
          <li><a href="#i_writeio">'<tt>llvm.writeio</tt>'   Intrinsic</a></li>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      <li><a href="#int_libc">Standard C Library Intrinsics</a>
        <ol>
          <li><a href="#i_memcpy">'<tt>llvm.memcpy</tt>' Intrinsic</a></li>
          <li><a href="#i_memmove">'<tt>llvm.memmove</tt>' Intrinsic</a></li>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_memset">'<tt>llvm.memset</tt>' Intrinsic</a></li>
          <li><a href="#i_isunordered">'<tt>llvm.isunordered</tt>' Intrinsic</a></li>
      <li><a href="#int_debugger">Debugger intrinsics</a></li>
Chris Lattner's avatar
Chris Lattner committed
    </ol>
  </li>
Chris Lattner's avatar
Chris Lattner committed
</ol>

<div class="doc_author">
  <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
            and <a href="mailto:vadve@cs.uiuc.edu">Vikram Adve</a></p>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_section"> <a name="abstract">Abstract </a></div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<p>This document is a reference manual for the LLVM assembly language. 
LLVM is an SSA based representation that provides type safety,
low-level operations, flexibility, and the capability of representing
'all' high-level languages cleanly.  It is the common code
representation used throughout all phases of the LLVM compilation
strategy.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_section"> <a name="introduction">Introduction</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<p>The LLVM code representation is designed to be used in three
different forms: as an in-memory compiler IR, as an on-disk bytecode
representation (suitable for fast loading by a Just-In-Time compiler),
and as a human readable assembly language representation.  This allows
LLVM to provide a powerful intermediate representation for efficient
compiler transformations and analysis, while providing a natural means
to debug and visualize the transformations.  The three different forms
of LLVM are all equivalent.  This document describes the human readable
representation and notation.</p>
Chris Lattner's avatar
Chris Lattner committed
<p>The LLVM representation aims to be a light-weight and low-level
while being expressive, typed, and extensible at the same time.  It
aims to be a "universal IR" of sorts, by being at a low enough level
that high-level ideas may be cleanly mapped to it (similar to how
microprocessors are "universal IR's", allowing many source languages to
be mapped to them).  By providing type information, LLVM can be used as
the target of optimizations: for example, through pointer analysis, it
can be proven that a C automatic variable is never accessed outside of
the current function... allowing it to be promoted to a simple SSA
value instead of a memory location.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="wellformed">Well-Formedness</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<p>It is important to note that this document describes 'well formed'
LLVM assembly language.  There is a difference between what the parser
accepts and what is considered 'well formed'.  For example, the
following instruction is syntactically okay, but not well formed:</p>

<pre>
  %x = <a href="#i_add">add</a> int 1, %x
</pre>

Chris Lattner's avatar
Chris Lattner committed
<p>...because the definition of <tt>%x</tt> does not dominate all of
its uses. The LLVM infrastructure provides a verification pass that may
be used to verify that an LLVM module is well formed.  This pass is
automatically run by the parser after parsing input assembly, and by
the optimizer before it outputs bytecode.  The violations pointed out
by the verifier pass indicate bugs in transformation passes or input to
the parser.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- Describe the typesetting conventions here. --> </div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_section"> <a name="identifiers">Identifiers</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<p>LLVM uses three different forms of identifiers, for different
purposes:</p>
Chris Lattner's avatar
Chris Lattner committed
<ol>
  <li>Named values are represented as a string of characters with a '%' prefix.
  For example, %foo, %DivisionByZero, %a.really.long.identifier.  The actual
  regular expression used is '<tt>%[a-zA-Z$._][a-zA-Z$._0-9]*</tt>'.
  Identifiers which require other characters in their names can be surrounded
  with quotes.  In this way, anything except a <tt>"</tt> character can be used
  in a name.</li>

  <li>Unnamed values are represented as an unsigned numeric value with a '%'
  prefix.  For example, %12, %2, %44.</li>

Reid Spencer's avatar
Reid Spencer committed
  <li>Constants, which are described in a <a href="#constants">section about
  constants</a>, below.</li>

<p>LLVM requires that values start with a '%' sign for two reasons: Compilers
don't need to worry about name clashes with reserved words, and the set of
reserved words may be expanded in the future without penalty.  Additionally,
unnamed identifiers allow a compiler to quickly come up with a temporary
variable without having to avoid symbol table conflicts.</p>

Chris Lattner's avatar
Chris Lattner committed
<p>Reserved words in LLVM are very similar to reserved words in other
languages. There are keywords for different opcodes ('<tt><a
href="#i_add">add</a></tt>', '<tt><a href="#i_cast">cast</a></tt>', '<tt><a
href="#i_ret">ret</a></tt>', etc...), for primitive type names ('<tt><a
href="#t_void">void</a></tt>', '<tt><a href="#t_uint">uint</a></tt>', etc...),
and others.  These reserved words cannot conflict with variable names, because
none of them start with a '%' character.</p>

<p>Here is an example of LLVM code to multiply the integer variable
'<tt>%X</tt>' by 8:</p>


<pre>
  %result = <a href="#i_mul">mul</a> uint %X, 8
</pre>


<pre>
  %result = <a href="#i_shl">shl</a> uint %X, ubyte 3
</pre>


<pre>
  <a href="#i_add">add</a> uint %X, %X           <i>; yields {uint}:%0</i>
  <a href="#i_add">add</a> uint %0, %0           <i>; yields {uint}:%1</i>
  %result = <a href="#i_add">add</a> uint %1, %1
</pre>

Chris Lattner's avatar
Chris Lattner committed
<p>This last way of multiplying <tt>%X</tt> by 8 illustrates several
important lexical features of LLVM:</p>
Chris Lattner's avatar
Chris Lattner committed
<ol>

  <li>Comments are delimited with a '<tt>;</tt>' and go until the end of
  line.</li>

  <li>Unnamed temporaries are created when the result of a computation is not
  assigned to a named value.</li>

  <li>Unnamed temporaries are numbered sequentially</li>

<p>...and it also show a convention that we follow in this document.  When
demonstrating instructions, we will follow an instruction with a comment that
defines the type and name of value produced.  Comments are shown in italic
text.</p>


<!-- *********************************************************************** -->
<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
<!-- *********************************************************************** -->

<!-- ======================================================================= -->
<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
</div>

<div class="doc_text">

<p>LLVM programs are composed of "Module"s, each of which is a
translation unit of the input programs.  Each module consists of
functions, global variables, and symbol table entries.  Modules may be
combined together with the LLVM linker, which merges function (and
global variable) definitions, resolves forward declarations, and merges
symbol table entries. Here is an example of the "hello world" module:</p>

<pre><i>; Declare the string constant as a global constant...</i>
<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
 href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00"          <i>; [13 x sbyte]*</i>

<i>; External declaration of the puts function</i>
<a href="#functionstructure">declare</a> int %puts(sbyte*)                                            <i>; int(sbyte*)* </i>

<i>; Definition of main function</i>
int %main() {                                                        <i>; int()* </i>
        <i>; Convert [13x sbyte]* to sbyte *...</i>
        %cast210 = <a
 href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>

        <i>; Call puts function to write out the string to stdout...</i>
        <a
 href="#i_call">call</a> int %puts(sbyte* %cast210)                              <i>; int</i>
        <a
 href="#i_ret">ret</a> int 0<br>}<br></pre>

<p>This example is made up of a <a href="#globalvars">global variable</a>
named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
function, and a <a href="#functionstructure">function definition</a>
for "<tt>main</tt>".</p>

<p>In general, a module is made up of a list of global values,
where both functions and global variables are global values.  Global values are
represented by a pointer to a memory location (in this case, a pointer to an
array of char, and a pointer to a function), and have one of the following <a
href="#linkage">linkage types</a>.</p>

</div>

<!-- ======================================================================= -->
<div class="doc_subsection">
  <a name="linkage">Linkage Types</a>
</div>
<div class="doc_text">

<p>
All Global Variables and Functions have one of the following types of linkage:
</p>
  <dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>

  <dd>Global values with internal linkage are only directly accessible by
  objects in the current module.  In particular, linking code into a module with
  an internal global value may cause the internal to be renamed as necessary to
  avoid collisions.  Because the symbol is internal to the module, all
  references can be updated.  This corresponds to the notion of the
  '<tt>static</tt>' keyword in C, or the idea of "anonymous namespaces" in C++.
  <dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>

  <dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt> linkage, with
  the twist that linking together two modules defining the same
  <tt>linkonce</tt> globals will cause one of the globals to be discarded.  This
  is typically used to implement inline functions.  Unreferenced
  <tt>linkonce</tt> globals are allowed to be discarded.
  <dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>

  <dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt> linkage,
  except that unreferenced <tt>weak</tt> globals may not be discarded.  This is
  used to implement constructs in C such as "<tt>int X;</tt>" at global scope.
  <dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>

  <dd>"<tt>appending</tt>" linkage may only be applied to global variables of
  pointer to array type.  When two global variables with appending linkage are
  linked together, the two global arrays are appended together.  This is the
  LLVM, typesafe, equivalent of having the system linker append together
  "sections" with identical names when .o files are linked.
  <dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>

  <dd>If none of the above identifiers are used, the global is externally
  visible, meaning that it participates in linkage and can be used to resolve
  external symbol references.
  </dd>
</dl>

<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
variable and was linked with this one, one of the two would be renamed,
preventing a collision.  Since "<tt>main</tt>" and "<tt>puts</tt>" are
external (i.e., lacking any linkage declarations), they are accessible
outside of the current module.  It is illegal for a function <i>declaration</i>
to have any linkage type other than "externally visible".</a></p>
</div>

<!-- ======================================================================= -->
<div class="doc_subsection">
  <a name="globalvars">Global Variables</a>
</div>

<div class="doc_text">

<p>Global variables define regions of memory allocated at compilation time
instead of run-time.  Global variables may optionally be initialized.  A
variable may be defined as a global "constant", which indicates that the
contents of the variable will <b>never</b> be modified (enabling better
optimization, allowing the global data to be placed in the read-only section of
an executable, etc).  Note that variables that need runtime initialization
cannot be marked "constant", as there is a store to the variable.</p>

<p>
LLVM explicitly allows <em>declarations</em> of global variables to be marked
constant, even if the final definition of the global is not.  This capability
can be used to enable slightly better optimization of the program, but requires
the language definition to guarantee that optimizations based on the
'constantness' are valid for the translation units that do not include the
definition.
</p>

<p>As SSA values, global variables define pointer values that are in
scope (i.e. they dominate) all basic blocks in the program.  Global
variables always define a pointer to their "content" type because they
describe a region of memory, and all memory objects in LLVM are
accessed through pointers.</p>

</div>


<!-- ======================================================================= -->
<div class="doc_subsection">
  <a name="functionstructure">Functions</a>
</div>

<div class="doc_text">

<p>LLVM function definitions are composed of a (possibly empty) argument list,
an opening curly brace, a list of basic blocks, and a closing curly brace.  LLVM
function declarations are defined with the "<tt>declare</tt>" keyword, a
function name, and a function signature.</p>

<p>A function definition contains a list of basic blocks, forming the CFG for
the function.  Each basic block may optionally start with a label (giving the
basic block a symbol table entry), contains a list of instructions, and ends
with a <a href="#terminators">terminator</a> instruction (such as a branch or
function return).</p>

<p>The first basic block in program is special in two ways: it is immediately
executed on entrance to the function, and it is not allowed to have predecessor
basic blocks (i.e. there can not be any branches to the entry block of a
function).  Because the block can have no predecessors, it also cannot have any
<a href="#i_phi">PHI nodes</a>.</p>

<p>LLVM functions are identified by their name and type signature.  Hence, two
functions with the same name but different parameter lists or return values are
considered different functions, and LLVM will resolves references to each
appropriately.</p>

</div>



Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_section"> <a name="typesystem">Type System</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
<p>The LLVM type system is one of the most important features of the
Chris Lattner's avatar
Chris Lattner committed
intermediate representation.  Being typed enables a number of
optimizations to be performed on the IR directly, without having to do
extra analyses on the side before the transformation.  A strong type
system makes it easier to read the generated code and enables novel
analyses and transformations that are not feasible to perform on normal
three address code representations.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- ======================================================================= -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div>
John Criswell's avatar
John Criswell committed
<p>The primitive types are the fundamental building blocks of the LLVM
Chris Lattner's avatar
Chris Lattner committed
system. The current set of primitive types are as follows:</p>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <table>
Chris Lattner's avatar
Chris Lattner committed
        <tbody>
        <tr><th>Type</th><th>Description</th></tr>
        <tr><td><tt>void</tt></td><td>No value</td></tr>
        <tr><td><tt>ubyte</tt></td><td>Unsigned 8 bit value</td></tr>
        <tr><td><tt>ushort</tt></td><td>Unsigned 16 bit value</td></tr>
        <tr><td><tt>uint</tt></td><td>Unsigned 32 bit value</td></tr>
        <tr><td><tt>ulong</tt></td><td>Unsigned 64 bit value</td></tr>
        <tr><td><tt>float</tt></td><td>32 bit floating point value</td></tr>
        <tr><td><tt>label</tt></td><td>Branch destination</td></tr>
Chris Lattner's avatar
Chris Lattner committed
        </tbody>
      </table>
    </td>
    <td class="right">
      <table>
Chris Lattner's avatar
Chris Lattner committed
        <tbody>
          <tr><th>Type</th><th>Description</th></tr>
          <tr><td><tt>bool</tt></td><td>True or False value</td></tr>
          <tr><td><tt>sbyte</tt></td><td>Signed 8 bit value</td></tr>
          <tr><td><tt>short</tt></td><td>Signed 16 bit value</td></tr>
          <tr><td><tt>int</tt></td><td>Signed 32 bit value</td></tr>
          <tr><td><tt>long</tt></td><td>Signed 64 bit value</td></tr>
          <tr><td><tt>double</tt></td><td>64 bit floating point value</td></tr>
Chris Lattner's avatar
Chris Lattner committed
        </tbody>
      </table>
    </td>
  </tr>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="t_classifications">Type
Classifications</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<p>These different primitive types fall into a few useful
classifications:</p>

<table border="1" cellspacing="0" cellpadding="4">
Chris Lattner's avatar
Chris Lattner committed
  <tbody>
    <tr><th>Classification</th><th>Types</th></tr>
Chris Lattner's avatar
Chris Lattner committed
    <tr>
      <td><a name="t_signed">signed</a></td>
      <td><tt>sbyte, short, int, long, float, double</tt></td>
    </tr>
    <tr>
      <td><a name="t_unsigned">unsigned</a></td>
      <td><tt>ubyte, ushort, uint, ulong</tt></td>
    </tr>
    <tr>
      <td><a name="t_integer">integer</a></td>
      <td><tt>ubyte, sbyte, ushort, short, uint, int, ulong, long</tt></td>
    </tr>
    <tr>
      <td><a name="t_integral">integral</a></td>
      <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long</tt>
      </td>
Chris Lattner's avatar
Chris Lattner committed
    </tr>
    <tr>
      <td><a name="t_floating">floating point</a></td>
      <td><tt>float, double</tt></td>
    </tr>
    <tr>
      <td><a name="t_firstclass">first class</a></td>
      <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long,<br> 
      float, double, <a href="#t_pointer">pointer</a>, 
      <a href="#t_packed">packed</a></tt></td>
Chris Lattner's avatar
Chris Lattner committed
    </tr>
  </tbody>
Chris Lattner's avatar
Chris Lattner committed
<p>The <a href="#t_firstclass">first class</a> types are perhaps the
most important.  Values of these types are the only ones which can be
produced by instructions, passed as arguments, or used as operands to
instructions.  This means that all structures and arrays must be
manipulated either by pointer or by component.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- ======================================================================= -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsection"> <a name="t_derived">Derived Types</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<p>The real power in LLVM comes from the derived types in the system. 
This is what allows a programmer to represent arrays, functions,
pointers, and other useful types.  Note that these derived types may be
recursive: For example, it is possible to have a two dimensional array.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="t_array">Array Type</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<h5>Overview:</h5>
<p>The array type is a very simple derived type that arranges elements
Chris Lattner's avatar
Chris Lattner committed
sequentially in memory.  The array type requires a size (number of
elements) and an underlying data type.</p>

<pre>
  [&lt;# elements&gt; x &lt;elementtype&gt;]
</pre>

Chris Lattner's avatar
Chris Lattner committed
<p>The number of elements is a constant integer value, elementtype may
be any type with a size.</p>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>[40 x int ]</tt><br/>
      <tt>[41 x int ]</tt><br/>
      <tt>[40 x uint]</tt><br/>
    </td>
    <td class="left">
      Array of 40 integer values.<br/>
      Array of 41 integer values.<br/>
      Array of 40 unsigned integer values.<br/>
    </td>
  </tr>
</table>
<p>Here are some examples of multidimensional arrays:</p>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>[3 x [4 x int]]</tt><br/>
      <tt>[12 x [10 x float]]</tt><br/>
      <tt>[2 x [3 x [4 x uint]]]</tt><br/>
    </td>
    <td class="left">
      3x4 array integer values.<br/>
      12x10 array of single precision floating point values.<br/>
      2x3x4 array of unsigned integer values.<br/>
    </td>
  </tr>
Chris Lattner's avatar
Chris Lattner committed
</table>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="t_function">Function Type</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<h5>Overview:</h5>
Chris Lattner's avatar
Chris Lattner committed
<p>The function type can be thought of as a function signature.  It
consists of a return type and a list of formal parameter types. 
Function types are usually used to build virtual function tables
Chris Lattner's avatar
Chris Lattner committed
(which are structures of pointers to functions), for indirect function
calls, and when defining a function.</p>
<p>
The return type of a function type cannot be an aggregate type.
</p>
Chris Lattner's avatar
Chris Lattner committed
<h5>Syntax:</h5>
Chris Lattner's avatar
Chris Lattner committed
<pre>  &lt;returntype&gt; (&lt;parameter list&gt;)<br></pre>
<p>Where '<tt>&lt;parameter list&gt;</tt>' is a comma-separated list of type
specifiers.  Optionally, the parameter list may include a type <tt>...</tt>,
which indicates that the function takes a variable number of arguments.
Variable argument functions can access their arguments with the <a
Chris Lattner's avatar
Chris Lattner committed
 href="#int_varargs">variable argument handling intrinsic</a> functions.</p>
Chris Lattner's avatar
Chris Lattner committed
<h5>Examples:</h5>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>int (int)</tt> <br/>
      <tt>float (int, int *) *</tt><br/>
      <tt>int (sbyte *, ...)</tt><br/>
    </td>
    <td class="left">
      function taking an <tt>int</tt>, returning an <tt>int</tt><br/>
      <a href="#t_pointer">Pointer</a> to a function that takes an
      <tt>int</tt> and a <a href="#t_pointer">pointer</a> to <tt>int</tt>,
      returning <tt>float</tt>.<br/>
      A vararg function that takes at least one <a href="#t_pointer">pointer</a> 
      to <tt>sbyte</tt> (signed char in C), which returns an integer.  This is 
      the signature for <tt>printf</tt> in LLVM.<br/>
    </td>
  </tr>
Chris Lattner's avatar
Chris Lattner committed
</table>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="t_struct">Structure Type</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<h5>Overview:</h5>
Chris Lattner's avatar
Chris Lattner committed
<p>The structure type is used to represent a collection of data members
together in memory.  The packing of the field types is defined to match
the ABI of the underlying processor.  The elements of a structure may
be any type that has a size.</p>
<p>Structures are accessed using '<tt><a href="#i_load">load</a></tt>
and '<tt><a href="#i_store">store</a></tt>' by getting a pointer to a
field with the '<tt><a href="#i_getelementptr">getelementptr</a></tt>'
instruction.</p>
Chris Lattner's avatar
Chris Lattner committed
<h5>Syntax:</h5>
Chris Lattner's avatar
Chris Lattner committed
<pre>  { &lt;type list&gt; }<br></pre>
Chris Lattner's avatar
Chris Lattner committed
<h5>Examples:</h5>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>{ int, int, int }</tt><br/>
      <tt>{ float, int (int) * }</tt><br/>
    </td>
    <td class="left">
      a triple of three <tt>int</tt> values<br/>
      A pair, where the first element is a <tt>float</tt> and the second element 
      is a <a href="#t_pointer">pointer</a> to a <a href="#t_function">function</a> 
      that takes an <tt>int</tt>, returning an <tt>int</tt>.<br/>
    </td>
  </tr>
Chris Lattner's avatar
Chris Lattner committed
</table>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="t_pointer">Pointer Type</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<p>As in many languages, the pointer type represents a pointer or
reference to another object, which must live in memory.</p>
Chris Lattner's avatar
Chris Lattner committed
<pre>  &lt;type&gt; *<br></pre>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>[4x int]*</tt><br/>
      <tt>int (int *) *</tt><br/>
    </td>
    <td class="left">
      A <a href="#t_pointer">pointer</a> to <a href="#t_array">array</a> of
      four <tt>int</tt> values<br/>
      A <a href="#t_pointer">pointer</a> to a <a
Chris Lattner's avatar
Chris Lattner committed
      href="#t_function">function</a> that takes an <tt>int*</tt>, returning an
      <tt>int</tt>.<br/>
    </td>
  </tr>
<!-- _______________________________________________________________________ -->
<div class="doc_subsubsection"> <a name="t_packed">Packed Type</a> </div>
<h5>Overview:</h5>
<p>A packed type is a simple derived type that represents a vector
of elements.  Packed types are used when multiple primitive data 
are operated in parallel using a single instruction (SIMD). 
A packed type requires a size (number of
elements) and an underlying primitive data type.  Packed types are
considered <a href="#t_firstclass">first class</a>.</p>
<h5>Syntax:</h5>
<pre>  &lt; &lt;# elements&gt; x &lt;elementtype&gt; &gt;<br></pre>
<p>The number of elements is a constant integer value, elementtype may
be any integral or floating point type.</p>
<h5>Examples:</h5>
<table class="layout">
  <tr class="layout">
    <td class="left">
      <tt>&lt;4 x int&gt;</tt><br/>
      <tt>&lt;8 x float&gt;</tt><br/>
      <tt>&lt;2 x uint&gt;</tt><br/>
    </td>
    <td class="left">
      Packed vector of 4 integer values.<br/>
      Packed vector of 8 floating-point values.<br/>
      Packed vector of 2 unsigned integer values.<br/>
    </td>
  </tr>
</table>
<!-- *********************************************************************** -->
<div class="doc_section"> <a name="constants">Constants</a> </div>
<!-- *********************************************************************** -->

<div class="doc_text">

<p>LLVM has several different basic types of constants.  This section describes
them all and their syntax.</p>

</div>

<!-- ======================================================================= -->
Reid Spencer's avatar
Reid Spencer committed
<div class="doc_subsection"><a name="simpleconstants">Simple Constants</a></div>

<div class="doc_text">

<dl>
  <dt><b>Boolean constants</b></dt>

  <dd>The two strings '<tt>true</tt>' and '<tt>false</tt>' are both valid
  constants of the <tt><a href="#t_primitive">bool</a></tt> type.
  </dd>

  <dt><b>Integer constants</b></dt>

Reid Spencer's avatar
Reid Spencer committed
  <dd>Standard integers (such as '4') are constants of the <a
  href="#t_integer">integer</a> type.  Negative numbers may be used with signed
  integer types.
  </dd>

  <dt><b>Floating point constants</b></dt>

  <dd>Floating point constants use standard decimal notation (e.g. 123.421),
  exponential notation (e.g. 1.23421e+2), or a more precise hexadecimal
Reid Spencer's avatar
Reid Spencer committed
  notation.  Floating point constants have an optional hexadecimal
  notation (see below).  Floating point constants must have a <a
  href="#t_floating">floating point</a> type. </dd>

  <dt><b>Null pointer constants</b></dt>

  <dd>The identifier '<tt>null</tt>' is recognized as a null pointer constant
  and must be of <a href="#t_pointer">pointer type</a>.</dd>

</dl>

<p>The one non-intuitive notation for constants is the optional hexadecimal form
of floating point constants.  For example, the form '<tt>double
0x432ff973cafa8000</tt>' is equivalent to (but harder to read than) '<tt>double
4.5e+15</tt>'.  The only time hexadecimal floating point constants are required
Reid Spencer's avatar
Reid Spencer committed
(and the only time that they are generated by the disassembler) is when a 
floating point constant must be emitted but it cannot be represented as a 
decimal floating point number.  For example, NaN's, infinities, and other 
special values are represented in their IEEE hexadecimal format so that 
assembly and disassembly do not cause any bits to change in the constants.</p>

</div>

<!-- ======================================================================= -->
<div class="doc_subsection"><a name="aggregateconstants">Aggregate Constants</a>
</div>

<div class="doc_text">

<dl>
  <dt><b>Structure constants</b></dt>

  <dd>Structure constants are represented with notation similar to structure
  type definitions (a comma separated list of elements, surrounded by braces
  (<tt>{}</tt>)).  For example: "<tt>{ int 4, float 17.0 }</tt>".  Structure
  constants must have <a href="#t_struct">structure type</a>, and the number and
  types of elements must match those specified by the type.
  </dd>

  <dt><b>Array constants</b></dt>

  <dd>Array constants are represented with notation similar to array type
  definitions (a comma separated list of elements, surrounded by square brackets
  (<tt>[]</tt>)).  For example: "<tt>[ int 42, int 11, int 74 ]</tt>".  Array
  constants must have <a href="#t_array">array type</a>, and the number and
  types of elements must match those specified by the type.
  </dd>

  <dt><b>Packed constants</b></dt>

  <dd>Packed constants are represented with notation similar to packed type
  definitions (a comma separated list of elements, surrounded by
  less-than/greater-than's (<tt>&lt;&gt;</tt>)).  For example: "<tt>&lt; int 42,
  int 11, int 74, int 100 &gt;</tt>".  Packed constants must have <a
  href="#t_packed">packed type</a>, and the number and types of elements must
  match those specified by the type.
  </dd>

  <dt><b>Zero initialization</b></dt>

  <dd>The string '<tt>zeroinitializer</tt>' can be used to zero initialize a
  value to zero of <em>any</em> type, including scalar and aggregate types.
  This is often used to avoid having to print large zero initializers (e.g. for
  large arrays), and is always exactly equivalent to using explicit zero
  initializers.
  </dd>
</dl>

</div>

<!-- ======================================================================= -->
<div class="doc_subsection">
  <a name="globalconstants">Global Variable and Function Addresses</a>
</div>

<div class="doc_text">

<p>The addresses of <a href="#globalvars">global variables</a> and <a
href="#functionstructure">functions</a> are always implicitly valid (link-time)
constants.  These constants are explicitly referenced when the <a
href="#identifiers">identifier for the global</a> is used and always have <a
href="#t_pointer">pointer</a> type. For example, the following is a legal LLVM
file:</p>

<pre>
  %X = global int 17
  %Y = global int 42
  %Z = global [2 x int*] [ int* %X, int* %Y ]
</pre>

</div>

<!-- ======================================================================= -->
Reid Spencer's avatar
Reid Spencer committed
<div class="doc_subsection"><a name="undefvalues">Undefined Values</a></div>
Reid Spencer's avatar
Reid Spencer committed
  <p>The string '<tt>undef</tt>' is recognized as a type-less constant that has 
  no specific value.  Undefined values may be of any type, and be used anywhere 
  a constant is permitted.</p>
Reid Spencer's avatar
Reid Spencer committed
  <p>Undefined values indicate to the compiler that the program is well defined
  no matter what value is used, giving the compiler more freedom to optimize.
  </p>
</div>

<!-- ======================================================================= -->
<div class="doc_subsection"><a name="constantexprs">Constant Expressions</a>
</div>

<div class="doc_text">

<p>Constant expressions are used to allow expressions involving other constants
to be used as constants.  Constant expressions may be of any <a
href="#t_firstclass">first class</a> type, and may involve any LLVM operation
that does not have side effects (e.g. load and call are not supported).  The
following is the syntax for constant expressions:</p>

<dl>
  <dt><b><tt>cast ( CST to TYPE )</tt></b></dt>

  <dd>Cast a constant to another type.</dd>

  <dt><b><tt>getelementptr ( CSTPTR, IDX0, IDX1, ... )</tt></b></dt>

  <dd>Perform the <a href="#i_getelementptr">getelementptr operation</a> on
  constants.  As with the <a href="#i_getelementptr">getelementptr</a>
  instruction, the index list may have zero or more indexes, which are required
  to make sense for the type of "CSTPTR".</dd>

  <dt><b><tt>OPCODE ( LHS, RHS )</tt></b></dt>

Reid Spencer's avatar
Reid Spencer committed
  <dd>Perform the specified operation of the LHS and RHS constants. OPCODE may 
  be any of the <a href="#binaryops">binary</a> or <a href="#bitwiseops">bitwise
  binary</a> operations.  The constraints on operands are the same as those for
  the corresponding instruction (e.g. no bitwise operations on floating point
  are allowed).</dd>
</dl>
</div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_section"> <a name="instref">Instruction Reference</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<!-- *********************************************************************** -->
Chris Lattner's avatar
Chris Lattner committed
<p>The LLVM instruction set consists of several different
classifications of instructions: <a href="#terminators">terminator
instructions</a>, <a href="#binaryops">binary instructions</a>, <a
 href="#memoryops">memory instructions</a>, and <a href="#otherops">other
instructions</a>.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- ======================================================================= -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsection"> <a name="terminators">Terminator
Instructions</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<p>As mentioned <a href="#functionstructure">previously</a>, every
basic block in a program ends with a "Terminator" instruction, which
indicates which block should be executed after the current block is
finished. These terminator instructions typically yield a '<tt>void</tt>'
value: they produce control flow, not values (the one exception being
the '<a href="#i_invoke"><tt>invoke</tt></a>' instruction).</p>
<p>There are six different terminator instructions: the '<a
Chris Lattner's avatar
Chris Lattner committed
 href="#i_ret"><tt>ret</tt></a>' instruction, the '<a href="#i_br"><tt>br</tt></a>'
instruction, the '<a href="#i_switch"><tt>switch</tt></a>' instruction,
the '<a href="#i_invoke"><tt>invoke</tt></a>' instruction, the '<a
 href="#i_unwind"><tt>unwind</tt></a>' instruction, and the '<a
 href="#i_unreachable"><tt>unreachable</tt></a>' instruction.</p>
Chris Lattner's avatar
Chris Lattner committed
<!-- _______________________________________________________________________ -->
Chris Lattner's avatar
Chris Lattner committed
<div class="doc_subsubsection"> <a name="i_ret">'<tt>ret</tt>'
Instruction</a> </div>
Chris Lattner's avatar
Chris Lattner committed
<h5>Syntax:</h5>
Chris Lattner's avatar
Chris Lattner committed
<pre>  ret &lt;type&gt; &lt;value&gt;       <i>; Return a value from a non-void function</i>
  ret void                 <i>; Return from void function</i>
Chris Lattner's avatar
Chris Lattner committed
</pre>
<h5>Overview:</h5>
Chris Lattner's avatar
Chris Lattner committed
<p>The '<tt>ret</tt>' instruction is used to return control flow (and a
value) from a function, back to the caller.</p>
John Criswell's avatar
John Criswell committed
<p>There are two forms of the '<tt>ret</tt>' instruction: one that
Chris Lattner's avatar
Chris Lattner committed
returns a value and then causes control flow, and one that just causes
control flow to occur.</p>
Chris Lattner's avatar
Chris Lattner committed
<h5>Arguments:</h5>