Skip to content
LangRef.html 75.5 KiB
Newer Older
Chris Lattner's avatar
Chris Lattner committed
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>LLVM Assembly Language Reference Manual</title></head>
Chris Lattner's avatar
Chris Lattner committed
<body bgcolor=white>

<table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
<tr><td>&nbsp; <font size=+5 color="#EEEEFF" face="Georgia,Palatino,Times,Roman"><b>LLVM Language Reference Manual</b></font></td>
Chris Lattner's avatar
Chris Lattner committed
</tr></table>

<ol>
  <li><a href="#abstract">Abstract</a>
  <li><a href="#introduction">Introduction</a>
  <li><a href="#identifiers">Identifiers</a>
  <li><a href="#typesystem">Type System</a>
    <ol>
      <li><a href="#t_primitive">Primitive Types</a>
	<ol>
          <li><a href="#t_classifications">Type Classifications</a>
        </ol>
      <li><a href="#t_derived">Derived Types</a>
        <ol>
          <li><a href="#t_array"  >Array Type</a>
          <li><a href="#t_function">Function Type</a>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#t_pointer">Pointer Type</a>
          <li><a href="#t_struct" >Structure Type</a>
Chris Lattner's avatar
Chris Lattner committed
          <!-- <li><a href="#t_packed" >Packed Type</a> -->
Chris Lattner's avatar
Chris Lattner committed
        </ol>
    </ol>
  <li><a href="#highlevel">High Level Structure</a>
    <ol>
      <li><a href="#modulestructure">Module Structure</a>
      <li><a href="#globalvars">Global Variables</a>
      <li><a href="#functionstructure">Function Structure</a>
Chris Lattner's avatar
Chris Lattner committed
    </ol>
  <li><a href="#instref">Instruction Reference</a>
    <ol>
      <li><a href="#terminators">Terminator Instructions</a>
        <ol>
          <li><a href="#i_ret"   >'<tt>ret</tt>' Instruction</a>
          <li><a href="#i_br"    >'<tt>br</tt>' Instruction</a>
          <li><a href="#i_switch">'<tt>switch</tt>' Instruction</a>
          <li><a href="#i_invoke">'<tt>invoke</tt>' Instruction</a>
          <li><a href="#i_unwind"  >'<tt>unwind</tt>'  Instruction</a>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      <li><a href="#binaryops">Binary Operations</a>
        <ol>
          <li><a href="#i_add"  >'<tt>add</tt>' Instruction</a>
          <li><a href="#i_sub"  >'<tt>sub</tt>' Instruction</a>
          <li><a href="#i_mul"  >'<tt>mul</tt>' Instruction</a>
          <li><a href="#i_div"  >'<tt>div</tt>' Instruction</a>
          <li><a href="#i_rem"  >'<tt>rem</tt>' Instruction</a>
          <li><a href="#i_setcc">'<tt>set<i>cc</i></tt>' Instructions</a>
        </ol>
      <li><a href="#bitwiseops">Bitwise Binary Operations</a>
        <ol>
          <li><a href="#i_and">'<tt>and</tt>' Instruction</a>
          <li><a href="#i_or" >'<tt>or</tt>'  Instruction</a>
          <li><a href="#i_xor">'<tt>xor</tt>' Instruction</a>
          <li><a href="#i_shl">'<tt>shl</tt>' Instruction</a>
          <li><a href="#i_shr">'<tt>shr</tt>' Instruction</a>
        </ol>
      <li><a href="#memoryops">Memory Access Operations</a>
        <ol>
          <li><a href="#i_malloc"  >'<tt>malloc</tt>'   Instruction</a>
          <li><a href="#i_free"    >'<tt>free</tt>'     Instruction</a>
          <li><a href="#i_alloca"  >'<tt>alloca</tt>'   Instruction</a>
	  <li><a href="#i_load"    >'<tt>load</tt>'     Instruction</a>
	  <li><a href="#i_store"   >'<tt>store</tt>'    Instruction</a>
	  <li><a href="#i_getelementptr">'<tt>getelementptr</tt>' Instruction</a>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
      <li><a href="#otherops">Other Operations</a>
        <ol>
          <li><a href="#i_phi"  >'<tt>phi</tt>'   Instruction</a>
          <li><a href="#i_cast">'<tt>cast .. to</tt>' Instruction</a>
Chris Lattner's avatar
Chris Lattner committed
          <li><a href="#i_call" >'<tt>call</tt>'  Instruction</a>
          <li><a href="#i_vanext">'<tt>vanext</tt>' Instruction</a>
          <li><a href="#i_vaarg" >'<tt>vaarg</tt>'  Instruction</a>
Chris Lattner's avatar
Chris Lattner committed
        </ol>
    </ol>
  <li><a href="#intrinsics">Intrinsic Functions</a>
  <ol>
    <li><a href="#int_varargs">Variable Argument Handling Intrinsics</a>
    <ol>
      <li><a href="#i_va_start">'<tt>llvm.va_start</tt>' Intrinsic</a>
      <li><a href="#i_va_end"  >'<tt>llvm.va_end</tt>'   Intrinsic</a>
      <li><a href="#i_va_copy" >'<tt>llvm.va_copy</tt>'  Intrinsic</a>
    </ol>
  </ol>
Chris Lattner's avatar
Chris Lattner committed

  <p><b>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> and <A href="mailto:vadve@cs.uiuc.edu">Vikram Adve</a></b><p>


Chris Lattner's avatar
Chris Lattner committed
</ol>


<!-- *********************************************************************** -->
<p><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
<tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
Chris Lattner's avatar
Chris Lattner committed
<a name="abstract">Abstract
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->

<blockquote>
  This document is a reference manual for the LLVM assembly language.  LLVM is
  an SSA based representation that provides type safety, low-level operations,
  flexibility, and the capability of representing 'all' high-level languages
  cleanly.  It is the common code representation used throughout all phases of
  the LLVM compilation strategy.
Chris Lattner's avatar
Chris Lattner committed
</blockquote>




<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
<tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
Chris Lattner's avatar
Chris Lattner committed
<a name="introduction">Introduction
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->

The LLVM code representation is designed to be used in three different forms: as
an in-memory compiler IR, as an on-disk bytecode representation (suitable for
fast loading by a Just-In-Time compiler), and as a human readable assembly
language representation.  This allows LLVM to provide a powerful intermediate
representation for efficient compiler transformations and analysis, while
providing a natural means to debug and visualize the transformations.  The three
different forms of LLVM are all equivalent.  This document describes the human
readable representation and notation.<p>

The LLVM representation aims to be a light-weight and low-level while being
expressive, typed, and extensible at the same time.  It aims to be a "universal
IR" of sorts, by being at a low enough level that high-level ideas may be
cleanly mapped to it (similar to how microprocessors are "universal IR's",
allowing many source languages to be mapped to them).  By providing type
information, LLVM can be used as the target of optimizations: for example,
through pointer analysis, it can be proven that a C automatic variable is never
accessed outside of the current function... allowing it to be promoted to a
simple SSA value instead of a memory location.<p>
Chris Lattner's avatar
Chris Lattner committed

<!-- _______________________________________________________________________ -->
</ul><a name="wellformed"><h4><hr size=0>Well Formedness</h4><ul>

It is important to note that this document describes 'well formed' LLVM assembly
language.  There is a difference between what the parser accepts and what is
considered 'well formed'.  For example, the following instruction is
syntactically okay, but not well formed:<p>
Chris Lattner's avatar
Chris Lattner committed

<pre>
  %x = <a href="#i_add">add</a> int 1, %x
</pre>

...because the definition of <tt>%x</tt> does not dominate all of its uses.  The
LLVM infrastructure provides a verification pass that may be used to verify that
an LLVM module is well formed.  This pass is automatically run by the parser
after parsing input assembly, and by the optimizer before it outputs bytecode.
The violations pointed out by the verifier pass indicate bugs in transformation
passes or input to the parser.<p>
Chris Lattner's avatar
Chris Lattner committed

<!-- Describe the typesetting conventions here. -->
Chris Lattner's avatar
Chris Lattner committed


<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
<tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
Chris Lattner's avatar
Chris Lattner committed
<a name="identifiers">Identifiers
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->

LLVM uses three different forms of identifiers, for different purposes:<p>

<ol>
<li>Numeric constants are represented as you would expect: 12, -3 123.421, etc.
Floating point constants have an optional hexidecimal notation.

<li>Named values are represented as a string of characters with a '%' prefix.
For example, %foo, %DivisionByZero, %a.really.long.identifier.  The actual
regular expression used is '<tt>%[a-zA-Z$._][a-zA-Z$._0-9]*</tt>'.  Identifiers
which require other characters in their names can be surrounded with quotes.  In
this way, anything except a <tt>"</tt> character can be used in a name.

<li>Unnamed values are represented as an unsigned numeric value with a '%'
prefix.  For example, %12, %2, %44.
Chris Lattner's avatar
Chris Lattner committed
</ol><p>

LLVM requires the values start with a '%' sign for two reasons: Compilers don't
need to worry about name clashes with reserved words, and the set of reserved
words may be expanded in the future without penalty.  Additionally, unnamed
identifiers allow a compiler to quickly come up with a temporary variable
without having to avoid symbol table conflicts.<p>
Chris Lattner's avatar
Chris Lattner committed

Reserved words in LLVM are very similar to reserved words in other languages.
There are keywords for different opcodes ('<tt><a href="#i_add">add</a></tt>',
'<tt><a href="#i_cast">cast</a></tt>', '<tt><a href="#i_ret">ret</a></tt>',
etc...), for primitive type names ('<tt><a href="#t_void">void</a></tt>',
'<tt><a href="#t_uint">uint</a></tt>', etc...), and others.  These reserved
words cannot conflict with variable names, because none of them start with a '%'
character.<p>
Chris Lattner's avatar
Chris Lattner committed

Here is an example of LLVM code to multiply the integer variable '<tt>%X</tt>'
by 8:<p>
Chris Lattner's avatar
Chris Lattner committed

Loading
Loading full blame...