LangRef.html



<h5>Semantics:</h5>
Memory is allocated, a pointer is returned.  '<tt>alloca</tt>'d memory is automatically released when the method returns.  The '<tt>alloca</tt>' utility is how variable spills shall be implemented.<p>

<h5>Example:</h5>
<pre>
  %ptr = alloca int                              <i>; yields {int*}:ptr</i>
  %ptr = alloca [int], uint 4                    <i>; yields {[int]*}:ptr</i>
</pre>


<!-- _______________________________________________________________________ -->
</ul><a name="i_load"><h4><hr size=0>'<tt>load</tt>' Instruction</h4><ul>

<h5>Syntax:</h5>
<pre>
  &lt;result&gt; = load &lt;ty&gt;* &lt;pointer&gt;                 <i>; yields {ty}:result</i>
  &lt;result&gt; = load &lt;ty&gt;* &lt;arrayptr&gt;{, uint &lt;idx&gt;}+    <i>; yields {ty}:result</i>
  &lt;result&gt; = load &lt;ty&gt;* &lt;structptr&gt;{, ubyte &lt;idx&gt;}+     <i>; yields field type</i>
</pre>

<h5>Overview:</h5>
The '<tt>load</tt>' instruction is used to read from memory.<p>

<h5>Arguments:</h5>

There are three forms of the '<tt>load</tt>' instruction: one for reading from a general pointer, one for reading from a pointer to an array, and one for reading from a pointer to a structure.<p>

In the first form, '<tt>&lt;ty&gt;</tt>' must be a pointer to a simple type (a primitive type or another pointer).<p>

In the second form, '<tt>&lt;ty&gt;</tt>' must be a pointer to an array, and a list of one or more indices is provided as indexes into the (possibly multidimensional) array.  No bounds checking is performed on array reads.<p>

In the third form, the pointer must point to a (possibly nested) structure.  There shall be one ubyte argument for each level of dereferencing involved.<p>

<h5>Semantics:</h5>
...

<h5>Examples:</h5>
<pre>
  %ptr = <a href="#i_alloca">alloca</a> int                               <i>; yields {int*}:ptr</i>
  <a href="#i_store">store</a> int 3, int* %ptr                          <i>; yields {void}</i>
  %val = load int* %ptr                           <i>; yields {int}:val = int 3</i>

  %array = <a href="#i_malloc">malloc</a> [4 x ubyte]                     <i>; yields {[4 x ubyte]*}:array</i>
  <a href="#i_store">store</a> ubyte 124, [4 x ubyte]* %array, uint 4
  %val   = load [4 x ubyte]* %array, uint 4       <i>; yields {ubyte}:val = ubyte 124</i>
  %val   = load {{int, float}}* %stptr, 0, 1      <i>; yields {float}:val</i>
</pre>


<!-- _______________________________________________________________________ -->
</ul><a name="i_store"><h4><hr size=0>'<tt>store</tt>' Instruction</h4><ul>

<h5>Syntax:</h5>
<pre>
  store &lt;ty&gt; &lt;value&gt;, &lt;ty&gt;* &lt;pointer&gt;                   <i>; yields {void}</i>
  store &lt;ty&gt; &lt;value&gt;, &lt;ty&gt;* &lt;arrayptr&gt;{, uint &lt;idx&gt;}+   <i>; yields {void}</i>
  store &lt;ty&gt; &lt;value&gt;, &lt;ty&gt;* &lt;structptr&gt;{, ubyte &lt;idx&gt;}+ <i>; yields {void}e</i>
</pre>

<h5>Overview:</h5>
The '<tt>store</tt>' instruction is used to write to memory.<p>

<h5>Arguments:</h5>
There are three forms of the '<tt>store</tt>' instruction: one for writing through a general pointer, one for writing through a pointer to a (possibly multidimensional) array, and one for writing to an element of a (potentially nested) structure.<p>

The semantics of this instruction closely match that of the <a href="#i_load">load</a> instruction, except that memory is written to, not read from.

<h5>Semantics:</h5>
...

<h5>Example:</h5>
<pre>
  %ptr = <a href="#i_alloca">alloca</a> int                               <i>; yields {int*}:ptr</i>
  <a href="#i_store">store</a> int 3, int* %ptr                          <i>; yields {void}</i>
  %val = load int* %ptr                           <i>; yields {int}:val = int 3</i>

  %array = <a href="#i_malloc">malloc</a> [4 x ubyte]                     <i>; yields {[4 x ubyte]*}:array</i>
  <a href="#i_store">store</a> ubyte 124, [4 x ubyte]* %array, uint 4
  %val   = load [4 x ubyte]* %array, uint 4       <i>; yields {ubyte}:val = ubyte 124</i>
  %val   = load {{int, float}}* %stptr, 0, 1      <i>; yields {float}:val</i>
</pre>


<!-- _______________________________________________________________________ -->
</ul><a name="i_getelementptr"><h4><hr size=0>'<tt>getelementptr</tt>' Instruction</h4><ul>

<h5>Syntax:</h5>
<pre>
  &lt;result&gt; = getelementptr &lt;ty&gt;* &lt;arrayptr&gt;{, uint &lt;idx&gt;}+    <i>; yields {ty*}:result</i>
  &lt;result&gt; = getelementptr &lt;ty&gt;* &lt;structptr&gt;{, ubyte &lt;idx&gt;}+     <i>; yields field type*</i>
</pre>

<h5>Overview:</h5>

'<tt>getelementptr</tt>' performs all of the same work that a '<tt><a href="#i_load">load</a>' instruction does, except for the actual memory fetch.  Instead, '<tt>getelementpr</tt>' simply performs the addressing arithmetic to get to the element in question, and returns it.  This is useful for indexing into a bimodal structure.

<h5>Arguments:</h5>


<h5>Semantics:</h5>


<h5>Example:</h5>
<pre>
  %aptr = getelementptr {int, [12 x ubyte]}* %sptr, 1   <i>; yields {[12 x ubyte]*}:aptr</i>
  %ub   = load [12x ubyte]* %aptr, 4                    <i>;yields {ubyte}:ub</i>
</pre>


<!-- ======================================================================= -->
</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
<a name="otherops">Other Operations
</b></font></td></tr></table><ul>

The instructions in this catagory are the "miscellaneous" functions, that defy better classification.<p>


<!-- _______________________________________________________________________ -->
</ul><a name="i_cast"><h4><hr size=0>'<tt>cast .. to</tt>' Instruction</h4><ul>

<h1>TODO</h1>

<a name="logical_integrals">
  Talk about what is considered true or false for integrals.


<h5>Syntax:</h5>
<pre>
</pre>

<h5>Overview:</h5>


<h5>Arguments:</h5>


<h5>Semantics:</h5>


<h5>Example:</h5>
<pre>
</pre>


<!-- _______________________________________________________________________ -->
</ul><a name="i_call"><h4><hr size=0>'<tt>call</tt>' Instruction</h4><ul>

<h5>Syntax:</h5>
<pre>

</pre>

<h5>Overview:</h5>


<h5>Arguments:</h5>


<h5>Semantics:</h5>


<h5>Example:</h5>
<pre>
  %retval = call int %test(int %argc)
</pre>


<!-- _______________________________________________________________________ --></ul><a name="i_icall"><h3><hr size=0>'<tt>icall</tt>' Instruction</h3><ul>

Indirect calls are desperately needed to implement virtual function tables (C++, java) and function pointers (C, C++, ...).<p>

A new instruction <tt>icall</tt> or similar should be introduced to represent an indirect call.<p>

Example:
<pre>
  %retval = icall int %funcptr(int %arg1)          <i>; yields {int}:%retval</i>
</pre>


<!-- _______________________________________________________________________ -->
</ul><a name="i_phi"><h4><hr size=0>'<tt>phi</tt>' Instruction</h4><ul>

<h5>Syntax:</h5>
<pre>
</pre>

<h5>Overview:</h5>


<h5>Arguments:</h5>


<h5>Semantics:</h5>


<h5>Example:</h5>
<pre>
</pre>


<!-- ======================================================================= -->
</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
<a name="builtinfunc">Builtin Functions
</b></font></td></tr></table><ul>

<b>Notice:</b> Preliminary idea!<p>

Builtin functions are very similar to normal functions, except they are defined by the implementation.  Invocations of these functions are very similar to method invocations, except that the syntax is a little less verbose.<p>

Builtin functions are useful to implement semi-high level ideas like a '<tt>min</tt>' or '<tt>max</tt>' operation that can have important properties when doing program analysis.  For example:

<ul>
<li>Some optimizations can make use of identities defined over the functions, 
    for example a parrallelizing compiler could make use of '<tt>min</tt>' 
    identities to parrellelize a loop.
<li>Builtin functions would have polymorphic types, where normal method calls
    may only have a single type.
<li>Builtin functions would be known to not have side effects, simplifying 
    analysis over straight method calls.
<li>The syntax of the builtin are cleaner than the syntax of the 
    '<a href="#i_call"><tt>call</tt></a>' instruction (very minor point).
</ul>

Because these invocations are explicit in the representation, the runtime can choose to implement these builtin functions any way that they want, including:

<ul>
<li>Inlining the code directly into the invocation
<li>Implementing the functions in some sort of Runtime class, convert invocation
    to a standard method call.
<li>Implementing the functions in some sort of Runtime class, and perform 
    standard inlining optimizations on it.
</ul>

Note that these builtins do not use quoted identifiers: the name of the builtin effectively becomes an identifier in the language.<p>

Example:
<pre>
  ; Example of a normal method call
  %maximum = call int %maximum(int %arg1, int %arg2)   <i>; yields {int}:%maximum</i>

  ; Examples of potential builtin functions
  %max = max(int %arg1, int %arg2)                     <i>; yields {int}:%max</i>
  %min = min(int %arg1, int %arg2)                     <i>; yields {int}:%min</i>
  %sin = sin(double %arg)                              <i>; yields {double}:%sin</i>
  %cos = cos(double %arg)                              <i>; yields {double}:%cos</i>

  ; Show that builtin's are polymorphic, like instructions
  %max = max(float %arg1, float %arg2)                 <i>; yields {float}:%max</i>
  %cos = cos(float %arg)                               <i>; yields {float}:%cos</i>
</pre>

The '<tt>maximum</tt>' vs '<tt>max</tt>' example illustrates the difference in calling semantics between a '<a href="#i_call"><tt>call</tt></a>' instruction and a builtin function invocation.  Notice that the '<tt>maximum</tt>' example assumes that the method is defined local to the caller.<p>


<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
<a name="todo">TODO List
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->

This list of random topics includes things that will <b>need</b> to be addressed before the llvm may be used to implement a java like langauge.  Right now, it is pretty much useless for any language, given to unavailable of structure types<p>

<!-- _______________________________________________________________________ -->
</ul><a name="synchronization"><h3><hr size=0>Synchronization Instructions</h3><ul>

We will need some type of synchronization instructions to be able to implement stuff in Java well.  The way I currently envision doing this is to introduce a '<tt>lock</tt>' type, and then add two (builtin or instructions) operations to lock and unlock the lock.<p>


<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
<a name="extensions">Possible Extensions
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->

These extensions are distinct from the TODO list, as they are mostly "interesting" ideas that could be implemented in the future by someone so motivated.  They are not directly required to get <a href="#rw_java">Java</a> like languages working.<p>

<!-- _______________________________________________________________________ -->
</ul><a name="i_tailcall"><h3><hr size=0>'<tt>tailcall</tt>' Instruction</h3><ul>

This could be useful.  Who knows.  '.net' does it, but is the optimization really worth the extra hassle?  Using strong typing would make this trivial to implement and a runtime could always callback to using downconverting this to a normal '<a href="#i_call"><tt>call</tt></a>' instruction.<p>


<!-- _______________________________________________________________________ -->
</ul><a name="globalvars"><h3><hr size=0>Global Variables</h3><ul>

In order to represent programs written in languages like C, we need to be able to support variables at the module (global) scope.  Perhaps they should be written outside of the module definition even.  Maybe global functions should be handled like this as well.<p>


<!-- _______________________________________________________________________ -->
</ul><a name="explicitparrellelism"><h3><hr size=0>Explicit Parrellelism</h3><ul>

With the rise of massively parrellel architectures (like <a href="#rw_ia64">the IA64 architecture</a>, multithreaded CPU cores, and SIMD data sets) it is becoming increasingly more important to extract all of the ILP from a code stream possible.  It would be interesting to research encoding methods that can explicitly represent this.  One straightforward way to do this would be to introduce a "stop" instruction that is equilivent to the IA64 stop bit.<p>


<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
<a name="related">Related Work
</b></font></td></tr></table><ul>
<!-- *********************************************************************** -->


Codesigned virtual machines.<p>

<dl>
<a name="rw_safetsa">
<dt>SafeTSA
<DD>Description here<p>

<a name="rw_java">
<dt><a href="http://www.javasoft.com">Java</a>
<DD>Desciption here<p>

<a name="rw_net">
<dt><a href="http://www.microsoft.com/net">Microsoft .net</a>
<DD>Desciption here<p>

<a name="rw_gccrtl">
<dt><a href="http://www.math.umn.edu/systems_guide/gcc-2.95.1/gcc_15.html">GNU RTL Intermediate Representation</a>
<DD>Desciption here<p>

<a name="rw_ia64">
<dt><a href="http://developer.intel.com/design/ia-64/index.htm">IA64 Architecture &amp; Instruction Set</a>
<DD>Desciption here<p>

<a name="rw_mmix">
<dt><a href="http://www-cs-faculty.stanford.edu/~knuth/mmix-news.html">MMIX Instruction Set</a>
<DD>Desciption here<p>

<a name="rw_stroustrup">
<dt><a href="http://www.research.att.com/~bs/devXinterview.html">"Interview With Bjarne Stroustrup"</a>
<DD>This interview influenced the design and thought process behind LLVM in several ways, most notably the way that derived types are written in text format. See the question that starts with "you defined the C declarator syntax as an experiment that failed".<p>
</dl>

<!-- _______________________________________________________________________ -->
</ul><a name="rw_vectorization"><h3><hr size=0>Vectorized Architectures</h3><ul>

<dl>
<a name="rw_intel_simd">
<dt>Intel MMX, MMX2, SSE, SSE2
<DD>Description here<p>

<a name="rw_amd_simd">
<dt><a href="http://www.nondot.org/~sabre/os/H1ChipFeatures/3DNow!TechnologyManual.pdf">AMD 3Dnow!, 3Dnow! 2</a>
<DD>Desciption here<p>

<a name="rw_sun_simd">
<dt><a href="http://www.nondot.org/~sabre/os/H1ChipFeatures/VISInstructionSetUsersManual.pdf">Sun VIS ISA</a>
<DD>Desciption here<p>


</dl>

more...

<!-- *********************************************************************** -->
</ul>
<!-- *********************************************************************** -->


<hr>
<font size=-1>
<address><a href="mailto:sabre@nondot.org">Chris Lattner</a></address>
<!-- Created: Tue Jan 23 15:19:28 CST 2001 -->
<!-- hhmts start -->
Last modified: Sun Jul  8 19:25:56 CDT 2001
<!-- hhmts end -->
</font>
</body></html>