Skip to content
  1. Aug 25, 2009
  2. Aug 23, 2009
  3. Aug 22, 2009
  4. Aug 21, 2009
  5. Aug 20, 2009
  6. Aug 19, 2009
  7. Aug 11, 2009
  8. Aug 07, 2009
  9. Aug 06, 2009
  10. Aug 03, 2009
  11. Aug 02, 2009
  12. Aug 01, 2009
  13. Jul 30, 2009
    • Evan Cheng's avatar
      Optimize some common usage patterns of atomic built-ins __sync_add_and_fetch()... · e62288fd
      Evan Cheng authored
      Optimize some common usage patterns of atomic built-ins __sync_add_and_fetch() and __sync_sub_and_fetch. 
      
      When the return value is not used (i.e. only care about the value in the memory), x86 does not have to use add to implement these. Instead, it can use add, sub, inc, dec instructions with the "lock" prefix.
      
      This is currently implemented using a bit of instruction selection trick. The issue is the target independent pattern produces one output and a chain and we want to map it into one that just output a chain. The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. DAG combiner can then transform the node before it gets to target node selection.
      
      Problem #2 is we are adding a whole bunch of x86 atomic instructions when in fact these instructions are identical to the non-lock versions. We need a way to add target specific information to target nodes and have this information carried over to machine instructions. Asm printer (or JIT) can use this information to add the "lock" prefix.
      
      llvm-svn: 77582
      e62288fd
  14. Jul 23, 2009
  15. Jul 14, 2009
  16. Jul 12, 2009
    • Bill Wendling's avatar
      Temporarily revert r75408. It appears to break the Apple-style builds: · 5b76fc03
      Bill Wendling authored
      x86_64-apple-darwin10-gcc -c   -g -O2  -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute   -mdynamic-no-pic -DHAVE_CONFIG_H -I. -I. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../include -I./../intl -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libcpp/include  -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~obj/src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -DLLVM_VERSION_INFO='"9999"' -DBUILD_LLVM_APPLE_STYLE   /Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/tree-ssa-alias.c -o tree-ssa-alias.o
      /var/tmp//ccJQ2JBT.s:4134:Incorrect register `%rcx' used with `l' suffix
      make[2]: *** [tree-ssa-live.o] Error 1
      make[2]: *** Waiting for unfinished jobs....
      
      llvm-svn: 75412
      5b76fc03
    • Chris Lattner's avatar
      eliminate MOV64r0 in favor of a Pat<> pattern. This is only nontrivial because · 02c4339b
      Chris Lattner authored
      the div lowering code explicitly references it.
      
      llvm-svn: 75408
      02c4339b
    • Chris Lattner's avatar
      fix a bug in my cleanup patch · 48cee9b4
      Chris Lattner authored
      llvm-svn: 75402
      48cee9b4
    • Chris Lattner's avatar
      comment cleanup, reduce nesting. · 4d10f1a6
      Chris Lattner authored
      llvm-svn: 75398
      4d10f1a6
  17. Jul 11, 2009
    • Torok Edwin's avatar
      assert(0) -> LLVM_UNREACHABLE. · 56d06597
      Torok Edwin authored
      Make llvm_unreachable take an optional string, thus moving the cerr<< out of
      line.
      LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
      NDEBUG builds.
      
      llvm-svn: 75379
      56d06597
  18. Jul 08, 2009
  19. Jun 27, 2009
    • Chris Lattner's avatar
      Reimplement rip-relative addressing in the X86-64 backend. The new · fea81da4
      Chris Lattner authored
      implementation primarily differs from the former in that the asmprinter
      doesn't make a zillion decisions about whether or not something will be
      RIP relative or not.  Instead, those decisions are made by isel lowering
      and propagated through to the asm printer.  To achieve this, we:
      
      1. Represent RIP relative addresses by setting the base of the X86 addr
         mode to X86::RIP.
      2. When ISel Lowering decides that it is safe to use RIP, it lowers to
         X86ISD::WrapperRIP.  When it is unsafe to use RIP, it lowers to
         X86ISD::Wrapper as before.
      3. This removes isRIPRel from X86ISelAddressMode, representing it with
         a basereg of RIP instead.
      4. The addressing mode matching logic in isel is greatly simplified.
      5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
         passed through various printoperand routines is gone now.
      6. The various symbol printing routines in asmprinter now no longer infer
         when to emit (%rip), they just print the symbol.
      
      I think this is a big improvement over the previous situation.  It does have
      two small caveats though: 1. I implemented a horrible "no-rip" modifier for
      the inline asm "P" constraint modifier.  This is a short term hack, there is
      a much better, but more involved, solution.  2. I had to xfail an 
      -aggressive-remat testcase because it isn't handling the use of RIP in the
      constant-pool reading instruction.  This specific test is easy to fix without
      -aggressive-remat, which I intend to do next.
      
      llvm-svn: 74372
      fea81da4
  20. Jun 26, 2009
  21. Jun 20, 2009
    • Chris Lattner's avatar
      change TLS_ADDR lowering to lower to a real mem operand, instead of matching as · 7d2b0494
      Chris Lattner authored
      a global with that gets printed with the :mem modifier.  All operands to lea's 
      should be handled with the lea32mem operand kind, and this allows the TLS stuff
      to do this.  There are several better ways to do this, but I went for the minimal
      change since I can't really test this (beyond make check).
      
      This also makes the use of EBX explicit in the operand list in the 32-bit, 
      instead of implicit in the instruction.
      
      llvm-svn: 73834
      7d2b0494
  22. Jun 03, 2009
  23. May 11, 2009
  24. May 08, 2009
  25. Apr 30, 2009
  26. Apr 29, 2009
    • Bill Wendling's avatar
      Second attempt: · 084669a1
      Bill Wendling authored
      Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
      use the old behavior, the flag is -O0. This change allows for finer-grained
      control over which optimizations are run at different -O levels.
      
      Most of this work was pretty mechanical. The majority of the fixes came from
      verifying that a "fast" variable wasn't used anymore. The JIT still uses a
      "Fast" flag. I'll change the JIT with a follow-up patch.
      
      llvm-svn: 70343
      084669a1
  27. Apr 28, 2009
  28. Apr 16, 2009
  29. Apr 15, 2009
  30. Apr 13, 2009
    • Dan Gohman's avatar
      Implement x86 h-register extract support. · 57d6bd36
      Dan Gohman authored
       - Add patterns for h-register extract, which avoids a shift and mask,
         and in some cases a temporary register.
       - Add address-mode matching for turning (X>>(8-n))&(255<<n), where
         n is a valid address-mode scale value, into an h-register extract
         and a scaled-offset address.
       - Replace X86's MOV32to32_ and related instructions with the new
         target-independent COPY_TO_SUBREG instruction.
      
      On x86-64 there are complicated constraints on h registers, and
      CodeGen doesn't currently provide a high-level way to express all of them,
      so they are handled with a bunch of special code. This code currently only
      supports extracts where the result is used by a zero-extend or a store,
      though these are fairly common.
      
      These transformations are not always beneficial; since there are only
      4 h registers, they sometimes require extra move instructions, and
      this sometimes increases register pressure because it can force out
      values that would otherwise be in one of those registers. However,
      this appears to be relatively uncommon.
      
      llvm-svn: 68962
      57d6bd36
Loading