Commits · 51afe6397b07ff9a20e918be046e12d6d0e96507 · Roger Ferrer / llvm-epi-0.8

Jun 28, 2012

Whitespace. · 51afe639
Chad Rosier authored Jun 27, 2012
```
llvm-svn: 159300
```
51afe639

The ELF relocation record format is different for N64 · 8ad0c272

Jack Carter authored Jun 27, 2012

which many Mips 64 ABIs use than for O64 which many 
if not all other target ABIs use.

Most architectures have the following 64 bit relocation record format:

  typedef struct
  {
    Elf64_Addr   r_offset; /* Address of reference */
    Elf64_Xword  r_info;   /* Symbol index and type of relocation */
  } Elf64_Rel;

  typedef struct
  {
    Elf64_Addr    r_offset;
    Elf64_Xword   r_info;
    Elf64_Sxword  r_addend;
  } Elf64_Rela;

Whereas N64 has the following format:

  typedef struct
  {
    Elf64_Addr    r_offset;/* Address of reference */
    Elf64_Word  r_sym;     /* Symbol index */
    Elf64_Byte  r_ssym;    /* Special symbol */
    Elf64_Byte  r_type3;   /* Relocation type */
    Elf64_Byte  r_type2;   /* Relocation type */
    Elf64_Byte  r_type;    /* Relocation type */
  } Elf64_Rel;

  typedef struct
  {
    Elf64_Addr    r_offset;/* Address of reference */
    Elf64_Word  r_sym;     /* Symbol index */
    Elf64_Byte  r_ssym;    /* Special symbol */
    Elf64_Byte  r_type3;   /* Relocation type */
    Elf64_Byte  r_type2;   /* Relocation type */
    Elf64_Byte  r_type;    /* Relocation type */
    Elf64_Sxword  r_addend;
  } Elf64_Rela;

The structure is the same size, but the r_info data element 
is now 5 separate elements. Besides the content aspects, 
endian byte reordering will be different for the area with 
each element being endianized separately.

I treat this as generic and continue to pass r_type as 
an integer masking and unmasking the byte sized N64 
values for N64 mode. I've implemented this and it causes no 
affect on other current targets.

This passes make check.

Jack

llvm-svn: 159299

8ad0c272

Jun 27, 2012

Revert r159136 due to PR13124. · a5886231

Matt Beaumont-Gay authored Jun 27, 2012

Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272

a5886231

Some reassociate optimizations create new instructions, which they insert just · 514db117

Duncan Sands authored Jun 27, 2012

before the expression root. Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one. When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops. Fix
this, resolving PR12963.

llvm-svn: 159265

514db117

Teach assembler to handle capitalised operation values for DSB instructions · 57b7d16e
Richard Barton authored Jun 27, 2012
```
llvm-svn: 159259
```
57b7d16e
Prevent ARM Assembler crashing on unrecognised assembly format for DSB instruction · 4b7558ef
Richard Barton authored Jun 27, 2012
```
llvm-svn: 159257
```
4b7558ef
Silence uninitialized variable warning in MipsISelDAGToDAG.cpp. · d030738b
Akira Hatanaka authored Jun 27, 2012
```
llvm-svn: 159243
```
d030738b
Fix bug in computation of stack size in MipsFrameLowering.cpp. · 62871a34
Akira Hatanaka authored Jun 27, 2012
```
llvm-svn: 159240
```
62871a34
Reduce indentation in function. Rearrange some methods. No functionality change. · 3b70d784
Bill Wendling authored Jun 26, 2012
```
llvm-svn: 159239
```
3b70d784

Revamp how debugging information is emitted for debug info objects. · e02a1f8c

Bill Wendling authored Jun 26, 2012

It's not necessary for each DI class to have its own copy of `print' and
`dump'. Instead, just give DIDescriptor those methods and have it call the
appropriate debugging printing routine based on the type of the debug
information.

llvm-svn: 159237

e02a1f8c

Add a missing check to avoid dereference null. No sensible test case possible.... · a7512787
Evan Cheng authored Jun 26, 2012
```
Add a missing check to avoid dereference null. No sensible test case possible. Sorry. rdar://11745134

llvm-svn: 159236
```
a7512787

Remove a instcombine transform that (no longer?) makes sense: · 319be53a

Evan Cheng authored Jun 26, 2012

    // C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230

319be53a

Jun 26, 2012

Implement getHostCPUName for ARM/linux. This will be used to implement -march=native in clang. · efe40286

Benjamin Kramer authored Jun 26, 2012

The cpuid registers are only available in privileged mode so we don't have
an OS-independent way of implementing this. ARM doesn't provide a list of
processor IDs so the list is somewhat incomplete.

llvm-svn: 159228

efe40286

X86: add GATHER intrinsics (AVX2) in LLVM · a0982041

Manman Ren authored Jun 26, 2012

Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221

a0982041

Fix ThreadLocalImpl::getInstance for --disable-threads. · 46785f94
Argyrios Kyrtzidis authored Jun 26, 2012
```
PR13114.

llvm-svn: 159210
```
46785f94

Allow targets to inject passes before the virtual register rewriter. · 59a0d324

Jakob Stoklund Olesen authored Jun 26, 2012

Such passes can be used to tweak the register assignments in a
target-dependent way, for example to avoid write-after-write
dependencies.

llvm-svn: 159209

59a0d324

There are a number of generic inline asm operand modifiers that · 5e69cffe

Jack Carter authored Jun 26, 2012

up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203

5e69cffe

Replacing zero-sized alloca's with a null pointer is too aggressive, instead · 8bc764ae

Duncan Sands authored Jun 26, 2012

merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite. What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size". It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly. Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202

8bc764ae

Removed unused variable · 863d2d32
Elena Demikhovsky authored Jun 26, 2012
```
llvm-svn: 159197
```
863d2d32
Rename to match other X86_64* names. · 8ed44466
Bill Wendling authored Jun 26, 2012
```
llvm-svn: 159196
```
8ed44466

Shuffle optimization for AVX/AVX2. · 26088d2e

Elena Demikhovsky authored Jun 26, 2012

The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]

llvm-svn: 159188

26088d2e

Update a bunch of stale comments that dated from when this folled the · 9139f44d

Chandler Carruth authored Jun 26, 2012

very first (and worst) placement algorithm. These should now more
accurately reflect the reality of the pass.

llvm-svn: 159185

9139f44d

Remove some duplicate instructions that exist only to given different... · 94bf0f38

Craig Topper authored Jun 26, 2012

Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.

llvm-svn: 159184

94bf0f38

Enable the new LoopInfo algorithm by default. · fb2ba3e1

Andrew Trick authored Jun 26, 2012

The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183

fb2ba3e1

Remove unnecessary FIXME · fecf9379
Andrew Trick authored Jun 26, 2012
```
llvm-svn: 159182
```
fecf9379

Make sure type is not extended or untyped before create a constant of the... · 4c6f917d

Evan Cheng authored Jun 26, 2012

Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.

llvm-svn: 159179

4c6f917d

Make some ugly hacks for inline asm operands which name a specific register a... · bbcd09cc
Eli Friedman authored Jun 25, 2012
```
Make some ugly hacks for inline asm operands which name a specific register a bit more thorough.  PR13196.

llvm-svn: 159176
```
bbcd09cc
revert my previous commit (r159173), since as Eli pointed out, it's perfectly... · 31b54a53
Nuno Lopes authored Jun 25, 2012
```
revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias

llvm-svn: 159175
```
31b54a53

do not set realloc() as NotAlias, since it can return the same pointer. This... · 75eaa72d

Nuno Lopes authored Jun 25, 2012

do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway..

llvm-svn: 159173

75eaa72d

Jun 25, 2012

ARM: update peephole optimization. · 606953fb

Manman Ren authored Jun 25, 2012

More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166

606953fb

Fix the objc_autoreleasedReturnValue optimization code to locate · 5f725cd1
Dan Gohman authored Jun 25, 2012
```
the call correctly even in the case where it is an invoke. This
fixes rdar://11714057.

llvm-svn: 159157
```
5f725cd1

Enforce stricter liveness rules for PHIs. · a57fc12e

Jakob Stoklund Olesen authored Jun 25, 2012

Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150

a57fc12e

Run ProcessImplicitDefs on SSA form where it can be much simpler. · eb495664

Jakob Stoklund Olesen authored Jun 25, 2012

Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149

eb495664

improve optimization of invoke instructions: · 07594cba

Nuno Lopes authored Jun 25, 2012

 - simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146

07594cba

check for the NoAlias attribute through CallSite · 9ecc8761
Nuno Lopes authored Jun 25, 2012
```
llvm-svn: 159145
```
9ecc8761

PR13013: ELF Type identification fails for MSB type ELF files. · fc2fb711

Meador Inge authored Jun 25, 2012

Fix 'sys::IdentifyFileType' to work with big and little endian byte orderings
when reading the ELF object file type.

Initial patch by Stefan Hepp.

llvm-svn: 159138

fc2fb711

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it · 540c3d23

Rafael Espindola authored Jun 25, 2012

hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136

540c3d23

The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no... · f0ad3606

Eli Bendersky authored Jun 25, 2012

The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no longer represents what the function does. Therefore, the function is removed and its functionality is folded into the only place in the code-base where it was being used.

llvm-svn: 159133

f0ad3606

Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there... · 357de815

Craig Topper authored Jun 25, 2012

Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.

llvm-svn: 159127

357de815

Remove codegen only instruction in favor of one that has the same definition.... · b6eb513c

Craig Topper authored Jun 25, 2012

Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.

llvm-svn: 159126

b6eb513c