Commits · d9bbdfd3cc9584ae4c2fc1ba5f5699a7ad2d0cf7 · Roger Ferrer / llvm-epi-0.8

Apr 03, 2013

Add 64-bit compare + branch for SPARC v9. · d9bbdfd3

Jakob Stoklund Olesen authored Apr 03, 2013

The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621

d9bbdfd3

Remove some unsupported-feature comments from PPC.td · b00fc876
Hal Finkel authored Apr 03, 2013
```
These refer to the reciprocal estimate support recently committed.

llvm-svn: 178618
```
b00fc876

Use PPC reciprocal estimates with Newton iteration in fast-math mode · 2e103310

Hal Finkel authored Apr 03, 2013

When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617

2e103310

Fix the fde encoding used by mips to match gas. · b9b7ae0c

Rafael Espindola authored Apr 03, 2013

This finally fixes the encoding. The patch also
* Removes eh-frame.ll. It was an unnecessary .ll to .o test that was checking
  the wrong value.
* Merge fde-reloc.s and eh-frame.s into a single test, since the only difference
  was the run lines.
* Don't blindly test the content of the entire .eh_frame section. It makes it
  hard to anyone actually fixing a bug and hitting a difference in a binary
  blob. Instead, use a CHECK for each field and document what is being checked.

llvm-svn: 178615

b9b7ae0c

Rolling back the AVX support patch due to breaking a gcc 4.6 build bot that... · 9c0f0af5

Aaron Ballman authored Apr 03, 2013

Rolling back the AVX support patch due to breaking a gcc 4.6 build bot that doesn't understand the xgetbv instruction for some reason.  Will revisit when time permits.

llvm-svn: 178614

9c0f0af5

Remove an optimization where we were changing an objc_autorelease into an... · b8c88365

Michael Gottesman authored Apr 03, 2013

Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue.

The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.

Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.

*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.

llvm-svn: 178612

b8c88365

Improved comment. No functionality change. · 62424391
Michael Gottesman authored Apr 03, 2013
```
llvm-svn: 178605
```
62424391
Attempting to fix the build on older GCC versions. · 56be6ba5
Aaron Ballman authored Apr 03, 2013
```
llvm-svn: 178604
```
56be6ba5
This patch addresses PR15351 by explicitly checking for AVX support · 6bc0dfc7
Aaron Ballman authored Apr 03, 2013
```
when getting the host processor information.

llvm-svn: 178598
```
6bc0dfc7
Formatting. · e2fbc67e
Eric Christopher authored Apr 02, 2013
```
llvm-svn: 178589
```
e2fbc67e

[mips] Small update to the implementation of eh.return for Mips. · 023c678a

Akira Hatanaka authored Apr 02, 2013

This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588

023c678a

Support and test template arguments for unions. · 6476f908
Eric Christopher authored Apr 02, 2013
```
llvm-svn: 178586
```
6476f908
Reformat arguments. · 17dd8f07
Eric Christopher authored Apr 02, 2013
```
llvm-svn: 178585
```
17dd8f07

[mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp. · 2ffc5734

Akira Hatanaka authored Apr 02, 2013

This patch fixes the following two tests which have been failing on
llvm-mips-linux builder since r178403:

LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll
LLVM :: Analysis/Profiling/load-branch-weights-loops.ll

llvm-svn: 178584

2ffc5734

Allow MachineTraceMetrics to be used when the model has no resources. · aeb69a54
Jakob Stoklund Olesen authored Apr 02, 2013
```
It it still possible to extract information from itineraries, for
example.

llvm-svn: 178582
```
aeb69a54

Apr 02, 2013

[ms-inline asm] Add support for parsing variables with namespace alias · 8a24466f

Chad Rosier authored Apr 02, 2013

qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566

8a24466f

Fix PR15630: Replace faulty stdcx. with stwcx. · 3581cd4b

Bill Schmidt authored Apr 02, 2013

When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559

3581cd4b

Don't attempt MTM heuristics without a scheduling model present. · 8fbfc591
Jakob Stoklund Olesen authored Apr 02, 2013
```
This should fix the PPC buildbots.

llvm-svn: 178558
```
8fbfc591

Count processor resources individually in MachineTraceMetrics. · 3ca14772

Jakob Stoklund Olesen authored Apr 02, 2013

The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.

The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.

This gives more precise resource information, particularly in traces
that use one resource a lot more than others.

llvm-svn: 178553

3ca14772

[fast-isel] Use the correct API to disable FastLowerArguments for Win64. · 7925d280
Chad Rosier authored Apr 02, 2013
```
llvm-svn: 178549
```
7925d280

DAGCombiner: Merge store/loads when we have extload/truncstores · d6c6e868

Arnold Schwaighofer authored Apr 02, 2013

This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546

d6c6e868

[NVPTX] Fix a few style issues in NVVMReflect · a922c7e9
Justin Holewinski authored Apr 02, 2013
```
llvm-svn: 178536
```
a922c7e9

Use a worklist to avoid a sneaky iterator invalidation. · 88d06c3b

Bill Wendling authored Apr 02, 2013

The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.

Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.

PR15440

llvm-svn: 178531

88d06c3b

Add 64-bit load and store instructions. · 8eabc3ff
Jakob Stoklund Olesen authored Apr 02, 2013
```
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
```
8eabc3ff

Basic 64-bit ALU operations. · 917e07f0

Jakob Stoklund Olesen authored Apr 02, 2013

SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527

917e07f0

Materialize 64-bit immediates. · bddb20ee

Jakob Stoklund Olesen authored Apr 02, 2013

The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526

bddb20ee

Add 64-bit shift instructions. · c1d1a481

Jakob Stoklund Olesen authored Apr 02, 2013

SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525

c1d1a481

Add predicates for distinguishing 32-bit and 64-bit modes. · 739d722e

Jakob Stoklund Olesen authored Apr 02, 2013

The 'sparc' architecture produces 32-bit code while 'sparcv9' produces
64-bit code.

It is also possible to run 32-bit code using SPARC v9 instructions with:

  llc -march=sparc -mattr=+v9

llvm-svn: 178524

739d722e

Add support for 64-bit calling convention. · 0b21f35a

Jakob Stoklund Olesen authored Apr 02, 2013

This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523

0b21f35a

Add an I64Regs register class for 64-bit registers. · 5ad3b353

Jakob Stoklund Olesen authored Apr 02, 2013

We are going to use the same registers for 32-bit and 64-bit values, but
in two different register classes. The I64Regs register class has a
larger spill size and alignment.

The addition of an i64 register class confuses TableGen's type
inference, so it is necessary to clarify the type of some immediates and
the G0 register.

In 64-bit mode, pointers are i64 and should use the I64Regs register
class. Implement getPointerRegClass() to dynamically provide the pointer
register class depending on the subtarget. Use ptr_rc and iPTR for
memory operands.

Finally, add the i64 type to the IntRegs register class. This register
class is not used to hold i64 values, I64Regs is for that. The type is
required to appease TableGen's type checking in output patterns like this:

  def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>;

SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and
TableGen doesn't know to check the type of register sub-classes.

llvm-svn: 178522

5ad3b353

Fix typo in PPCISelLowering · 93d75ea0

Hal Finkel authored Apr 02, 2013

Thanks to Bill Schmidt for finding this in review of r178480.

llvm-svn: 178521

93d75ea0

The divide unit is not pipeline, but it is still buffered. · e1d88cfb

Andrew Trick authored Apr 02, 2013

Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519

e1d88cfb

Target/R600: Fix CMake build to add missing files. · fd98f7f2
NAKAMURA Takumi authored Apr 01, 2013
```
llvm-svn: 178508
```
fd98f7f2

Apr 01, 2013
- Mips direct object exception handling regression · 9423f507
  Jack Carter authored Apr 01, 2013
```
Revision 177141 caused a regression in all but
mips64 little endian. That is because none of the
other Mips targets had test cases checking the 
contents of the .eh_frame section. This patch fixes
both the llvm code and adds an assembler test case 
to include the current 4 flavors.

The test cases unfortunately rely on llvm-objdump. A
preferable method would be to use a pretty printer output
such as what readelf -wf <elf_file> would give.

I also changed the name of the test case to correct a typo.

llvm-svn: 178506
```
  9423f507
- R600: Add support for native control flow · bfaa63a6
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178505
```
  bfaa63a6
- R600/SI: Share code recording ShaderTypeAttribute between generations · ace6f735
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178504
```
  ace6f735
- R600: Emit CF_ALU and use true kcache register. · f43bc57b
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178503
```
  f43bc57b
- Fix top-comment header and some indentation · e60fc2f6
  Eli Bendersky authored Apr 01, 2013
```
llvm-svn: 178492
```
  e60fc2f6
- Fix a bad assert in PPCTargetLowering · 3f88d089
  Hal Finkel authored Apr 01, 2013
```
llvm-svn: 178489
```
  3f88d089
- Correct assertion condition · 6662fd0f
  Shuxin Yang authored Apr 01, 2013
```
llvm-svn: 178484
```
  6662fd0f