Commits · cf5014771d0e33c984b4b6946d6b0f181c3108e5 · Roger Ferrer / llvm-epi

Dec 28, 2011

PR11662. · 3c3dd6e5

Nadav Rotem authored Dec 28, 2011

Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.

llvm-svn: 147309

3c3dd6e5

Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. · b3515a8d

Elena Demikhovsky authored Dec 28, 2011

Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.

llvm-svn: 147308

b3515a8d

Demystify this comment. · 8640fdf0
Nick Lewycky authored Dec 28, 2011
```
llvm-svn: 147307
```
8640fdf0

Dec 27, 2011
- PR11642 has been fixed, enable -fvisibility-inlines-hidden everywhere. · 07935469
  Rafael Espindola authored Dec 27, 2011
```
llvm-svn: 147296
```
  07935469
- Switch StringMap from an array of structures to a structure of arrays. · 46236ee5
  Benjamin Kramer authored Dec 27, 2011
```
- -25% memory usage of the main table on x86_64 (was wasted in struct padding).
- no significant performance change.

llvm-svn: 147294
```
  46236ee5
- Use false not zero, as a bool. · 398255e7
  Nick Lewycky authored Dec 27, 2011
```
llvm-svn: 147292
```
  398255e7
- Turn cos(-x) into cos(x). Patch by Alexander Malyshev! · a8e84fb5
  Nick Lewycky authored Dec 27, 2011
```
llvm-svn: 147291
```
  a8e84fb5
- Clean up some Release build warnings. · b668401b
  Benjamin Kramer authored Dec 27, 2011
```
llvm-svn: 147289
```
  b668401b
- Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for... · df34d152
  Craig Topper authored Dec 27, 2011
```
Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.

llvm-svn: 147287
```
  df34d152
- Teach simplifycfg to recompute branch weights when merging some branches, and · c554a9b5
  Nick Lewycky authored Dec 27, 2011
```
to discard weights when appropriate. Still more to do (and a new TODO), but
it's a start!

llvm-svn: 147286
```
  c554a9b5
- Using Inst->setMetadata(..., NULL) should be safe to remove metadata even when · 4c131387
  Nick Lewycky authored Dec 27, 2011
```
there is non of that type to remove. This fixes a crasher in the particular
case where the instruction has metadata but no metadata storage in the context
(this is only possible if the instruction has !dbg but no other metadata info).

llvm-svn: 147285
```
  4c131387
- Fix warning. · 2b14b80b
  Rafael Espindola authored Dec 26, 2011
```
llvm-svn: 147284
```
  2b14b80b
Dec 26, 2011
- Make sure DAGCombiner doesn't introduce multiple loads from the same memory... · e96286cd
  Eli Friedman authored Dec 26, 2011
```
Make sure DAGCombiner doesn't introduce multiple loads from the same memory location.  PR10747, part 2.

llvm-svn: 147283
```
  e96286cd
- Update the branch weight metadata when reversing the order of a branch. · 8d302df4
  Nick Lewycky authored Dec 26, 2011
```
llvm-svn: 147280
```
  8d302df4
- Sort includes, canonicalize whitespace, fix typos. No functionality change. · e87d54c8
  Nick Lewycky authored Dec 26, 2011
```
llvm-svn: 147279
```
  e87d54c8
Dec 25, 2011
- Update the LangRef documentation: the codegen does support this instruction. · 4c4d254f
  Nadav Rotem authored Dec 25, 2011
```
llvm-svn: 147274
```
  4c4d254f
- Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat. · c1faeac4
  Nadav Rotem authored Dec 25, 2011
```
llvm-svn: 147272
```
  c1faeac4
- Sparc: Implement emitFrameIndexDebugValue and getDebugValue Location hooks. · 1fc8263b
  Venkatraman Govindaraju authored Dec 25, 2011
```
llvm-svn: 147269
```
  1fc8263b
- Add braces to remove silly warning. · 2990ec6e
  Bill Wendling authored Dec 25, 2011
```
llvm-svn: 147264
```
  2990ec6e
- Remove unused variables. · 2d3dac3e
  Rafael Espindola authored Dec 25, 2011
```
llvm-svn: 147261
```
  2d3dac3e
Dec 24, 2011

Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0. · 8b7e71ff
Chandler Carruth authored Dec 24, 2011
```
This is a result of Benjamin's work on ValueTracking.

llvm-svn: 147259
```
8b7e71ff

InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x... · b16bd77b

Benjamin Kramer authored Dec 24, 2011

InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add.

This was intended to undo the sub canonicalization in cases where it's not profitable, but it also
finds some cases on it's own.

llvm-svn: 147256

b16bd77b

ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero. · 4ee5747f
Benjamin Kramer authored Dec 24, 2011
```
unsigned foo(unsigned x) { return 31 - __builtin_clz(x); }
now compiles into a single "bsrl" instruction on x86.

llvm-svn: 147255
```
4ee5747f

InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n. · 010337c8

Benjamin Kramer authored Dec 24, 2011

This has the obvious advantage of being commutable and is always a win on x86 because
const - x wastes a register there. On less weird architectures this may lead to
a regression because other arithmetic doesn't fuse with it anymore. I'll address that
problem in a followup.

llvm-svn: 147254

010337c8

Section relative fixups are a coff concept, not a x86 one. Replace the · a56ab0ed
Rafael Espindola authored Dec 24, 2011
```
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.

llvm-svn: 147252
```
a56ab0ed

Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the · a3d54fe0

Chandler Carruth authored Dec 24, 2011

LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type

We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]

llvm-svn: 147251

a3d54fe0

Add systematic testing for cttz as well, and fix the bug I spotted by · 38ce2445
Chandler Carruth authored Dec 24, 2011
```
inspection earlier.

llvm-svn: 147250
```
38ce2445
Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test. · 103ca80f
Chandler Carruth authored Dec 24, 2011
```
llvm-svn: 147249
```
103ca80f

Tidy up this rather crufty test. Put the declarations at the top to make · 44cf0722

Chandler Carruth authored Dec 24, 2011

my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like
nounwind. Remove stray uses comments. Name things semantically rather
than tN so that adding a new test in the middle doesn't cause pain, and
so that new tests can be grouped semantically.

This exposes how little systematic testing is going on here. I noticed
this by finding several bugs via inspection and wondering why this test
wasn't catching any of them. =[

llvm-svn: 147248

44cf0722

Chandler fixed this. · 767bbe48
Benjamin Kramer authored Dec 24, 2011
```
llvm-svn: 147247
```
767bbe48

Expand more when we have a nice 'tzcnt' instruction, to avoid generating · c9fcde23

Chandler Carruth authored Dec 24, 2011

'bsf' instructions here.

This one is actually debatable to my eyes. It's not clear that any chip
implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless
EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding.
Still, this restores the old behavior with 'tzcnt' enabled for now.

llvm-svn: 147246

c9fcde23

Tidy up some of these tests. · eeb3a1ce
Chandler Carruth authored Dec 24, 2011
```
llvm-svn: 147245
```
eeb3a1ce

Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the · 7e9453e9

Chandler Carruth authored Dec 24, 2011

X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244

7e9453e9

Cleanup this test a bit, sorting things and grouping them more clearly. · 15075d4b
Chandler Carruth authored Dec 24, 2011
```
llvm-svn: 147243
```
15075d4b
Fix Comments. · 103318e9
Jakob Stoklund Olesen authored Dec 24, 2011
```
llvm-svn: 147238
```
103318e9
Add MachineMemOperands to instructions generated in storeRegToStackSlot or · 1cf75767
Akira Hatanaka authored Dec 24, 2011
```
loadRegFromStackSlot. 

llvm-svn: 147235
```
1cf75767
Detect unaligned loads/stores that have been added for Mips64 support. · 6f54a461
Akira Hatanaka authored Dec 24, 2011
```
llvm-svn: 147234
```
6f54a461
Test case for r147232. · 79329ce4
Akira Hatanaka authored Dec 24, 2011
```
llvm-svn: 147233
```
79329ce4
If target ABI is N64, LEA should be daddiu. · 695d113a
Akira Hatanaka authored Dec 24, 2011
```
llvm-svn: 147232
```
695d113a
Move x86 specific bits of the COFF writer to lib/Target/X86. · 908d2ed1
Rafael Espindola authored Dec 24, 2011
```
llvm-svn: 147231
```
908d2ed1