Commits · 3e869f002c5ecfbbe5c3596026b2fc73421b2f91 · Roger Ferrer / llvm-epi-0.8

Apr 12, 2012

Generalize r153635 to deal with TokenFactor chains; also clean up the logic... · 3e869f00

Evan Cheng authored Apr 12, 2012

Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106

llvm-svn: 154604

3e869f00

Apr 09, 2012

Cleanup and relax a restriction on the matching of global offsets into · 3779ac10

Chandler Carruth authored Apr 09, 2012

x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
the match to try to fit RIP into the base register. If it fails, it
now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
as it did before. The reason these immediates are safe is because the
ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
This is the only case changed by the patch, and the primary place you
see it is in TLS, either the win64 section offset TLS or Linux
local-exec TLS model in a PIC compilation. Here the ABI again ensures
that the immediates fit because we are in small mode, and any other
operations required due to the PIC relocation model have been handled
externally to the Wrapper node (extra loads etc are made around the
wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304

3779ac10

Apr 04, 2012

Always compute all the bits in ComputeMaskedBits. · ba0a6cab

Rafael Espindola authored Apr 04, 2012

This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011

ba0a6cab

Mar 29, 2012

Replace assert(0) with llvm_unreachable to avoid warnings about dropping off... · 8619c37b

Benjamin Kramer authored Mar 29, 2012

Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds.

llvm-svn: 153643

8619c37b

For X86, change load/dec-or-inc/store into dec-or-inc, respectively. · 68d59e8a

Joel Jones authored Mar 29, 2012

This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153635

68d59e8a

Reverted to revision 153616 to unblock build · b474099e
Joel Jones authored Mar 29, 2012
```
llvm-svn: 153623
```
b474099e

For X86, change load/dec-or-inc/store into dec-or-inc, respectively. · b88c81fe

Joel Jones authored Mar 29, 2012

This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153617

b88c81fe

Mar 27, 2012
- Prune some includes · 1fcf5bca
  Craig Topper authored Mar 27, 2012
```
llvm-svn: 153502
```
  1fcf5bca
- Remove unnecessary llvm:: qualifications · f6e7e12f
  Craig Topper authored Mar 27, 2012
```
llvm-svn: 153500
```
  f6e7e12f
Mar 17, 2012
- Reorder includes in Target backends to following coding standards. Remove some... · b25fda95
  Craig Topper authored Mar 17, 2012
```
Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.

llvm-svn: 152997
```
  b25fda95
Mar 09, 2012
- Use uint16_t to store opcodes in static tables in X86 backend. · 2dac9628
  Craig Topper authored Mar 09, 2012
```
llvm-svn: 152391
```
  2dac9628
Feb 22, 2012
- Declare register classes as const. Fix a couple pointers to register classes... · cc830f8c
  Craig Topper authored Feb 22, 2012
```
Declare register classes as const. Fix a couple pointers to register classes that weren't already const.

llvm-svn: 151138
```
  cc830f8c
Feb 16, 2012

Use the same CALL instructions for Windows as for everything else. · 97e3115d

Jakob Stoklund Olesen authored Feb 16, 2012

The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708

97e3115d

Feb 15, 2012
- Stop custom lowering forr x86 DEC64m from happening if the load in the lowered... · c21ebf5c
  Pete Cooper authored Feb 15, 2012
```
Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user

llvm-svn: 150537
```
  c21ebf5c
Feb 13, 2012

Fixed bug when custom lowering DEC64m on x86. · 71be57bb

Pete Cooper authored Feb 13, 2012

If the DEC node had more than one user, it was doing this lowering but
leaving the original DEC node around and so decrementing twice.

Fixes PR11964.

llvm-svn: 150356

71be57bb

Jan 20, 2012
- More dead code removal (using -Wunreachable-code) · 46a9f016
  David Blaikie authored Jan 20, 2012
```
llvm-svn: 148578
```
  46a9f016
Jan 12, 2012

Switch all of the uses of my InsertDAGNode helper to follow the exact · eb21da06

Chandler Carruth authored Jan 12, 2012

same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.

I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.

I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.

llvm-svn: 148001

eb21da06

Jan 11, 2012

Revert r147945 which disabled an addressing mode transformation. I had · 3212a342

Chandler Carruth authored Jan 11, 2012

hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

llvm-svn: 147957

3212a342

Disable the transformation I added in r147936 to see if it fixes some · 9bc48e52

Chandler Carruth authored Jan 11, 2012

strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

llvm-svn: 147945

9bc48e52

Hoist a really redundant code pattern into a helper function, and delete · 3eacfb83
Chandler Carruth authored Jan 11, 2012
```
lots of lines of code. No functionality changed.

llvm-svn: 147942
```
3eacfb83
Simplify the AND-rooted mask+shift checking code to match that of the · b0049f4a
Chandler Carruth authored Jan 11, 2012
```
SRL-rooted code.

llvm-svn: 147941
```
b0049f4a

Unify the interface of the three mask+shift transform helpers, and · 3dbcda84

Chandler Carruth authored Jan 11, 2012

factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

llvm-svn: 147940

3dbcda84

Clarify and make explicit some of the requirements for transforming · aa01e666

Chandler Carruth authored Jan 11, 2012

mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

llvm-svn: 147939

aa01e666

Hoist the logic to transform shift+mask combinations into sub-register · 51d3076b

Chandler Carruth authored Jan 11, 2012

extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

llvm-svn: 147937

51d3076b

Teach the X86 instruction selection to do some heroic transforms to · 55b2cdee

Chandler Carruth authored Jan 11, 2012

detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936

55b2cdee

Jan 09, 2012

Don't rely on the fact that shift values are never very large, and thus · c16622da

Chandler Carruth authored Jan 09, 2012

this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.

Spotted by inspection, and impossible to trigger given the shift widths
that can be used.

llvm-svn: 147773

c16622da

Nov 16, 2011
- Added missing comment about new custom lowering of DEC64 · 48784ed5
  Pete Cooper authored Nov 16, 2011
```
llvm-svn: 144811
```
  48784ed5
Nov 15, 2011
- Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used · 7c7ba1ba
  Pete Cooper authored Nov 15, 2011
```
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
```
  7c7ba1ba
Nov 03, 2011

Reapply r143206, with fixes. Disallow physical register lifetimes · 198b7ffc

Dan Gohman authored Nov 03, 2011

across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660

198b7ffc

Oct 29, 2011
- Revert r143206, as there are still some failing tests. · 9b9c9701
  Dan Gohman authored Oct 29, 2011
```
llvm-svn: 143262
```
  9b9c9701
Oct 28, 2011

Reapply r143177 and r143179 (reverting r143188), with scheduler · 73057ad2

Dan Gohman authored Oct 28, 2011

fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206

73057ad2

Speculatively disable Dan's commits 143177 and 143179 to see if · 225a7037

Duncan Sands authored Oct 28, 2011

it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188

225a7037

Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW · 4db3f7dd

Dan Gohman authored Oct 28, 2011

on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177

4db3f7dd

Oct 08, 2011

Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. · 729abd36

Jakob Stoklund Olesen authored Oct 08, 2011

In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
target all GR8 registers, only those in GR8_NOREX.

TO enforce this, we ensure that all instructions using the
EXTRACT_SUBREG are GR8_NOREX constrained.

This fixes PR11088.

llvm-svn: 141499

729abd36

Aug 01, 2011
- Teach PreprocessISelDAG to be aware of vector types and to not process them. · 616fe605
  Bruno Cardoso Lopes authored Aug 01, 2011
```
llvm-svn: 136653
```
  616fe605
Jul 13, 2011

Make sure we don't combine a large displacement and a frame index in the same... · 344ec797

Eli Friedman authored Jul 13, 2011

Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64.  It can overflow, leading to a crash/miscompile.

<rdar://problem/9763308>

llvm-svn: 135084

344ec797

Refactor out checking for displacements on x86-64 addressing modes. No... · ef67e7d6

Eli Friedman authored Jul 13, 2011

Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress.

Part of <rdar://problem/9763308>.

llvm-svn: 135079

ef67e7d6

Jul 02, 2011
- TargetConstant immediates won't be placed into registers so tighten · a8a56f7e
  Eric Christopher authored Jul 01, 2011
```
up the valid constant check earlier.

rdar://9692967

llvm-svn: 134286
```
  a8a56f7e
Jun 30, 2011
- Fix a small thinko for constant i64 lock/orq optimization where we · c9321737
  Eric Christopher authored Jun 30, 2011
```
we didn't have an opcode for 64-bit constant or expressions.

Fixes rdar://9692967

llvm-svn: 134121
```
  c9321737
May 20, 2011
- Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. · 91f1d247
  Stuart Hastings authored May 20, 2011
```
rdar://problem/8614450

llvm-svn: 131746
```
  91f1d247