Commits · 7dee697faa5361fc953909b8d4547f2d08af5cab · Roger Ferrer / llvm-epi-0.8

Jul 28, 2013
- Added encoding prefixes for KNL instructions (EVEX). · 003e7d73
  Elena Demikhovsky authored Jul 28, 2013
```
Added 512-bit operands printing.
Added instruction formats for KNL instructions.

llvm-svn: 187324
```
  003e7d73
May 07, 2013
- Re-enable AVX detection on x64 platforms. · 1a0c91f7
  Michael Kuperstein authored May 07, 2013
```
llvm-svn: 181313
```
  1a0c91f7
May 03, 2013
- Unbreaking the non-x86 build bots by protecting the AVX test code properly. · cc958f00
  Aaron Ballman authored May 03, 2013
```
llvm-svn: 180992
```
  cc958f00
- Correctly testing for AVX support in x86 based off code from Hosts.cpp. · 63fe0148
  Aaron Ballman authored May 03, 2013
```
llvm-svn: 180991
```
  63fe0148
Apr 25, 2013

This patch adds the X86FixupLEAs pass, which will reduce instruction · 8b7ab4ba

Preston Gurd authored Apr 25, 2013

latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573

8b7ab4ba

Apr 03, 2013
- Formatting. · e2fbc67e
  Eric Christopher authored Apr 02, 2013
```
llvm-svn: 178589
```
  e2fbc67e
Mar 29, 2013
- Add support of RDSEED defined in AVX2 extension · a486a11d
  Michael Liao authored Mar 28, 2013
```
llvm-svn: 178314
```
  a486a11d
Mar 28, 2013
- Add ADX CPUID detection · c93fe7f8
  Michael Liao authored Mar 28, 2013
```
llvm-svn: 178299
```
  c93fe7f8
Mar 27, 2013

· 663e6f95

Preston Gurd authored Mar 27, 2013

For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171

663e6f95

Mar 26, 2013
- Add HLE target feature · e344ec91
  Michael Liao authored Mar 26, 2013
```
llvm-svn: 178082
```
  e344ec91
- Add PREFETCHW codegen support · 5173ee03
  Michael Liao authored Mar 26, 2013
```
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
```
  5173ee03
Feb 27, 2013
- Revert r176166 because it broke one of the lit tests. · 08ab877c
  Nadav Rotem authored Feb 27, 2013
```
llvm-svn: 176171
```
  08ab877c
- std::string to StringRef. · 85e1211f
  Nadav Rotem authored Feb 27, 2013
```
llvm-svn: 176166
```
  85e1211f
Feb 16, 2013
- Reinitialize the ivars in the subtarget so that they can be reset with the new features. · 61375d89
  Bill Wendling authored Feb 16, 2013
```
llvm-svn: 175336
```
  61375d89
- Temporary revert of 175320. · e9434778
  Bill Wendling authored Feb 15, 2013
```
llvm-svn: 175322
```
  e9434778
- Reinitialize the ivars in the subtarget. · a060d0ef
  Bill Wendling authored Feb 15, 2013
```
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.

llvm-svn: 175320
```
  a060d0ef
Feb 15, 2013

Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features. · aef9c37c

Bill Wendling authored Feb 15, 2013

If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.

llvm-svn: 175314

aef9c37c

Feb 14, 2013
- added basic support for Intel ADX instructions · f809c649
  Kay Tiong Khoo authored Feb 14, 2013
```
-feature flag, instructions definitions, test cases

llvm-svn: 175196
```
  f809c649
Jan 30, 2013
- Restrict sin/cos optimization to 64-bit only for now. 32-bit is a bit messy and less critical. · d2ca4e2e
  Evan Cheng authored Jan 30, 2013
```
llvm-svn: 173987
```
  d2ca4e2e
Jan 29, 2013

Teach SDISel to combine fsin / fcos into a fsincos node if the following · 0e88c7d8

Evan Cheng authored Jan 29, 2013

conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755

0e88c7d8

Jan 08, 2013

Pad Short Functions for Intel Atom · a01daace

Preston Gurd authored Jan 08, 2013

The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879

a01daace

Jan 05, 2013

Revert revision 171524. Original message: · 478b6a47

Nadav Rotem authored Jan 05, 2013

URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603

478b6a47

Jan 04, 2013

The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a

Preston Gurd authored Jan 04, 2013

returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524

e36b685a

Jan 02, 2013

Move all of the header files which are involved in modelling the LLVM IR · 9fb823bb

Chandler Carruth authored Jan 02, 2013

into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366

9fb823bb

Dec 10, 2012

Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A. · 17f25c4e
Chandler Carruth authored Dec 10, 2012
```
Thanks to the PaX folks for noticing in review! We need some tests here,
any sugestions welcome...

llvm-svn: 169739
```
17f25c4e

Address a FIXME and update the fast unaligned memory feature for newer · 0f585581

Chandler Carruth authored Dec 10, 2012

Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

llvm-svn: 169730

0f585581

Dec 03, 2012

Use the new script to sort the includes of every file under lib. · ed0881b2

Chandler Carruth authored Dec 03, 2012

Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131

ed0881b2

Nov 09, 2012
- Switch FreeBSD/i386 back to 4byte stack alignment. This partially · 22135678
  Roman Divacky authored Nov 09, 2012
```
reverts r126226.

llvm-svn: 167632
```
  22135678
Nov 08, 2012

Add support of RTM from TSX extension · 73cffddb

Michael Liao authored Nov 08, 2012

- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region

llvm-svn: 167573

73cffddb

Oct 08, 2012
- misched: remove the unused getSpecialAddressLatency hook. · 07dced62
  Andrew Trick authored Oct 08, 2012
```
llvm-svn: 165418
```
  07dced62
Oct 03, 2012

Set up MCSchedModel after detecting the CPU type in X86SubTarget. · 35fcb54c

Preston Gurd authored Oct 03, 2012

Corrects a problem whereby MCSchedModel was not being set up when
the CPU type was auto-detected.

Patch by Andy Zhang.

llvm-svn: 165122

35fcb54c

Sep 04, 2012

Generic Bypass Slow Div · cdf540d5

Preston Gurd authored Sep 04, 2012

- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150

cdf540d5

Aug 13, 2012
- X86: when auto-detecting the subtarget features, make sure use IsIntel to detect · e90e94f1
  Manman Ren authored Aug 13, 2012
```
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
```
  e90e94f1
Aug 11, 2012

X86: when we are auto-detecting the subtarget features, make sure we turn on · 1acb6707

Manman Ren authored Aug 10, 2012

FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306
llvm-svn: 161717

1acb6707

Aug 07, 2012

Allow x86 subtargets to use the GenericModel defined in X86Schedule.td. · e0c83b1f

Andrew Trick authored Aug 07, 2012

This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.

llvm-svn: 161370

e0c83b1f

Aug 01, 2012
- Whitespace. · 24c19d20
  Chad Rosier authored Aug 01, 2012
```
llvm-svn: 161122
```
  24c19d20
Jul 19, 2012
- Adds the family codes for the Midview Atom processors so that the · 8e082688
  Preston Gurd authored Jul 19, 2012
```
Atom buildbot will auto-detect Atom.

llvm-svn: 160521
```
  8e082688
Jul 18, 2012

This patch fixes 8 out of 20 unexpected failures in "make check" · f0a48ec8

Preston Gurd authored Jul 18, 2012

when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451

f0a48ec8

Jun 03, 2012
- Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. · 79dbb0c6
  Craig Topper authored Jun 03, 2012
```
llvm-svn: 157903
```
  79dbb0c6
Jun 01, 2012
- Enable automatic detection of FMA3 support to allow intrinsics to be used. · 1d4d62d7
  Craig Topper authored Jun 01, 2012
```
llvm-svn: 157805
```
  1d4d62d7