Commits · 7dee697faa5361fc953909b8d4547f2d08af5cab · Roger Ferrer / llvm-epi-0.8

Jul 28, 2013
- Added encoding prefixes for KNL instructions (EVEX). · 003e7d73
  Elena Demikhovsky authored Jul 28, 2013
```
Added 512-bit operands printing.
Added instruction formats for KNL instructions.

llvm-svn: 187324
```
  003e7d73
Jul 24, 2013

I'm starting to commit KNL backend. I'll push patches one-by-one. This patch... · 8cfb43f7

Elena Demikhovsky authored Jul 24, 2013

I'm starting to commit KNL backend. I'll push patches one-by-one. This patch includes support for the extended register set XMM16-31, YMM16-31, ZMM0-31.
The full ISA you can see here: http://software.intel.com/en-us/intel-isa-extensions

llvm-svn: 187030

8cfb43f7

May 03, 2013
- X86: Add target description for btver2; make autodetection logic aware of AVX. · b44c4275
  Benjamin Kramer authored May 03, 2013
```
llvm-svn: 181005
```
  b44c4275
Apr 25, 2013

This patch adds the X86FixupLEAs pass, which will reduce instruction · 8b7ab4ba

Preston Gurd authored Apr 25, 2013

latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573

8b7ab4ba

Apr 19, 2013

[asm parser] Add support for predicating MnemonicAlias based on the assembler · 9f7a221f

Chad Rosier authored Apr 18, 2013

variant/dialect.  Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.

llvm-svn: 179804

9f7a221f

Mar 29, 2013
- Add support of RDSEED defined in AVX2 extension · a486a11d
  Michael Liao authored Mar 28, 2013
```
llvm-svn: 178314
```
  a486a11d
Mar 28, 2013
- Add the Haswell machine model. · e7b6a8aa
  Nadav Rotem authored Mar 28, 2013
```
llvm-svn: 178301
```
  e7b6a8aa
Mar 27, 2013

· 663e6f95

Preston Gurd authored Mar 27, 2013

For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171

663e6f95

Mar 26, 2013

Add HLE target feature · e344ec91
Michael Liao authored Mar 26, 2013
```
llvm-svn: 178082
```
e344ec91

Enable SandyBridgeModel for all modern Intel P6 descendants. · 1ac7e662

Jakob Stoklund Olesen authored Mar 26, 2013

All Intel CPUs since Yonah look a lot alike, at least at the granularity
of the scheduling models. We can add more accurate models for
processors that aren't Sandy Bridge if required. Haswell will probably
need its own.

The Atom processor and anything based on NetBurst is completely
different. So are the non-Intel chips.

llvm-svn: 178080

1ac7e662

Add PREFETCHW codegen support · 5173ee03
Michael Liao authored Mar 26, 2013
```
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
```
5173ee03

Feb 14, 2013
- added basic support for Intel ADX instructions · f809c649
  Kay Tiong Khoo authored Feb 14, 2013
```
-feature flag, instructions definitions, test cases

llvm-svn: 175196
```
  f809c649
Jan 08, 2013

Pad Short Functions for Intel Atom · a01daace

Preston Gurd authored Jan 08, 2013

The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879

a01daace

Jan 05, 2013

Revert revision 171524. Original message: · 478b6a47

Nadav Rotem authored Jan 05, 2013

URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603

478b6a47

Jan 04, 2013

The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a

Preston Gurd authored Jan 04, 2013

returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524

e36b685a

Dec 15, 2012

Make '-mtune=x86_64' assume fast unaligned memory accesses. · 7a28f954

Chandler Carruth authored Dec 15, 2012

Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs. Reviewed by Dan Gohman.

llvm-svn: 170269

7a28f954

Dec 10, 2012

Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." · 867c7bff
Chandler Carruth authored Dec 10, 2012
```
Accidental commit... git svn betrayed me. Sorry for the noise.

llvm-svn: 169741
```
867c7bff

Make '-mtune=x86_64' assume fast unaligned memory accesses. · 7eaa45c7

Chandler Carruth authored Dec 10, 2012

Summary:
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D195

llvm-svn: 169740

7eaa45c7

Address a FIXME and update the fast unaligned memory feature for newer · 0f585581

Chandler Carruth authored Dec 10, 2012

Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

llvm-svn: 169730

0f585581

Nov 08, 2012

Add support of RTM from TSX extension · 73cffddb

Michael Liao authored Nov 08, 2012

- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region

llvm-svn: 167573

73cffddb

Oct 25, 2012
- Atom has SIMD instruction set extension up to SSSE3 · c6696b04
  Michael Liao authored Oct 25, 2012
```
llvm-svn: 166665
```
  c6696b04
Oct 03, 2012
- Fix 80-column violation · 30387810
  Craig Topper authored Oct 03, 2012
```
llvm-svn: 165089
```
  30387810
Sep 12, 2012
- Add support for AMD Geode. · fd690094
  Roman Divacky authored Sep 12, 2012
```
llvm-svn: 163710
```
  fd690094
Sep 04, 2012

Generic Bypass Slow Div · cdf540d5

Preston Gurd authored Sep 04, 2012

- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150

cdf540d5

Aug 16, 2012
- Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well. · af3e9834
  Anitha Boyapati authored Aug 16, 2012
```
llvm-svn: 162012
```
  af3e9834
- (no commit message) · 426feb61
  Anitha Boyapati authored Aug 16, 2012
```
llvm-svn: 162010
```
  426feb61
Jul 07, 2012

I'm introducing a new machine model to simultaneously allow simple · 87255e34

Andrew Trick authored Jul 07, 2012

subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891

87255e34

Jun 03, 2012
- Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. · 79dbb0c6
  Craig Topper authored Jun 03, 2012
```
llvm-svn: 157903
```
  79dbb0c6
May 31, 2012

X86: Rename the CLMUL target feature to PCLMUL. · a0396e45

Benjamin Kramer authored May 31, 2012

It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745

a0396e45

May 01, 2012
- Make XOP and FMA4 require SSE4A to match GCC behavior. Use this to simplify Bulldozer feature list. · bae0e9ea
  Craig Topper authored May 01, 2012
```
llvm-svn: 155897
```
  bae0e9ea
- Make XOP imply AVX as its needed to legalize the registers types. · 43518cc5
  Craig Topper authored May 01, 2012
```
llvm-svn: 155891
```
  43518cc5
- Make CLMUL and AES imply SSE2 since its needed to legalize the type. · 29dd148a
  Craig Topper authored May 01, 2012
```
llvm-svn: 155888
```
  29dd148a
- Enable AVX and FMA4 for AMD Bulldozer processors. · 0eacda5f
  Craig Topper authored May 01, 2012
```
llvm-svn: 155885
```
  0eacda5f
Apr 26, 2012

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to... · 08ccfbe5

Craig Topper authored Apr 26, 2012

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.

llvm-svn: 155618

08ccfbe5

Feb 18, 2012
- Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430,... · b22310fd
  Jia Liu authored Feb 18, 2012
```
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.

llvm-svn: 150878
```
  b22310fd
Feb 07, 2012
- Use LEA to adjust stack ptr for Atom. Patch by Andy Zhang. · 1b81fddd
  Evan Cheng authored Feb 07, 2012
```
llvm-svn: 150008
```
  1b81fddd
Feb 02, 2012

Instruction scheduling itinerary for Intel Atom. · 8523b16f

Andrew Trick authored Feb 01, 2012

Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.

Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.

Adds a test to verify that the scheduler is working.

Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.

Patch by Preston Gurd!

llvm-svn: 149558

8523b16f

Jan 12, 2012
- Rename X86ATTAsmParser -> X86AsmParser · 4a6e778a
  Devang Patel authored Jan 12, 2012
```
We are using one parser to parse att as well as intel style syntax.

llvm-svn: 148032
```
  4a6e778a
Jan 10, 2012
- Add definition for intel asm variant. · 67bf992a
  Devang Patel authored Jan 10, 2012
```
Right now, this just adds additional entries in match table. The parser does not use them yet.

llvm-svn: 147859
```
  67bf992a
- Add definitions for AMD's bobcat (aka btver1) · 077ae1d7
  Benjamin Kramer authored Jan 10, 2012
```
llvm-svn: 147846
```
  077ae1d7