Newer
Older
4001
4002
4003
4004
4005
4006
4007
4008
4009
4010
4011
4012
4013
4014
4015
4016
4017
4018
4019
4020
4021
4022
4023
4024
4025
4026
4027
4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
4065
4066
4067
4068
4069
4070
4071
4072
4073
4074
4075
4076
4077
4078
4079
4080
4081
4082
4083
4084
4085
4086
4087
4088
4089
4090
4091
4092
4093
4094
4095
4096
4097
4098
4099
4100
4101
4102
4103
4104
4105
4106
4107
4108
4109
4110
4111
4112
4113
4114
4115
4116
4117
4118
4119
4120
4121
4122
4123
4124
4125
4126
4127
4128
4129
4130
4131
4132
4133
4134
4135
4136
4137
4138
4139
4140
4141
4142
4143
4144
4145
4146
4147
4148
4149
4150
4151
4152
4153
4154
4155
4156
4157
4158
4159
4160
4161
4162
4163
4164
4165
4166
4167
4168
4169
4170
4171
4172
4173
4174
4175
4176
4177
4178
4179
4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
4195
4196
4197
4198
4199
4200
4201
4202
4203
4204
4205
4206
4207
4208
4209
4210
4211
4212
4213
4214
4215
4216
4217
4218
4219
4220
4221
4222
4223
4224
4225
4226
4227
4228
4229
4230
4231
4232
4233
4234
4235
4236
4237
4238
4239
4240
4241
4242
4243
4244
4245
4246
4247
4248
4249
4250
4251
4252
4253
4254
4255
4256
4257
4258
4259
4260
4261
4262
4263
4264
4265
4266
4267
4268
4269
4270
4271
4272
4273
4274
4275
4276
4277
4278
4279
4280
4281
4282
4283
4284
4285
4286
4287
4288
4289
4290
4291
4292
4293
4294
4295
4296
4297
4298
4299
4300
4301
4302
4303
4304
4305
4306
4307
4308
4309
4310
4311
4312
4313
4314
4315
4316
4317
4318
4319
4320
4321
4322
4323
4324
4325
4326
4327
4328
4329
4330
4331
4332
4333
4334
4335
4336
4337
4338
4339
4340
4341
4342
4343
4344
4345
4346
4347
4348
4349
4350
4351
4352
4353
4354
4355
4356
4357
4358
4359
4360
4361
4362
4363
4364
4365
4366
4367
4368
4369
4370
4371
4372
4373
4374
4375
4376
4377
4378
4379
4380
4381
4382
4383
4384
4385
4386
4387
4388
4389
4390
4391
4392
4393
4394
4395
4396
4397
4398
4399
4400
4401
4402
4403
4404
4405
4406
4407
4408
4409
4410
4411
4412
4413
4414
4415
4416
4417
4418
4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
4474
4475
4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486
4487
4488
4489
4490
4491
4492
4493
4494
4495
4496
4497
4498
4499
4500
4501
4502
4503
4504
4505
4506
4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
4522
4523
4524
4525
4526
4527
4528
4529
4530
4531
4532
4533
4534
4535
4536
4537
4538
4539
4540
4541
4542
4543
4544
4545
4546
4547
4548
4549
4550
4551
4552
4553
4554
4555
4556
4557
4558
4559
4560
4561
4562
4563
4564
4565
4566
4567
4568
4569
4570
4571
4572
4573
4574
4575
4576
4577
4578
4579
4580
4581
4582
4583
4584
4585
4586
4587
4588
4589
4590
4591
4592
4593
4594
4595
4596
4597
4598
4599
4600
4601
4602
4603
4604
4605
4606
4607
4608
4609
4610
4611
4612
4613
4614
4615
4616
4617
4618
4619
4620
4621
4622
4623
4624
4625
4626
4627
4628
4629
4630
4631
4632
4633
4634
4635
4636
4637
4638
4639
4640
4641
4642
4643
4644
4645
4646
4647
4648
4649
4650
4651
4652
4653
4654
4655
4656
4657
4658
4659
4660
4661
4662
4663
4664
4665
4666
4667
4668
4669
4670
4671
4672
4673
4674
4675
4676
4677
4678
4679
4680
4681
4682
4683
4684
4685
4686
4687
4688
4689
4690
4691
4692
4693
4694
4695
4696
4697
4698
4699
4700
4701
4702
4703
4704
4705
4706
4707
4708
4709
4710
4711
4712
4713
4714
4715
4716
4717
4718
4719
4720
4721
4722
4723
4724
4725
4726
4727
4728
4729
4730
4731
4732
4733
4734
4735
4736
4737
4738
4739
4740
4741
4742
4743
4744
4745
4746
4747
4748
4749
4750
4751
4752
4753
4754
4755
4756
4757
4758
4759
4760
4761
4762
4763
4764
4765
4766
4767
4768
4769
4770
4771
4772
4773
4774
4775
4776
4777
4778
4779
4780
4781
4782
4783
4784
4785
4786
4787
4788
4789
4790
4791
4792
4793
4794
4795
4796
4797
4798
4799
4800
4801
4802
4803
4804
4805
4806
4807
4808
4809
4810
4811
4812
4813
4814
4815
4816
4817
4818
4819
4820
4821
4822
4823
4824
4825
4826
4827
4828
4829
4830
4831
4832
4833
4834
4835
4836
4837
4838
4839
4840
4841
4842
4843
4844
4845
4846
4847
4848
4849
4850
4851
4852
4853
4854
4855
4856
4857
4858
4859
4860
4861
4862
4863
4864
4865
4866
4867
4868
4869
4870
4871
4872
4873
4874
4875
4876
4877
4878
4879
4880
4881
4882
4883
4884
4885
4886
4887
4888
4889
4890
4891
4892
4893
4894
4895
4896
4897
4898
4899
4900
4901
4902
4903
4904
4905
4906
4907
4908
4909
4910
4911
4912
4913
4914
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929
4930
4931
4932
4933
4934
4935
4936
4937
4938
4939
4940
4941
4942
4943
4944
4945
4946
4947
4948
4949
4950
4951
4952
4953
4954
4955
4956
4957
4958
4959
4960
4961
4962
4963
4964
4965
4966
4967
4968
4969
4970
4971
4972
4973
4974
4975
4976
4977
4978
4979
4980
4981
4982
4983
4984
4985
4986
4987
4988
4989
4990
4991
4992
4993
4994
4995
4996
4997
4998
4999
5000
complement" operation, which is the "~" operator in C.
Arguments:
""""""""""
The two arguments to the '``xor``' instruction must be
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
arguments must have identical types.
Semantics:
""""""""""
The truth table used for the '``xor``' instruction is:
+-----+-----+-----+
| In0 | In1 | Out |
+-----+-----+-----+
| 0 | 0 | 0 |
+-----+-----+-----+
| 0 | 1 | 1 |
+-----+-----+-----+
| 1 | 0 | 1 |
+-----+-----+-----+
| 1 | 1 | 0 |
+-----+-----+-----+
Example:
""""""""
.. code-block:: llvm
<result> = xor i32 4, %var ; yields {i32}:result = 4 ^ %var
<result> = xor i32 15, 40 ; yields {i32}:result = 39
<result> = xor i32 4, 8 ; yields {i32}:result = 12
<result> = xor i32 %V, -1 ; yields {i32}:result = ~%V
Vector Operations
-----------------
LLVM supports several instructions to represent vector operations in a
target-independent manner. These instructions cover the element-access
and vector-specific operations needed to process vectors effectively.
While LLVM does directly support these vector operations, many
sophisticated algorithms will want to use target-specific intrinsics to
take full advantage of a specific target.
.. _i_extractelement:
'``extractelement``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = extractelement <n x <ty>> <val>, i32 <idx> ; yields <ty>
Overview:
"""""""""
The '``extractelement``' instruction extracts a single scalar element
from a vector at a specified index.
Arguments:
""""""""""
The first operand of an '``extractelement``' instruction is a value of
:ref:`vector <t_vector>` type. The second operand is an index indicating
the position from which to extract the element. The index may be a
variable.
Semantics:
""""""""""
The result is a scalar of the same type as the element type of ``val``.
Its value is the value at position ``idx`` of ``val``. If ``idx``
exceeds the length of ``val``, the results are undefined.
Example:
""""""""
.. code-block:: llvm
<result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
.. _i_insertelement:
'``insertelement``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = insertelement <n x <ty>> <val>, <ty> <elt>, i32 <idx> ; yields <n x <ty>>
Overview:
"""""""""
The '``insertelement``' instruction inserts a scalar element into a
vector at a specified index.
Arguments:
""""""""""
The first operand of an '``insertelement``' instruction is a value of
:ref:`vector <t_vector>` type. The second operand is a scalar value whose
type must equal the element type of the first operand. The third operand
is an index indicating the position at which to insert the value. The
index may be a variable.
Semantics:
""""""""""
The result is a vector of the same type as ``val``. Its element values
are those of ``val`` except at position ``idx``, where it gets the value
``elt``. If ``idx`` exceeds the length of ``val``, the results are
undefined.
Example:
""""""""
.. code-block:: llvm
<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
.. _i_shufflevector:
'``shufflevector``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
Overview:
"""""""""
The '``shufflevector``' instruction constructs a permutation of elements
from two input vectors, returning a vector with the same element type as
the input and length that is the same as the shuffle mask.
Arguments:
""""""""""
The first two operands of a '``shufflevector``' instruction are vectors
with the same type. The third argument is a shuffle mask whose element
type is always 'i32'. The result of the instruction is a vector whose
length is the same as the shuffle mask and whose element type is the
same as the element type of the first two operands.
The shuffle mask operand is required to be a constant vector with either
constant integer or undef values.
Semantics:
""""""""""
The elements of the two input vectors are numbered from left to right
across both of the vectors. The shuffle mask operand specifies, for each
element of the result vector, which element of the two input vectors the
result element gets. The element selector may be undef (meaning "don't
care") and the second operand may be undef if performing a shuffle from
only one vector.
Example:
""""""""
.. code-block:: llvm
<result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
<4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
<result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
<4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
<result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
<4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
<result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
<8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
Aggregate Operations
--------------------
LLVM supports several instructions for working with
:ref:`aggregate <t_aggregate>` values.
.. _i_extractvalue:
'``extractvalue``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
Overview:
"""""""""
The '``extractvalue``' instruction extracts the value of a member field
from an :ref:`aggregate <t_aggregate>` value.
Arguments:
""""""""""
The first operand of an '``extractvalue``' instruction is a value of
:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The operands are
constant indices to specify which value to extract in a similar manner
as indices in a '``getelementptr``' instruction.
The major differences to ``getelementptr`` indexing are:
- Since the value being indexed is not a pointer, the first index is
omitted and assumed to be zero.
- At least one index must be specified.
- Not only struct indices but also array indices must be in bounds.
Semantics:
""""""""""
The result is the value at the position in the aggregate specified by
the index operands.
Example:
""""""""
.. code-block:: llvm
<result> = extractvalue {i32, float} %agg, 0 ; yields i32
.. _i_insertvalue:
'``insertvalue``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
Overview:
"""""""""
The '``insertvalue``' instruction inserts a value into a member field in
an :ref:`aggregate <t_aggregate>` value.
Arguments:
""""""""""
The first operand of an '``insertvalue``' instruction is a value of
:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
a first-class value to insert. The following operands are constant
indices indicating the position at which to insert the value in a
similar manner as indices in a '``extractvalue``' instruction. The value
to insert must have the same type as the value identified by the
indices.
Semantics:
""""""""""
The result is an aggregate of the same type as ``val``. Its value is
that of ``val`` except that the value at the position specified by the
indices is that of ``elt``.
Example:
""""""""
.. code-block:: llvm
%agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
%agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
%agg3 = insertvalue {i32, {float}} %agg1, float %val, 1, 0 ; yields {i32 1, float %val}
.. _memoryops:
Memory Access and Addressing Operations
---------------------------------------
A key design point of an SSA-based representation is how it represents
memory. In LLVM, no memory locations are in SSA form, which makes things
very simple. This section describes how to read, write, and allocate
memory in LLVM.
.. _i_alloca:
'``alloca``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = alloca <type>[, <ty> <NumElements>][, align <alignment>] ; yields {type*}:result
Overview:
"""""""""
The '``alloca``' instruction allocates memory on the stack frame of the
currently executing function, to be automatically released when this
function returns to its caller. The object is always allocated in the
generic address space (address space zero).
Arguments:
""""""""""
The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
bytes of memory on the runtime stack, returning a pointer of the
appropriate type to the program. If "NumElements" is specified, it is
the number of elements allocated, otherwise "NumElements" is defaulted
to be one. If a constant alignment is specified, the value result of the
allocation is guaranteed to be aligned to at least that boundary. If not
specified, or if zero, the target can choose to align the allocation on
any convenient boundary compatible with the type.
'``type``' may be any sized type.
Semantics:
""""""""""
Memory is allocated; a pointer is returned. The operation is undefined
if there is insufficient stack space for the allocation. '``alloca``'d
memory is automatically released when the function returns. The
'``alloca``' instruction is commonly used to represent automatic
variables that must have an address available. When the function returns
(either with the ``ret`` or ``resume`` instructions), the memory is
reclaimed. Allocating zero bytes is legal, but the result is undefined.
The order in which memory is allocated (ie., which way the stack grows)
is not specified.
Example:
""""""""
.. code-block:: llvm
%ptr = alloca i32 ; yields {i32*}:ptr
%ptr = alloca i32, i32 4 ; yields {i32*}:ptr
%ptr = alloca i32, i32 4, align 1024 ; yields {i32*}:ptr
%ptr = alloca i32, align 1024 ; yields {i32*}:ptr
.. _i_load:
'``load``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = load [volatile] <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>]
<result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment>
!<index> = !{ i32 1 }
Overview:
"""""""""
The '``load``' instruction is used to read from memory.
Arguments:
""""""""""
The argument to the '``load``' instruction specifies the memory address
from which to load. The pointer must point to a :ref:`first
class <t_firstclass>` type. If the ``load`` is marked as ``volatile``,
then the optimizer is not allowed to modify the number or order of
execution of this ``load`` with other :ref:`volatile
operations <volatile>`.
If the ``load`` is marked as ``atomic``, it takes an extra
:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
``release`` and ``acq_rel`` orderings are not valid on ``load``
instructions. Atomic loads produce :ref:`defined <memmodel>` results
when they may see multiple atomic stores. The type of the pointee must
be an integer type whose bit width is a power of two greater than or
equal to eight and less than or equal to a target-specific size limit.
``align`` must be explicitly specified on atomic loads, and the load has
undefined behavior if the alignment is not set to a value which is at
least the size in bytes of the pointee. ``!nontemporal`` does not have
any defined semantics for atomic loads.
The optional constant ``align`` argument specifies the alignment of the
operation (that is, the alignment of the memory address). A value of 0
or an omitted ``align`` argument means that the operation has the abi
alignment for the target. It is the responsibility of the code emitter
to ensure that the alignment information is correct. Overestimating the
alignment results in undefined behavior. Underestimating the alignment
may produce less efficient code. An alignment of 1 is always safe.
The optional ``!nontemporal`` metadata must reference a single
metatadata name <index> corresponding to a metadata node with one
``i32`` entry of value 1. The existence of the ``!nontemporal``
metatadata on the instruction tells the optimizer and code generator
that this load is not expected to be reused in the cache. The code
generator may select special instructions to save cache bandwidth, such
as the ``MOVNT`` instruction on x86.
The optional ``!invariant.load`` metadata must reference a single
metatadata name <index> corresponding to a metadata node with no
entries. The existence of the ``!invariant.load`` metatadata on the
instruction tells the optimizer and code generator that this load
address points to memory which does not change value during program
execution. The optimizer may then move this load around, for example, by
hoisting it out of loops using loop invariant code motion.
Semantics:
""""""""""
The location of memory pointed to is loaded. If the value being loaded
is of scalar type then the number of bytes read does not exceed the
minimum number of bytes needed to hold all bits of the type. For
example, loading an ``i24`` reads at most three bytes. When loading a
value of a type like ``i20`` with a size that is not an integral number
of bytes, the result is undefined if the value was not originally
written using a store of the same type.
Examples:
"""""""""
.. code-block:: llvm
%ptr = alloca i32 ; yields {i32*}:ptr
store i32 3, i32* %ptr ; yields {void}
%val = load i32* %ptr ; yields {i32}:val = i32 3
.. _i_store:
'``store``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields {void}
store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields {void}
Overview:
"""""""""
The '``store``' instruction is used to write to memory.
Arguments:
""""""""""
There are two arguments to the '``store``' instruction: a value to store
and an address at which to store it. The type of the '``<pointer>``'
operand must be a pointer to the :ref:`first class <t_firstclass>` type of
the '``<value>``' operand. If the ``store`` is marked as ``volatile``,
then the optimizer is not allowed to modify the number or order of
execution of this ``store`` with other :ref:`volatile
operations <volatile>`.
If the ``store`` is marked as ``atomic``, it takes an extra
:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
``acquire`` and ``acq_rel`` orderings aren't valid on ``store``
instructions. Atomic loads produce :ref:`defined <memmodel>` results
when they may see multiple atomic stores. The type of the pointee must
be an integer type whose bit width is a power of two greater than or
equal to eight and less than or equal to a target-specific size limit.
``align`` must be explicitly specified on atomic stores, and the store
has undefined behavior if the alignment is not set to a value which is
at least the size in bytes of the pointee. ``!nontemporal`` does not
have any defined semantics for atomic stores.
The optional constant "align" argument specifies the alignment of the
operation (that is, the alignment of the memory address). A value of 0
or an omitted "align" argument means that the operation has the abi
alignment for the target. It is the responsibility of the code emitter
to ensure that the alignment information is correct. Overestimating the
alignment results in an undefined behavior. Underestimating the
alignment may produce less efficient code. An alignment of 1 is always
safe.
The optional !nontemporal metadata must reference a single metatadata
name <index> corresponding to a metadata node with one i32 entry of
value 1. The existence of the !nontemporal metatadata on the instruction
tells the optimizer and code generator that this load is not expected to
be reused in the cache. The code generator may select special
instructions to save cache bandwidth, such as the MOVNT instruction on
x86.
Semantics:
""""""""""
The contents of memory are updated to contain '``<value>``' at the
location specified by the '``<pointer>``' operand. If '``<value>``' is
of scalar type then the number of bytes written does not exceed the
minimum number of bytes needed to hold all bits of the type. For
example, storing an ``i24`` writes at most three bytes. When writing a
value of a type like ``i20`` with a size that is not an integral number
of bytes, it is unspecified what happens to the extra bits that do not
belong to the type, but they will typically be overwritten.
Example:
""""""""
.. code-block:: llvm
%ptr = alloca i32 ; yields {i32*}:ptr
store i32 3, i32* %ptr ; yields {void}
%val = load i32* %ptr ; yields {i32}:val = i32 3
.. _i_fence:
'``fence``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
fence [singlethread] <ordering> ; yields {void}
Overview:
"""""""""
The '``fence``' instruction is used to introduce happens-before edges
between operations.
Arguments:
""""""""""
'``fence``' instructions take an :ref:`ordering <ordering>` argument which
defines what *synchronizes-with* edges they add. They can only be given
``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
Semantics:
""""""""""
A fence A which has (at least) ``release`` ordering semantics
*synchronizes with* a fence B with (at least) ``acquire`` ordering
semantics if and only if there exist atomic operations X and Y, both
operating on some atomic object M, such that A is sequenced before X, X
modifies M (either directly or through some side effect of a sequence
headed by X), Y is sequenced before B, and Y observes M. This provides a
*happens-before* dependency between A and B. Rather than an explicit
``fence``, one (but not both) of the atomic operations X or Y might
provide a ``release`` or ``acquire`` (resp.) ordering constraint and
still *synchronize-with* the explicit ``fence`` and establish the
*happens-before* edge.
A ``fence`` which has ``seq_cst`` ordering, in addition to having both
``acquire`` and ``release`` semantics specified above, participates in
the global program order of other ``seq_cst`` operations and/or fences.
The optional ":ref:`singlethread <singlethread>`" argument specifies
that the fence only synchronizes with other fences in the same thread.
(This is useful for interacting with signal handlers.)
Example:
""""""""
.. code-block:: llvm
fence acquire ; yields {void}
fence singlethread seq_cst ; yields {void}
.. _i_cmpxchg:
'``cmpxchg``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <ordering> ; yields {ty}
Overview:
"""""""""
The '``cmpxchg``' instruction is used to atomically modify memory. It
loads a value in memory and compares it to a given value. If they are
equal, it stores a new value into the memory.
Arguments:
""""""""""
There are three arguments to the '``cmpxchg``' instruction: an address
to operate on, a value to compare to the value currently be at that
address, and a new value to place at that address if the compared values
are equal. The type of '<cmp>' must be an integer type whose bit width
is a power of two greater than or equal to eight and less than or equal
to a target-specific size limit. '<cmp>' and '<new>' must have the same
type, and the type of '<pointer>' must be a pointer to that type. If the
``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed
to modify the number or order of execution of this ``cmpxchg`` with
other :ref:`volatile operations <volatile>`.
The :ref:`ordering <ordering>` argument specifies how this ``cmpxchg``
synchronizes with other atomic operations.
The optional "``singlethread``" argument declares that the ``cmpxchg``
is only atomic with respect to code (usually signal handlers) running in
the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with
respect to all other code in the system.
The pointer passed into cmpxchg must have alignment greater than or
equal to the size in memory of the operand.
Semantics:
""""""""""
The contents of memory at the location specified by the '``<pointer>``'
operand is read and compared to '``<cmp>``'; if the read value is the
equal, '``<new>``' is written. The original value at the location is
returned.
A successful ``cmpxchg`` is a read-modify-write instruction for the purpose
of identifying release sequences. A failed ``cmpxchg`` is equivalent to an
atomic load with an ordering parameter determined by dropping any
``release`` part of the ``cmpxchg``'s ordering.
Example:
""""""""
.. code-block:: llvm
entry:
%orig = atomic load i32* %ptr unordered ; yields {i32}
br label %loop
loop:
%cmp = phi i32 [ %orig, %entry ], [%old, %loop]
%squared = mul i32 %cmp, %cmp
%old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared ; yields {i32}
%success = icmp eq i32 %cmp, %old
br i1 %success, label %done, label %loop
done:
...
.. _i_atomicrmw:
'``atomicrmw``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields {ty}
Overview:
"""""""""
The '``atomicrmw``' instruction is used to atomically modify memory.
Arguments:
""""""""""
There are three arguments to the '``atomicrmw``' instruction: an
operation to apply, an address whose value to modify, an argument to the
operation. The operation must be one of the following keywords:
- xchg
- add
- sub
- and
- nand
- or
- xor
- max
- min
- umax
- umin
The type of '<value>' must be an integer type whose bit width is a power
of two greater than or equal to eight and less than or equal to a
target-specific size limit. The type of the '``<pointer>``' operand must
be a pointer to that type. If the ``atomicrmw`` is marked as
``volatile``, then the optimizer is not allowed to modify the number or
order of execution of this ``atomicrmw`` with other :ref:`volatile
operations <volatile>`.
Semantics:
""""""""""
The contents of memory at the location specified by the '``<pointer>``'
operand are atomically read, modified, and written back. The original
value at the location is returned. The modification is specified by the
operation argument:
- xchg: ``*ptr = val``
- add: ``*ptr = *ptr + val``
- sub: ``*ptr = *ptr - val``
- and: ``*ptr = *ptr & val``
- nand: ``*ptr = ~(*ptr & val)``
- or: ``*ptr = *ptr | val``
- xor: ``*ptr = *ptr ^ val``
- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
comparison)
- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
comparison)
Example:
""""""""
.. code-block:: llvm
%old = atomicrmw add i32* %ptr, i32 1 acquire ; yields {i32}
.. _i_getelementptr:
'``getelementptr``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}*
<result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}*
<result> = getelementptr <ptr vector> ptrval, <vector index type> idx
Overview:
"""""""""
The '``getelementptr``' instruction is used to get the address of a
subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
address calculation only and does not access memory.
Arguments:
""""""""""
The first argument is always a pointer or a vector of pointers, and
forms the basis of the calculation. The remaining arguments are indices
that indicate which of the elements of the aggregate object are indexed.
The interpretation of each index is dependent on the type being indexed
into. The first index always indexes the pointer value given as the
first argument, the second index indexes a value of the type pointed to
(not necessarily the value directly pointed to, since the first index
can be non-zero), etc. The first type indexed into must be a pointer
value, subsequent types can be arrays, vectors, and structs. Note that
subsequent types being indexed into can never be pointers, since that
would require loading the pointer before continuing calculation.
The type of each index argument depends on the type it is indexing into.
When indexing into a (optionally packed) structure, only ``i32`` integer
**constants** are allowed (when using a vector of indices they must all
be the **same** ``i32`` integer constant). When indexing into an array,
pointer or vector, integers of any width are allowed, and they are not
required to be constant. These integers are treated as signed values
where relevant.
For example, let's consider a C code fragment and how it gets compiled
to LLVM:
.. code-block:: c
struct RT {
char A;
int B[10][20];
char C;
};
struct ST {
int X;
double Y;
struct RT Z;
};
int *foo(struct ST *s) {
return &s[1].Z.B[5][13];
}
The LLVM code generated by Clang is:
.. code-block:: llvm
%struct.RT = type { i8, [10 x [20 x i32]], i8 }
%struct.ST = type { i32, double, %struct.RT }
define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
entry:
%arrayidx = getelementptr inbounds %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
ret i32* %arrayidx
}
Semantics:
""""""""""
In the example above, the first index is indexing into the
'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
= '``{ i32, double, %struct.RT }``' type, a structure. The second index
indexes into the third element of the structure, yielding a
'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
structure. The third index indexes into the second element of the
structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
dimensions of the array are subscripted into, yielding an '``i32``'
type. The '``getelementptr``' instruction returns a pointer to this
element, thus computing a value of '``i32*``' type.
Note that it is perfectly legal to index partially through a structure,
returning a pointer to an inner element. Because of this, the LLVM code
for the given testcase is equivalent to:
.. code-block:: llvm
define i32* @foo(%struct.ST* %s) {
%t1 = getelementptr %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
%t2 = getelementptr %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
%t3 = getelementptr %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
%t4 = getelementptr [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
%t5 = getelementptr [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
ret i32* %t5
}
If the ``inbounds`` keyword is present, the result value of the
``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
pointer is not an *in bounds* address of an allocated object, or if any
of the addresses that would be formed by successive addition of the
offsets implied by the indices to the base address with infinitely
precise signed arithmetic are not an *in bounds* address of that
allocated object. The *in bounds* addresses for an allocated object are
all the addresses that point into the object, plus the address one byte
past the end. In cases where the base is a vector of pointers the
``inbounds`` keyword applies to each of the computations element-wise.
If the ``inbounds`` keyword is not present, the offsets are added to the
base address with silently-wrapping two's complement arithmetic. If the
offsets have a different width from the pointer, they are sign-extended
or truncated to the width of the pointer. The result value of the
``getelementptr`` may be outside the object pointed to by the base
pointer. The result value may not necessarily be used to access memory
though, even if it happens to point into allocated storage. See the
:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
information.
The getelementptr instruction is often confusing. For some more insight
into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
Example:
""""""""
.. code-block:: llvm
; yields [12 x i8]*:aptr
%aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1
; yields i8*:vptr
%vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
; yields i8*:eptr
%eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1
; yields i32*:iptr
%iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0
In cases where the pointer argument is a vector of pointers, each index
must be a vector with the same number of elements. For example:
.. code-block:: llvm
%A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets,
Conversion Operations
---------------------
The instructions in this category are the conversion instructions
(casting) which all take a single operand and a type. They perform
various bit conversions on the operand.
'``trunc .. to``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = trunc <ty> <value> to <ty2> ; yields ty2
Overview:
"""""""""
The '``trunc``' instruction truncates its operand to the type ``ty2``.
Arguments:
""""""""""
The '``trunc``' instruction takes a value to trunc, and a type to trunc
it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
of the same number of integers. The bit size of the ``value`` must be
larger than the bit size of the destination type, ``ty2``. Equal sized
types are not allowed.
Semantics:
""""""""""
The '``trunc``' instruction truncates the high order bits in ``value``
and converts the remaining bits to ``ty2``. Since the source size must
be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
It will always truncate bits.
Example:
""""""""
.. code-block:: llvm
%X = trunc i32 257 to i8 ; yields i8:1
%Y = trunc i32 123 to i1 ; yields i1:true
%Z = trunc i32 122 to i1 ; yields i1:false
%W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
'``zext .. to``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = zext <ty> <value> to <ty2> ; yields ty2
Overview:
"""""""""
The '``zext``' instruction zero extends its operand to type ``ty2``.
Arguments:
""""""""""
The '``zext``' instruction takes a value to cast, and a type to cast it
to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
the same number of integers. The bit size of the ``value`` must be
smaller than the bit size of the destination type, ``ty2``.
Semantics:
""""""""""
The ``zext`` fills the high order bits of the ``value`` with zero bits
until it reaches the size of the destination type, ``ty2``.
When zero extending from i1, the result will always be either 0 or 1.
Example:
""""""""
.. code-block:: llvm
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
'``sext .. to``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
<result> = sext <ty> <value> to <ty2> ; yields ty2
Overview:
"""""""""
The '``sext``' sign extends ``value`` to the type ``ty2``.
Arguments:
""""""""""
The '``sext``' instruction takes a value to cast, and a type to cast it
to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
the same number of integers. The bit size of the ``value`` must be
smaller than the bit size of the destination type, ``ty2``.
Semantics:
""""""""""
The '``sext``' instruction performs a sign extension by copying the sign
bit (highest order bit) of the ``value`` until it reaches the bit size
of the type ``ty2``.
When sign extending from i1, the extension always results in -1 or 0.
Example:
""""""""
.. code-block:: llvm
%X = sext i8 -1 to i16 ; yields i16 :65535
%Y = sext i1 true to i32 ; yields i32:-1
%Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
'``fptrunc .. to``' Instruction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^