• R/O
  • HTTP
  • SSH
  • HTTPS

提交列表

标签
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

GCC with patches for OS216


RSS
Rev. 时间 作者
cfa2652 devel/unified-autovect 2018-03-14 20:22:05 Sameera Deshpande

Add tile generation algorithm fixes

From-SVN: r258522

d5e3c14 2018-01-25 21:09:30 Sameera Deshpande

Add target specific tile generation algorithm

From-SVN: r257048

066fa0f 2017-03-31 18:09:57 Sameera Deshpande

stage 2...

stage 2: implementation of k-arity promotion/reduction in the series "Improving
effectiveness and generality of autovectorization using unified representation".

The permute nodes within primitive reorder tree(PRT) generated from input
program can have any arity depending upon stride of accesses. However, the
target cannot have instructions to support all arities. Hence, we need to
promote or reduce the arity of PRT to enable successful tree tiling.

In classic autovectorization, if vectorization stride > 2, arity reduction is
performed by generating cascaded extract and interleave instructions as
described by "Auto-vectorization of Interleaved Data for SIMD" by D. Nuzman,
I. Rosen and A. Zaks.

Moreover, to enable SLP across loop, "Loop-aware SLP in GCC" by D. Nuzman,
I. Rosen and A. Zaks unrolls loop till stride = vector size.

k-arity reduction/promotion algorithm makes use of modulo arithmetic to generate
PRT of desired arity for both above-mentioned cases.

Single ILV node of arity k can be reduced into cascaded ILV nodes with single
node of arity m with children of arity k/m such that ith child of original ILV
node becomes floor (i/m) th child of (i%m) th child of new parent.

Single EXTR node with k parts and i selector can be reduced into cascaded EXTR
nodes such that parent EXTR node has m parts and i/(k/m) selection on child EXTR
node with k/m parts and i % (k/m) selection.

Similarly, loop unrolling to get desired arity m can be represented as arity
promotion from k to m.

Single ILV node of arity k can be promoted to single ILV node of arity m by
adding extraction with m/k parts and selection i/k of i%k the child of original
tree as ith child of new ILV node.

To enable loop-aware SLP, we first promote arity of input PRT to maximum vector
size permissible on the architecture. This can have impact on vector code size,
though performance will be the same. To eliminate redundant ILV and EXTR
operations, thereby undoing unneccessary unrolling, we can perform unity
reduction optimization:
- EXTR_m,x (ILV_M(S1, S2, ... Sm)) => Sx
- ILV_m (EXTR_0(S), EXTR_1(S),...EXTR_m-1(S)) => S

Later we apply arity promotion reduction algorithm on the output tree to get tree
with desired arity. For now, we are supporting target arity = 2, as most of the
architectures have support for that. However, the code can be extended for
additional arity supports as well.

We have also implemented unity reduction optimization which eliminates redundant
ILV and EXTR nodes thereby undoing unneccessary unrolling - which can bloat up
the code size otherwise.

From-SVN: r246610

08a2348 2017-02-14 23:47:59 Sameera Deshpande

Incremental changes to toT merge.

From-SVN: r245428

571e95c 2017-02-14 16:25:57 Sameera Deshpande

Merged to ToT dated 13th Feb 2017.

From-SVN: r245416

e218490 2016-07-18 16:35:34 Sameera Deshpande

Misc changes - accumulate probable root nodes for the loop in ITER_node.

From-SVN: r238425

94c2e8d 2016-07-11 17:01:08 Sameera Deshpande

Add new pass to perform autovectorization using unified representation - Current GCC framework does not give complete overview of the loop to be vectorized ...

Add new pass to perform autovectorization using unified representation - Current
GCC framework does not give complete overview of the loop to be vectorized : it
either breaks the loop across body, or across iterations. Because of which these
data structures can not be reused for our approach which gathers all the
information of loop body at one place using primitive permute operations. Hence,
define new data structures and populate them.

Add support for vectorization of LOAD/STORE instructions
a. Create permute order tree for the loop with LOAD and STORE instructions
for single or multi-dimensional arrays, aggregates within nested loops.

This change adds new pass to perform autovectorization using unified
representation, defines new data structures to cater to this requirement and
creates primitive reorder tree for LOAD/STORE instructions within the loop.

The whole loop is represented using the ITER_NODE, which have information about
- The preparatory statements for vectorization to be executed before entering
the loop (like initialization of vectors, prepping for reduction operations,
peeling etc.)
- Vectorizable loop body represented as PRIMOP_TREE (primitive reordering tree)
- Final statements (For peeling, variable loop bound, COLLAPSE operation for
reduction etc.)
- Other loop attributes (loop bound, peeling needed, dependences, etc.)

Memory accesses within a loop have definite repetitive pattern which can be
captured using primitive permute operators which can be used to determine
desired permute order for the vector computations. The PRIMOP_TREE is AST which
records all computations and permutations required to store destination vector
into continuous memory at the end of all iterations of the loop. It can have
INTERLEAVE, CONCAT, EXTRACT, SPLIT, ITER or any compute operation as
intermediate node. Leaf nodes can either be memory reference, constant or vector
of loop invariants. Depending upon the operation, PRIMOP_TREE holds appropriate
information about the statement within the loop which is necessary for
vectorization.

At this stage, these data structures are populated by gathering all the
information of the loop, statements within the loop and correlation of the
statements within the loop. Moreover the loop body is analyzed to check if
vectorization of each statement is possible. One has to note however that this
analysis phase will give worst-case estimate of instruction selection, as it
checks if specific named pattern is defined in .md for the target. It not
necessarily give optimal cover which is aim of the transformation phase using
tree tiling algorithm - and can be invoked only once the loop body is
represented using primitive reoder tree.

At this stage, the focus is to create permute order tree for the loop with LOAD
and STORE instructions only. The code we intend to compile is of the form
FOR(i = 0; i < N; i + +)
{
stmt 1 : D[k ∗ i + d 1 ] =S 1 [k ∗ i + c 11 ]
stmt 2 : D[k ∗ i + d 2 ] =S 1 [k ∗ i + c 21 ]
...
stmt k : D[k ∗ i + d k ] =S 1 [k ∗ i + c k 1 ]
}
Here we are assuming that any data reference can be represented using base + k *
index + offset (The data structure struct data_reference from GCC is used
currently for this purpose). If not, the address is normalized to convert to
such representation.

From-SVN: r238205

2660286 2016-07-08 16:52:03 Martin Liska

Do not consider COMPLEX_TYPE as fold_convertible_p

PR middle-end/71606
* fold-const.c (fold_convertible_p): As COMPLEX_TYPE
folding produces SAVE_EXPRs, thus return false for the type.
* gcc.dg/torture/pr71606.c: New test.

From-SVN: r238157

70cdd4a 2016-07-08 13:36:16 Jerry DeLisle

re PR fortran/71764 (ICE in gfc_trans_structure_assign)

2016-07-07 Jerry DeLisle <jvdelisle@gcc.gnu.org>

PR fortran/71764
* trans-expr.c (gfc_trans_structure_assign): Remove assert.

* gfortran.dg/pr71764.f90: New test.

From-SVN: r238156

cb0044d 2016-07-08 09:16:21 GCC Administrator

Daily bump.

From-SVN: r238155

842dc2e 2016-07-08 03:45:43 Jakub Jelinek

re PR c++/70869 (internal compiler error: Segmentation fault on array of pointer to function members)

PR c++/70869
PR c++/71054
* cp-gimplify.c (cp_genericize_r): For DECL_EXPR for non-static
artificial vars, genericize their initializers.

* g++.dg/cpp0x/pr70869.C: New test.
* g++.dg/cpp0x/pr71054.C: New test.

Co-Authored-By: Kai Tietz <ktietz70@googlemail.com>

From-SVN: r238124

31be426 2016-07-08 02:59:54 David Edelsohn

* g++.dg/debug/pr71432.C: Fail on AIX.

From-SVN: r238122

9fc0faf 2016-07-08 01:35:43 Jonathan Wakely

Update libstdc++ status docs

* doc/xml/manual/status_cxx2014.xml: Update LFTS status table.
* doc/html/*: Regenerate.

From-SVN: r238120

86ec3bf 2016-07-07 22:20:30 Arnaud Charlet

[multiple changes]

2016-07-07 Ed Schonberg <schonberg@adacore.com>

* exp_ch6.adb (Expand_Internal_Init_Call): Subsidiary procedure
to Expand_Protected_ Subprogram_Call, to handle properly a
call to a protected function that provides the initialization
expression for a private component of the same protected type.
* sem_ch9.adb (Analyze_Protected_Definition): Layout must be
applied to itypes generated for a private operation of a protected
type that has a formal of an anonymous access to subprogram,
because these itypes have no freeze nodes and are frozen in place.
* sem_ch4.adb (Analyze_Selected_Component): If prefix is a
protected type and it is not a current instance, do not examine
the first private component of the type.

2016-07-07 Arnaud Charlet <charlet@adacore.com>

* exp_imgv.adb, g-dynhta.adb, s-regexp.adb, s-fatgen.adb, s-poosiz.adb:
Minor removal of extra whitespace.
* einfo.ads: minor removal of repeated "as" in comment

2016-07-07 Vadim Godunko <godunko@adacore.com>

* adaint.c: Complete previous change.

From-SVN: r238117

0640c7d 2016-07-07 22:17:51 Arnaud Charlet

[multiple changes]

2016-07-07 Vadim Godunko <godunko@adacore.com>

* adainit.h, adainit.c (__gnat_is_read_accessible_file): New
subprogram.
(__gnat_is_write_accessible_file): New subprogram.
* s-os_lib.ads, s-os_lib.adb (Is_Read_Accessible_File): New subprogram.
(Is_Write_Accessible_File): New subprogram.

2016-07-07 Justin Squirek <squirek@adacore.com>

* sem_ch12.adb (Install_Body): Minor refactoring in the order
of local functions.
(In_Same_Scope): Change loop condition to be more expressive.

From-SVN: r238116

8c51903 2016-07-07 22:16:05 Arnaud Charlet

[multiple changes]

2016-07-07 Gary Dismukes <dismukes@adacore.com>

* sem_ch3.adb, sem_prag.adb, sem_prag.ads, prj-ext.adb, freeze.adb,
sem_attr.adb: Minor reformatting, fix typos.

2016-07-07 Justin Squirek <squirek@adacore.com>

* sem_ch12.adb (In_Same_Scope): Created this function to check
a generic package definition against an instantiation for scope
dependancies.
(Install_Body): Add function In_Same_Scope and
amend conditional in charge of delaying the package instance.
(Is_In_Main_Unit): Add guard to check if parent is present in
assignment of Current_Unit.

From-SVN: r238115

1c12209 2016-07-07 22:15:39 Martin Liska

Optimize fortran loops with +-1 step.

* gfortran.dg/do_1.f90: Remove a corner case that triggers
an undefined behavior.
* gfortran.dg/do_3.F90: Likewise.
* gfortran.dg/do_check_11.f90: New test.
* gfortran.dg/do_check_12.f90: New test.
* gfortran.dg/do_corner_warn.f90: New test.
* lang.opt (Wundefined-do-loop): New option.
* resolve.c (gfc_resolve_iterator): Warn for Wundefined-do-loop.
(gfc_trans_simple_do): Generate a c-style loop.
(gfc_trans_do): Fix GNU coding style.
* invoke.texi: Mention the new warning.

From-SVN: r238114

9cc6b3f 2016-07-07 22:12:55 Eric Botcazou

sem_ch6.adb (Analyze_Subprogram_Body_Helper): Remove redundant test, adjust comments and formatting.

2016-07-07 Eric Botcazou <ebotcazou@adacore.com>

* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Remove redundant test,
adjust comments and formatting.
* sem_prag.adb (Inlining_Not_Possible): Do not test Front_End_Inlining
here but...
(Make_Inline): ...here before calling Inlining_Not_Possible instead.
(Set_Inline_Flags): Remove useless test.
(Analyze_Pragma) <Pragma_Inline>: Add comment about -gnatn switch.

From-SVN: r238113

7119f1b 2016-07-07 22:11:05 Martin Liska

Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step

* trans-stmt.c (gfc_trans_do): Add expect builtin for DO
loops with step bigger than +-1.
* gfortran.dg/predict-1.f90: Ammend the test.
* gfortran.dg/predict-2.f90: Likewise.

From-SVN: r238112

0e77949 2016-07-07 22:05:08 Arnaud Charlet

[multiple changes]

2016-07-07 Ed Schonberg <schonberg@adacore.com>

* sem_prag.ads, sem_prag.adb (Build_Classwide_Expression): Include
overridden operation as parameter, in order to map formals of
the overridden and overring operation properly prior to rewriting
the inherited condition.
* freeze.adb (Check_Inherited_Cnonditions): Change call to
Build_Class_Wide_Expression accordingly. In Spark_Mode, add
call to analyze the contract of the parent operation, prior to
mapping formals between operations.

2016-07-07 Arnaud Charlet <charlet@adacore.com>

* adabkend.adb (Scan_Back_End_Switches): Ignore -o/-G switches
as done in back_end.adb.
(Scan_Compiler_Args): Remove special case for CodePeer/SPARK, no longer
needed, and prevents proper handling of multi-unit sources.

2016-07-07 Thomas Quinot <quinot@adacore.com>

* g-sechas.adb, g-sechas.ads (GNAT.Secure_Hashes.H): Add Hash_Stream
type with Write primitive calling Update on the underlying context
(and dummy Read primitive raising P_E).

2016-07-07 Thomas Quinot <quinot@adacore.com>

* sem_ch13.adb: Minor reformatting.

From-SVN: r238111

7dccd19 2016-07-07 22:02:31 Arnaud Charlet

[multiple changes]

2016-07-07 Thomas Quinot <quinot@adacore.com>

* g-socket.ads: Document performance consideration for stream
wrapper.

2016-07-07 Arnaud Charlet <charlet@adacore.com>

* osint-c.ads (Set_File_Name): Clarify spec.

From-SVN: r238110

c765803 2016-07-07 22:00:54 Eric Botcazou

freeze.adb: Reenable code.

2016-07-07 Eric Botcazou <ebotcazou@adacore.com>

* freeze.adb: Reenable code.

From-SVN: r238109

d1ce5f8 2016-07-07 21:59:19 Arnaud Charlet

minor reformatting.

From-SVN: r238107

0bb97bd 2016-07-07 21:59:06 Yannick Moy

sem_ch6.adb (Process_Formals): Set ghost flag on formal entities of ghost subprograms.

2016-07-07 Yannick Moy <moy@adacore.com>

* sem_ch6.adb (Process_Formals): Set ghost flag
on formal entities of ghost subprograms.
* ghost.adb (Check_Ghost_Context.Is_OK_Ghost_Context): Accept ghost
entities in use type clauses.

From-SVN: r238106

f965d3d 2016-07-07 21:03:39 Martin Liska

Prevent LTO wrappers to process a recursive execution

* file-find.c (remove_prefix): New function.
* file-find.h (remove_prefix): Declare the function.
* gcc-ar.c (main): Skip a folder of the wrapper if
a wrapped binary would point to the same file.

From-SVN: r238089

019d659 2016-07-07 20:50:55 Jan Hubicka

tree-scalar-evolution.c (iv_can_overflow_p): export.


* tree-scalar-evolution.c (iv_can_overflow_p): export.
* tree-scalar-evolution.h (iv_can_overflow_p): Declare.
* tree-ssa-loop-ivopts.c (alloc_iv): Use it.

From-SVN: r238088

275792f 2016-07-07 20:45:11 Ilya Enkovich

re PR ipa/71624 ([CHKP] internal compiler error: in duplicate_thunk_for_node)

gcc/

PR ipa/71624
* ipa-inline-analysis.c (compute_inline_parameters): Set
local.can_change_signature to false for intrumentation
thunk callees.

gcc/testsuite/

PR ipa/71624
* g++.dg/pr71624.C: New test.

From-SVN: r238086

33427b4 2016-07-07 17:54:59 Thomas Preud'homme

arm.h (TARGET_USE_MOVT): Check MOVT/MOVW availability with TARGET_HAVE_MOVT.

2016-07-07 Thomas Preud'homme <thomas.preudhomme@arm.com>

gcc/
* config/arm/arm.h (TARGET_USE_MOVT): Check MOVT/MOVW availability
with TARGET_HAVE_MOVT.
(TARGET_HAVE_MOVT): Define.
* config/arm/arm.c (const_ok_for_op): Check MOVT/MOVW
availability with TARGET_HAVE_MOVT.
* config/arm/arm.md (arm_movt): Use TARGET_HAVE_MOVT to check MOVT
availability.
(addsi splitter): Use TARGET_THUMB && TARGET_HAVE_MOVT rather than
TARGET_THUMB2.
(symbol_refs movsi splitter): Remove TARGET_32BIT check.
(arm_movtas_ze): Use TARGET_HAVE_MOVT to check MOVT availability.
* config/arm/constraints.md (define_constraint "j"): Use
TARGET_HAVE_MOVT to check MOVT availability.

From-SVN: r238083

3129a32 2016-07-07 17:54:50 Thomas Preud'homme

arm-protos.h: Reindent FL_FOR_* macro definitions.

2016-07-07 Thomas Preud'homme <thomas.preudhomme@arm.com>

gcc/
* config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.

From-SVN: r238082

05a437c 2016-07-07 17:54:40 Thomas Preud'homme

arm-arches.def (armv8-m.base): Define new architecture.

2016-07-07 Thomas Preud'homme <thomas.preudhomme@arm.com>

gcc/
* config/arm/arm-arches.def (armv8-m.base): Define new architecture.
(armv8-m.main): Likewise.
(armv8-m.main+dsp): Likewise.
* config/arm/arm-protos.h (FL_FOR_ARCH8M_BASE): Define.
(FL_FOR_ARCH8M_MAIN): Likewise.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/bpabi.h: Add armv8-m.base, armv8-m.main and
armv8-m.main+dsp to BE8_LINK_SPEC.
* config/arm/arm.h (TARGET_HAVE_LDACQ): Exclude ARMv8-M.
(enum base_architecture): Add BASE_ARCH_8M_BASE and BASE_ARCH_8M_MAIN.
* config/arm/arm.c (arm_arch_name): Increase size to work with ARMv8-M
Baseline and Mainline.
(arm_option_override_internal): Also disable arm_restrict_it when
!arm_arch_notm. Update comment for -munaligned-access to also cover
ARMv8-M Baseline.
(arm_file_start): Increase buffer size for printing architecture name.
* doc/invoke.texi: Document architectures armv8-m.base, armv8-m.main
and armv8-m.main+dsp.
(mno-unaligned-access): Clarify that this is disabled by default for
ARMv8-M Baseline architectures as well.

gcc/testsuite/
* lib/target-supports.exp: Generate add_options_for_arm_arch_FUNC and
check_effective_target_arm_arch_FUNC_multilib for ARMv8-M Baseline and
ARMv8-M Mainline architectures.

libgcc/
* config/arm/lib1funcs.S (__ARM_ARCH__): Define to 8 for ARMv8-M.

From-SVN: r238081