[sac-user] A Question about the Optimizations of Single Assignment C
Sven-Bodo Scholz
S.Scholz at herts.ac.uk
Tue Nov 27 16:37:17 GMT 2007
On Tue, Nov 27, 2007 at 02:53:01PM +0000, Li Bin wrote:
> Dear Bodo,
>
> I tried to modify the definition of v1 and v2 as you indicated in your
> email, and I just intended to deal with the corresponding files of that
> three segements. My compiler can also generate three simd<x>.c files as
> your compiler did, but if I tried to wrap the with-loop by a for-loop,
> like this:
>
> 7 for(i=0;i<10;i++)
> 8 { v1=with(iv)
> 9 (.<=iv<=[5000]):v1[iv]+v2[iv];
> 10 ([5001]<=iv<=[9000]):v1[iv]-v2[iv];
> 11 ([9001]<=iv<=.):v1[iv]+v2[iv];
> 12 modarray(v1); }
>
> Then the simd.c file which is corresponding to segment [5001,9000] is
> gone. Does the for-loop prevent some folding in compile time?
> I am really sorry that this is what the practical situitation I met,
> previously I thought that this for-loop shouldn't make any different and
> I didn't mentioned, but obviously I am wrong about that.
Well, I would have thought so too ;-) Unfortunately, it does :-(
I will look into that..... meanwhile:
Stephan, would it be possible to combine -simd with sac4c?
Idea:
int[10000] vectadd( int[10000] v1, int[10000] v2)
{
v1=with(iv)
(.<=iv<=[5000]):v1[iv]+v2[iv];
([5001]<=iv<=[9000]):v1[iv]-v2[iv];
([9001]<=iv<=.):v1[iv]+v2[iv];
modarray(v1);
return( v1); }
Now, we export this thing using sac4c (with -simd).
After that, Bin Li could write a C-function:
int *v1_raw, *v2_raw;
SACarg *v1, *v2;
v1_raw = (int *)malloc( 10000 * sizeof( int));
v2_raw = (int *)malloc( 10000 * sizeof( int));
/* some init code */
v1 = SACARGconvertFromIntPointer( v1_raw, 1, 10000);
v2 = SACARGconvertFromIntPointer( v2_raw, 1, 10000);
for(i=0;i<10;i++) {
vectadd( &v1, v1, SACARGnewReference(v2));
}
As soon as simd is fixed we could compare that against this
in order to see the overhead of the SACarg'ification process....
I am not sure to what extent that would "taint" Bin's measurements.
Any ideas, anyone?
Bodo
>
> Cheers.
> Bin
>
>
>
> Sven-Bodo Scholz 写道:
> >On Mon, Nov 26, 2007 at 04:21:09PM +0000, Li Bin wrote:
> >
> >>Dear Bodo,
> >>
> >>When I checked the C files generated by using -simd option, I see
> >>something like this:
> >>
> >>I use a program as:
> >>
> >>1 use Structures:all;
> >>2 use SimplePrint:all;
> >>3 int main()
> >>4 {
> >>5 v1=genarray([10000],5);
> >>6 v2=genarray([10000],5);
> >>7 v1=with(iv)
> >>8 (.<=iv<=[5000]):v1[iv]+v2[iv];
> >>9 ([5001]<=iv<=[9000]):v1[iv]-v2[iv];
> >>10 ([9001]<=iv<=.):v1[iv]+v2[iv];
> >>11 modarray(v1);
> >>12 res=print(v1);
> >>13 return(res);
> >>14 }
> >>
> >>
> >>The compiler only generates those "simd" C files corresponding to
> >>initializing v1, and performing the addition operation on v1 and v2
> >>(relative to line 8 and line 10 in original SAC code). And if I change
> >>line 8 or line 10 to subtraction operation, it will also keep the same,
> >>i.e. still only generate C file relative to the addition operation. So I
> >>suppose that the substraction must have been optimized somewhere during
> >>generating intermediate code.
> >>
> >>Would you please give me a hint about what did the compiler do to that?
> >>
> >
> >Well, v1 and v2 are defined by identical expressions
> >=> the compiler transforms it into 1 similar to
> >
> >
> >>5 v2=genarray([10000],5);
> >>7 v1=with(iv)
> >>8 (.<=iv<=[5000]):v2[iv]+v2[iv];
> >>9 ([5001]<=iv<=[9000]):v2[iv]-v2[iv];
> >>10 ([9001]<=iv<=.):v2[iv]+v2[iv];
> >>11 modarray(v1);
> >>12 res=print(v1);
> >>
> >
> >Then, the compiler folds the definition of v2 into that of v1
> >=> we obtain something similar to:
> >
> >
> >>7 v1=with(iv)
> >>8 (.<=iv<=[5000]): 5 + 5;
> >>9 ([5001]<=iv<=[9000]): 5 - 5;
> >>10 ([9001]<=iv<=.): 5+ 5;
> >>11 modarray(v1);
> >>12 res=print(v1);
> >>
> >
> >which finally yields (after constant folding):
> >[The following was obtained by compiling with -b11]
> >
> > _pinl_276__flat_108 = 0;
> > _dl_341 = 10;
> > v1__SSA0_1 = with ( iv__SSA0_2 )
> > ([ 0 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) < [
> >5001 ])
> > {
> > /* empty */
> > } : _dl_341 ; ,
> > ([ 5001 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) <
> >[ 9001 ])
> > {
> > /* empty */
> > } : _pinl_276__flat_108 ; ,
> > ([ 9001 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) <
> >[ 10000 ])
> > {
> > /* empty */
> > } : _dl_341 ;
> > genarray( [ 10000 ] ,IDX(_wlidx_368_v1__SSA0_1));
> >
> >In my system (actual compiler) this generates 3 simd<x>.c files, one
> >each for the 3 segments from above.
> >
> >Which of the above computations do you actually wish to vectorise?
> >If you want the 2 "initial" vectors v1 and v2 to be actually build you
> >have to prevent With-Loop-Folding from happenening (this can be achieved
> >by calling the compiler with -noWLF) AND you have to define v1 and v2
> >differently, such as:
> >
> >v1 = genarray( [10000], 2);
> >v2 = genarray( [10000], 3);
> >
> >' hope that helps a bit.
> >Best wishes,
> >
> > Bodo
> >
>
More information about the sac-user
mailing list