[sac-user] A Question about the Optimizations of Single Assignment C

Sven-Bodo Scholz S.Scholz at herts.ac.uk
Tue Nov 27 16:37:17 GMT 2007


On Tue, Nov 27, 2007 at 02:53:01PM +0000, Li Bin wrote:
> Dear Bodo,
> 
> I tried to modify the definition of v1 and v2 as you indicated in your 
> email, and I just intended to deal with the corresponding files of that 
> three segements. My compiler can also generate three simd<x>.c files as 
> your compiler did, but if I tried to wrap the with-loop by a for-loop, 
> like this:
> 
> 7 for(i=0;i<10;i++)
> 8 { v1=with(iv)
> 9 (.<=iv<=[5000]):v1[iv]+v2[iv];
> 10 ([5001]<=iv<=[9000]):v1[iv]-v2[iv];
> 11 ([9001]<=iv<=.):v1[iv]+v2[iv];
> 12 modarray(v1); }
> 
> Then the simd.c file which is corresponding to segment [5001,9000] is 
> gone. Does the for-loop prevent some folding in compile time?
> I am really sorry that this is what the practical situitation I met, 
> previously I thought that this for-loop shouldn't make any different and 
> I didn't mentioned, but obviously I am wrong about that.

Well, I would have thought so too ;-) Unfortunately, it does :-(
I will look into that..... meanwhile:

Stephan, would it be possible to combine -simd with sac4c?
Idea:

int[10000] vectadd( int[10000] v1, int[10000] v2)
{
 v1=with(iv)
 (.<=iv<=[5000]):v1[iv]+v2[iv];
 ([5001]<=iv<=[9000]):v1[iv]-v2[iv];
 ([9001]<=iv<=.):v1[iv]+v2[iv];
modarray(v1);

return( v1); }

Now, we export this thing using sac4c (with -simd).
After that, Bin Li could write a C-function:

int *v1_raw, *v2_raw;
SACarg *v1, *v2;

v1_raw = (int *)malloc( 10000 * sizeof( int));
v2_raw = (int *)malloc( 10000 * sizeof( int));
/* some init code */

v1 = SACARGconvertFromIntPointer( v1_raw, 1, 10000);
v2 = SACARGconvertFromIntPointer( v2_raw, 1, 10000);

for(i=0;i<10;i++) {
  vectadd( &v1, v1, SACARGnewReference(v2));
}

As soon as simd is fixed we could compare that against this
in order to see the overhead of the SACarg'ification process....

I am not sure to what extent that would "taint" Bin's measurements.

Any ideas, anyone?

Bodo

> 
> Cheers.
> Bin
> 
> 
> 
> Sven-Bodo Scholz 写道:
> >On Mon, Nov 26, 2007 at 04:21:09PM +0000, Li Bin wrote:
> >  
> >>Dear Bodo,
> >>
> >>When I checked the C files generated by using -simd option, I see 
> >>something like this:
> >>
> >>I use a program as:
> >>
> >>1 use Structures:all;
> >>2 use SimplePrint:all;
> >>3 int main()
> >>4 {
> >>5   v1=genarray([10000],5);
> >>6    v2=genarray([10000],5);
> >>7   v1=with(iv)
> >>8        (.<=iv<=[5000]):v1[iv]+v2[iv];
> >>9        ([5001]<=iv<=[9000]):v1[iv]-v2[iv];
> >>10       ([9001]<=iv<=.):v1[iv]+v2[iv];
> >>11       modarray(v1);
> >>12    res=print(v1);
> >>13    return(res);
> >>14   }
> >>
> >>
> >>The compiler only generates those "simd" C files corresponding to 
> >>initializing v1, and performing the addition operation on v1 and v2 
> >>(relative to line 8 and line 10 in original SAC code). And if I change 
> >>line 8 or line 10 to subtraction operation, it will also keep the same, 
> >>i.e. still only generate C file relative to the addition operation. So I 
> >>suppose that the substraction must have been optimized somewhere during 
> >>generating intermediate code.
> >>
> >>Would you please give me a hint about what did the compiler do to that?
> >>    
> >
> >Well, v1 and v2 are defined by identical expressions
> >=> the compiler transforms it into 1 similar to
> >
> >  
> >>5   v2=genarray([10000],5);
> >>7   v1=with(iv)
> >>8        (.<=iv<=[5000]):v2[iv]+v2[iv];
> >>9        ([5001]<=iv<=[9000]):v2[iv]-v2[iv];
> >>10       ([9001]<=iv<=.):v2[iv]+v2[iv];
> >>11       modarray(v1);
> >>12    res=print(v1);
> >>    
> >
> >Then, the compiler folds the definition of v2 into that of v1
> >=> we obtain something similar to:
> >
> >  
> >>7   v1=with(iv)
> >>8        (.<=iv<=[5000]): 5 + 5;
> >>9        ([5001]<=iv<=[9000]): 5 - 5;
> >>10       ([9001]<=iv<=.): 5+ 5;
> >>11       modarray(v1);
> >>12    res=print(v1);
> >>    
> >
> >which finally yields (after constant folding):
> >[The following was obtained by compiling with -b11]
> >
> >  _pinl_276__flat_108 = 0;
> >  _dl_341 = 10;
> >  v1__SSA0_1 = with ( iv__SSA0_2 )
> >        ([ 0 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) < [
> >5001 ])
> >        {
> >          /* empty */
> >        } : _dl_341 ; ,
> >        ([ 5001 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) <
> >[ 9001 ])
> >        {
> >          /* empty */
> >        } : _pinl_276__flat_108 ; ,
> >        ([ 9001 ] <= iv__SSA0_2=[_eat_39] (IDXS:_wlidx_368_v1__SSA0_1) <
> >[ 10000 ])
> >        {
> >          /* empty */
> >        } : _dl_341 ;
> >      genarray( [ 10000 ] ,IDX(_wlidx_368_v1__SSA0_1));
> >
> >In my system (actual compiler) this generates 3 simd<x>.c files, one
> >each for the 3 segments from above.
> >
> >Which of the above computations do you actually wish to vectorise?
> >If you want the 2 "initial" vectors v1 and v2 to be actually build you
> >have to prevent With-Loop-Folding from happenening (this can be achieved 
> >by calling the compiler with -noWLF) AND you have to define v1 and v2 
> >differently, such as:
> >
> >v1 = genarray( [10000], 2);
> >v2 = genarray( [10000], 3);
> >
> >' hope that helps a bit.
> >Best wishes,
> >
> >  Bodo
> >  
> 



More information about the sac-user mailing list