[PATCH] ipsum_calc_block: Optimize size and speed
joakim.tjernlund at transmode.se
Fri Apr 23 16:25:21 CEST 2010
Joakim Tjernlund/Transmode wrote on 2010/04/23 16:14:58:
> > Hello!
> > > But you can't get rid of:
> > > z + (z < sum)
> > > which is the real bottleneck. Perhaps this doesn't cost much
> > > on high end CPUs but it sure does on embedded CPUs
> > Why should it be? It can be compiled as a sequence of "add with carry"
> > instructions, can't it?
> Yes, but have you seen gcc do that? I havn't, perhaps gcc has become smarter
Just tried this and it didn't with gcc 3.4.3 on PowerPC
Some arch does not have an add with carry insn(MIPS?)
add32(unsigned long sum, unsigned long x)
unsigned long z = sum + x;
return z + (z < sum);
/* gcc -O3 -S gives:
.gnu_attribute 4, 2
.gnu_attribute 8, 1
.type add32, @function
More information about the Bird-users