Page 1 of 7
Stream FFT and iFFT
Posted: Mon Jun 17, 2013 11:00 pm
by trogluddite
It's just a delicate little baby at the moment, so treat it with care...

Re: Stream FFT and iFFT
Posted: Mon Jun 17, 2013 11:24 pm
by tester
First what I of course did - was to change that value in order to see the crash. But - it did not crashed. It looks that this value is changable when the sound is offline.
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 7:17 pm
by trogluddite
Yes, that's right - it's only when the assembly blocks are running that there can be problems.
Because it is such an intensive algorithm, I left out the parts that check whether memory addresses are valid inside the code arrays - if they go out of range, there's an instant crash. Saves a fair chunk of CPU, but means that there are many hard-wired values that need to be right.
Ultimately, I'll probably just release a set of modules, each preset to a different buffer length - there's a big trade off with this job between versatility and CPU load due to the limitations of the FS assembly instruction set. There's still a lot more code to add yet too - FFT -> iFFT on its own is pretty dull; messing with the data in between is where things will get much more interesting.
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 9:52 pm
by tester
//edit:
redirecting this to new topic.
viewtopic.php?f=4&t=1513#p6442
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 9:55 pm
by MyCo
The problem with realtime FFT and iFFT is, you have to overlap the buffers and apply the correct window to the FFT. And finally you have to find the right spot where you start the overlapping FFT. That's actually a very hard task.
Have you done the conversion to SSE from x86 code? It's very hard to read, but it looks like you don't really use the benefits of SSE.
When I understand it correctly, you buffer the 1024 samples linear into "inArray" and at the end of the buffer you calculate the FFT. Wouldn't it be faster to buffer the incoming samples at the bitshifted positions in an Array? Then you could leave this step (Loop1) out of the main code.
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 10:17 pm
by MyCo
I think I've found a bug in the FFT code. On line 125:
xmm3 wasn't set before
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 10:29 pm
by MyCo
Another related bug:
Code: Select all
mov eax,1;
cmp eax,0;
jnz JgteMtest;
the Jump is always executed because 1-0 is not zero
Re: Stream FFT and iFFT
Posted: Tue Jun 18, 2013 10:39 pm
by digitalwhitebyte
@Trog:
your implementation seems to use this algorithm.
Code: Select all
class Array
# DFT and inverse.
#
# Algorithm from
# 'Meyberg, Vachenauer: Hoehere Mathematik II, Springer Berlin, 1991, page 332'
#
# See http://blog.mro.name/2011/04/simple-ruby-fast-fourier-transform/
#
def fft doinverse = true
src = self
# raise ArgumentError.new "Expected array input but was '#{src.class}'" unless src.kind_of? Array
n = src.length
nlog2 = Math.log( n ) / Math.log( 2 )
raise ArgumentError.new "Input array size must be a power of two but was '#{n}'" unless nlog2.floor - nlog2 == 0.0
n2 = n / 2
phi = Math::PI / n2
if doinverse
phi = -phi
else
src.collect!{|c| c /= n.to_f}
end
# build a sine/cosine table
wt = Array.new n2
wt.each_index { |i| wt[i] = Complex(Math.cos(i * phi), Math.sin(i * phi)) }
# bit reordering
n1 = n - 1
j = 0
1.upto(n1) do |i|
m = n2
while j >= m
j -= m
m /= 2
end
j += m
src[i],src[j] = src[j],src[i] if j > i
end
# 1d(N) Ueberlagerungen
mm = 1
begin
m = mm
mm *= 2
0.upto(m - 1) do |k|
w = wt[ k * n2 ]
k.step(n1, mm) do |i|
j = i + m
src[j] = src[i] - (temp = w * src[j])
src[i] += temp
end
end
n2 /= 2
end while mm != n
src
end
end
//usage
real = []
imag = []
@in.fft(false).each {|i| real.push i.real; imag.push i.imag}
output 0, real; output 1, imag
Re: Stream FFT and iFFT
Posted: Wed Jun 19, 2013 9:01 am
by trogluddite
Hi Guys
MyCo wrote: you don't really use the benefits of SSE.
Saving that for later - I'll look into that once the other coding is done, but I thought I'd keep the other SSE channels free for now to give the option of parallel (e.g. stereo) operation.
Bugs! - nope!
There are a few unconditional jumps (why no 'jmp' opcode!) - they are just used to get the execution order right to switch between testing conditions at loop start/end; so the xmm3 value is actually set
later in the code.
MyCo wrote:buffer the incoming samples at the bitshifted positions in an Array? Then you could leave this step (Loop1) out of the main code.
Yes, I have tried several methods of doing that - but oddly enough, they've all come out with higher CPU load so far, despite, as you say, less read/writes and way fewer opcodes. Seems that streamlining the memory pipeline is the crucial thing here - a handful less cache misses per iteration makes a huge difference.
I'll keep working on it though - that transfer is where scaling/windowing would be applied, so it needs tightening up!
digitalwhitebyte wrote:your implementation seems to use this algorithm.
Yup, that's the one!
Re: Stream FFT and iFFT
Posted: Wed Jun 19, 2013 11:28 am
by Tronic
I think it is a good read for understanding the world of the FFT
The FFT Demystified