Discussion:
[Freetel-codec2] alloc/malloc in fsk
Steve
2016-02-14 01:45:47 UTC
Permalink
I was thinking you were creating a monster there with the memory alloc code.

I replaced all that with:

complex float f1_int[sizeof (complex float)*(nsym + 1) * P];
complex float f2_int[sizeof (complex float)*(nsym + 1) * P];
complex float f3_int[sizeof (complex float)*(nsym + 1) * P];
complex float f4_int[sizeof (complex float)*(nsym + 1) * P];
complex float f1_intbuf[sizeof (complex float) * Ts];
complex float f2_intbuf[sizeof (complex float) * Ts];
complex float f3_intbuf[sizeof (complex float) * Ts];
complex float f4_intbuf[sizeof (complex float) * Ts];

Poof!

I usually deprecate malloc, as it is old school. If I want true block
storage or dual-port memory, I just use shared memory.

int fd = open("/run/codec2/codecMemory", O_RDWR | O_CREAT, S_IRUSR |
S_IWUSR);
c2 = (struct codec2 *) mmap(0, sizeof(struct codec2), PROT_READ |
PROT_WRITE, MAP_SHARED, fd, 0);

if (c2 == MAP_FAILED) {
exit(1);
}

lseek (fd, sizeof(struct codec2)+1, SEEK_SET);

if (write (fd, "", 1) != 1) {
exit(1);
}

lseek (fd, 0, SEEK_SET);
close(fd);

I've been playing with it for C99. Haven't tested it yet though.

https://bitbucket.org/a-la-mode/fsk

Course, I haven't done firmware since the Z-80 :-)

Steve
Tomas Härdin
2016-02-15 22:14:47 UTC
Permalink
Post by Steve
I was thinking you were creating a monster there with the memory alloc code.
    complex float f1_int[sizeof (complex float)*(nsym + 1) * P];
    complex float f2_int[sizeof (complex float)*(nsym + 1) * P];
    complex float f3_int[sizeof (complex float)*(nsym + 1) * P];
    complex float f4_int[sizeof (complex float)*(nsym + 1) * P];
    complex float f1_intbuf[sizeof (complex float) * Ts];
    complex float f2_intbuf[sizeof (complex float) * Ts];
    complex float f3_intbuf[sizeof (complex float) * Ts];
    complex float f4_intbuf[sizeof (complex float) * Ts];
Poof!
Not sure if it's a priority, but VLAs don't work on Visual Studio's C
compiler. Since there are Windows builds on freedv.org I'm assuming
this is important (unless you're using gcc via MSYS or something like
c99-to-c89)

/Tomas
Bruce Perens
2016-02-15 22:21:21 UTC
Permalink
Malloc is in no way "old school", and is the only way that we should be
allocating variable-length arrays or any large array at all. You should
consider that our platform is often an embedded one and the stack may be as
small as 4k bytes.
Post by Tomas Härdin
Post by Steve
I was thinking you were creating a monster there with the memory alloc code.
complex float f1_int[sizeof (complex float)*(nsym + 1) * P];
complex float f2_int[sizeof (complex float)*(nsym + 1) * P];
complex float f3_int[sizeof (complex float)*(nsym + 1) * P];
complex float f4_int[sizeof (complex float)*(nsym + 1) * P];
complex float f1_intbuf[sizeof (complex float) * Ts];
complex float f2_intbuf[sizeof (complex float) * Ts];
complex float f3_intbuf[sizeof (complex float) * Ts];
complex float f4_intbuf[sizeof (complex float) * Ts];
Poof!
Not sure if it's a priority, but VLAs don't work on Visual Studio's C
compiler. Since there are Windows builds on freedv.org I'm assuming
this is important (unless you're using gcc via MSYS or something like
c99-to-c89)
/Tomas
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Freetel-codec2 mailing list
https://lists.sourceforge.net/lists/listinfo/freetel-codec2
Brady O'Brien
2016-02-15 22:33:29 UTC
Permalink
In the current codebase, I've got all of those wrapped in an ifdef block to
switch between malloc/free and alloca.
Post by Bruce Perens
Malloc is in no way "old school", and is the only way that we should be
allocating variable-length arrays or any large array at all. You should
consider that our platform is often an embedded one and the stack may be as
small as 4k bytes.
Post by Tomas Härdin
Post by Steve
I was thinking you were creating a monster there with the memory alloc code.
complex float f1_int[sizeof (complex float)*(nsym + 1) * P];
complex float f2_int[sizeof (complex float)*(nsym + 1) * P];
complex float f3_int[sizeof (complex float)*(nsym + 1) * P];
complex float f4_int[sizeof (complex float)*(nsym + 1) * P];
complex float f1_intbuf[sizeof (complex float) * Ts];
complex float f2_intbuf[sizeof (complex float) * Ts];
complex float f3_intbuf[sizeof (complex float) * Ts];
complex float f4_intbuf[sizeof (complex float) * Ts];
Poof!
Not sure if it's a priority, but VLAs don't work on Visual Studio's C
compiler. Since there are Windows builds on freedv.org I'm assuming
this is important (unless you're using gcc via MSYS or something like
c99-to-c89)
/Tomas
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Freetel-codec2 mailing list
https://lists.sourceforge.net/lists/listinfo/freetel-codec2
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Freetel-codec2 mailing list
https://lists.sourceforge.net/lists/listinfo/freetel-codec2
glen english
2016-02-15 23:17:38 UTC
Permalink
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Bruce Perens
2016-02-16 17:10:56 UTC
Permalink
I use aligned_alloc ... but that's because I write NEON code and like
alignment.

Malloc is guaranteed to return an address that is aligned to the CPU word
size. Be careful when using larger alignments on small CPUs. Pernicious
cache behavior is possible. The classic case is a 2-way set-associative
cache where the addresses of three operands end up hashing to the same
cache bucket because they are aligned and the hash is simply the low
address bits. Every operation causes a cache spill and reload.

Thanks

Bruce
glen english
2016-02-15 23:50:35 UTC
Permalink
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Bruce Perens
2016-02-16 17:29:56 UTC
Permalink
absolutely NOT on embedded systems with MM !
Glenn,

I think you mean embedded systems *without* memory management.

SM-1000 is using STM32F405 (or 07) and is not in general configured with a
large enough stack.

We also have embedded platforms where memory banks have different speeds,
such as the ARMv7 in Katena. In general these platforms offer a very
limited amount of fast on-die RAM to host the stack and you can't have as
much as you want without putting the whole stack off-die.
glen english
2016-02-16 20:21:53 UTC
Permalink
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Steve
2016-02-16 18:16:45 UTC
Permalink
I'm kind of wondering maybe, if the SM1000/2000 really need an external RAM
on the board. It seems like the RAM is way too small.

I guess my other feelings about malloc, is the overhead of all the pointers
and even the algorithm itself sucks RAM for no good reason.

I'm a big fan of BSS storage. Make everything a global, and the heap and
stack can never crash, But then I'm showing my Small-C heritage :-)
Bruce Perens
2016-02-16 18:35:43 UTC
Permalink
Katena uses external RAM on essentially the same class of CPU as SM-1000,
but we are running ucLinux.
Post by Steve
I guess my other feelings about malloc, is the overhead of all the
pointers and even the algorithm itself sucks RAM for no good reason.

It's really small. Especially compared to the other code we have. Breaking
up and coalescing buffers isn't rocket science.

The problem with automatic variables is that the stack ends up having to be
as large as your largest allocation at your deepest depth of recursion.
Even though 99% of the time you are using much less memory than that. And
it's more difficult to measure what you really are using.

Although you get rid of a few pointers, you end up using addressing modes
with larger offsets, and these assemble into larger instructions and more
instructions. ARM has too many addressing modes (like an old CISC
processor) and they have varying efficiency.
glen english
2016-02-16 23:47:20 UTC
Permalink
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Bruce Perens
2016-02-17 00:04:24 UTC
Permalink
My concern on anything embedded is that through bad design/ bad writing/
bugs, malloc could fail, and that is bad.

You are talking to the guy who wrote Electric Fence :-). If you think
things are bad now, there was a time when the typical programmer could not
find a buffer overrun at all.

Valgrind is the modern tool, and it's very helpful with isolating that sort
of problem. It should always be run, in its various modes, before
production software is released.

For a while I've been trying to write a functional programming language
that is actually practical :-) There are many software errors that we
should not be allowed to make by our tools.

Thanks

Bruce

Loading...