Microcontroller C programming
Quite often you can find good examples of effective algorithms: faster code performance and code size as well. In order to write good optimal algorithm you have to know structure of compiler. Of course we are not going to analyze compilers, but we can look through few rules and tricks and achieve optimal algorithm.
Begining
What to do when limit of program size is overrun or there is not enough speed in some cases. Of course you will say, that these parts it is best to write in assembler, but is this a solution? Maybe there is a way to do this with C?
Not a secret, that the same result can be achieved in different ways, e.g. some programmers like to use dynamic variables other arrays, some like to use case other if statements. This is a Programming Style and mostly everyone has its own style. In examples bellow will be used AVR-GCC compiler with optimization keys “-O0” (without optimization), “-O3” (optimization on speed), “-Os” (optimization on code size). “-Os” and “-O3” optimizations gives similar results.
What C cannot do
C is a high level language; this means that this language isn’t tied to particular hardware. This why programmer doesn’t have access to processors recourses, e.g. stack or flags or registers. For example with pure C:
• Impossible to check whether or not there were an overflow after arithmetic operation (in order to check this you have to read overflow flag);
• It is impossible to organize multithreaded operations, because for this you should save register values to save the states.
Of course we all know that for those tasks we can use libraries like io.h.
Harvard Architecture
Usually we program for fon Neumann architecture (program and data memory uses the same memory space). But in Harvard type there exist many types of memory: Flash, EPROM, RAM…and they are different from each other. In traditional C there is no provided support for different types of memory. It would be convenient to write like this:
ram char buffer[10]; // Array in RAM
disk char file[10]; // Array in Disc
for (i=0;i<10;i++) // Writing 10 chars '0'
{
file[i]='0'; //To disc
}
strncpy(file,buffer,10); // from disc to buffer
Because there is not supported, we need to use special functions to work with different kind of memories.
Array of structures or “structure” of arrays
Structure is one of convenient C language constructions. The use of structures makes code easier to read and analyze also this is only one way to write data to memory in order. But lets look if use of structures is always good way.
For example lets describe 10 sensors:
struct SENSOR
{
unsigned char state;
unsigned char value;
unsigned char count;
}
struct SENSOR Sensors[10];
What do you thin what compiler is going to do when reading x sensor value. It multiplies by 3 and adds 1. So there is multiplication needed to read one byte – it is very ineffective. It is better to use arrays:
unsigned char Sensor_states[10];
unsigned char Sensor_values[10];
unsigned char Sensor_counts[10];
This is less readable but code is performed faster because multiplication isn’t needed. But in other hand it is good to use structures when there are needed operations with structures like copying.
It is good to mention, that compiler in this case multiplication operation changed to shift, but in more complicated structures there is not always possible to do this.
The results:
==============================================================
-O0 -O3
words clocks words clocks
==============================================================
Reading from structure 16 19 12 13
Reading from array 9 12 6 7
--------------------------------------------------------------
Value (times) 1.8 1.6 2.0 1.9
==============================================================
Copy of structure 41 81 26 42
Copy of array elements 44 55 43 49
--------------------------------------------------------------
Value (times) 0.9 1.5 0.6 0.9
==============================================================
Listings can be viewed here.
Branching “Switch”
In C language conditional sentences it is convenient to write using switch(). This construction is well optimized by compiler. But there are some nuances like, switch operator changes variable to int type, even if variable was chat type. For example:
char a;
char With_switch()
{
switch(a)
{
case '0': return 0;
case '1': return 1;
case 'A': return 2;
case 'B': return 3;
default: return 255;
}
}
char With_if()
{
if(a=='0') return 0;
else if(a=='1') return 1;
else if(a=='A') return 2;
else if(a=='B') return 3;
else return 255;
}
Changing of types ads more code size. Listings are here.
Results:
====================================
-O0 -O3
words words
====================================
With_Switch 57 33
With_If 40 25
------------------------------------
Value (times) 1.4 1.3
====================================
Signed bytes
From AVR-GCC ver3.2 there is no possible to pass char type variable or get result from functions. Always chat types are expanded to int type.
char b;
unsigned char get_b_unsigned()
{
return b;
}
signed char get_b_signed()
{
return b;
}
After compiling with “-O3” key the results aren’t depending on variable “b”. Listing fragment:
get_b_unsigned:
lds r24,b ; LSB to r24
clr r25 ; MSB=0.
ret
get_b_signed:
lds r24,b ; LSB to r24
clr r25 ;MSB=0
sbrc r24,7 ;skip if LSB>0
com r25 ;0xFF otherwise
ret
Despite the result which is one byte, GCC always calculates and another byte. Thus, if number is unsigned, then MSB always is equal to zero, if there is signed number, then processor has to do 2 additional operations
Of course in many cases it doesn’t play the significant role in performance, but if you know, that there won’t be results with signs, it is better use unsigned types. This might be more actual consideration for those, who like to use a lot of functions, but in other hand performance is slower because of frequent use of functions.
In order to use unsigned types by default it can be set “-funsigned-char” in makefile. This makes all char types to be unsigned, otherwise compiler thinks different…
Help compiler
There are situations, when in big program branches there are common parts, e.g. branches ends with same sentences. For example, clean buffer, increment counter, set flag and so on. It is not always convenient to pot those operations in one function or macro. Well compiler can do this by itself – it just needs a little help.
Let’s see an example. Function does something depending on variable, and then it does the same operations: increments counter, nulls state and ads length to index (this is just an example to demonstrate). Lets write switch() statement:
void long_branch(unsigned char c)
{
switch(c)
{
case 'a':
UDR = 'A';
count++;
index+=length;
state=0;
break;
case 'b':
UDR = 'B';
state=0;
count++;
index+=length;
break;
case 'c':
UDR = 'C';
index+=length;
state=0;
count++;
break;
defualt:
error=1;
state=0;
break;
}
}
Compile this with “-O3” key. Result – 66 words. Let’s reorder sentences:
void long_branch_opt(unsigned char c)
{
switch(c)
{
case 'a':
UDR = 'A';
count++;
index+=length;
state=0;
break;
case 'b':
UDR = 'B';
count++;
index+=length;
state=0;
break;
case 'c':
UDR = 'C';
count++;
index+=length;
state=0;
break;
defualt:
error=1;
state=0;
break;
}
}
Compilation gives 36 words.
What happened? Nothing, just after reordering every branch ends with same parts. Compiler recognizes similar parts and compiled one part and in those places puts JMP. It is important to remember that those parts should be in the ends of branches, otherwise it doesn’t work.
In real programs there is not always possible to do this, but:
• Sometimes it can be done artificially adding code;
• Not always all parts must end equally – there can be several groups of different parts.
So code size can be reduced by changing order in sentences.
Why the “heaps” are needed
Many programmers like to use dynamic memory. For this reason special structure is used – heap. In computers the structures are managed by operation system, but in microcontrollers where is no operation system, compiler creates special segment. Also in standard library there are defined functions malloc and free for memory allocating and freeing.
Sometimes is convenient to use dynamic memory, but the price for convenience is high. And when resources are limited this can be critical.
What happens when heap is used? Let’s write simple program which doesn’t use dynamic memory:
char a[100];
void main(void)
{
a[30]=77;
}
the compiled code size is small. Write to array element is done by two clock cycles, because address of each element is known. Program size is 50 words. Data memory is 100 bytes. Main() function is performed in 9 cycles with stack init.
The same program but heap is used:
char * a;
void main(void)
{
a=malloc(100);
a[30]=77;
free(a);
}
Program size is 325 words, data memory is 114 bytes. Write to array element is done in 6 cycles (5 op codes). Main() function is done in 147 cycles with stack init.
Program increased by 275 words, where malloc takes 157 words and free function takes 104 words. Other 14 words are for calling those functions. So there is more complicated work with array elements. The initialization of array writes 0 to each element. 14 bytes of memory in data memory is used for: heap memory organizing variables (10 bytes), 2 bytes is the pointer to array, and 2 bytes are in front of memory block to save its size which is used in free function.
So it is better not to use dynamic memory, when resources are very limited.
Typical errors
Let’s go through few typical errors that can help to avoid some troubling.
Reading string from flash memory
AVR-GCC doesn’t understand where pointer has to show – to program memory or data memory. By default is RAM. In order to read from Flash memory, you should use macro which is in ”pgmspace.h” library:
#include
#include
prog_char hello_str[]=”Hello AVR!”;
void puts(char * str)
{
while(PRG_RDB(str) != 0)
{
PORTB=PRG_RDB(str++);
}
}
void main(void)
{
puts(Hello_str);
}
Reading bit from port
void Wait_for_bit()
{
while( PINB & 0x01 );
}
When optimization is turned on, compiler calculates (PINB & 0×01) first and then writes to answer register and then tests. Compiler doesn’t know that PINB can change at any moment – it doesn’t depend on program flow. In order to avoid this, you should use macro from file “sfr_gefs.h”(which is in “io,h”). For example:
void Wait_for_bit()
{
while ( bit_is_set(PINB,0) );
}
Waiting for interrupt flag
Function has to wait until interrupt will occur:
unsigned char flag;
void Wait_for_interrupt()
{
while(flag==0);
flag=0;
}
SIGNAL(SIG_OVERFLOW0)
{
flag=1;
}
Problem is the same. Compiler doesn’t know when flag can change. Solution is to make variable volatile:
volatile unsigned char flag;
void Wait_for_interrupt()
{
while(flag==0);
flag=0;
}
SIGNAL(SIG_OVERFLOW0)
{
flag=1;
}
Delay
This function has to delay time:
void Big_Delay()
{
long i;
for(i=0;i<1000000;i++);
}
Problem is hidden in compiler optimization. It is obvious to compiler that function doesn’t do anything - doesn’t return any value and doesn’t change any global or local variables. This function can be optimized to zero, but of course compiler leaves several cycles.
To avoid this there should be used macro from “delay.h” or assembler should be included in loop in order to make compiler to compile full loop cycle:
#define nop() {asm("nop");}
void Big_Delay()
{
long i;
for(i=0;i<1000000;i++) nop();
}
Source:
• http://myavr.narod.ru/c_style.htm
Blogsphere: TechnoratiFeedsterBloglines
Bookmark: Del.icio.usSpurlFurlSimpyBlinkDigg
RSS feed for comments on this post | TrackBack URI for this post
New on WinAVR Tutorial
Running TX433 and RX433 RF modules with AVR microcontrollers,Sometimes in embedded design you may want to go wireless. Might be you will want to log various readi …Programming AVR ADC module with WinAVR,Most of AVR microcontrollers have Analog to Digital Converter (ADC) integrated in to chip. Such solut … |
New on WinARM Tutorial
What are differences between WinARM and WinAVR,Everyone who is working with AVR microcontrollers knows this powerful tool – WinAVR (http://win …LPC2000 watchdog timer,As in all microcontrollers watchdog timers purpose isto reset microcontroller after reasonable amount … |

December 9th, 2006 at 2:26 am
Thanks Very much for this Good introduction,I Learned alot from it
Thanks Again
December 5th, 2007 at 5:48 pm
while going through it some points became clear.
thanks
January 2nd, 2008 at 7:47 am
I want code for lcd display in lpc938 through I2c.
can you help me?