Tag: programming

Strcpy blues

Where do you set the notch for a C programmer declaring “good C/C++” in his/her resume? It’s “developpa” hiring time again at the company I work for, so I am conducting some technical interviews with junior programmers to find the best candidate.
Before this point we had many internal discussion about having the candidate to perform some coding during the interview or not, to the point I got a reputation of being the Bad One – a sort of a Torquemada of the C Language.
In fact I fairly agree with many others that if you are looking for a programmer (i.e. someone who writes code for a living) you should see how he/she actually writes code. Some colleagues felt embarrassed in asking this to an experienced programmer or were afraid of intimidating excessively the candidate.
Then we had a couple of hires whose skill claims weren’t exactly matched by their real skills, therefore we eventually agreed on proposing the candidate to write a few lines of code then using those as a base for more profound questions.
With the best intentions of making the candidate at ease we asked to write the implementation of a short standard library function, say memcpy, strcpy or strcmp. In order to have a common base for evaluation I focused my demand on strcpy.
Such functions should be famous enough not to scare anyone and easy enough to implement so that further discussion (improvement, optimization, pitfalls…) could be tackled to know better the candidate’s attitudes.

Well, I was wrong.

The first problem my candidates encountered was well before the implementation, in fact they had trouble in the definition of a string in C. Most of them opted for a function signature employing an undefined type “String”, someone opted for a pair of square brackets at the right end of the name.
That sheds an unsettling light on the candidate’s knowledge both of the standard library and the C string concept.
Anyway I didn’t say anything and left candidates going on. For all of them it was clear that a loop was needed, but most of them panicked on the string length. As I expected from the string type, no one of them was sure about what a C string is. One said that he expected a string length somewhere. So I pointed them in the right direction stating that a C string is more a convention that a language type, consisting in a sequence of characters terminated by 0.
Most of the candidates were able to use the array notation with an assignment. Notably one candidate failed to write the assignment operator pretending that it was the double equal (he said this twice, he was pretty emotional, but I am afraid he’s really convinced).
No one of them attempted the pointer way with post-increment. I am not a fan of the pointer notation, sometimes it is clearer, sometimes is not. For simple cases like this the optimizer is able to convert the source into the most optimized version. But I expected that a candidate could go the pointer way in order to impress the interviewer. Anyway it wasn’t the case. Worse a candidate had a doubt whether he should dereference or not the pointer with the array notation.
Well, we were looking for a junior C programmer, but I didn’t expect such low score in the C knowledge. Maybe in their experience they hadn’t chance of using the string part of the C standard library, but I would say that pointers were quite stranger to most of them.
Given the technical interview I underwent in my career and the technical interview I read about in Internet forums, I would say that in other countries programmers (even junior) are better prepared, while in Italy it seldom occurs in interviews to require the candidate to write lines of code.
I quite agree that the knowledge of the specific language is not everything for evaluating a candidate, but I think it says a lot about his/her interests, will to learn and to work as a programmer. For these reasons I tried to be flexible, ignoring details of C syntax rather focusing on the problem solving abilities. But the lack of a starting point (the code implementation) seriously hampered my options.
Also I recall that when I left the school anyone in my university friends circle could craft a strcpy in a matter of minutes (I would add blindfolded, but that would be an exaggeration). So have the times changed? Or it is my memories going wild? Should I lower the notch or changing entirely the kind of interview questions?

From the management point of view (as they told me) a junior programmer is expected not to be fluent in a programming language and I expect that a junior has no experience of medium/large code base and software engineering can be just abstract rubbish (at best) to him/her. So I have been instructed to look for learning skill and passion. I found a hard time in formulating questions for evaluating the learning skill of a candidate (well I had some ideas about mazes, food and electrical shocks, but they told me that can’t be done with humans… even if they are junior programmers).

PIC18F Software Project Survival Guide 5

This is the fifth installment of a series of posts regarding software programming on PIC18F cpu family. Previous installments: first, second, third and fourth.
Lost in memories
The first problem you are faced with when designing your application is the PIC memory architecture. And saying that this is a problem is like saying that a multi-head nuke missile is a sort of firecracker. You could say that most of the problems you’ll find working with PIC18s can be classified in two sets: those who stem from memory and those who impact on memory.
First problem is size.
PIC 18F if programmed in C (and you want to use C and not assembly to keep you mental sanity for what is left of your life out of the working hours) tends to be quite memory hungry. Code density is low even if compared to the early CPUs developed by mankind.
To make things worse, Harvard Architecture isn’t going to help you – since pointers are implemented differently down into the assembly instructions level whether they point to Ram (file register) or to flash (Program Memory), you will need to code the same function twice (or more). Consider that the standard library strcpy function has four different implementations because of the four combination you get from its arguments (copy ram to ram, ram to flash, flash to ram and flash to flash).

I read about a C compiler that masks these differences away, (with a pointer wrapper, I presume) but according to the website is far from being production level completeness. Also such an approach penalizes execution time when you know where data are located and that their position is not going to change.
If you really need to handle data both in data memory and program memory you can write a wrapper. I need to sequentially read bytes from any source, so I wrote two wrapper classes ByteReader (and ByteWriter with limited functionalities). The additional benefit is that you can adapt the wrapper to read from an external flash memory as well.

When you let your hardware engineers decide which PIC18 to put on board of the device you are developing, take care about some flash memory subtleties.
All PICs 18F can program their own program memory. But there is a sub-family of PICs (18Fx(456)K2x) whose members have a dedicated small (usually 1k or less) data flash attached to the CPU core by a third memory bus.
You may wonder why the Microchip engineers went the chore of adding a third bus and differentiate on chip memory and addressing. Well, they had, indeed, very good reasons:

  1. program memory can be written one byte at time, but needs to be erased one 4k page at time. 4K on 128k is quite a fraction, but worse, you are forced to juggle with data you want to preserve and considering you don’t have 4k of data memory it is going to be quite a juggle.
  2. When you write the program memory the corresponding bus is stalled, since this is the bus where the program instructions are taken and there is no instruction cache, the CPU is stalled. Typical erase/write times for a flash memory causes the CPU to stall for 5-10ms, possibly more and that can be a showstopper for a real time application.

If you need persistent storage and there are no PICs with the feature list you need, you may resort to an external flash memory connected either by I2C or SPI.

Anyway, regardless of the application, always beg for the device with the largest memory (i.e. 128k in the current production)
Since you are going to sell billions of devices there will be some pressures about picking a small memory footprint device. Resist! You must not lose this battle! You can argue that PIC with different memory sizes are pin-to-pin compatible and there’s no need to add risks on the development when you can downsize the memory in pre-production or at the first technical review.
128k of program memory may seem a lot for an embedded system, but given the low code density and the optimizer naivete is not that much.
On some devices of the 18F family (the ones with a high pin count) you can extend program memory with an external memory.
For our application we managed to fit everything in the base memory and we used the extra pins to connect a LCD screen (the parallel port and memory bus share the same pins). Also we employed an external flash for data only connected either via an SPI or an I2C depending on the specific device we were developing.

Harvard Architecture
Although it could seem a good idea from a theoretical point of view, having two distinct addressing spaces for program and data, isn’t a good idea and having distinct instructions, with distinct addressing modes makes things ugly.
In fact you do want store data in non volatile memory – initialization data, constants, lookup tables – would be only for you have at most 4k of data memory.
The compiler trying to favor performance over conformance doesn’t help much.
Let’s start from the void pointer (void*). In C this kind of pointer is just a (temporary) container for any pointer. You take any typed pointer and you can convert it into void* and then, when you need it, you have to convert it back to the original type.
With MCC18 you have two main problems stemming from the default storage chosen for pointers. In facts pointers are data memory pointers by default. Void pointer makes no exception and it is two bytes long. The problem is that it cannot host a program memory pointer when you use the large memory model. Large memory model is needed only when you need to access more than 64k of program memory and implies that program memory pointers are 3 bytes long.
The other problem is that, by this convention, the pointed type it is not enough to tell apart program and data memory pointer. That is be it a uint8_t* or a uint8_t const* you cannot tell anything about the memory region the uint8_t is stored.
For this discrimination MCC18 provides two qualifiers: rom and ram; the latter being optional, since by default everything is stored in ram.
On one side I prefer the qualifier approach when compared to an “intelligent” approach where the compiler decides silently where to put what based on “const”-ness or other consideration. In fact I use const qualifier quite everywhere and not just for data stored in program memory.
On the other side having to explicitly provide several versions of the same function is a plain waste of space. I would like having a flexible approach where by default I have generic pointers, handled by the compiler through an adapter layer, and specialized pointers (rom/ram) on demand where performance matters.

Hardware and Software stack
First PICs were really simple processors featuring a couple of levels for the call stack. 18F architecture has 31levels of call stack, thus enabling the CPUs to power medium sized architecture.
At a first glance 31 levels may seem a lot… what the heck, even Windows stack trace rarely spans over such a distance.
That would be fine, but let’s take a closer look at what happens when you run low on memory and you enable all the optimizations.
One of such optimization is called Procedural Abstraction and does a fine work in trimming down the size of the code. This transformation examines the intermediate code (or maybe the assembly code directly) and creates subroutines for duplicated code snippets. Operating at a lower abstraction level than the C source, the optimization has far more opportunities of applying the transformation.
Although clever the optimization has a drawback – it takes out of programmer’s hands the call stack control. This is generally true for every optimizing compiler (e.g.: when the compiler moves a static function into the calling place), but to a much lesser extent. MCC18 is capable of factorizing every bunch of assembly instructions in the middle of every C statement building up C lines by composing several subroutines. A nightmare to debug, a hell to understand in the disassembly listing and a sure way to eat those 31 call stack entries really quick.
I already wrote about how to recover from a stack over/underflow and restart debugging without having to re-program the chip. Now let’s see how to avoid the overflow at all.
First you can decided the stack overflow handling by setting the STVREN configuration bit. Basically you can chose among the “ignore” and the “trap” policies. Unfortunately, as we’re going to see, they are both rather ineffective.
Ignoring stack overflow means that when the limit is trespassed the execution continues jumping at the requested address without pushing the return address into the call-stack.
This means that at the first return instruction, rather than returning to the caller, execution jumps back last non-overflowing call.
Ideally, since overflow flag STKFUL (in STKPTR register) is set on overflow you could think of a function stub that checks this flag on entry. The trouble is that once you are in the called function you have no way to recover the return address since it is lost.
Changing ignore to trapping may seem more promising. In fact when this mode is selected, on limit trespassing the execution jumps to address 0x0 and the STKFUL flag is set. This acts somewhat like a reset, but since the micro state is not reset you could think of it as a trap.
Shamefully, yet again you don’t have a way to recover a return address, so you can’t do a save/restore call-stack.
After headscraping far too long on this problem, I decided to simplify the problem making the assumption that if the stack overflow occurs, it occurs only within interrupts. That makes somewhat sense since for sure interrupts are going to impact on the stack. So I added some code in the low level interrupt for checking the stack pointer against a threshold and saving all the hardware stack in a data memory region. This allows to reset the stack pointer and continue to execute the interrupt code. Before leaving the interrupt the opposite operation is performed.
Ideally it would have been nice to have the C run time support to handle the issue by adding a prologue/epilogue to every function so that hardware stack could have been “virtualized”. It is not much of a pessimization since you have already to handle the software stack for parameters and you may optimize out prologue/epilogue for functions that do not perform subroutine calls.
As of today no industrial C compiler implements this.

Next time I’ll write about Extended Mode.

PIC18F Software Project Survival Guide 4

This is the fourth installment of a series of posts regarding software programming on PIC18F cpu family. Previous installments: firstsecond and third.
IDE
The IDE, named MPLAB, is not brilliant, but it could be worse. I saw on the Microchip website that they are going to have a new edition – namely MPLAB X – (now in Beta) that is based on Netbeans. That means that what I am writing here may become quite obsolescent in the next months.
At the time of writing MPLAB X lacks of a number of feature (extended mode just to name one) that prevent its use in a professional environment. If you do not need extended mode, then maybe the worse part is that compilation is overly sluggish.
Anyway I commend the choice of Netbeans, which I prefer over the most widespread Eclipse.
Back to the current MPLAB IDE. You can configure the IDE to get it less uncomfortable. For example you can right click on top left corner of the windows and make them dock-able. When you are satisfied with the layout remember to save it because it is not granted that MPLAB restores it at the next run (it is about two years I am using MPLAB, but I haven’t found yet the reason for which often windows and panels layout is lost from a run to another).
You can redefine keyboard shortcut, but there isn’t a keyboard equivalent for each GUI control. E.g. the pretty useful “compile this file” is missing.
There is a “Goto Locator” contextual option that (possibly) brings you to the definition/declaration of the symbol under the cursor. You have to explicitly enable this option and it works only after a successful compilation. You can also enable auto-completion, but it seldom works. Goto Locator option is hidden nearly as the legendary pin in the haystack. Just open any text file, right click in the edit area and select “Properties”. “Editor Properties” dialog box pops up. Click on “General” tab and select “Enable Tag Locators”. Now that you are there you can switch to the “C File Types” tab and add line numbers, set the tab size and have the tab key to insert spaces . You can also have a look at the “Tooltips” tab where you could enable the auto-complete (which is mostly wrong), and at the tab “Other” where you can enable the ruler marker for the 80th column.
One of the most annoying limitation we experienced is the limitation to 32 characters for the find/find in files commands.

About debugging
Debugging is not a nightmare provided you have the ICD3 hardware debugger AND you keep enabling the software breakpoints (not that MPLAB 8.66 doesn’t let you use software breakpoints because of a bug, so don’t install that version).
(Well actually it is not a nightmare until you get short on Program Memory and you need to enable space saving optimization, but that it is another story).
Hardware breakpoints have weird limitations – first and foremost they are just three, and, at least one, is used for single stepping. Next they “skid”, when the debugger stops on a hardware breakpoint it never halts exactly at the line where the breakpoint is set, but a few lines after. If the line after the breakpoint contain a function call, then the execution may stop in that function call, i.e. somewhere else with respect to where you placed the breakpoint.
In the same way single stepping is much like ice skating, especially if you are trying to step over function calls.
Software breakpoints are immune the plagues that pester hardware cousins. Therefore I cannot see any reason, beyond self punishment, to stay with hardware breakpoints.
The only point of attention with software breakpoints is to routinely check that they are enabled since MPLAB seems to forget this option.
In the same way MPLAB is likely to forget the event breakpoints. These are special conditions that can be programmed to halt the debugger. One of such conditions is the processing of a Sleep instruction and the other is the stack overflow/underflow condition.
When running the debug version of a program, I usually #define assert to halt the debugger, so that failed assertion directly bumps up on the PC screen and I am able to examine the whole machine state. So Sleep break event is very handy (when MPLAB remembers about it, otherwise is… surprising).
Halting on stack overflow is welcome since there is no other way to detect call stack overflow. According to the configuration bit you may chose whether to reset or to continue
If you stumble into a stack overflow you are left at address 0, apparently without any chance to run the application again.
This is because the code has to clear the stack overflow/underflow condition, but this is prevented by the debugger that is halted because of such condition. At the same time this is a very clear sign that the stack messed-up. The confirmation for stack over/underflow may be obtained by a look at the STKPTR register (search for it in the View/Special Register File window). If STKPTR has bit 7 or 6 set then the stack exploded.
To run the program, unless you want to re-program the chip, you simply change the STKPTR register value to 0x01.

If you are looking for ram or data memory bear in mind that it is called “File Registers”. While “Program Memory” is the flash memory. Unfortunately program memory can be displayed only word by word (2 bytes) (and, confusingly enough, in little endian format). Conversely ram can be display only byte by byte.
The other serious limitation is that program memory located variables are not displayed neither in the watch window nor in the local variable window. Your only chance is to note down the address of the variable and look its memory content up in the memory dump window.
The feature I miss most is the stack view with the run-to-return debugging option. You cannot peek neither at the hardware stack (return stack) nor at the software stack.
You will find plenty of options that are not disabled when they should be, for example you can always select the stack analysis report, but it works only when the extended mode is selected (and when you don’t employ function pointers which is quite a constraint). Hardware stack is always displayed as filled by zero, and I haven’t found (yet) the conditions under which it reports something meaningful.
MPLAB is so clumsy when compared to any modern IDE (Eclipse, Netbeans, Visual Studio) that you may find yourself more comfortable to develop and maintain the code with one of those IDE and then to resort to MPLAB only for building the firmware and debugging it.

Next time I’ll write about memory.

PIC18F Software Project Survival Guide 3

This is the third installment on a series of post regarding software programming on PIC18F cpu family. You can find the first here and the second here.
Linker
The linker is expected to group together the code from all the different compilation units produced by the compiler and generate the binary code. Since Pic18f architecture is what it is, this is not a trivial task.
Compiler groups data in data sections. These sections may be assigned to a specific region of the data memory via the linker script.
This file is defined for every project and you’d better to familiarize with its syntax.
In fact some troubles may arise by bad linker script and can be fixed by changing the linker script.
For example the compiler uses a temporary data section (named .tmpdata) – to store intermediate expression results – that is free to float around in the data memory. If the linker script is modified without care, then this section may fall across the banks boundary causing wrong computation (in the best case, memory corruption in the worst).
The default project comes with a default linker script that avoids data object to get across bank boundaries. (Note that linker script banks are not data memory banks, but user defined regions of memory, you may want to make linker script banks coincide with data memory banks for avoiding bank switching problems). So, by default, you are protected from this kind of fault (at the cost of some slack space, unless your code is so lucky to fill perfectly all the pages). But when the project size increases your data object will grow as well. So you may be tempted (I was) to merge all the banks into a big one.
I did, then I found many unexpected troubles because of this (see the .tmpdata and C startup problems for example). So I wrote a small awk script to run over the map file to spot these problems:

#!/usr/bin/awk -f

/<[[iu]data>/ {
        len=strtonum($5)
        if( len > 0 )
        {
            lastByte=strtonum($3)+len-1
            if( and(lastByte, 0xFFFFFF00) != and(strtonum($3),0xFFFFFF00))
            {
                print "Warning file " $1 " spans over multiple pages (data "
                    "size=" len ")"
            }
        }
    }

From the results I selected those modules that have large data object. I found three large data objects of 360, 600 and 501 bytes respectively. So I modified the linker script to have 3 multi-page banks – 2 banks composed by 2 pages and 1 spanning over 3.
In this way the linker is forced to put big objects in those multi-pages banks, but it will keep all the other objects within a single bank as required.
The best option you have is to start with a default linker script and then merge together adjacent banks as soon as you discover a large data object (this will be reported by an obscure linker error message pointing to a completely innocent module).
The Linker is also very uninformative about errors, you are allowed only to know that you ran out of memory. To be more precise you are allowed to know it only after a training, because the error message is very obscure, something on the lines of “section <a section name you are unaware of> cannot fit some other section”.

Assembler
Since Pic 18 are basically C-unfriendly, some assembly may be required. If you need a little bit of assembly then you can write it directly in the C source code (at a price we’ll see later). If you need some more assembly you want to the separate assembler. In this case you can take full advantage of specific assembly directives and/or macros, but then you lose integration with C language. In fact the assembler cannot fully understand C preprocessor directives, making it impossible to use the same header file for inclusion in both C and assembly.
There are two ways to workaround this, both not very satisfying. First you can write shared header files with the common subset of preprocessor directives shared both by assembly and C. Just keep in mind that rules for searching header file differs.
The other way is to write a filter (more or less complex according to the complexity of your header files) for converting C headers into assembly includes.
I went the last way because it seemed simpler, just convert C comments into assembly language comments, then I modified the filter to resolve include files. I gave up when I tried to switch from #if defined(X) to the old #ifdef X supported by assembler.
Eventually I decided to opt for very basic header files included both from assembly and integrated in more convoluted header file structure for C. I resorted to this solution only because it would take too much time to write a complete filter. If you do this just keep in mind that although comments are not compatible, you can use #if 0/#endif to bracket away parts from both the assembly and the C part.
When you mix assembly and C in the same source file you may get surprising results. As I wrote before I had defined an assert macro to execute a Sleep instruction in order to halt the debugger. My assert was something like:

#define ASSERT(X__) do { if( !(X__) ) { asm_ Sleep endasm_ } } while( false )

The effect is that this inserts the assembly fragment with the Sleep instruction everywhere you want to assert something. I was running short on program memory so I tried several configuration on debugging and optimization options and I discovered a significant difference in memory usage whether asserts where implemented with the assembly fragment or via a function call.
Apparently the optimizer has a hard time in doing its work when an assembly code block is inserted in a C function, no matter what the content of the block is (the sleep instruction has no side effects that can disturb the C code execution).
I think the assert is one of the rare case where you want assembly language not for performance reason. Therefore it is a sort of contradiction – you use assembly language fragment to improve speed, but you kill the C language optimization.
If you need assembly for performance, put it in a specific .asm source file.

Next time I’ll write about the IDE and debugging.

PIC18F Software Project Survival Guide 2

This is the second installment on a series of post regarding software programming on PIC18F cpu family. You can find the first here.
Tools
You can (and you really should for a non-trivial project if you care about your mental sanity) program the 18F using the C language.
Compiler
Basically there are two options – the first is the MCC18 compiler from Microchip and the other is the HiTech C. MCC18 is cheap and crappy, HiTech C is expensive and more optimizing (I cannot say whether is crappy or not since I never used it).
MCC18 is not fully C89 standard, on the other hand you need some extension to get your work done on this little devil. HiTech could be more ISO/ANSI compliant (I don’t know), but it is not compatible with MCC18 (is something they are planning to add in next releases, anyway I wouldn’t hold my breath). For this reason you’d better chose early which compiler you want to go with since they are not compatible. Probably you can manage to write portable code, but be prepared to write a lot of wrapper layers. Nonetheless you have to sort this out before starting coding.
Just to give you a hint about the compatibility problem I am talking about, apart from the way the two compilers provide access to the hardware registers, the HiTech uses the “const” attribute to chose the storage for variables, while MCC18 relies on non-standard storage qualifier keywords rom and (optionally) ram.

When I say that MCC18 is crappy, I have a number of arguments to support my point. Each point cost me at least a couple of hours to discover and work around, but sometimes I needed to spend days.
ISO/ANSI compliance is lacking from the preprocessor to the compiler. Not only preprocessor fails to properly expand nested macros, but it also messes up line numbering when a function-like macro is invoked on multiple lines.
For the first problem I haven’t found any workaround but to hand-code a part of the preprocessor work. For line numbers I use the backslash to foul the preprocessor in believing it is just a long line

#define A(B,C,D) /* macro definition */

A( longParameterB,
    longParameterC,
    longParameterD );

Compiler warnings are inadequate at best. For example you don’t get any message if a function that return a non-void type has no return statement. On the contrary when you compare an unsigned int to 0 (and not 0u) you get a warning. And you get warning for correct code, for example you can’t pass a T* to a const void* parameter without getting the warning, event if the two pointers have the same size and the same internal representation.
This behavior makes your life hard if your programming guidelines require maximum warning level and no warnings and doesn’t help you with real problems in your code. I use PC-Lint to spot real problems, but a run of gcc with some #define to handle non-standard constructs will spot most of them.
About warnings I had to fight back my loathing of useless casts and add them just for shutting up the compiler.
Given the poor state of the lad, I hadn’t been able to write a static assertion macro. Usually you write such a macro by turning a boolean condition in a compile-time expression that can be either valid or invalid (e.g. declaring an array with -1 or 0 elements, declaring an enum and assigning the first value to 1/0 or 1/1…). I haven’t found any way to get the compiler refuse any of these constructs.
One of the worst part of the toolchain is that it can produce code that breaks some hardware limitation without a warning. For example the compiler relies on a global temporary area for computing numerical expressions (array access is a case). The code generated expects that the temporary area is entirely contained in a data memory bank. The compiler nor the linker are able to detect when this area falls across a data memory bank boundary and alert the programmer. This is nasty because you can get subtle problems or having a failing program just after a recompilation.
Similarly the C startup code relies on a similar constraint for a group of variables, should they not fit in the same data memory bank, the initialization silently fails.
It took me few minutes to rewrite the startup code initialization routine and can’t see any noticeable slowdown.
I would advise to:

  • rewrite the C startup code, keep in mind the limitations of the compiler (breaks on objects laid across page boundary, breaks on accessing struct larger than 127 bytes, break on accessing automatic variables if the space taken is some tens of bytes);
  • use another tool (gcc/Pc-lint) to parse the source code and get meaningful warnings (missing returns, == instead of =, unused variables, uninitialized variables and so on);
  • enforce data structure invariants and consistency by use of assertion;
  • if you find a way to implement static assertion, let me know.

Next time I’ll write about linker and assembler.

PIC18F Software Project Survival Guide

Now that I’m getting nearly through I feel confident about posting this series of posts about my work experience on PIC18F. Although my writing could seem a bit intimidating or cathedratical I would like to receive your feedback and your thoughts on the matter. I got through, but I don’t like to have universal solutions 🙂
So, at last you failed in defending your position. No use in all the documentation you provided, the articles from internet and blog posts where with no doubt, PIC was clearly depicted as the wrong choice.
But either your boss or your customer (at this point it makes not much difference) imposed a PIC18F on your project. And she also gave a reason you can hardly argue about – Microchip never sends a CPU the way of the dodo… so, twenty years from now we could continue to manufacture the same device with the same hardware avoiding the need for engineering maintenance.
Given that this device will be sold in billion units, that makes a lot of sense.
Their problem is solved, but yours are just looming at a horizon crowded with dark clouds.
Good news first, you can do it – PIC 18Fs (after some twiddling) have enough CPU power to fit most of the applications you can throw at them. I just completed a device which acts as a central hub of a real time network and provides information via a 128×64 pixels display.
Bad news – it won’t be easy, for anything more convoluted than a remote gate opener, due at most by yesterday (as most of the project requires nowadays) your life is going to be a little hell. I’ll try to describe if not the safest path in this hell, at least the one where you cannot get hurt too badly.
So, let’s start by architecture.

Architecture
PIC18 architecture is described almost everywhere (checked on the back of your cereal box, recently?), but the first place you are going to look, the datasheet, will be mostly helpless. So I will try not to repeat anything and I will not go much into details, but I will try to give you a picture with the objective of showing the capabilities and the drawbacks of these gizmos.
First these are 8 bits CPUs rooted in the RISC field – simple instruction, simple task, low code density.
The memory follows the so called Harvard architecture – two distinct memories for data and for program instructions. Data memory is called Register File, while program memory is called… Program Memory. Data memory is a RAM, while program memory is a flash.
Program memory is linear (no banks, no pages), each word is 16 bits wide, but the memory can be accessed for reading (or writing) data one byte at time. Current PIC18s have program memory sizes up to 128k, but nothing in their design prevents them to address up to 16Mbytes (2^24).
You can erase and write the program memory from the PIC program itself (this is called self-programming), but there are some constraints – first memory is organized in pages, 1024 bytes each. In order to write the program memory you have first to erase it and this can be done only one page at time. Once the page has been erased you may write it altogether or just one byte at time.
The worst part is that when the program memory is erased or written the program memory bus is used and therefore the execution is stalled. This stall can last for several milliseconds.
Data memory can be accessed either linearly or through banks of 256 bytes each depending on the assembly instruction you use. Data memory for PIC18s is up to 4k, again there’s nothing in the design that prevents the CPU to address up to 64k or RAM. In the data memory there is a special section (Special Function Registers) where hardware registers can be accessed.
PIC18 architecture becomes quite funny on SFR since you can find the usual timer, interrupts and peripherals control registers along with CPU registers such as the status flags, the W register (a sort of accumulator). Further there are registers that basically implements specific addressing. For example PIC18 has no instruction for indirect addressing (i.e. read from a location pointed to by a register); if you want to indirect access a location you have to load the location into a SFR (say FSR0 for example) and then read from another SFR (e.g. INDF0). If you want a post-increment you read from POSTINC0.
That may sound elegant, but it is a nightmare for a C compiler, basically any function that accepts a pointer could thrash part of the CPU state, since most of the CPU state is memory mapped!
That’s also the reason why, conservatively, the C compiler pushes about 60 bytes of context into the stack on entering a generic interrupt handler.
There is a third memory in every PIC18F – the hardware return stack. This is a LIFO memory with 31 entries. Each entry is the return address stored every time that a CALL (or RCALL) assembly instruction is executed.
Still on the CPU side, PIC18F features two level of interrupts – the high priority level interrupt and the low priority interrupt, you can assign every interrupt on the MCU to one or the other of the levels.
Talking about the peripherals you will find almost everything – from low pin count device to 100 pins MCUs with parallel port interface, external memory and ethernet controller. Even in a 28 pins DIL package you found a number of digital I/Os, comparators, DA and AD converters, PWMs. Every pin is multiplexed on two or three different functions. I2C and SPI are available on each chip, while USB port is available only on a couple of sub-families.

Next time, I’ll talk about tools

Commit logs

Geek’s humor comes in several flavors, “[http://whatthecommit.com/|what the commit]” is a tasty one. I wonder how funny it is looking at my commit log… say yesterday:


 9:42 - 127 bytes saved
 11:07 - 140 bytes saved
 11:20 - 882 bytes saved (got rid of Xyz state machine)
 11:25 - 22 bytes saved (got rid of Wuz - unused code, just program memory waster)
 11:41 - 431 bytes saved (got rid entirely of Xyz)
 14:00 - prevented race condition on frame receive.
 14:13 - ISO/ANSI insulting warning removed with the following comment
     // The cast below is not needed in ANSI C and should be considered a
     // bad coding practice. BUT we are stuck with MCC18 compiler that is
     // NOT ANSI compliant.
 14:46 - 18 bytes saved by splitting fn() into newFn() and retryFn()
 15:55 - 54 bytes saved by removing a couple of arguments from a function.
 16:20 - 54 bytes saved by replacing function fu().
 

With less than 10k free in the program memory, yesterday it was again Spring Cleaning Day… but long are gone the days of easy gains, now we are sorting of scraping the bottom of the barrel. Here’s some math for you.. if a contractor is paid 400€/day and the difference between a @!* PIC18 and any other micro is less than half a dollar, how many units of your device you need to sell to justify the contractor work?

PIC 18F and its stinking stack

During the past week PIC 18f continued to be my main headaches supplier. I think I already wrote enough about the messiness of the Harvard architecture combined with “modern” languages such as C. Well, since two separate address spaces didn’t seem enough, the chip designers opted to add a third for call return stack. The place where the CPU stores the return address before jumping to a subroutine is into a distinct address space… that’s not so addressable – since you can read and write only the topmost location.
Being our model the top of the range, it features 31 levels in the stack. That means that you can sub-call no more than 31 times without breaking the stack (or, you can’t compute factorials greater than 31! :-)).
Well 31 is a reasonable amount, or at least I thought so. Then we ran out of Program Memory, so we enabled the compiler space optimization. I must admit that these optimizations are quite effective being able to trim down the size of your binary of some 30%. Most of this percentage is achieved by the so-called “procedural abstraction” optimization. This procedure factorize common sequence of assembly into sub-routine. By repeatedly applying the technique, more and more blocks get extracted and replaced by a CALL instruction.
The side effect is that this technique has a dramatic impact on the stack – it may doubles the number of nested calls.
In our case the stack exploded when a deeply nested graphic routine was interrupted by the low-level interrupt that in turn was interrupted by the high level interrupt.
I had the terrible “game-over” feeling – I had to use the optimization to fit in the memory, but I had no control over the nesting of the calls. In fact every action I could take to soften the nesting would be undone automatically by the optimizer.
I don’t like to give up, so I fired up acrobat reader on the datasheet trying to dig a way out of troubles.
PIC 18f datasheet about call return stack isn’t terribly clear, so I had to read it several times. Eventually I concluded that stack overflow and underflow have two kinds of handling selectable once for all by configuration bits (sort of fuses you set when programming the device). In the first way when the execution attempts to push the 31st entry in the stack, just flips a flag. Next pushes just overwrite the entry. I thought hard about this, but I found no use whatsoever.
Next mode provides a sort of interrupt. On the 31st push the execution jumps to 0x000000 (the reset vector) with the overflow condition flagged.
Woha, I realized, I could trap this address and check for the overflow before handing the execution flow to the standard firmware. If I am here for an overflow, I could dump the hardware stack (one entry at time, it is possible) into somewhere in RAM and continue the execution where…
Exactly, where?
The CPU discards the target address of the call, it just pushes the return address into the stack. Well I could peek around the return address to see whether a CALL either relative or absolute has been issued and recover the target address.
I really enjoyed the idea, but the more I though of it, the more I found critical cases. What if the jump was caused by a call to a pointer? The PIC code for such an addressing is all but straightforward. Maybe I could detect the call pattern and perform some recovery.
Eventually I found the real killer of my idea in interrupts. In fact the call return stack is shared for normal execution and interrupts. The code for recovering the target address would have to be rather convoluted – check for a CALL/RCALL instruction, check for a CALL to a pointer, check for interrupt flags (there are tons of them) and decided whether a low or high priority interrupt has triggered…
Dead end. I wondered what was the idea of the chip designer when implementing the stack overflow mechanism. I found it useless, done this way, and it could have worked fine with some additional circuitry – it needed just to save the target address and it would have been great.
Back to paper and headache.
I did some search on internet and found another compiler (a bit too experimental to be employed in an industrial project) that save the return address from the hardware stack into the software stack. Doing this, the hardware stack never grew over 3 entries. Good! But the MCC18 compiler is really basic, it doesn’t allow this technique, nor it allows the programmer to specify user defined prologue/epilogue for functions.
At this point occurred to me that the stack explosion had always been during interrupt time. So, I thought, I may save the stack when I enter the low level interrupt and the stack level is over a certain threshold, and then restore everything before leaving the interrupt.
The implementation was quite straightforward (almost flawless, if you pardon my ubiquitous off-by-one bug). The only problem is that sometimes the low level interrupt incurs in a bad penalty of copying some 100 bytes from the stack into RAM in order to avoid the stack explosion.
But it makes a good war story.

messed up startup

  /* we'll make the assumption in the following code that these statics
   * will be allocated into the same bank.
   */

When I first saw the this comment in the MCC18 C startup code some time ago, it sent a chill down my spine. But promptly I recovered, after all, this is one of the most widely used MCU on the planet and this is the preferred development environment. If they did it this way, they would have good reasons.

Then I wasted a whole afternoon tracking a problem in initialization, only to discover that the assumption had been silently broken by the linker and the code ceased to properly work, wreaking havoc in initialization of the static data.
It is beyond my understanding how could you, while professionally doing your job, base a core component (the startup code) on such assumption without either being sure that the assumption is always correct or there is an automatic detection that reports when the assumption no longer holds true.
Objectively it would have been fairer to provide a less efficient version safely and always working leaving to the eager programmer the task of crafting a high performance version if needed, rather than vice-versa.
Valentino, in comments to a previous post, asked me if I have data about difference in pricing among PIC and other MCUs. Well, not handy, but the difference is really small (my guess is 0.5€ to 1.5€) and it doesn’t really make sense to undergo all these troubles unless your volumes are filthy high.