Sunday, August 8, 2010

blinking leds

As usual I am not satisfied with something like this:


#include
out port bled = PORT_BUTTONLED ;
int main () {
bled <: 0b0011 ;
while (1)
;
return 0;
}


Particularly when looking at the documentation for the part (xmos xs1) ports are not managed in the traditional way through registers but there are instead instructions specifically for setting up clocks and ports and such things. It was also interesting that the documentation did not make things obvious, for basic gpio push pull stuff just do this.

I started to dig through the assembler generated by that simple turn the led on program above. I have an XMOS XC-1A board, and the above program does in fact turn the led or maybe two of them on. Initially I failed to understand the parts of the program (machine/assembler code) produced, such that I posted a question to Stack Overflow. A while later someone looked at the problem and I also looked at the problem with fresh eyes and it started to come together.

One very nice thing about this chip product is that the development tools, which you do have to give them your email address (register), are complete and include what so far is a very nice simulator. Having just moved from a chip design (to silicon and now board development/support) project I was happy to see .vcd output, waveforms you look at with a tool like gtkwave. I much prefer to debug software by looking inside the chip in this manner. So you do not need to run out and buy any hardware in order to start using this product. Just register and download the tools.

Well, to understand the example here you will also need the schematic. If the tools dont include it you will need the xs1_en.pdf file for the chip as well. Maybe another document or two from xmos.

First embedded project on any new chip, blink the led right? The XC-1A has a number of leds on it. The simpler ones are right next to the four pushbuttons. In the schematic they are called KEYLED A through KEYLED D. One side is ground the other is tied to X0D14, X0D15, X0D20 and X0D21. The chip on this board has four xcores inside so pin numbers with X0 are from xcore 0 X1, from xcore 1, etc. So these are xcore 0 and that is good, so far I dont know how to switch to the other cores without using C (well XC their version of C with extensions).

Next we need the XS1-G4 512BGA datasheet. On table 2.2 we look up those pins XnD14 is on port/bit P4C0, XnD15 on P4C2 and we find the other two P4C2 and P4C3. That means the lower four bits of port 4C (when viewed as a 4 bit port) are the four button leds on this board. Why 4 bit ports and not 8 bit or 16 or 32. Well because when you disassemble the output of the above .XC code their compiler used P4C as we will see.

Some documentation or examples related to this board indicated to use PORT_BUTTONLED as the name of the port to get at those leds. if you look in the xmos tools configs/XC-1A.xn file we see that

<Port Location="XS1_PORT_4C" Name="PORT_BUTTONLED"/>

And that is consistent with what we find using the datasheet for the part and the pin names in the schematic.

The next thing we need to find is in the target/include directory file xs1_g4000b-512.h. We take the XS1_PORT_4C name and look that up and get

#define XS1_PORT_4C 0x40200

That 0x40200 is the address of the port if you will.

The xmos compiler tools are obviously directly modified gnu tools, binutils, gcc, etc. So very familiar in that sense just put an x in front of most things gas becomes xas, xobjdump, etc. If you use xobjdump -D to disassemble the above project to try to repeat what I have done there are some basics of the instruction set you need to understand. Like ARM and other relatively modern instruction sets you are not able to load anything immediate (constant) you want into a register in a single instruction. The arm allows for about 8 significant bits the xcore seems to be closer to 12 bits. So similar to say the x86 family the xcore can/uses a data pointer. Basically a register used as a base address into memory from which you can do register offset type addressing. Now I am still having problems using this dp register. Sometimes the code works then add a line or two and it breaks. So to setup this data area you would do something like this:

ldap r11,constants
set dp,r11
...

constants:
.word 0x00040200
.word 0x00080300


So with that setup you can fetch numbers with more than 16 bits in them, for example.

ldw r3,dp[0x0]


I had and am still having problems so I am going to avoid the approach the XC compiler took and
do my own thing.

ldc r3,0x4020
shl r3,r3,4


Now another thing that I dont quite understand and from looking today didnt see a real answer from the xmos support/forum website (I will join here in a bit and post more questions) as to why you need this and what clock or resource we are messing with.


ldc r3, 0x6
setc res[r3], 0x8
setc res[r3], 0xF


You have to just run that sequence otherwise your IO ports wont work and sometimes the chip (and sim) hang.

The 0x8 and 0xF we can see from the setci instruction (another thing you notice very quickly from the disassembly that the instruction name as documented in the XS1 Architecture document do not match the assembler directly. The assembler knows that a setc with an immediate is really a setci and a setc with a register as the second term is a setc in the docs). 0x8 is CTRL_INUSE_ON. the 0xF is CTR_RUN_STARTR. Whatever clock number 0x6 is we want to use it and start it. We get the indication it is some sort of clock because the compiler output assigns this clock to the port using the setclk instruction. Now for this simple program that setclk is not necessary to make it work.

I think we are ready to blink the led.


.globl _start
_start:
ldc r0,4
ldc r2,8
ldc r3,16
ldc r1, 100
notmain:
sub r1,r1,1
bt r1, notmain

ldc r3, 0x6
setc res[r3], 0x8
setc res[r3], 0xf

ldc r3,0x4020
shl r3,r3,4

setc res[r3],0x8

top:
ldc r0, 0x8
out res[r3], r0
bl delay

ldc r0, 0x4
out res[r3], r0
bl delay

ldc r0, 0x2
out res[r3], r0
bl delay

ldc r0, 0x1
out res[r3], r0
bl delay

ldc r0, 0x2
out res[r3], r0
bl delay

ldc r0, 0x4
out res[r3], r0
bl delay

bu top

delay:
retsp 0x0
ldc r1, 0x8000
shl r1,r1,8
db:
sub r1,r1,1
bt r1, db
retsp 0x0




As with other gcc/binutils programs we need a _start entry point. We dont really need a main and the tools might warn about it but it is fine. The tools were complaining about a few registers not being initialized so the first few lines of code are for that. the little delay loop is just a simulation thing to allow the chip to spin up, I am guessing not needed for this chip, but who knows, it also gives us a feel for the simulator.

Then we see the setcs for clock number 0x6

And now we start talking to our port 4C. We prep (and consume) are reigster for holding the ports address, 0x40200. And we use setc to use that port. dont seem to need to do more to get that port ready for bit banging.

The docs showed us that the lower 4 bits of that (four bit) port are related to the outputs, so we only need to wiggle those four bits. You can see the main loop here turns one of the four bits on one at a time. Initially the delay function returns immediately so you can simulate this, to run on real hardware take that first retsp 0x0 out right after the word delay in order for your eyes to see the lights blink.

So assuming the above assembler was named m.s

xas m.s -o m.o
xcc m.o -nostartfiles -target=XC-1A -o m.xe

With gcc/binutils tools I generally prefer to use the linker directly instead of indirectly through gcc. But in this case I am using xcc as a straight path to the .xe file output. Perhaps I had problems figuring out their tools.

The simulation syntax wasnt initially obvious to me but I figured it out. I use gtkwave to look at .vcd files and at least for the .vcd format I was using there is a problem with output of this simulation. So the way I like to run it:


xsim --max-cycles 2000 --vcd-tracing "-o m.vcd -ports -cycles -threads -timers -instructions -functions -pads" n.xe


The problem gtkwave has is the -pads option adds the edge of the chip pins/pads to the output, something we do care about, but in the .vcd file it does this:

$var wire 1 paa3 0:X0D52 $end


If you edit the .vcd file and replace all of the paa definitions without the zero colon:


before:
$var wire 1 paa3 0:X0D52 $end
after:
$var wire 1 paa3 X0D52 $end


Now gtkwave is happy with the file. I ended up writing a little utility to create a fixed .vcd file so I didnt have to keep doing a search and replace.

Unfortunately I wont tell you here how to use gtkwave to see things, some pointers though gtkwave m.vcd to open the file. Lower leftish are all the signals, click on one and CTRL-A to highlight all of them. Then add all of those to the output with the add button below that selection. Then go up to the magnifying glass icon that zooms out to fit the whole sim in the window. Then slide up and down looking for the wiggly paa output lines.

If you have an XC-1A board then remove the first retsp after delay, re-compile, and use xrun m.xe to load and run the program and see the blinking lights. Now on linux (ubuntu) you need to

sudo mount --bind /dev/bus /proc/bus

In order for the tool to find the boards out there, then unmount before you are finished so the operating system operates normally.