1 / 49

Practical Embedded System Issues

Practical Embedded System Issues. Edwin Olson MIT CSAIL eolson@mit.edu May 8, 2008. Today’s Goal. Give you a feeling of how an embedded system works How is embedded development different from general-purpose development? Show you some implementation nuts and bolts

gaston
Download Presentation

Practical Embedded System Issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical Embedded System Issues Edwin Olson MIT CSAIL eolson@mit.edu May 8, 2008

  2. Today’s Goal • Give you a feeling of how an embedded system works • How is embedded development different from general-purpose development? • Show you some implementation nuts and bolts • Help you avoid common blunders when implementing embedded systems in the future

  3. Agenda • What is an embedded system? • Introduction to Embedded Development • Implementing a pre-emptive kernel • Processor/Memory model • Context switching and scheduling • Implementing a device driver • Many ways to do it… most wrong! • A simple embedded system • DARPA Urban Challenge ADU

  4. Embedded Systems Development • What is an embedded system? • A system designed for a specific (fixed) function • Hardware and software design are often coupled • Resources (memory, CPU) often limited • Real-time issues more common • Ludicrous cost sensitivity • Smaller teams: • Developers are a lot “closer” to the hardware

  5. Embedded System Development • Examples: • Mars rovers • Industrial automation • Automotive ECU • OrcBoard • ADU

  6. ARM (Advanced RISC Machine) • ARM is big in embedded world • Low power • Range of products • Applications • iPhone • GameBoy Advanced • Automotive • 90 processors shipped per second

  7. Microcontrollers/SoC • System-on-chip • CPU + RAM + FLASH + I/O devices integrated into one package • Advantages • Lower system cost • Faster development • Greater reliability • Physically smaller • Disadvantages • Less customizable • Less expandability CPU 50MHz 32 bit Luminary ARMv7m RAM64 KB Peripherals ADC, PWM, QEI, UART, I2C, SPI, CAN, EMAC FLASH 256KB

  8. ARM v7m Memory Map 0x00000000 vectors code • Single 32 bit address space • Code, Data, IO Space • No MMU (Memory Management Unit) • No virtual memory • All processes exist in same space • Some memory protection (woo!) • Guard pages • No-execute regions FLASH data init 0x20000000 data bss kernel stack RAM task 0 stack task 1 stack …Heap… 0x40000000 uart ethernet IO Space timer …

  9. Power On • On a general purpose system (Linux/Solaris) many things happen for you automatically • In embedded world, you’re responsible! • What happens when we turn the system on? • How does our code start?

  10. “Booting” vectors 0x00000000 Reset address, IRQ handler addresses code int myGlobalVariable = 50; int myOtherGlobal; void crt() { // initialize hardware, e.g., PLL // copy .data section for (uint32_t *src = _etext, *dst = _data ; dst != _edata ; src++, dst++) (*dst) = (*src); // zero .bss section for (uint32_t *p = __bss_start__; p != __bss_end__; p++) (*p) = 0; main(); } void main() { … } FLASH data init data bss RAM …Heap… uart ethernet IO Space timer …

  11. Global Variables vectors code int myGlobalVariable = 50; int myOtherGlobal; void crt() { // initialize hardware, e.g., PLL // copy .data section for (uint32_t *src = _etext, *dst = _data ; dst != _edata ; src++, dst++) (*dst) = (*src); // zero .bss section for (uint32_t *p = __bss_start__; p != __bss_end__; p++) (*p) = 0; main(); } void main() { … } FLASH data init data bss RAM …Heap… uart ethernet IO Space timer …

  12. Global Variables vectors code int myGlobalVariable = 50; int myOtherGlobal; void crt() { // initialize hardware, e.g., PLL // copy .data section for (uint32_t *src = _etext, *dst = _data ; dst != _edata ; src++, dst++) (*dst) = (*src); // zero .bss section for (uint32_t *p = __bss_start__; p != __bss_end__; p++) (*p) = 0; main(); } void main() { … } FLASH data init copy data zero bss RAM …Heap… uart ethernet IO Space timer …

  13. Using a real-time kernel • Often have several things to do at once: • Operate peripherals • Command interface • Watchdogs • Cooperative versus pre-emptive multitasking

  14. Initializing Application vectors code void main() { nkern_init(); nkern_task_create(task0, PRIORITY_HIGH, 1024); nkern_task_create(task1, PRIORITY_LOW, 1024); nkern_bootstrap(); // never returns } void task0() { … } void task1() { … } FLASH data init malloc data bss kernel stack RAM …Heap… uart ethernet IO Space timer …

  15. Initializing Application vectors code void main() { nkern_init(); nkern_task_create(task0, PRIORITY_HIGH, 1024); nkern_task_create(task1, PRIORITY_LOW, 1024); nkern_bootstrap(); // never returns } void task0() { … } void task1() { … } FLASH data init malloc data bss kernel stack RAM task 0 stack …Heap… uart ethernet IO Space timer …

  16. Initializing Application vectors code void main() { nkern_init(); nkern_task_create(task0, PRIORITY_HIGH, 1024); nkern_task_create(task1, PRIORITY_LOW, 1024); nkern_bootstrap(); // never returns } void task0() { … } void task1() { … } FLASH data init malloc data bss kernel stack RAM task 0 stack task 1 stack …Heap… uart ethernet IO Space timer …

  17. Context Switching • At any point in time, either: • A user task is running: nkern_running_task • An IRQ or kernel is running • How do we switch from one task to another? • What state does a task have? • CPU register state • Task stack state • Virtual memory mapping state (no MMU)

  18. ARM v7m Processor State

  19. Context Switching • Our strategy: • We’ll store registers on task’s stack • Just need to remember each task’s stack pointer • Each task represented by an nkern_task_t typedefstruct { uint32_t sp; // stack pointer for task uint32_t priority; // task priority nkern_wait_list_t waitlist; // discuss these later… uint64_t wait_utime; } nkern_task_t; nkern_task_t *nkern_running_task; // pointer to currently-running task

  20. Context switching cartoon • SysTick timer generates periodic interrupts • Configured by writing to I/O memory space IRQs time IRQ handler Task 0 Task 1 How often should SysTick timer generate interrupts?

  21. Before the IRQ • Used stack • local variables • function calls • Suppose task0 is executing • …Doing something useful… • …Using its stack and registers… • Stack pointer (SP) points to next available memory location SP Task 0 Stack Unused stack Task 1 Stack RAM

  22. An IRQ Occurs… • Used stack • local variables • function calls • Hardware pushes some of task0’s registers onto the stack: • PC, xPSR • r0, r1, r2, r3, r14 (LR) • Hardware then invokes the IRQ handler… SP Hardware-saved registers Task 0 Stack SP Unused stack Task 1 Stack RAM

  23. A Useless IRQ Handler • Used stack • local variables • function calls systick_irq_handler: stmdb sp!, {r4-r11} // push remaining registers ldr r0, =nkern_running_task ldr r0, [r0] str sp, [r0] // save SP in nkern_task_t … …. ldr r0, =nkern_running_task // load SP from nkern_task_t ldr sp, [r0] ldr sp, [r12] ldmia sp!, {r4-r11} // pop registers rti // return from interrupt // hardware will take over SP Hardware-saved registers Task 0 Stack SP Software-saved registers Unused stack Task state is now saved! Task 1 Stack Task is now restored! Why does the hardware save only some of the registers? RAM

  24. SysTick IRQ Handler • Used stack • local variables • function calls systick_irq_handler: stmdb sp!, {r4-r11} // push remaining registers ldr r0, =nkern_running_task ldr r0, [r0] str sp, [r0] // save SP in nkern_task_t … call nkern_scheduler …. ldr r0, =nkern_running_task // load SP from nkern_task_t ldr sp, [r0] ldr sp, [r12] ldmia sp!, {r4-r11} // pop registers rti // return from interrupt // hardware will take over SP Hardware-saved registers Task 0 Stack SP Software-saved registers Unused stack Pick a different task to run • Used stack • local variables • function calls Hardware-saved registers Task 1 Stack Software-saved registers Unused stack RAM

  25. Scheduler: preemptive round-robin • Each priority level maintains a queue of runnable tasks. • Algorithm: • Put the old task at the end of its priority queue or into the sleep queue • Examine queue of sleeping tasks • Wake those whose sleep interval has expired • Find the highest-priority, non-empty queue • Remove and return first item in queue.

  26. What else do we need? • If our tasks are always runnable, we’re done! • If a task wants to wait for a fixed time: • nkern_running_taskwait_time = now() + delay • manually cause a SysTick IRQ • What if a task wants to wait for some asynchronous event? • Receive data from ethernet/serial • A button press

  27. An example problem • Serial-To-LCD • Receive characters from serial port, write them to a graphical LCD display

  28. Serial-to-LCD: dumb • Continuously check for occurrence of event • Miserable! • High CPU usage • Higher latency for other tasks • Higher power consumption • … and really common • (thanks to reusing vendor-supplied sample code) void serial_echo_task() { while (1) { while (!serial_rx_data_available()); // spin wait char c = serial_rx_get_data(); lcd_draw_character(c); } }

  29. Serial-to-LCD: half a good idea • Suppose the hardware can generate an IRQ when our desired event occurs • What’s wrong with this? • Long-running IRQ • lcd_draw_character() called from IRQ context • Destroys real-time performance of system! void serial_irq() { if (serial_rx_data_available()) { char c = serial_rx_get_data(); lcd_draw_character(c); } }

  30. Serial-to-LCD: smart • Add task to a waitlist • Scheduler will stop scheduling that task • An IRQ will “wake up” the task nkern_wait_list_t *serial_rx_wait_list; void serial_echo_task() { while (1) { nkern_wait(serial_rx_wait_list); // won’t return until data avail char c = serial_rx_get_data(); lcd_draw_character(c); } } void serial_irq() { if (serial_rx_data_available()) nkern_wake_all(serial_rx_wait_list); }

  31. Serial-to-LCD: smarter! • Trigger a reschedule from within the IRQ • Don’t have to wait for SysTick before the task can wake up. • Greatly reduces servicing latency void serial_irq() { if (serial_rx_data_available()) { nkern_wake_all(serial_rx_wait_list); nkern_schedule(); } } One more tiny change needed for “smartest”… hint: prevent unnecessary calls to nkern_schedule()

  32. Serial-to-LCD: Moral • All four methods are “correct” • Implement Serial-to-LCD’s requirements • Are interchangable • You might find any one of these in a system! • But only the fourth method is good! • The first will consume tons of CPU and power • The second can violate real-time requirements of other tasks in the system • The third will have high latency • We can still shoot ourselves in the foot, even if we are using the right tools (e.g., real-time OS)

  33. Serial IRQ Handler: Uncensored static void serial0_irq_real(void) __attribute__ ((noinline)); static void serial0_irq_real() { uint32_t status = UART0_FR_R; uint32_t reschedule = 0; if (!(status & ((1<<RX_IRQ) | (1<<RX_TIMEOUT_IRQ)))) { // data is available. UART0_ICR_R = (1<<RX_IRQ) | (1<<RX_TIMEOUT_IRQ); reschedule |= _nkern_wake_all(&rx_waitlist[0]); } if (!(status & (1<<TX_IRQ))) { // room to send UART0_ICR_R = 1<<TX_IRQ; reschedule |= _nkern_wake_all(&tx_waitlist[0]); } if (reschedule) _nkern_schedule(); } static void serial0_irq(void) __attribute__ ((naked)); static void serial0_irq(void) { IRQ_TASK_SAVE; NKERN_IRQ_ENTER; serial0_irq_real(); NKERN_IRQ_EXIT; IRQ_TASK_RESTORE; } #define IRQ_TASK_SAVE \ asm volatile ("mrs r12, PSP \r\n\t" \ "stmdb r12!, {r4-r11} \r\n\t" \ "ldr r0, =nkern_running_task \r\n\t" \ "ldr r0, [r0] \r\n\t" \ "str r12, [r0] \r\n\t"); #define IRQ_TASK_RESTORE \ asm volatile ("ldr r0, =nkern_running_task \r\n\t" \ "ldr r12, [r0] \r\n\t" \ "ldr r12, [r12] \r\n\t" \ "ldmia r12!, {r4-r11} \r\n\t" \ "msrpsp, r12 \r\n\t" \ "ldr pc, =0xfffffffd \r\n\t" \ ".ltorg \r\n\t"); #define NKERN_IRQ_ENTER nkern_in_interrupt_flag++; #define NKERN_IRQ_EXIT nkern_in_interrupt_flag--; What’s up with the __attribute__ stuff?

  34. Top-Half/Bottom-Half Handlers • Consider Ethernet peripheral • When a packet arrives, a lot of processing can result • Checksums to verify • IP fragments to reassemble • TCP windows to update • Applications to wake up • Want to keep IRQ Handler as fast as possible: • Modern strategy • IRQ Handler is minimalist. Does least amount of work possible, wakes up another thread to finish the work. • Thread processes incoming data while respecting priorities of other tasks in system.

  35. DARPA Urban Challenge

  36. System Data Flow Sensors ADU Raw sensor data Manual Override, Run, Stop, E-Stop Steering, Gas, Brake, Shifter Motion Plan Sensor Processing, Path Planning 40 CPU Blade-Cluster Non real-time Linux

  37. Sensor Processing (Blades) • Obstacle Detection, tracking • Using LIDAR, RADAR

  38. Sensor Processing (Blades) • Detect road paint • Using Cameras

  39. Sensor Processing (Blades) • Estimate Lanes

  40. Motion Planning (Blades) Goal Point Curb seen by “hazards”, but not yet lane tracker Car Gray = BadRed = Infeasible Lamp Post

  41. Motion Planning (Blades) • Search for a series of steer/gas commands that get us closer to goal Obstacle Oncoming lane Off-road Current Position 41

  42. Waveform Generation • Blade cluster (not real time) generates waveform plan in advance • ADU executes plan, generating waveforms in real-time gas/brake steering wheel position time plan received next plan deadline now

  43. ADU • Interface between the vehicle and our primary computers • Need physical interface to real world • DACs, UARTs, CAN • Basic mode switching of car (even if cluster is off ) • Detect failures/bugs in our main software • Maintains a big finite state machine • “RUN”, “PAUSE”, “STANDBY” • Transitions caused by: • Commands from blade cluster • Human Button presses • Time-outs

  44. ADU Finite State Machine Shift Command Standby 5 seconds Run Shift Shift Done Invalid command Watchdog timeout “Run” Button Stop “Stop” Button Manual Override released Manual Override engaged ManualOverride Assertion Failure Error

  45. ADU Tasks • tick_task() • Trigger periodic FSM state transitions • Watchdog timer • emc_poll_task • Query car’s drive-by-wire for status periodically • emc_async_task • Execute queued commands to drive-by-wire system (shifting, turn signals) • dac_task() • Generate drive/steer analog output signals • shift_task() • Perform and monitor transmission shifting • udp_command_task() • Handle incoming ethernet command packets • button_task() • Sample button inputs • Debounce • Trigger state transitions • music_task() • Play music/make sounds to report state changes • status_task() • Sends periodic status messages via UDP • lcd_task() • Display status on LCD display • debug_task() • Dump kernel statistics over serial port on command FSM /Watchdog Command Inputs Vehicle control interface Diagnostic/Monitoring

  46. Communications • How do we interface a non real-time system to a real-time system? • TCP? • Retransmissions, windowing heuristics  hard-to-predict latency • UDP? • Nominally unreliable • Others: • CAN, RS-232, RS-485, USB…

  47. Communications: UDP • Error rates over local LAN? • Gigabit ethernet bit error rate = 10-12 • Dominant loss mode: host buffer overflows • Idempotent commands are robust to packet loss • Don’t send: “Turn steering wheel clockwise”, send “Set target steering wheel position to 0.92”. • Our strategy: • Retransmit idempotent commands at a rate high enough where packet loss failure mode is negligible

  48. But why Ethernet? • Very high bit rate • Low error rate • Low latency • Multi-point • Cheap • Noise Immunity • Electrical Isolation

  49. Conclusions • Embedded systems are neat! • Tightly coupled hardware+software • Possible (necessary?) to understand the whole system • Interesting & unique challenges: • Limited resources: CPU, memory, power, size, cost • Simple real-time schedulers • Real time savers • Not a panacea: success requires knowledge and care • ADU as a simple embedded system

More Related