but why?
When working on my bedside table lamp project (lampy2), I was faced with the dreaded moment where the good old while (1) loop was showing its limits:
- I needed to generate frames regualrly
- and then send them to the LED driver
- while also checking for touch input and other things (including sending debug messages over UART when developping)
As a result, my FPS was not constant and the animations were just not smooth enough. Naturally I looked at RTOS and other flavours of schedulers for embedded systems and to be honest they all seemed way too powerful (and therefore complex). The STM32 FreeRTOS documentation and package in Cube IDE etc looked daunting. All I needed was to define a few tasks, how often they should run, and sleep in between! Since the execution time of my tasks was mostly constant and their sequencing as well (apart from touch signals, nothing is unexpected really), I decided to build something much smaller, much simpler, and that I could actually understand so that I can evolve it as I need.
what it does
Lelu scheduler is a cooperative (non-preemptive) task scheduler. Yep, I looked it up, that’s what it’s called. Baiscally it does what I said above, or in other words: tasks run to completion - they’re not interrupted mid-execution. This makes reasoning about shared state much simpler than with preemptive schedulers, at the cost of requiring well-behaved tasks that don’t hog the CPU and no external factors prompting more work (like reacting to some interrupt).
Key features:
- Configurable task periods - each task has its own execution interval
- Priority by registration order - first registered = highest priority
- Runtime task enable/disable
- Overrun detection - warns when tasks take too long
- Execution time profiling - track time spent in each task
- Debug output via UART - optional diagnostic messages
- Works with STM32 HAL (F4, G0, and other families but should nto be too hard to extend beyond tbh)
how it works
The scheduler is built around a simple tick-based model. Here’s the big picture:
HAL_IncTick() called every 1ms by SysTick interrupt
│
▼
┌─────────────────────────┐
│ lelu_scheduler_systick │ ← counts milliseconds
└───────────┬─────────────┘
│
│ every TICK_PERIOD_MS (default 25ms)
▼
tick_pending = true
│
│
════════════╪════════════════ main loop ════════════════
│
▼
┌─────────────────────────┐
│ lelu_scheduler_run() │ ← check all tasks
└───────────┬─────────────┘
│
┌───────┴───────┐
▼ ▼
Task ready? Task ready? ... (for each registered task)
│ │
▼ │
┌─────────┐ │
│ Run it! │ │
│ (timed) │ │
└────┬────┘ │
│ │
▼ ▼
update stats skip, not due yet
│
▼
┌─────────────────────────┐
│ __WFI() - sleep until │ ← CPU sleeps, saves power
│ next interrupt │
└─────────────────────────┘
the tick
The scheduler hooks into STM32’s HAL_IncTick() which is called every 1ms by the SysTick interrupt. Inside, we count milliseconds and set a flag every LELU_TICK_PERIOD_MS (default 25ms). This is the scheduler’s ticking clock.
Why not check tasks every 1ms? I mean you can but for me it’s overkill. If you need that level of responsiveness/accuracy you probably want to use something else anyway :)
task scheduling
Each task has a period (eg. 100ms) and a counter tracking time since last execution. When lelu_scheduler_run() is called, it loops through all tasks in registration order (that’s the priority!) and checks: has enough time passed since this task last ran?
Task: "LED_blink" (period=500ms)
time ──────────────────────────────────────────────────────▶
│ │ │ │ │ │
0ms 100ms 200ms 300ms 400ms 500ms
│ │
└── task runs ─────────────────────────────────── └── task runs again
If yes, the task function is called. If no, skip to the next task.
profiling and overrun detection
Super simple: the scheduler takes the time before and after the task ran.
If a task takes longer than one tick period, that’s an “overrun”. The scheduler logs OR- to UART so you know something’s hogging the CPU. This is useful to just visually check when some tasks take too long or behave in non-constant ways.
the main loop
The main loop is dead simple:
while (1) {
lelu_scheduler_run(); // run any tasks that are due
while (!tick_pending) { // nothing to do?
__WFI(); // sleep until next interrupt
}
clear_tick(); // acknowledge the tick
}
The __WFI() (Wait For Interrupt) instruction puts the CPU to sleep until the next interrupt fires. This means the MCU isn’t burning cycles in a busy loop - it wakes up, does its work, and goes back to sleep. I think that there are better ways to do that epecially on low-power STM32 lines.
quick start
1. Include the header
#include "lelu_scheduler.h"
2. Hook into HAL_IncTick
Add the scheduler systick call to your HAL_IncTick() function:
void HAL_IncTick(void)
{
uwTick += (uint32_t)uwTickFreq;
lelu_scheduler_systick(); // <-- this is the magic line
}
3. Initialize and run
#include "lelu_scheduler.h"
// task functions
void task_blink_led(void) { HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5); }
void task_read_sensor(void) { /* read your sensor */ }
int main(void)
{
HAL_Init();
SystemClock_Config();
MX_GPIO_Init();
MX_USART2_UART_Init();
// initialize scheduler (pass UART for debug output, or NULL)
lelu_scheduler_init(&huart2);
// register tasks with name, handler, period in ms
lelu_scheduler_add_task("LED_blink", task_blink_led, 500, NULL);
lelu_scheduler_add_task("Sensor", task_read_sensor, 100, NULL);
// signal that boot is complete (enables overrun detection)
lelu_scheduler_set_boot_done();
// main loop
while (1)
{
lelu_scheduler_run();
while (!lelu_scheduler_tick_pending()) { __WFI(); }
lelu_scheduler_clear_tick();
}
}
the blinky example
The classic “hello world” of embedded systems, but with two LEDs blinking at different rates:
/* Task Functions */
void task_blink_led1(void) { HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5); }
void task_blink_led2(void) { HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_6); }
int main(void)
{
// ... HAL init ...
lelu_scheduler_init(&huart2);
lelu_scheduler_add_task("LED1_fast", task_blink_led1, 250, NULL); // 2 Hz
lelu_scheduler_add_task("LED2_slow", task_blink_led2, 1000, NULL); // 0.5 Hz
lelu_scheduler_set_boot_done();
while (1)
{
lelu_scheduler_run();
while (!lelu_scheduler_tick_pending()) { __WFI(); }
lelu_scheduler_clear_tick();
}
}
Connect a serial terminal at 115200 baud and you’ll see:
[LELU] Scheduler initialized (max 8 tasks, 25ms tick)
[LELU] Added task 'LED1_fast' (id=0, period=250ms)
[LELU] Added task 'LED2_slow' (id=1, period=1000ms)
[LELU] Boot done, scheduler active with 2 tasks
configuration
Configuration happens via preprocessor defines. Set these before including lelu_scheduler.h:
#define LELU_MAX_TASKS 4 // only need 4 tasks, save some RAM
#define LELU_TICK_PERIOD_MS 10 // 10ms tick for finer resolution
#include "lelu_scheduler.h"
| Define | Default | Description |
|---|---|---|
LELU_MAX_TASKS |
8 | Maximum number of tasks (~32 bytes each) |
LELU_TASK_NAME_LEN |
20 | Max characters for task names |
LELU_TICK_PERIOD_MS |
25 | Base scheduler tick period in milliseconds |
Choosing LELU_TICK_PERIOD_MS: ideally use the GCD of all your task periods. Tasks at 100ms and 500ms? Use 100ms (or 50ms, 25ms). Tasks at 30ms and 50ms? Use 10ms. Lower values = more responsive but more CPU overhead.
execution profiling
Call lelu_scheduler_print_stats() and you get a breakdown of how each task is performing. See this output from my current lampy3 project:
lampy3 scheduler statistics output (text)
[LELU] Task Statistics (total_ticks=2113727)
----------------------------------------
button: total=1ms period=20ms RUNNING
compute: total=256818ms period=20ms RUNNING
i2c_send: total=1011470ms period=20ms RUNNING
temp_flag_chk: total=1916ms period=1100ms RUNNING
temp_read: total=872ms period=9700ms RUNNING
temp_report: total=1518ms period=15300ms RUNNING
fault_detect: total=4056ms period=44700ms RUNNING
stats_report: total=5763ms period=30100ms RUNNING
----------------------------------------
cooperative scheduling: keep tasks short
This is the main gotcha. Since tasks aren’t preempted, a long-running task blocks everything else:
// BAD - blocks the entire scheduler for 1 second!
void bad_task(void) {
HAL_Delay(1000);
do_something();
}
// GOOD - use a state machine, let other tasks run
void good_task(void) {
static uint8_t state = 0;
switch (state) {
case 0: start_operation(); state = 1; break;
case 1: if (operation_done()) state = 2; break;
case 2: finish_operation(); state = 0; break;
}
}
API reference
Initialization
| Function | Description |
|---|---|
lelu_scheduler_init(uart) |
Initialize scheduler. Pass UART handle for debug, or NULL |
lelu_scheduler_set_boot_done() |
Enable overrun detection. Call after setup is complete |
Task Management
| Function | Description |
|---|---|
lelu_scheduler_add_task(name, handler, period, &id) |
Register a new task |
lelu_scheduler_start_task(id) |
Enable a task |
lelu_scheduler_stop_task(id) |
Disable a task |
Execution
| Function | Description |
|---|---|
lelu_scheduler_systick() |
Call from HAL_IncTick() every 1ms |
lelu_scheduler_run() |
Execute ready tasks (call from main loop) |
lelu_scheduler_tick_pending() |
Check if a tick period has elapsed |
lelu_scheduler_clear_tick() |
Clear the tick flag after processing |
Statistics
| Function | Description |
|---|---|
lelu_scheduler_print_stats() |
Print all task statistics via UART |
lelu_scheduler_get_stats(id, &stats) |
Get stats for a specific task |
lelu_scheduler_get_total_ticks() |
Get total ms elapsed since init |
memory footprint
| Component | Size |
|---|---|
| Per task | ~32 bytes |
| Global state | ~16 bytes |
| Debug buffer | 128 bytes |
| Total (8 tasks) | ~400 bytes |
Quite light for what you get. I do not have a way to think about CPU overhead, but I assume it’s very low.
installation
Option 1: Git submodule (recommended)
cd your_project/Core
git submodule add https://github.com/atelierlelu/lelu_scheduler.git
Then add lelu_scheduler/include to your include paths and lelu_scheduler/src/lelu_scheduler.c to your sources.
Option 2: Just copy the files
cp lelu_scheduler/include/lelu_scheduler.h your_project/Core/Inc/
cp lelu_scheduler/src/lelu_scheduler.c your_project/Core/Src/
license
MIT.