STM32 Development and Remote Debugging
STM32 Development
Prerequisites
- Get free account at https://my.st.com
- Get an STLink adapter or a Nucleo Board with embedded STLink adapter for flashing and remote debugging
- Linux (better with libusbx >= 1.0) or Windows or Mac
- An STM32 chip, of course
Total cost starts at: adapter ~3€ and Chip ~1€
Installation
- Login with your account
- Go to https://www.st.com/en/development-tools/stm32cubeide.html and download STM32CubeIDE for your OS (Linux/Mac/Windows)
- Unzip the download and install by starting the sh file as root
unzip en.st-stm32cubeide_1.3.0_5720_20200220_1053_amd64.rpm_bundle.sh.zip sudo sh st-stm32cubeide_1.3.0_5720_20200220_1053_amd64.rpm_bundle.sh
- If this fails on linux because libusbx is missing, then repeat, and while you are asked for first license accept, do this:
cd st-stm32cubeide_1.3.0_5720_20200220_1053_amd64.rpm_bundle.sh.root/ rpm -ihv --nodeps st-stlink-server-1.3.0-4-linux-amd64.rpm
Prepare the STM32 chip
In my case an STM32L011D3 (TSSOP14), YMMV
- Pull down Pin 1 (Boot0) to Gnd (10k)
- Connect your STLink (mine is on a Nucleo F401RE, Connectors CN4 and JP1) to your STM32, Ground first:
* CN4-3 - 9 (VSS aka ground) * CN4-2 - 14 (SWCLK) * CN4-4 - 13 (SWDIO) * CN4-5 - 4 (NRST) * JP1-1 - 8 (VDD)
- Connect an LED between PA0 (anode, +) and VSS (Kathode, -)
First Steps (Blink)
- Start the IDE
- Create new STM32 project
- Select your chip from the query form
- Give the project a name, rest default, select Finish
- CubeMX starts and shows your chip (this helps configuring the clocks and pins)
- On first tab (Pinout and Config) select System Core/SYS and Check Debug Serial Wire
- Select pin PA0 and select GPIO_Output as function
- Select Save and allow generationg of code, if asked
 
- Open file Core/Src/main.c
- Add the two HAL lines below, right after the while (1):
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
  HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_0);
  HAL_Delay(200);
- Click on Run (green circle with white triangle)
Code should get compiled, flashed and started now: the LED blinks
Debugging
To debug your code, select Debug (the small green bug) instead of run. The code will be flashed, started and then paused before the first instruction in main(): HAL_Init() Some options:
- Step over (F6) -> Execute the next statement, but dont descend into functions
- Step Into (F5) -> Execute the next statement, if it is a function, go to its first instruction
- Set a breakpoint (Double click before the line number) -> Execution will pause before that line
- Resume (F8) -> Code runs until it hits a breakpoint
- Suspend -> Pause code wherever it is right now
While the execution is suspended, you can examine variables, memory and registers - no need for printf :)
Hints
HAL
HAL is the Hardware Abstraction Layer for ST chips. It is aiming at giving the same high level interface on all STM32 chips. Drawback is resource usage: The blink example alone is already using 5432 bytes. Not good if you only have 8192, like me with the STM32L011D3. You can't even enable all functions like STI, I2C and UARTs before hitting a limit.
LL
LL is the low level programming API for ST chips. It is much closer to programming the registers, but much more conservative with resources. To use it, open the projects ioc file (where the pin config is stored). Select tab Project Manager, subsection Advanced Settings and change HAL to LL for each component in the Driver Selector. Now replace the HAL functions with LL functions. E.g.
- HAL_Delay(100) -> LL_mDelay(100)
- HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_0) -> LL_GPIO_TogglePin(GPIOA, LL_GPIO_PIN_0)
Now the blink code is less than half the size (2648) for the same function!
Clocks
So far the chip ran with its default speed of only 2MHz. For more (up to 32MHz for my L0), configure it in CubeMX: open the iot file and select Clock Configuration. What I know so far is these options:
- MSIRC: internal medium speed (default)
- HSI: internal high speed (16MHz RC that can be multiplied by PLL to up to 32MHz)
- HSE: external quarz for more precise speed than internal RC
- LSE: low power speed for RTC
To use 32MHz HSI, select PLL Source Mux HSI16, *PLLMul x4 and /PLLDiv /2, Then System Clock Mux PLLClk. This is then used as basis for CPU clock and peripheral clocks. Just consider that faster also means more power usage, so keep it as low as possible for your application for all components.
SPI
The TSSOP14 of the STM32L011D3 has not enough pins for SPI and online debugging (Debug Serial Wire). They share the same pins and you can only have one at runtime :(
UART
Uarts are the serial ports. You can send any data over one line and receive over another. Blocking or asynchronous. Here is just the simplest form: blocking send.
- Enable LPUART1 or USART2 or whatever port you want and does not collide with already used pins in CubeMX (Pinout & Configuration, Conectivity)
- As Mode select Half Duplex (can send/receive synchronously)
- In Parameter Settings select communication parameters like (common values in parantheses)
- Baud Rate (115200)
- Word Length (8)
- Parity (None)
- Stop Bits (1)
- Data Direction (For sending only debug prints "Transmit Only" is enough)
 
- Generate the code adds a in MX_LPUART1_UART_Init() to your main()
- To send over lpuart1, just call
HAL_UART_Transmit(&hlpuart1, buffer, bufferlength, HAL_MAX_DELAY);
Again, quite resource hungry. Just the HAL init and one send is near the 8k flash limit. So let's try LL API:
- In CubeMX Project manager switch LPUART driver from HAL to LL
- Regenerate sources and change in HAL_UART_Transmit(&hlpuart1, "Hello, serial world!\n", 21, HAL_MAX_DELAY); to its LL equivalent:
LL_LPUART_Enable(LPUART1);
const char *str = "Hello, serial world!\n";
while( *str ) {
  while( !LL_LPUART_IsActiveFlag_TXE(LPUART1) ) ;
  LL_LPUART_TransmitData8(LPUART1, *str++);
}
while( !LL_LPUART_IsActiveFlag_TC(LPUART1) ) ;
LL_LPUART_Disable(LPUART1);
Et voila: longer source code (that you can pack into a function) but only 5k flash.