Firmware for MTL X-Ray detector
Firmware for novel MTL X-Ray detector with built-in CUDA image processing and efficient streaming capabilities
Customer:
MTL Medical Technologies Limited (Russia)

Links and publications:
X-Ray image processing and tomographic reconstruction on GPU, page 40 (VIII Russian X-Ray producers conference)
Operating systems:
Linux

Platforms:
Arm64

Technology stack:
C++ 14–17, CUDA

Languages:
English, Russian

Firmware description

Integrated firmware controls detector hardware, acquires image parts from separate X-Ray sensors, composes them into a single output image, processes and sends it to the host machine.

Built-in CUDA based image processing pipeline allows real-time image corrections such as dark frame subtraction, gain, defects and sensor gap corrections, levels and gamma tuning, and HDR image construction. The pipeline allows the usage of external processing modules which can be combined into a processing chain without the need of firmware modification.

Acquired images can be stored locally on the device, directly viewed by HDMI connection, send to a host machine via GigE Vision protocol or streamed to the built-in web interface.
Key features

  • Integrated flexible CUDA based image processing pipeline
  • Ability to use custom processing modules
  • High processing performance up to 60 fps on large 3624x4560 (16 bit) images
  • HDR image acquisition mode support
  • Fast in-house GigE Vision protocol implementation with no intermediate memory copying and up to 10 Gb link support
  • On-the-fly GeniCam features generation depending on device configuration and settings
  • Web interface with real-time video streaming
  • Direct HDMI output with zooming/padding capability
Development process

Image acquisition, processing and further sending to a host machine can be quite challenging because of large image sizes (up to 3624x4560, 16 bit) and limited compute capability of the embedded hardware. Therefore the main efforts were directed at optimizing the image processing pipeline, reducing the intermediate copying of large data and optimally parallelizing all internal processes.

Most of the image processing steps are performed using CUDA in a single pass which ensures high performance up to 60 fps. At the same time, the processing pipeline supports external customizable processing modules that can be configured and used in real time.

A single image buffer pool is used for all stages, from acquiring parts of an image to processing and streaming. This approach ensures that there is no intermediate copying of images, which leads to high performance.

The fast in-house implementation of the GigE Vision protocol optimizes network usage and allows connection speeds up to 10 Gbps without frame drops.

The introduction of a central storage of GeniCam functions gives the user full control over all device functions. The .xml file with feature list is generated at runtime, allowing it to be automatically configured according to the device's configuration and settings.