The objective of Unicore Solutions team was to create an efficient multi-GPU implementation of classical Feldkamp cone-beam algorithm. No quality trade-offs were allowed. The first focus was quality and only then speed. We were targeting large datasets (≥ 5K), while keeping efficiency on smaller ones.
The module is integrated in Neoscan microCT software and support all common options, like 360/180+ scans, beam hardening, ring artifacts, smoothing, misalignment compensation, ROI, etc.
Our optimization goals were to create a CUDA multi-GPU and cluster ready solution. We have maximized performance of all computational stages, by efficiently balancing load of different GPU subsystems like memory bandwidth, caches, texture blocks and SMs. We have also implemented an optimal asynchronous direct disk IO with no slow external codecs for main formats.