I recently did an investigation into the use of Hadoop in medical image analysis where the aim was to automate the calculation of lung volume from 3D CT thorax. The project was successful (details outlined in previous link), however what became apparent over the course of the investigation is that there are many ways to attempt – and perhaps even achieve – image analysis.
Each digital image processing algorithm has its own set of characteristics. Some are easier to parallelise; some are faster; some are more accurate (processing algorithms are sometimes optimisations rather than perfect results); some are greedy; some are more computationally demanding; some are more memory intensive. In addition, algorithms often need to be pipelined to achieve the required result, bringing together a number of varying characteristics in a single process.
This huge degree of variation means that it is a non trivial task to create an environment/infrastructure optimised for consistently processing images in the most efficient way. That said – if it was easy, it would be boring!