As explained in https://www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf
"To preprocess the corpus and create fast run-time indices we used Hadoop. UIMA annotators were easily deployed as mappers in the Hadoop map-reduce framework. Hadoop distributes thewhich is exactly what Behemoth does (how very reassuring!).
content over the cluster to afford high CPU utilization and provides convenient tools for deploying, managing, and monitoring the corpus analysis process."
The article also mentions UIMA-AS and it is not entirely clear what part of the system uses what : is UIMA-AS used for the runtime analysis of the questions and Hadoop for the background learning?
Would be interesting to know what sort of UIMA annotators were used internally for the analysis of the text and, more importantly from Behemoth's point of view, whether it could have been used for this project and/or what features would have been required to get it to work on DeepQA.