Medical reports including clinical information and patient reports are currently provided in heterogeneous formats and structures.
Accordingly, there is a need in the art to gain access to such medical reports for the purpose of using their textual contents by machine-operated systems. Addressing the overall aim of providing means for the seamless integration of patient data it is particularly important to identify information units within these reports which are of high relevance for later diagnosis decisions.
However, clinicians are usually required to provide a comprehensive medical report of any finding in order to verify the completeness of a diagnosis. Besides pathological findings, which are of particular interest for later diagnosis decisions, medical reports usually also include non-pathological findings. Thus, the requirement of comprehensiveness eventually results in extensive texts in which relevant information units are usually difficult to identify.