There are two main optical character recognition (OCR) approaches:
1: line detection->character segmentation->character classification->string formation.
2: sliding window character detection/identification->line detection->string formation
The 1st approach fails if line detection or character segmentation fails. Typically, segmentation is very difficult to do robustly, especially since we try to do segmentation before recognition (we don't take advantage about knowledge of the objects to recognize—we try to “detect” without “recognizing”, which is hard).
The 2nd approach can be more robust because it avoids having to do a segmentation/detection. Instead we first find candidate recognitions of characters throughout the image, and then assemble the candidates into the most likely string(s). The main difficulty with this approach is that the character recognition may generate false positive detections which must be filtered out.
Most OCR approaches are based on an initial (character) segmentation.