A large and growing population of users enjoys entertainment through the consumption of media items, including electronic media, such as electronic books (also referred to herein as ebooks), electronic newspapers, electronic magazines, and other electronic reading material. Users employ various electronic devices to consume such publications. Among these electronic devices are electronic book readers, cellular telephones, personal digital assistants (PDAs), smart phones, portable media players, tablet computers, electronic pads, netbooks, desktop computers, notebook computers, and the like.
Books (e.g., hard copy books or electronic books which do not contain formatting or region type information, such as portable document format (PDF) books) are often converted into other electronic media item formats (e.g., a Mobipocket (MOBI) format) in order to provide the book as an electronic media item to electronic devices (e.g., user devices). When the books are converted to different electronic media item formats, the converted books are often just scanned images of the pages of the book. In some instances, the optical character recognition may be performed on the images of the pages of the book to extract the text of the pages.