Current methods available for processing information (e.g., records of merchandise for sale) from websites usually begin with human identification of headers of the records. For example, the identification may be processed by a human operator manually entering text into a computer-readable file or program. In some instances, a graphical interface may be utilized allow the human operator to “point and click” on an appropriate record bounding box, which may be interpreted by a computing system to identify record headers. If the human operator identifies two or more records, the computing system may collect information about record headers based on the records identified by the human operator. These techniques, however, are not appropriate when sheer scale makes human identification either inefficient or practically impossible. Further, websites are identified programmatically on-the-fly such that human intervention may not be able to address ahead of time.