PDF files or documents generally include various types of data, which can be of value to consumers of the PDF files. This data can be as simple as the text and images rendered on the screen by a PDF viewer, but can also include resources such as file attachments, form data, annotations, metadata, and bookmarks. The PDF format is also increasingly being used as a container for a variety of presentation formats. Many of these alternative presentation formats are defined by HTML-related material including Cascading Style Sheets (CSS), HTML files, images, Javascript, Java Script Object Notation (JSON), etc. Although these PDF resources have value to a variety of applications, there are relatively few tools available that allow a user to edit all aspects of a PDF file. For example, although PDF files can contain HTML content, existing HTML authoring tools have no knowledge of internal PDF file formats. This is generally due to the fact that, in order to extract or edit such content in a PDF file, application developers must create, purchase, or otherwise include relatively sophisticated PDF parser libraries in their applications. These PDF parser libraries are typically written in low-level languages such as C, C #, C++, and Java, which tend to be complex and often require a higher level of programming skill to create and maintain. This can result in an application development process that is more complex and costly than would otherwise be the case, which in turn inhibits broad adoption and usage by those lacking the requisite technical skills.