The present invention relates generally to distributing content (text, images, etc.) on a network, and more particularly to distributing content to multiple target sites having different site hierarchies and/or different content layouts.
The Internet is a worldwide collection of cooperating computer networks. A user typically accesses the Internet through a xe2x80x9cclientxe2x80x9d computer. The client computer communicates with a xe2x80x9cserverxe2x80x9d computer on a remote computer network using telephone, ISDN, or T1 lines or similar physical connections. The server computer may download content to the client computer for viewing or execution by the user.
The client and server computers communicate through software protocols, such as File Transfer Protocol (FTP), Simple Mail Transfer Protocol (SMTP), Hypertext Transfer Protocol (HTTP), Gopher, etc. Currently HTTP is the most widely used protocol and is used for accessing the World Wide Web. The World Wide Web is a portion of the Internet consisting of server computers, also referred to as xe2x80x9csites,xe2x80x9d which make multi-media documents (often referred to as content) generally available for downloading or retrieval by individuals having a computer with Internet access. The documents are generally created in Hypertext Markup Language (HTML) and displayed on a user""s computer.
One advantage of the Internet is immediate and up-to-date access to information, such as news, sports, stock prices, etc. Many sites, such as MSNBC, ZDNet, SNAP, etc., publish content several times per hour. Users viewing such content know that they are viewing the latest information available on any given topic. Many of the sites have staff reporters that create some of the published content, but other content is purchased from third-party news sources. For example, MSNBC sells content to other content publishers, such as MSN, ZDNet, etc., for display on their sites. Additionally, major newspapers, such as the Wall Street Journal, and other news publications sell content to sites.
Each of the sites is organized differently and/or has a different look and feel for displaying content. That is, different sites may have different section hierarchies (e.g., headings and stories under the headings) and different story layouts (arrangement of words, images, etc. for the stories). The different look and feel is particularly apparent when content designed for a full-size monitor is displayed on a hand-held device. Typically, content for a full-size monitor has one or more pictures associated with the content, while a hand-held device uses only text when displaying content.
A problem with purchasing content from third-party sources is that the content has the look and feel of the source site (the site supplying the content), rather than the target site (the site receiving the content). Consequently, a site supplying content must quickly reformat the content before sending it to a target site so that the content has the look and feel of the target site. Additionally, the source site must determine where the content should be located in the site hierarchy for the target site. For example, the same story may be located under a section xe2x80x9cother sportsxe2x80x9d on one target site and under xe2x80x9ctennisxe2x80x9d on another target site.
Another problem is that target sites have restrictions on content that can be displayed based on licensing agreements with third-party content providers. Thus, a source site must ensure that the target site does not receive content that it cannot legally display.
Currently, the necessary work required to supply content to multiple target sites is labor intensive. There are publishing tools, such as Adobe Pagemaker 6.5 Plus, that allow content to be saved as different file types, such as HTML or PDF. However, selecting these file types does not affect the look and feel of the content. Instead, it allows the content to be displayed using different software applications. For example, an HTML document can be displayed in a browser, while a PDF document can be displayed in Adobe Acrobat. However, both software applications display essentially identical documents in terms of the document""s layout.
It is desirable, therefore, to provide a system and method wherein content is distributed to multiple target sites taking into account the target site""s hierarchy and/or look and feel.
A method and apparatus is disclosed that allows for republication of content to multiple target sites in a customized format for each target site. Users on client computers can then view the customized content by connecting to the target sites through the Internet or other network.
In one aspect, a source site receives content (e.g., stories) from a journalist client computer and automatically republishes the content to multiple target sites. The source site includes a database engine having stored procedures thereon containing the custom section hierarchy of the target sites. The custom section hierarchy for a given target site is used to map a story on the source site to any desired section on the target site. Thus, different target sites can have the stories displayed under different section headings.
In another aspect, the source site includes a difference engine that analyzes time stamps and a browse order of stories already published to determine whether a story needs to be republished. Typically, a source site is instructed to republish an entire section of stories even though some of the stories have not changed on the target site. The difference engine determines the stories in the section that have changed by analyzing a master index file indicating the current state of the target site. Only stories that have changed since the last publication are republished.
In another aspect, the story layout for each target site is customized. Style sheets containing customization information for the stories are used for the customization. The style sheets can be used to customize the location of a picture, add a header, add a deck, customize the font, etc. In the illustrated implementation, the stories are converted to an XML file type and XSL style sheets are used for the customization. Other file types and style sheets also may be used.
In yet a further aspect, a status table is used to track each republication job as content is customized for each target site. The status table allows multiple jobs to be republished simultaneously either through parallel processing or multithreading.
Further features and advantages of the invention will become apparent with reference to the following detailed description and accompanying drawings.