It is common for today's enterprise networks to comprise scattered arrangements of different hardware and software systems. This is due to the ever-changing data management needs of corporate enterprises, and to continuing advances in the computing hardware and software available to meet those needs. Commonly, different entities within an enterprise (e.g., different departments or work sites) have disparate software applications, groupware systems, or data maintenance architectures/procedures, such that information created or maintained by one entity is not usable by another entity.
Corporate portals, also referred to as intranet portals, have been introduced to increase the accessibility and usability of information stored across heterogeneous systems of an enterprise network. A corporate portal, which is usually overlaid onto an existing enterprise network, is designed to extract content from disparate systems on the enterprise network and to make that content searchable. The corporate portal is further designed to subdivide the content into taxonomic categories useful to the enterprise, and to allow individual users to access the content using an intuitive, customizable user interface. The customizable user interface is usually web-based to enhance its intuitiveness.
As part of the customization process, the user may configure their display to include one or more portal processing objects. Portal processing objects are adapted to access, process, and display content in a predefined manner appropriate for a class of user. For example, a company executive might import a first portal processing object into her display that illustrates certain sales data in a quarterly summary format. In contrast, a field sales supervisor might import a second portal processing object into his display that shows a real-time “streaming” version of that sales data.
A corporate portal comprises a database, such as a relational database, for storing the organizational schema of the enterprise. To maintain a taxonomic structure for organizing access to content, the corporate portal stores the names of categories and the references to content associated with those categories in a database. Due to storage limitations, and the problems associated with replicating content to a single format, a corporate portal may not store all of the content accessible from the portal. Rather, only references to that content are stored.
A corporate portal further comprises a search engine to automate the organization of text-based content. The search engine allows users to perform general searches for desired content contained across the many heterogeneous systems of an enterprise network.
A corporate portal further comprises a metadata-driven utility to automate the organization of different types of structured content. Corporate portals organize access to structured content including, for example, enterprise resource planning records, data warehousing reports, extensible markup language content, and groupware documents. Such content is often described by metadata that an application stores as columns, fields, or tags to describe the actual data, often for the benefit of other applications or an administrator. Using metadata to organize content provides for meaningful, integrated access to the many repositories of structured information that contain little text. Evaluating metadata also increases the precision of the organizing utility for semi-structured content, which consists of both metadata and text. Because metadata formats and terminology vary, the corporate portal is designed to recognize a wide variety of metadata. Once the metadata is understood in a standard form, the corporate portal can display that information with each reference, much like a card in a card catalog displays the author, subject and title of books in a collection.
Finally, a corporate portal incorporates web publishing tools similar to many document publishing and conversion tools. The portal publishes a taxonomy of content references, but may not necessarily publish the content itself in an HTML format. When a user clicks on a link, the portal may open desktop tools such as a Lotus Notes client or a database query tool instead of a web page, depending on the user's preference and/or privileges.
One example of a corporate portal is the Plumtree Corporate Portal 4.0 available from Plumtree Software, Inc. of San Francisco, Calif. Aspects of the Plumtree Corporate Portal 4.0 are described in a publicly available document, entitled “The Plumtree Corporate Portal 4.0: Technical White Paper,” which is available from Plumtree Software, Inc. and which is posted on their public web site at plumtree.com as of the filing date of this disclosure.
Desirable attributes of a corporate portal system include the ability to extract information from a wide variety of different formats (e.g., MICROSOFT OFFICE™, LOTUS NOTES™, VISIO™, HTML, ADOBE PDF™, etc.), and the ability to organize access to that information in a meaningful manner appropriate for the enterprise. A further desirable attribute is high extensibility, i.e., the ability to be extended to accommodate and bring together many heterogeneous enterprise network systems. In addition to being important during high corporate growth periods or merger activity, extensibility is also important for “future-proofing” of the corporate portal, such that information systems and data formats not currently in existence can be accommodated in the future. Yet another important attribute is security, wherein access to sensitive internal information is selectively brokered among different classes or types of employees, or selectively provided to certain users outside the company.
Finally, another crucial feature of a corporate portal is ease of administration. It is important that the person or persons administering and maintaining the corporate portal be provided with easy-to-use tools for keeping the corporate portal up-to-date, secure, and comprehensive, without requiring extensive manual upkeep and intervention. Stated another way, manual effort by a portal administrator should not be wasted on excessively tedious or repetitive tasks.
FIG. 1 shows a corporate portal configuration in accordance with the prior art, comprising a corporate portal system 102 overlaid onto an enterprise network 104. It is to be appreciated that enterprise network 104 of FIG. 1 is a simplified example, and that the preferred embodiments described herein are applicable to enterprise networks that may comprise many heterogeneous sub-networks or domains across many different work groups or spanning many different work sites connected by a wide area network. It is to be further appreciated that while different domains are often physically collocated or intermingled in practice, depending on their purposes and distinctions, different domains herein are shown as physically separate collections for clarity of presentation. In the simplified example of FIG. 1, enterprise network 104 comprises a communications backbone 106, a marketing domain 108, an engineering domain 110, and one or more miscellaneous nodes 112 (e.g., receptionist, executive, etc.). Enterprise network 104 is usually connected to the Internet 114. Marketing domain 108 comprises marketing groupware 109, a marketing archive 118, and a plurality of user terminals 120, such as personal computers, for use by marketing personnel. Engineering domain 110 comprises engineering groupware 122, an engineering archive 124, and a plurality of user terminals 126, such as personal computers or engineering workstations, for use by engineering personnel.
Generally speaking, due to different computing needs and/or different evolutionary paths within the enterprise, the marketing domain 108 and engineering domain 110 often contain vastly different types of computing hardware and software. Thus, for example, the marketing domain 108 may be based on a Lotus Notes groupware architecture, whereas the engineering domain 110 may be based on a Windows NT network platform. Each of these domains will usually maintains its own lists of users and groups in its own distinct format. Still other domains (not shown) of the enterprise network 104 may maintain lists of users and groups according to a standardized format such as LDAP (Lightweight Directory Access Protocol). As known in the art, user refers to a particular person using the system (e.g., Bob, Mary, Steve, etc.), while a group refers to a logical collection of users (e.g., Executives, Engineers, Marketing, Company_Picnic_Committee, etc.). As used herein, “external users” and “external groups” refer to users and groups as identified by their native domains. In contrast, “portal users” and “portal groups” refers to users and groups as identified by the corporate portal system 102.
Corporate portal system 102 comprises a web server 128, a portal processing object server 130, a job server 132, a first data storage server 134, and a second data storage server 136 coupled as shown in FIG. 1. It is to be appreciated that the elements 128–136 are shown on separate hardware systems and coupled over a network due to practical implementation requirements. However, the different elements of corporate portal 102 may be implemented on different combinations of hardware and networking connections, and may even be implemented on a single computing machine, although such a configuration is generally not recommended for performance reasons.
Portal processing object server 130 has a connection to web server 128 for performing the role of serving requests from portal processing objects being executed for portal users. Job server 132 comprises a search engine 138 and a file crawler 140. Data storage server 134 comprises a relational database 142 into which is stored directory tables 146 comprising metadata about selected objects (e.g., documents, databases, executables, and other objects) contained in the enterprise network 104 (hereinafter referred to as “external objects” because they are external to the corporate portal). The contents of directory tables 146 forms a metadata object corresponding to each external object, often referred to as a card for that external object. Relational database 142 further comprises an access control list 144 comprising, for each external object, a list of the portal users and portal groups that may access that object. The second data storage server 136 comprises a text index 150 used in conjunction with the search engine 138, and a set of portal processing object snapshots 152 for use by the portal processing object server 130.
In operation, when the file crawler 140 (which may comprise Notes, Exchange, web crawlers, etc.) discovers a new document (or other new object) in the enterprise network 104, the document is given an object identifier (OID). The document is also text-indexed by search engine 138, with the resulting data stored in text index 150. Also, metadata corresponding to the new document (e.g., title, location, author, creation date, type, and many other attributes) is stored in directory tables 146 to form a metadata object (i.e., card) 148 for that document. For clarity of explanation, the examples presented herein are for a common situation in which newly discovered object is a document, such as a word processing file, HTML file, spreadsheet file, or the like. It is to be appreciated, however, that the newly discovered object may generally be any type of object (e.g., XML file, executable, database file, directory object, groupware related reference files, executables, or objects, etc.) in accordance with the preferred embodiments described herein.
Finally, after indexing and metadata object creation, the access control list 144 is updated to include the object identifiers (OIDs) of the portal users and portal groups that may access that document. For clarity of explanation, the presence of an OID associated with a document, user, group, or other object is established herein by reference to the object name itself. Thus, for example, a reference to a user (cn=John, OID=0xf9c6b332) shall simply be “John,” it being understood that the corporate portal system will actually be storing or manipulating the OID corresponding to John. When a user logs onto the corporate portal and performs a search, a first set of documents may satisfy their search parameters. For each document in that set, the access control list 144 is checked to see if that portal user has access permission to that document, or if that user is a member of a portal group having access permission to that document. The portal user is only presented with a listing of documents for which they have access permission.
More particularly, the portal user is presented with information selected from the metadata object corresponding to that document (e.g., document title, author, abstract, hyperlink to the document, etc.). The portal user may then instantiate a document viewing session, in which web server 128 accesses the document from its location on the enterprise network 104 and presents it to the user in a browser window. Alternatively, depending on the configuration of that specific corporate portal for that user, the user may be required to separately log in to the domain containing the document (using an external user name and password) and view it using their own desktop software applications. Many other scenarios are possible depending on the configuration of the corporate portal 102 and the enterprise network 104. For purposes of the present disclosure, it is mainly important to note that documents to which a portal user does not have access permission are kept invisible to that portal user by the corporate portal 102.
Problems arise in the administration of the prior art corporate portal 102 with respect to the administration of the access control list. In particular, a disadvantageous trade-off is presented between proper document security settings in the access control list versus the amount of corporate portal system administrator time and effort required to maintain it. According to the prior art of FIG. 1, the file crawling sessions that discover new documents and populate the access control list (often referred to as “crawls”) are administered by content managers 154 and 156. For example, the content manager 154 may be responsible for marketing content present on the corporate portal system, while the content manager 156 may be responsible for engineering content present on the corporate portal system. Content managers 154 and 156 are expected to have a close connection to the content and users in their particular area of responsibility. More particularly, content managers 154 and 156 are expected to have a practical knowledge of the external users and external groups that may be associated with documents' native security settings, and a practical knowledge of how those settings should be reflected in the corporate portal security system with respect to portal users and portal groups.
FIG. 2 shows steps for crawling and assigning document security settings in accordance with the prior art. At step 202, the content manager configures crawl parameters. For example, the content manager will specify the domains, directories, file types, etc. for a crawl. The content manager may specify that the crawl be executed on a one-time basis, on a regular periodic basis (e.g., nightly, weekly), or according to a custom schedule. At step 204, while configuring the crawl parameters, the content manager specifies the security settings for the documents that will be imported by that crawl, in particular specifying the portal users and portal groups that will have access to the imported documents. After crawling begins, at step 206 the file crawler finds a new document meeting the crawl parameters, and at step 208 the document is imported into the corporate portal through the generation and storage of an associated metadata object in the relational database 142. At step 210, the document is “stamped” with access permissions, i.e., the access control list 144 is populated for that document according to the pre-specified portal users and portal groups specified by the content manager. At step 212, the content manager reviews the crawl results, and may manually change access settings in the access control list 144 if required.
Disadvantageously, the method of FIG. 2 results in corporate portal security settings having limited precision and extensibility. First, the security for each document imported into the corporate portal is extrinsically dictated by the corporate portal system itself (via the content manager), rather than by external network administrator or external user who created the document. In large corporations, the content manager may be substantially removed from an understanding of the access control required for documents in a given location. While the content manager might attempt to access and emulate the local security settings for documents, this would be manually intensive task that is made even more difficult by the many heterogeneous security systems in the enterprise network. Often, the content manager will take the “safe road” and provide very limited access to crawled documents (e.g., by doing per-domain crawls and only allowing portal groups corresponding to that specific domain to view the document). This can defeat the very purpose of the corporate portal system, which is to enhance intelligence and best-practices sharing among the enterprise network users. Alternatively, the content manager might “throw their hands up” and allow every portal group to see the document, which might compromise corporate security policies. Accordingly, if the content administrator is unwilling or unable to shoulder an intensive, laborious workload in properly keeping up the access control list, the appropriateness of the settings in the access control list suffers.
Furthermore, the extensibility of the corporate portal security settings is limited in the prior art method of FIG. 2. It is often the case that entire domains, users, and groups are added to the corporate portal all at once (e.g., in a corporate acquisition or merger scenario). In such a situation, members of newly added portal groups will generally not be able to view currently existing documents in the corporate portal system, unless a new set of crawls is performed, wherein new custom parameters must be added specifying which of the newly added portal groups should see the documents, in addition to any old custom parameters for current portal groups. Alternatively, the content manager may individually “stamp” the newly allowed portal groups onto the cards of the existing documents. Either of these scenarios represents a substantial, administration-intensive task. Similar tasks would also need to be performed on new documents from the added domain with respect to portal groups that were already in existence.
Accordingly, it would be desirable to provide a corporate portal in which security settings are more easily administered.
It would be further desirable to provide a corporate portal in which object security settings are established and maintained with increased precision and relevance.
It would be still further desirable to provide a corporate portal security system that is more extensible and does not require excessive manual intervention upon the addition of new domains, users, or groups to the enterprise network.