The fast-paced development of a geographic information system (“GIS”) has triggered some researchers to reconsider the fundamental essence of GIS and its social implications. GIS (sometimes described by the term “media”) has been widely used in various types of business, government and university projects. For instance, in North America alone, the value of the GIS market, even in a slow economy, can increase from $1.4 billion in 2001 to $2.0 billion in 2004. This market value will continue to expand because GIS is finding new markets on the Internet. Today, GIS serves as a means of communication by conveying information and knowledge to the public. This booming trend of GIS is potentially attributable to many, but at least two primary factors: the building of the Spatial Data Infrastructure (SDI) worldwide, and the dazzling development of computing technology and information technology in general.
During the past decade or so, the construction of SDI has proliferated geographically across all levels. It ranges from the National SDI in the U.S. to both the local SDI, such as state SDI and county SDI, and Global SDI. In the U.S., building SDI has involved almost every department of the federal government. Large volumes of geographic data, which are valuable to various organizations, have been accumulated mostly in a traditional, hierarchical manner: objects and their related attributes are collected and classified according to different themes or layers, and different layers are overlain to produce a specific map. To fully utilize the available spatial data efficiently and effectively, GIS has to play a critical role, not just in disseminating raw data, but also in providing information and offering value-added services to potential users.
However, to utilize and access valuable data, GIS-enabled environments have to be available to the public. The fast development of the Internet, especially the World Wide Web (WWW) and wireless communication, provides an ideal platform to empower the general public with the GIS technology through WebGIS and Location Based Service (LBS). The geospatial enablement of everyday tools (e.g., cars and phones) has provided the general public channels to access GIS environments almost anywhere and anytime. These developments, facilitated by the general advancement of computing technology, permit data and information sharing through SDI and different types of distributed systems. In all diverse systems, the common vital units are the computing components, which handle the access, processing and visualization of the geographic information, and the interactions between the users and the data. Therefore, a distributed GIS may be abstractly organized into different computing units, which are themselves connected through various types of networks (e.g., coaxial cable, optical fiber and satellite), and may be represented by the client/server computing model suggested by many scholars. According to different roles, a typical system broadly consists of three parts: a client, a server and a network. The client interacts with the users and performs some computing functions on spatial data. The server supplies data and information, and performs some value-added services to a client. The network hosts the transmission of information between the client and server.
Among various types of network GIS, including the LBS, WebGIS is extensively developed and widely used. It has accompanied the rapid development of the WWW during the past decade. Most users of the Internet have experience using WebGIS. Mapquest, Terraserver, Weather.com, and many other WebGISs have been widely used in, for example, online route selection, city planning, environmental exploration, watersheds management, land use planning, road/rail construction, business analysis, airport construction, and data integration and dissemination. These popular tools serve different types of users. To give the user a better view of data or information, 3D visualization, Virtual Reality Markup Language (VRML), and multimedia have also been integrated into certain WebGISs.
While WebGIS is gaining in popularity, dissemination of voluminous and heterogeneous data becomes a challenge, as the Internet bandwidth is not limitless. To handle this challenge, two important issues can be considered: (1) share and interoperate the heterogeneous data among different systems, different communities, and different users; and (2) improve the system performance so that data are delivered to the users within a reasonable time span. The OpenGIS Consortium (OGC) and Technical Committee 211 of International Organization of Standards address the first issue by providing a series of standardized interface specifications to allow different components of the system, including data, to support interoperability. Although the second issue has been addressed by various proposed suggestions, the research on performance of WebGIS has limitations on two issues: (1) most methods focus on only one aspect of the performance problem; and (2) most methods do not consider how the hierarchical structure of map, layer, object and attribute may affect performance.
WebGIS focuses on how to allocate both raster and vector data in a client-server-based web platform, as well as how to allocate functions to different system components in processing data to satisfy users' needs.
Raster Data
To handle raster data transmission, useful solutions may be borrowed from research and applications in image transmission in computer science. For instance, a progressive raster transmission technique has been frequently suggested. The basic idea of progressive transmission is to use image compression techniques to gradually extract and transmit raster data. After the compressed image is transmitted to the client, the image is gradually reconstructed on the client side. A simple progressive raster transmission technique randomly extracts and transmits the image without following a systematic algorithmic process. More sophisticated techniques for progressive raster transmission could be based on image compression techniques, such as Joint Photographic Experts Group (JPEG), wavelet, fractal, or a combination of techniques. Because of their complexity and their computing requirements, the progressive techniques are ideal for transmitting fixed-size images on the Internet. But, they do not have the flexibility to handle efficiently the transmission of large volume and variable image sizes in a WebGIS environment. However, the fundamental techniques of image compression may still be used to reduce the overall image transmission size.
A relatively large image may be extracted into different levels of detail to construct a hierarchical or pyramid structure. In each hierarchical layer, the image may be cut into pieces or tiles, which are logically connected through their respective coordinates. The image data may be transmitted in a load-on-demand manner (i.e., only the data of interest, as requested by the user), combined from the respective tiles and transmitted to the client side. TerraServer (available from the United States Geological Survey of Reston, Va.) uses this technique to assist SQL Server to manage images, but this approach is not applicable for managing pyramid in WebGIS, as most WebGIS cannot come with SQL Server. ArcGIS (available from ESRI of Redlands, Calif.) also adopts the pyramid technique in handling big images, but the pyramid is built every time the big image is accessed. This temporary approach is not suitable for managing images in WebGIS because the response time will be too long if the pyramid is built for every access. Therefore, elaborate permanent pyramid-management strategies need to be developed.
Some scholars suggest adopting tile pre-fetching and caching techniques to improve the performance on raster data transmissions. Unfortunately, this combined technique is effective for raster data only, and the complex nature of WebGIS involves the handling of both raster data and vector data, as well as the support of spatial analysis. Moreover, users may request raster images in a relatively random manner and therefore, the pre-fetching technique may not be efficient and effective in handling random requests.
Vector Data
The progressive transmission technique can also be used for vector data, but the process is different from the one applied to raster data in the pyramid structure. Vector data may be extracted using cartographic principles to construct multilevel or multilayer structures, instead of using a simple resampling process for raster data. A mesh scheme may mark a milestone for vector progressive transmission. By slightly modifying the topology of the input mesh, a higher compression ratio for transmitting Triangulated Irregular Network (TIN) data can be achieved. Also, an encoding structure may improve the efficiency of progressive transmission. Further, a model to generate multiple map representations and a set of generalization operators may be used. These processes may be good for transmitting single-layer maps, performing atomic topological changes on a vector map to achieve a better transmission performance and preserving the topology of spatial data. However, these processes do not take into account the de facto hierarchical geographic data organization of maps, layers, objects, and attributes. Therefore, these processes cannot be used generically to handle heterogeneous datasets in a WebGIS environment.
Another technique for improving the performance of vector data transmission is indexing. Indexing techniques have been widely studied mainly from two perspectives: (1) spatial object, attribute-based thematic indexing; and (2) spatial indexing. Thematic indexing is the process of indexing attributes such as addresses, postcodes, phone numbers and feature names. Attribute data may be efficiently processed using popular commercial databases. An R-tree index method has also been used for thematic indexing. Spatial indexing may be more complex than thematic indexing and may be classified into two general categories: (1) hierarchical access indexing (such as R-tree and Quad-tree); and (2) hash indexing (such as Grid-files and R-files). R-tree may be based on a feature's Minimum Bounding Box, which can be the minimum rectangle containing the feature. There are several extensions of R-tree (such as R*-tree and R+-tree) that allow dynamic indexing. To support multilevel data structure, Reactive-tree, PR-file and Multiscale Hilbert R-tree have been proposed.
These spatial index and thematic index research efforts may provide a basis for implementing vector data indexing. The indexing techniques mentioned above may be used on either the client and/or server side to improve the access to vector data, and related computing techniques.
Other Computing Techniques
Some have used pre-fetching and caching techniques for raster data transmission in systems exclusively handling raster data. In a WebGIS, which may involve both raster and vector data, caching could be used not only for transmitting data, but also for allocating data between a client and a server, especially for metadata and fundamental layer information. Multithreading techniques have also been suggested to improve performance by processing more than one task simultaneously. This technique may be very useful for handling server-side concurrent access, as well as improving client side interactive capability. When a server is opened to the public, many users may access the server simultaneously. In this potential massive access situation, the system may also be required to adopt cluster techniques for multiple servers to serve users, ensure reliability and improve overall performance.
Consequently, what is needed are a system and method for efficiently handling the transmission of large volume and variable raster and vector sizes in a WebGIS environment. In addition, such system and method would take into consideration the hierarchical geographic data organization of maps, layers, objects, and attributes to improve overall performance. Furthermore, it is preferable to have fast transmission, good response time and reliability.