This invention relates in general to using data pipes to transport data from one application to another application, and more particularly, to connecting two applications which are using a data pipe differently. Specifically, a writer application is attempting to write data to the pipe, after a last parallel reader application has closed access to the pipe, i.e., after only partially reading the data.
Data piping between two units of work (e.g., a writer application and a reader application) includes the writer application writing data to a pipe and the reader application reading data from the pipe. The pipe is a conduit of a stream of data. As data is written to the pipe, the data is read from the pipe.
Within a single system (i.e., one operating system image), data piping is typically implemented using a first-in first-out (FIFO) buffer queue structure located in inboard memory. The pipe is accessible to both applications participating in the piping. Further, multiple writer and reader applications of the same system can access the same pipe.
Data piping can also be performed between applications on different systems using an external shared memory (i.e., cross-system data piping). This is described in co-pending, commonly assigned, U.S. patent application Ser. No. 08/846,718, filed Apr. 30, 1997, Bobak et. al., entitled xe2x80x9cCross-System Data Piping Method Using An External Shared Memory,xe2x80x9d which is hereby incorporated herein by reference in its entirety. (This reference to application Ser. No. 08/846,718 in the Background of The Invention section is not an admission that said application is prior art to this invention.)
As noted, there are certain cases where a last reader application accessing a data pipe closes a data set prior to reading all of the data being written to the data pipe by a writer application. For example, the reading process may be sampling only a first portion of the records contained in a data set being written by the writer application to the pipe media. Since the writing process has no knowledge of the destination of the data being written, it continues to operate as before, writing all data of the data set until closing the output file.
In a parallel processing data piping scenario, if the reading process does not consume all data from the data set, the writing process will eventually fill all allocated pipe media, and then be placed in a wait state for a further partner process (referred to herein as a xe2x80x9cdrain stepxe2x80x9d) to drain the data pipe. This is the current implementation of an I/O subsystem running on OS/390 offered by International Business Machines Corporation entitled xe2x80x9cIBM BatchPipes/MVSxe2x80x9d, which is described further herein below.
For example, consider a two-step job that executes on a system such as International Business Machines"" OS/390. The first step of the job creates a sequential data set, and upon completion of the first step, the second step executes and reads the data set created by the first step. However, when the second step reads the data set, it reads only the first part of the data set. Jobs that operate in this manner pose problems when parallelized, which obviously must be overcome in order to allow successful execution.
Another technique which can be employed to keep writer applications and reader applications of a data pipe in sync in a parallel processing environment can involve the use of xe2x80x9cfittings,xe2x80x9d which are described in detail in xe2x80x9cIBM BatchPipes/MVS User""s Guide and Reference,xe2x80x9d GC28-1215-01, Second Edition, September, 1995. Using this technique, a fitting could be placed on the writing application to insure that only data needed by the reading application is written to the pipe. For example, it is possible to create a fitting on the writing application""s pipe data definition statement that would only allow 500,000 records to flow into the pipe if the reading application only wanted 500,000 records. This fitting, which could be specified in BatchPipeWorks syntax as xe2x80x9cBPREAD:TAKE 500000:BPWRITE,xe2x80x9d would allow these processes to work in parallel. However, the method has a disadvantage in that it consumes unnecessary resources to process the fitting, and adds management complexity since the writing application must be aware of the way in which the reading process accesses the data.
Existing solutions for this problem are less than cost effective since they either require the reader application to read the entire data set even though the reader only requires a portion of the data set, or they require that a subsequent process be scheduled to drain the data pipe. Since the balance of the data set may be unwanted, such solutions involve unnecessary processing which wastes resources.
Therefore, a need exists in the art for an enhanced data pipe processing capability which enables-pipe access support to selectively and transparently discard data to be written to a data pipe by a writer application, without requiring additional processes to be started and managed.
Briefly summarized, the invention comprises in one aspect a data piping method which includes: writing data via a writer application to a data pipe; reading data from the data pipe via at least one reader application; determining when a last reader application of the at least one reader application closes the data pipe before the writer application completes writing all data to the data pipe; and upon determining that the last reader application closes the data pipe before the writer application completes writing all data, preventing further writing of data to the data pipe by the writer application, wherein the preventing of further writing to the data pipe is transparent to the writer application.
In another aspect, the invention comprises a data piping system including a data pipe, a writer application and at least one reader application. The writer application writes data to the data pipe, while the at least one reader application reads the data from the data pipe. Means are provided for determining when a last reader application of the at least one reader application closes the data pipe before the writer application completes writing data to the data pipe, and for responding thereto by preventing future writing of data to the data pipe by the writer application. The preventing of future writing being transparent to the writer application.
In a further aspect, the invention comprises an article of manufacture which includes at least one computer useable medium having computer readable program code means embodied therein for causing the piping of data. The computer readable program code means in the article of manufacture comprises computer readable program code means for causing a computer to effect writing of data by a writer application to a data pipe and to effect reading of data by at least one reader application from the data pipe; as well as computer readable program code means for causing a computer to effect determining when a last reader application of the at least one reader application closes the data pipe before the writer application completes writing all data to the data pipe and responding thereto by transparently preventing further writing of data to the data pipe by the writer application.
To restate, presented herein is a technique for selectively dummying a data pipe transparent to a writer application to enhance the ability for writer and reader applications to access a common data pipe in different ways without degrading performance, e.g., without loss of data or processing time, and without causing additional resource consumption due to an inadvertent early reader close. Advantageously, this technique eliminates the need for a drain step either as an additional step within a consuming job or as a separate job. Therefore, less resources are used to accomplish the same function. This also means that it is easier to implement parallelism in customer environments, and facilitates transparent implementation since application changes and additional processes are unneeded. Further, the solution presented herein is specific to dummying the pipe media and allows a fitting associated therewith to still be driven. Also, the technique presented facilitates the automated parallelization inherent in IBM SmartBatch for OS/390 (referenced below). Significant processing time enhancement is attained.