1. Field of the Invention
The present invention relates generally to error correction coding for data storage systems such as the flash memory.
2. Description of the Related Art
Many flash memories have a page-accessible interface, which means that the flash array therein is arranged into “pages”, which is a unit for host data access, and that the host transfers data to and from the flash memory one page at a time. Other than pages, the flash array is further arranged in “blocks”, which is a unit for data erasure. When activated by the host, the flash memory performs an “erase” operation which forces all bits in a specified block to a default value, which is ‘1’ (one) in most flash memories. Usually, the block is a larger unit than the page and contains a plurality of pages. For the host to write new data to an already written page, the block containing the page would be erased before the page can be written with new data. If the host writes only portions of a page, then all bits in the unwritten portions of the page would retain the default value. In the present invention, “writing a page” refers to the operation of the host sending data to the flash memory and the flash memory programming the data into the flash array of a specified page. Therefore, the terms “programming a page” and “writing a page” are used interchangeably. An “unwritten page” refers to a page which has been erased but has not been written with host data. The term “default value” refers to the value of a data unit, such as a bit or a symbol, in an unwritten page. As an example, the default value of a bit is one in the present invention, which may be easily modified for a default value of zero. Further, the present invention uses the flash memory as an example of the data storage medium in the host system, it is obvious to those skilled in the art that the principles of the present invention apply to any data storage medium which, when unwritten, has a uniform default value.
Error correction codes (ECC) are often used with flash memories to protect the integrity of the data stored in the storage medium against data-corrupting conditions such as storage medium defects, random read errors, etc. Linear block codes are a class of error correction codes which is often used and is the focus of the present invention.
A “symbol” refers to a data unit of a fixed number of bits. Symbols of m bits can be uniquely represented by elements of a Galois Field (or GF) of order 2m. Once a particular GF is selected to represent the symbols, then the arithmetic operations between any two symbols are defined by the GF. In the vector representation, each GF element is represented by a vector of m bits, and the GF addition and subtraction operations between any two elements are equivalent to bitwise exclusive OR (XOR) of their corresponding vectors. In the present invention, an addition operation, indicated as a plus sign ‘+’ in the drawings, between two symbols or between two sequences of symbols refers to bitwise XOR between two symbols or two sequences of symbols, respectively.
Encoding of an (n, k) linear block code means mapping a sequence of k message symbols into another sequence of n symbols, where n is greater than k. The resultant sequence of n symbols is commonly referred to as the “code word”. Encoding methods are generally divided into two categories, namely the systematic and the non-systematic encoding. With systematic encoding, the message appears in the code word itself, occupying the first k symbols of the code word. The other n-k redundant symbols are commonly referred to as the “parity symbols” or “parity”. On the other hand, with non-systematic encoding, the message does not necessarily appear in its corresponding code word.
FIG. 1 illustrates a generalized ECC scheme in a host system (101) using the flash memory as storage medium. The flash memory interface (106) sends and receives signals to and from the flash memory to operate the flash memory according to the interface protocol of the flash memory. In a flash memory write, the data source (102) outputs data in the form of “messages” to the Encoder (104) which encodes each message into a code word. The code words are sent to the flash memory interface which writes the code words to the flash memory (107). In a flash memory read, the flash memory interface (106) reads from the flash memory the code words stored therein and sends the read code words to the Decoder (105). The Decoder recovers the message from each read code word and sends the recovered message to the data destination (103). Due to the data corrupting conditions in the flash memory, a code word read from the flash memory may contain errors.
FIG. 2 illustrates the systematic and the non-systematic encoding from a mathematical viewpoint. Let polynomials M(X), p(X) and C(X) represent the k-symbol message, the (n-k)-symbol parity and the corresponding n-symbol code word, respectively. With systematic encoding, the encoder (201) takes M(X) (202) as input and computes p(X) (204) and appends p(X) to Xn-kM(X) (203) to form the code word C(X)=Xn-kM(X)+p(X). Multiplying M(X) by Xn-k is equivalent to shifting M(X) left by n-k symbols. With non-systematic encoding, the encoder (205) takes M(X) (206) as input and computes C(X) (207) directly.
In some applications, the host may read from an unwritten page. If the host has prior knowledge that the page to be read is unwritten and thus does not contain a valid ECC code word, then the host can disable the ECC decoder while reading the page. However, random bit errors can occur while the host reads the page and can not be detected or corrected with the ECC decoder disabled. If the host does not have prior knowledge that the page to be read is unwritten and thus does not disable the ECC decoder while reading the page, then since the page has not been written with valid code words, the decoder would perform erroneous corrections to the page data even if the page is read without any random bit errors. Therefore, it would be advantageous to devise an ECC scheme whereby data in an unwritten page, as with written pages, is under the protection of ECC when read by the host, such that the host may read any page without prior knowledge as to whether the page has been written and without disabling the ECC decoder for the page read.
Since with such an ECC scheme, the host is not required to distinguish the written pages from the unwritten pages, it would be advantageous to devise an apparatus which is capable of determining whether or not a page is unwritten after the page is read from the flash memory. Obviously, for the apparatus to work properly, it is important that the unwritten pages be under the protection of the ECC such that the probability of random bit errors adversely affecting the result reported by the apparatus is minimized.