The software code executing in remote or embedded devices often needs to be updated both during development of the embedded device and after the device has been delivered to the customer (post-issuance). Typically, several code updates are required during product development. The reliability and efficiency of updates during development of the embedded device affect the time required to develop the product. The reliability and efficiency of post-issuance updates is also important, because non-functioning devices typically must be returned to the device issuer, resulting in shipping-related costs and delays.
One type of update is performed to fix “bugs” or problems with the software. Another type of update is performed to include new product features, such as special functions tailored for particular customers. A code update may fail for a number of reasons, including operator error, power failure and other unexpected events. These failure events sometimes make the device inoperable. These failure events may also make the device appear to function normally despite the existence of a failure within the code space. This latent problem may manifest itself subsequently during normal operation, often resulting in an inoperable device. Therefore, it is important to detect such a failure prior to program execution.
Such failure conditions are typically detected by reading back each byte of code previously written and comparing it against an expected value. This process is explained in more detail with reference to FIG. 1.
Turning to FIG. 1, a flow diagram that illustrates a typical method for detecting corrupted code by comparing the value of each storage unit downloaded to device with the corresponding value of the storage units read back from the device is presented. At 100, the code is downloaded to the embedded device. At 105, a previously downloaded byte is read from the embedded device. At 110, the byte is compared to the expected value. At 115, a determination is made regarding whether the byte value matches the expected value. At 120 an indication that the code is corrupted is made if the byte value does not match the expected value and the process ends at 135. If the byte value matches the expected value, at 125 a determination is made regarding whether there is another byte to check. This process continues until all bytes in the program have been checked. If all of the bytes have been checked and all of them match their expected values, an indication that the program is valid is made at 130.
Unfortunately, this method of reading back every byte and comparing it to an expected value is time-consuming. Additionally, the expected values for particular bytes may change with each software version, thus requiring special knowledge about each software version.
An improvement is made possible by using a checksum. This process of using a checksum is explained in more detail with reference to FIG. 2.
Turning now to FIG. 2, a flow diagram that illustrates a typical method for using a checksum to detect corrupted code is presented. At 200, the code is downloaded to the embedded device. At 205, a checksum is initialized. At 210, a byte previously downloaded is read. At 215, the byte value is added to the checksum. At 220, a determination is made regarding whether there is another byte to check. This process continues until the value of each program byte has been added to the checksum. At 225, the calculated checksum is compared to the expected checksum. If the calculated checksum and the expected checksum are not the same, an indication that the code is corrupted is made at 230 and the process ends at 240. If the calculated checksum and the expected checksum match, an indication that the code is valid is made at 235.
Unfortunately, this checksum method requires adding each byte, reading it back and comparing it against an expected value. While this method typically takes less time than reading and comparing every byte to an expected value, the method is still time-consuming. Additionally, the checksum method sometimes fails to detect corrupt code. In these cases, the calculated checksum of corrupted code matches the expected checksum, causing corrupt code to be used.
Once the determination that the code is corrupted is made, other code that is not corrupt must be executed. One typical solution is to provide complete redundancy by maintaining two copies of the code. One copy is typically maintained in “boot” flash, while the other is maintained in “main” flash. This is illustrated by FIG. 3.
Turning now to FIG. 3, a flow diagram that illustrates a method for updating code on an embedded device using separate copies of the program in boot flash memory and main flash memory is presented. At 300, the code is downloaded. At 305, a first copy of the program code is maintained in main flash memory and a second copy of the program code is maintained in boot flash memory. At 310, execution begins from boot flash memory. At 315, a determination is made regarding whether main flash memory is corrupted. If main flash memory is corrupted, the copy of the program code in boot flash memory is executed at 320. If main flash memory is not corrupted, a copy of the program code in main flash memory is executed at 325.
This solution provides complete redundancy so that one copy of the code may be executed when the other is corrupted. However, this redundancy also requires twice the memory, thus constraining the maximum program size in embedded devices that follow this approach.
Other solutions provide additional levels of redundancy. But this additional redundancy typically comes at the expense of even higher memory requirements, further constraining the maximum program size.