Follow

Application and Configuration checksum errors explained (KB1027)

The neuron chip states of Applicationless and Unconfigured due to checksum failures are explained in the Neuron 5000 and 6000 Users Guide. See the section on Application Integrity Checksums. However what can cause these states in a deivce that seems to have been working perfectly well? These series of questions and answers taken from just those projects and provide guidance on what to look for as to the cause and remedy.

These guidance notes were prepared in response to the following:

 

Question: I have been working with a customer that has been experiencing EEPROM corruption problems on an older 3120 or 3150 neuron chip At this point we have done a formal design review and have suggested possible problems with their design. The customer believes that part of the problem may be due to improper Power supply design. He is concerned that fast transients may be getting through the supply and creating bad memory reads during initial Checksumming which ultimately causes his nodes to go Applicationless. Before making any corrections to the design several questions that need to be answered:

 

Answer) If they are using a Dallas semiconductor DS1233 LVI. Dallas states, below, that the minimum transient on VCC needs to be 50us to be caught by the LVI. In our design experience at Echelon, with the LVI, is this reasonable or have we seen cases where longer VCC transients have not been caught. This is important during redesign of the power supply to properly block any transient faster than what the LVI can detect.

 

"The minimum tf of 300us was specified to insure that the RST! output

goes active before Vcc has dropped below 3V, based on the delay from Vcc

crossing the trip point to RST! active. For a negative going transient on

Vcc which just reaches the trip point, the DS1233 may not respond to

voltages at the trip point for less than 50us.

 

   The timing feature that Echelon cares about for the DS1233

   pulse-stretching LVI is that it be able to catch the ~ 1us wide RESET-

   output pulse from a software reset of the Neuron chip, and turn that into

   a longer reset pulse that is reliably seen by other circuitry. Dallas

   doesn't specify the minimum "pushbutton" RESET- pulse width that the

   DS1233 will detect in their datasheet, they just talk about detecting a

   low-going edge. In the DS1813 datasheet (the 1813 is a newer version of

   the 1233), they specify a minimum RESET- pulse width of 1us.

.

  

 

Question: If a device indicates it is in the applicationless state, is it necessarily defective?

 

Answer: No but it is an indication of a device failure. Often it is caused by a checksum test failure, which in turn can be caused by bad memory or incorrect memory map.

If one device goes applicationless more than once all by itself, you need to understand the root cause. In this case, it is likely that the device has a memory problem.

 

There are also situations when a system failure could leave a device applicationless. For example, the network connection to the device could fail (a router failure or a power failure) in the middle of an application download to the device or at a certain stage of commissioning the device. This failure is not directly related to the device, but it will leave the device applicationless. If this temporary condition is relieved, the next reload should succeed.

 

There are many nuances in checksum behaviour depending on the image export options used in Nodebuilder, physical configuration etc. Here is a concise answer that applies to 3150 Neuron with firmware v6 and should provide some guidance.

 

Note. If you have an old version of the original Motorola Neuron Chip Data Book then page 3-13 or section 3.2.6 Memory Integrity also provides the definition of what the Configuration image, Application Image and System images are and what happens if the checksum in these memory areas fail.

LonSupport can provide .pdf copies of this DataBook.

 

Q: It is our understanding that two different checksums are performed, an

Application sum and a Configuration sum. When the neuron does a checksum on power up or a reset what exactly happens? Which occurs first?

If both fail what state does the neuron go into and what error is logged?

  

A: Configuration checksum goes first, application checksum goes next.

   If the configuration checksum fails, by default the node goes

   unconfigured. You may alter this behaviour by changing export options

 

 

Here is the sequence:

 

  1. Configuration checksum. If it fails, the device will either

   repair the checksum and go unconfigured, or will repair and preserve the

   state, depending on if the configuration recovery flag is set or not.

  

  1. Application checksum. This one actually consists of several checksums,

   depending on the physical configuration of the node, see the page 3-13 of

   the Motorola data book. First of all the system image is checksumed (this

   is optional !). A checksum error in the system image forces the node

   to the applicationless state. Second, other areas are checksumed. Here the

   behaviour on failure depends on the recovery flag. If the recovery flag

   is not set, the node state is set to applicationless. Behaviour in the case

   of recovery flag on is discussed on page 3-15 of the Motorola databook.

 

Q: How many checksum failures does it take to go Applicationless?. If it is two what happens if there is a reset between the two

 

Answer: It takes two failures. Exact meaning is : when a checksum fails,

   the firmware immediately starts a new, second check. If it fails as well,

   the checksum is considered failed, with the consequences as described

   above.

 

   Q: What happens if there is a reset between each

   checksum?

 

Answer: Since the sequence was incomplete and the applicationless state was

   not forced, upon reset it will start the checksum sequence, absolutely

   ignorant of previous attempts.

 

   Q: Do the checksum failures need to be consecutive or cumulative?

Answer: Consecutive.

  

   Q: After the node goes applicationless can we determine if the EEPROM has

   been corrupted as well? One method would be to read out the EEPROM on a

   good node and compare it to the values in the Applicationless node.

   However, can you get the values out of a node that is Applicationless?

  

   A: You can use memory reads, since the firmware is supposedly intact

   and you can use network management commands. Corrupted configuration

   table is not a problem, since the network management commands use Neuron

   ID addressing. One tool that can do the job is Nodeutil.

 

   For a 5000 or 6000 neuron chip the external memory contents should be read back via the memory programmer. The configuration data structure is contained within the first few hundred bytes starting from location 0x000 of the physical serial memory.

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk