PCB Hardware Testing – diagnostics and remedy

The last few days I have been investigating the reason(s) why my hardware does not seem to work.  There are a few software “modules” that are supposed to run “out of the box” via that particular author’s HEX file build.  Thus far, I have downloaded and attempted to run; PAULMON2.1, MCS-51 BASIC (v1.1, v1.2 and v1.31 ), KEIL’s “MON51” and the NMIY-0031 support files.  None of these ran with success, which I found very odd.

Having no success with pre-assembled HEX files, I found some simple program examples on the internet for doing things like toggling port bits and emitting simple text messages from the UART.

I took the time to reacquaint myself with the 805x instruction set, to better familiarize myself with KEIL’s A51 assembler and learn how to modify some of these sample programs I had obtained.  KEIL’s “A51” assembler varies in syntax from the “asem51”, “as31”, “as8051” (Linux) assemblers and the “A51” DOS assembler so most of them needed modification, including the PAULMON program.

One of the things I discovered along the was was that my address decoder, which was using a GAL20V8Z, was not working correctly for the ROM OE pin.  I know the simulations were correct and after once-again looking at the Boolean logic equations resulting from my CUPL code, which also STILL looked correct, I was not sure why there was a problem.  But then I recalled that my TL-866 programmer did not directly support the “Z” suffix of those PLD’s, only the “A”, “B”, “C” and “D” suffixes.  So I concluded that was the issue.  I later recalled that I did run into this same problem a few weeks earlier when I was working on the initial address decoder for the solder-less breadboard version for the Z80 AVR project and that’s why I switched to an AT16V8B for that project.  Since the AT16V8B is housed in a 20-pin DIP package and the GAL20V8Z is in a 24-pin DIP package, I decided to target a GAL22V10D instead.  A slight modification in target device and the CUPL code compiled without a hitch.  I programmed the GAL22V10D and inserted it into the PCB and the ROM CS and OE functions then worked as they were supposed to but still I had issues getting some code to work properly on the 8052.

After the PLD fix, I was able to get some of the sample programs to assemble using KEIL’s “C51 lite” IDE and they ran properly in the simulator.  I might add that PAULMON 2.1 also ran in the simulator.  I was able to get the port-bit toggling and the UART programs to run on my hardware, which made it even more confusing as to what was the actual issue … and the culprit since the PLD was only one part of the issue.

I further pondered and investigated the dysfunctional hardware issue but eventually got tired of hardware trouble-shooting with software that seemed to always end with no conclusive (and positive) results.

I decided it was time to break out the logic analyzer and start looking at the execution order of the program code.  In my investigation, I could see that the there were two distinct values on the data bus; one coincided with the falling edge of the ALE signal, which was the multiplexed lower 8 address bits and the other was the actual data. It’s one thing to see the graphic illustrations of the waveforms in the datasheets but quite another to see it on a logic analyzer with the hardware in action.  Since I triggered the capture on the falling edge of the RESET signal, code execution started at the RESET vector address of 0x0000.

It can be seen that in fact, the address does increment from 0x00 to 0x01 and then 0x02.  The first three bytes of code contain an LJMP instruction, which is 3 bytes long.  The target address is 0x0860, which is correct per the assembly listing of the code.

0000 020860          ljmp    poweron         ;reset vector

So at least the 8052 is executing code from the program memory starting from address 0x0000 but it still does not explain the mystery of why some code functions and others do not.

One of the programs I found on-line was a simple program memory (ROM) dumper written in assembly language.  With some slight modifications, like adding UART init and dumping 256 bytes at a time, it assembled in 189 bytes.  I programmed the system EEPROM and plugged it in.  It also worked as expected but in my testing, by looking at the dump of program memory contents, I could see that the first page of program memory contained the dump program but to my surprise, so did the 2nd 256 byte page.  Every sequential page dump from there on yielded 0xFF, which is the correct erased EEPROM state.  Okay, that makes sense but why the duplicate code in the 2nd page?  My first thought was that my memory decoder logic was STILL wrong but in looking at the simulations again and the resulting Boolean equations, they were too simple to be incorrect and they had nothing to do with the lower 12-bits of the address bus.

ROM_CS => !a15 & !psen_l
ROM_OE => !a15 & !psen_l

Well, something was certainly amiss in the address decoding but what?  I turned to the schematic and after only a few seconds, I saw it!  I had mistakenly identified the 28C64’s “A9” pin with the “A8” net.  Sure enough, the faithful PCB layout program did as I asked in the netlist and connected pin U4-24 (A9) to pin U4-25 (A8).  This explained the duplicate code in even and odd pages since A8 is the 256-byte page boundary and A9 is the 512-byte page boundary.  I am still at a loss as to how I missed that on the schematic, as many times as I have scoured over it for mistakes before moving to PCB layout.  Then … how could I have missed two pins connected together on the EEPROM symbol when I scoured the PCB design before I released the PCB’s for fabrication?  I’ll never know but the fact remains that there is an artwork error that needs correcting before I can move forward on this project.

The next step was to figure out how to correct the mistake on my built PCB with a minimum of rework.  The fix was simple enough, cut the A8/A9 connection between pins 24 and 25 as well as the A8 net connected to the A9 pin, reconnect pin “A8” to the A8 net and pin “A9” to the A9 net, two trace cuts and two wires.  It seemed reasonable to use the RAM chip “right down the road” for the connection points but the 8052 would have been just as easy.  A blank PCB would be easy to rework as I could run the two KYNAR wires needed right under the IC’s (or sockets) to both hide and affix them securely.  However, the A8/A9 trace needed to be cut and it was on the top layer of the PCB, under the 28-pin socket for the 28C64.  I had to remove that socket, hopefully with a clean de-soldering job that would not result in any of the through-hole plating getting ripped out in the process.  I was successful in that only two of the 28 pins had a very small finger of solder holding them firmly in place.  A little more soldering and de-soldering and they were clean enough to pull without causing any damage.  From there, I found a via to use for one point of the A8 connection and I like to use vias for rework wiring because they make a nice solder pad to insert the rework wire into. The other end I threaded through the A8 pin hole.  I remounted the 28-pin socket and re-soldered all but the A9 and A10 connections, which I wrapped and then soldered using the existing wire for A10 and added another between A9 of the the RAM IC and the other to A9 of the EEPROM.  Rework finished!

Once I finished the necessary modification, I was eager to test as I was 100% positive that this was going to allow the code to correctly execute.  The reason why my own test programs worked was because their code was under 256 bytes and did not overlap into the 2nd 256-byte page of memory.  PAULMON and the MCS-51 BASIC programs did, plus all subsequent even pages were shadowed into the next adjacent odd page.  Any subroutine calls to odd pages would surely cause an infinite loop.

BTW: the 2nd KYNAR wire shown on the bottom side connects the 28C256’s “A13” pin to the 8052’s “A13” signal.  One may recall that the PCB was laid out for a 28C64, which I am still waiting to arrive from China.  In the interim, I am using a 28C256, which adds two more address lines; A13 and A14.  A13 was an easy connection because that pin (26) is a no-connect on the 28C64.  A14 is on pin 1 of the 28C64, which goes to the 8052 “INT0” pin, which is pulled “high” via a 100KΩ resistor to Vcc.  Effectively, I have a simple “bank-switching” arrangement to select between the upper and lower 16KB “halfs” of the 28C256.  When loading code images into the TL-866 programmer, I have to select the starting offset as either 0x0000 (lower bank) or 0x4000 (upper bank).

Since I already had the dump program  loaded into the EEPROM, I used it and sure enough, there was no longer a “shadow” of the dump program showing in the 2nd page of program memory.

I loaded the PAULMON v2.1 code into the EEPROM and as expected, it worked.  I then tested MCS-51 BASIC, v1.1 and v1.31 and both emitted their respective sign-on messages.

Additional note:  I have also measured the run current on this board and it is about 220 ma.  The INTEL 8052AH that I am using specifies the run current to be a maximum of 175 ma@12 MHz  with “all outputs disconnected”, so the 8052AH is pulling most of the current thus far.  I expect it will run a quite a bit lower with the AT89C52, which specifies “active current”to be a maximum of 25 ma@12 Mhz.  That’s a 700% reduction on operating current!

In hindsight, I will state that:

  • No matter how simple a design might look (and be), when “complex tools” are brought into play, Murphy’s Law usually has a tendency to creep in as well.  I cannot image if this was a highly complex multi-layer PCB.  How then does one cut traces on inner layers and/or attach wires to tiny SMT components to rework a PCB with artwork errors?
  • If I had not just happened to find and use the ROM dumper program, I might not have found this issue as quickly as I did.  I should add that “quick” was after about 8 hours of investigating and testing.

Hardware testing is only partially completed as I am still waiting for the AT89S52’s, which are now at about 50 days since the initial shipping date, more than the “20 to 40 days” that the vendor states.  I recently ordered more AT89C52’s from a different source that uses the ePacket method of delivery since the ePacket method uses tracking numbers, which can be tracked via the USPS.COM web site.  In fact, the tracking number has already shown up on the USPS.COM web site.  The 28C64’s are not yet in and they are nearly 2 weeks overdue now since they too are not using the ePacket delivery.  Is it “Christmas Season” that is causing these delays?  Is it that some eBay vendors simply don’t ship the product hoping you will forget that you ordered them and they get to (hopefully) keep your $$$’s, do they have employees that are sneaking packages out the back door and reselling them on their own or are they really getting “lost” in shipment from China to the uSA?  I may never know but the anxiety, frustration and loss of time having to re-order and wait some more can be overwhelming if you let it get to you … and sometimes, I do.  🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s