How Long Will It Work? Testing the Reliability of Electronic Systems

Testing the reliability of electronics systems at Fraunhofer IZM

When developing and bringing new products to market, knowing its lifetime can be essential. Manufacturers want their products to work for a specific period of time or number of uses, but things break and products fail. To know when and why that happens, Fraunhofer IZM uses reliability testing.

The technique works efficiently by accelerating the testing process, although this can make the results less specific and harder to interpret. Despite this inherent concern, reliability tests are a very useful tool for the industry.

Daniel Hahn works in Fraunhofer IZM’s “Environmental and Reliability Engineering” department. He explains to RealIZM which requirements need to be met when working with reliability testing.

A typical reliability test

When testing the reliability of electronics, there are many challenges and requirements to be considered. In his work, Daniel Hahn focuses on understanding why products fail when they age by testing them under the real-life conditions in use, as application and market related environment conditions. The aim is to define the reliability parameters (design goals) and optimize the design of products in order to meet these goals better. This is the purpose of reliability testing in a nutshell: Reliability tests are used to qualify a product for a specific set of reliability parameters.

Every product has a desired lifetime. This is mostly defined by the clients of Fraunhofer IZM, in other words, the manufacturers. It can also be set by the results of customer requirements. “The parameters could be anything”, Hahn explains. “It could be a period of time, a number of cycles, or mileage, or indeed a combination of all of the above.” Ideally, the reliability tests cover the entire target lifetime, but efficiency demands that the process is accelerated. The length of the actual test is therefore calculated, often by experience, but also from sufficient knowledge about the failure mechanisms at work. With experienced-based approaches, a product that exhibits a weakness in a first reliability test will be subjected to an even longer test the second time around. New, irreversible test parameters are set for each iteration. Another important prerequisite for all tests is that they need to have zero failures in themselves. It also needs to be remembered that electronic products have an actual lifetime, irrespective of the test duration, and there is a defined number of failures that would be tolerated after the original target life. For example: There should be less than five percent of failures after the reach of the desired lifetime.

The challenges of zero-failure and accelerated test strategy

Daniel Hahn speaks about the different challenges that come with this specific approach to testing. Many of them relate to the differences between the desired lifetime and the actual life of a product. In most cases, only one test is conducted, in a brief window of the overall time in question. There is no way to know the distance between the product’s test and its actual life, and the researchers also cannot know how much of the desired product life is covered by the test.

Figure 1: Typical situation for standard test procedure. It is unknown what lifetime the test covers and where this lies relative to the desired lifetime and fatigue failures

Daniel Hahn knows that this leads to a number of problems that need to be considered. For instance, the tests could be too harsh, making it necessary to overengineer the products just to comply with the test.

Figure 2: Test too harsh results in large gap between covered lifetime (test coverage) and desired lifetime. Leads to overengineering to comply with the test

By contrast, the test could be too weak. If that’s the case the test is basically useless, as it doesn’t illustrate the desired lifetime. When tests and fatigue failures overlap, the product’s reliability is proven to be insufficient, but this means that the correct sample size has to be chosen in order to achieve the desired probability to catch failures.

Figure 3: Test is too weak and doesn’t cover the desired lifetime

When then do Daniel Hahn and his colleagues use these strategies, if it is so hard to produce meaningful results? They do so, because it makes the tests faster, more efficient, and cheaper, as the only other alternative would be to test products constantly for their entire life.

The purpose of reliability tests

Acknowledging the challenges inherent in testing, Daniel Hahn has a specific definition for the actual purpose of reliability testing: “Reliability testing wants to narrow the gap between the desired life of a product and the test coverage. On top of that, reliability design wants to narrow the gap between the desired life of the product and the failure distribution.” This explicitly does not mean that products are designed to fail or expire at a certain time. Instead, it is about enabling testers to make precise statements about the product’s lifetime.

Figure 4: Points that can be influenced to improve test to achive the desired results

They can influence the reliability test procedure with so-called “mission profiles”, which are the exact requirements for the product’s desired life. These mission profiles state the real-life conditions in use, which in turn influences the test duration. Real-life conditions could include, as Hahn specifies, “every relevant load coming from environmental and application influences as well as loads due to transportation and storage conditions.” The scientists also need specific knowledge about the failure mechanisms to make more accurate predictions about the lifetime, and they need to filter for the relevant loads that could actually occur. Effective communication through the supply chain is vital for these mission profiles, since different links in the chain have different insights and information.

The test in practice

Daniel Hahn explains the procedure for reliability tests with a simple model: The greatest possible load is used as the test condition. The duration for the test is calculated based on a simple aging model or alternatively defined by experience. With electronic devices or products, different components are tested separately from the system, which usually depends on the supply chain. Such simplistic models are very limited, Daniel Hahn knows, but they are good enough for a meaningful estimate. If the scientists want to include as many failures of the system as possible, they have to choose very conservative test parameters. “Precise and detailed models are very specific in terms of failure mechanisms as well, so the stress and model parameters are specific to failure locations and materials.” This means that all factors that cause, for example, mechanical stress on a PCB (printed circuit board) have to be considered, in the tested device or in the test itself. A possible solution would be to change to a higher level of the system. This could mean e.g. testing the ECU (Electronic Control Unit), instead of testing the PCB, but this drastically increases the costs and complexity of the test. The alternative is to keep testing the PCB level, but changing the test parameters to take other stress factors into account, which again needs a precise model and good knowledge of the system. For Daniel Hahn, this is still a big challenge to be mastered.

Application areas

One example for the described testing method is condition monitoring for offshore wind turbines. The question is: How does the system correlate with the wind? How does the wind influence, for example, the temperature of the electronics, and how does this then affect the lifetime of the system? This can be defined by mission profiles and models. By developing tests representing the lifetime conditions of the wind turbine the function of the conditions monitoring system can be adjusted, tested and verified. Fraunhofer IZM also uses these reliability testing methods for the automotive industry and its suppliers. Various other clients come to the Institute to know which mission profiles they can use for their products. Another line of work for Daniel Hahn and his testing methods is the development of test strategies for electronics to reach specific reliability targets.

The industry depends on reliability testing to get specific knowledge about the lifetime and possible failures of their products. It makes electronic systems in particular and products in general more sustainable. With many challenges still to be solved and fascinating new avenues, like Artificial Intelligence or digital twins, still to be explored, there is a lot left to do for Daniel Hahn and his colleagues.

Nevertheless, the IZM is quite advanced in analyzing whole systems, due to the experience gained in developing and testing different technologies. Reliability testing for Daniel Hahn and his colleagues is characterized above all by the holistically collected knowledge about the reliability of products. The tests are generated primarily on the basis of the gathered expertise. As Daniel Hahn explains, an understanding of failure mechanisms is already available through previous tests on technologies developed in-house or other products. Due to the experience at IZM, such analyses and tests can be carried out much faster and purposive. There is also constant research into which new methodologies can be used for reliability testing. Daniel Hahn sums it up: “IZM looks at the whole spectrum, from the component and its failure mechanisms to the entire system with all its levels of reliability.”

Daniel Hahn, Fraunhofer IZM

Daniel Hahn

Daniel Hahn joined Fraunhofer IZM as a research scientist in 2011. He studied electrical engineering at the Technical University of Berlin. Due to his academic focus on microsystems engineering, he is interested in the reliability of electronic systems, especially microsystems engineering, and the manifold interactions of technologies in these systems.

His research focuses on the development of methods for application-oriented testing (including topics such as mission profiles, aging models, accelerated testing, system reliability and statistics).

Add comment