- What is Automated Error Prevention?
- Rethinking Our Approach to Software
Quality
- Learning From The Automotive Industry's
Experienc
- Defining the Concept of Error Prevention
- What Error Prevention Means to the
Software Industry
- What is Automated Error Prevention
(AEP)?
- How Automated Error Prevention (AEP)
Works
- Thinking the AEP Concept?Way
- Implementing the Parasoft AEP Methodology
- AEP Methodology Principle 1: Apply
industry best practices to prevent
common errors and establish a foundation
for full-lifecycle error prevention
- AEP Methodology Principle 2: Modify
practices as needed to prevent unique
errors
- AEP Methodology Principle 3: Ensure
that each group implements AEP correctly
and consistently
- Principle 3a: Introduce AEP
on a group-by-group basis
- Principle 3b: Ensure that
each group has an appropriate
supporting infrastructure
- Principle 3c: Implement a
group workflow that ensures
error prevention practices are
performed appropriately
- AEP Methodology Principle 4: Phase
in each practice incrementally
- AEP Methodology Principle 5: Use
statistics to stabilize each process,
then make it capable
- How Automated is AEP?
- Human Concerns Related to AEP
- Redefining Roles
- Establishing Group Culture
- Final Thoughts
- Learning More About AEP
- Notes

What is Automated
Error Prevention?
By Adam Kolawa, Ph.D.
Chairman/CEO of Parasoft Corporation
Rethinking Our Approach to
Software Quality
If you really look at IT, you will see that its
mission is to improve business processes and increase
profits. Companies are constantly rethinking and
struggling with how to use IT to a competitive advantage,
reduce IT operating and maintenance costs, and reduce
the total cost of ownerships all while attempting
to deliver increased value. This manifests itself
through the following concerns and initiatives shared
by many organizations:
- ?Outsourcing (lowering costs with outsourcing,
achieving greater quality with outsourcing, gaining
visibility into outsourced development, etc.)
- ?IT Strategy (controlling the costs of system
integration and maintenance, ensuring the quality,
cost effectiveness, and success of upcoming migrations,
etc.)
- ?Quality Initiatives (obtaining the desired
assessment/certification, sustaining existing
quality procedures, building quality into the
process, etc.)
- ?Cost Reduction (controlling spiraling software
development and labor costs, producing more with
the same resources, reducing the costs associated
with rework and poor quality, etc.)
- ?On-time Product Delivery (ensuring that projects
are shipped on time, with the requested functionality)
- ?Process Improvement (streamlining the software
development lifecycle for faster time to market,
increasing developer productivity, making the
process predictable and reliable, etc.)
- ?Security/Privacy (verifying that appropriate
levels of security and privacy are built into
the product)
- ?Adherence to Standards (verifying compliance
with specific standards [such as MISRA or Section
508], training developers and contractors to follow
the standards, etc.)
Most of these problems can be traced to the same
source: the struggle to make software work . . .
without incurring unreasonable costs. So, it all
seems to lead back to cost, which raises the question
of why software development is so costly. Most people
in the industry would agree that low IT productivity
is the culprit here. But why are IT teams, with
all their expertise and hard work, suffering from
low productivity? I believe that the root cause
of this low productivity can be traced to errors
that result from mistakes made throughout the software
development lifecycle. These errors include everything
from performance errors, to security errors, to
misimplemented functionality, to errors that crash
an entire system. They essentially stifle IT teams'
ability to produce working software in a reasonable
time and at reasonable costs. In fact, if you look
at virtually any IT team, you will see that its
team members spend about 80% of their time chasing
and fixing bugs, and only about 20% of their time
on tasks that deliver value and improve the business.
The logical reaction, then, is to prevent errors.
Many other industries, such as the automotive industry,
have also struggled with low quality, high costs,
and low productivity as a result of human error.
These industries recognized that although mistakes
cannot be eliminated, they can indeed be controlled.
They then modified their production lines to prevent
as many errors as possible. By preventing scores
of errors from ever entering the products, they
addressed their most critical industry problems
and were able to remain viable industries.
The software industry still has not learned this
lesson. Many people do not think that error prevention
is even possible in the software industry; they
believe that because each piece of software is different,
the lessons learned from working on one piece of
software cannot be applied to other pieces. Thus,
instead of trying to prevent errors from entering
software, the software industry tries to test errors
out of software. First we build a product, then
we attempt to use testing to determine whether the
product works and finally remove any errors that
the testing process exposes. Throughout this process,
we cross our fingers and hope that the most insidious
and embarrassing problems will be identified before
the release. However, a consideration of the number
and impact of software errors including the study
which found that software errors cost the U.S. economy
a staggering $60 billion dollars per yeari?
suggests that this "quality through testing" approach
is not yielding the desired results.
This belief that testing can create quality software
systems is a fundamental problem in the software
industry. We don't think of the whole process of
building and deploying software in a way that would
prevent errors because we don't believe that it
can actually be done. Yet, this error prevention
approach is not only possible, but necessary. Every
mature industry has already figured this out and
stopped relying on testing as a way to make products
work. We continue to have faith that testing will
deliver quality but it never does.
If the software industry is serious about reducing
the error rate and resolving the issues that stem
from errors, we can't afford to continue hoping
that our current approach to testing will miraculously
start yielding quality software. Instead, we need
to follow in the footsteps of other industries and
start preventing errors throughout the software
development lifecycle industry's own version of
the manufacturing "production line." For this strategy
to work, we need a formalized process for integrating
error prevention in the software lifecycle, and
this formalized process must redefine梑ut not eliminate
the role of testing. While testing has proven to
be an ineffective method of building quality into
software, it can and should be used to measure how
well software is being produced and maintained.
Parasoft understands the need for this redefinition
of testing and for error prevention, and has developed
Automated Error Prevention (AEP) to help the software
industry make the transition as painless and efficient
as possible.
Learning From The Automotive
Industry Experience
Other industries have learned how to control the
process of production, utilizing error prevention
techniques as a means of creating high quality,
affordable, and abundant products. The automotive
industry is a prime example of this paradigm shift.
For the automotive industry, a major advance in
product quality was made when the meaning of testing
was redefined. In the years prior to the Second
World War, automakers in both America and Europe
would test their products after they came off the
assembly line. Defects would be corrected one-by-one.
There were two problems with this method. First,
it slowed production and delivery of the end product,
making production more expensive than it should
have been. Second, it did not find and fix all of
the defects, allowing many defects to remain in
the autos after they had been sent to market. This
led to a short product 搒helf-life?and high consumer
dissatisfaction.
After the war this 揾unt and fix?mentality remained
deeply rooted in the American automotive industry.
Americans were willing to overlook flaws, it seemed,
if their lust for style was fulfilled; in other
words, fins, pinstripes, and white-walls made the
glitches and poor performance easier to live with.
Ironically, it was an American, W. Edwards Deming,
who tried to educate the American automotive industry
about error prevention techniques and their application
to the assembly line. When his efforts fell on deaf
ears, Deming turned to the Japanese, who were quick
to realize the potential profit Deming抯 methods
had for their industry.
Deming抯 great insight into the manufacturing process
was that quality was essentially locked out of the
process if defects were only looked for after the
autos were finished. Looking for a rattling piece
of metal, for example, in a finished product was
like looking for a needle in a haystack. This is
the same reason post-production testing fails in
the software industry. Looking for one fatal runtime
error in 50,000 lines of code is much harder than
looking for the same error in only 500 lines.
Defining the Concept of Error
Prevention
Deming taught that fixing problems where they occur
in the manufacturing process not only eliminates
many quality problems in the finished product, but
also promotes the ultimate goal of improving the
quality of the manufacturing process. He found that
by fixing the process itself, it's possible to prevent
the same types of errors from occurring over and
over again. In other words, his quality improvement
initiative promotes and requires error prevention.
Deming advocated process quality improvement through
a procedure of root cause analysis and elimination
of error causes. The basic procedure for implementing
Deming's process quality improvements is as follows:
- 1. Identify an error.
- 2. Find the cause of the error.
- 3. Locate the point in production that created
the error.
- 4. Implement preventative practices to ensure
that errors do not reoccur.
- 5. Monitor the process.
For example, inspectors on an auto assembly line
discover that seat bolts are not being tightened
properly. The cause of this error is that the bolts
do not exactly fit the tool used to tighten them.
The corrective action is to provide the proper fitting
tool to the location in the line where the seats
are installed. Monitoring the process is accomplished
by closely inspecting the seat bolts for tightness.
Monitoring the station where the seats are installed
also provides data about the amount of time saved
by using the right tool.
Error prevention is very different from error detection.
Error detection is the process of finding and fixing
errors after an application is built; the flawed
process that generated those errors is left uncorrected.
In the seat example, error detection would have
simply tightened the seat bolts at the end of the
assembly line. This action may have ensured that
the bolts were installed properly, but it would
have left the root of the problem embedded in the
manufacturing process. The problem would never go
away because the root cause was not corrected.
Implementing Deming抯 methodology took time, years
in some instances, as every step in the manufacturing
process had to be analyzed and in most cases altered.
However, in adopting Deming抯 error prevention techniques,
Japan, Germany, and other nations saw the quality
of their products skyrocket. In turn, the manufacturing
process became more efficient, resulting in greater
production numbers and reduction of per-unit costs.
This savings was immediately transferred to the
consumer.
The American automotive industry remained skeptical
of the Deming approach, despite the unexpected and
astonishing improvement in quality experienced by
the Japanese. Quite simply, nobody in the U.S. believed
such an advance could be achieved, let alone maintained.
Post-production testing remained the standard approach
to quality in the American auto industry well into
the 1970s. Only after the 1973 oil crisis, when
Americans turned to smaller foreign imports in an
effort to save fuel, did the American auto industry
take notice. What they saw was an ever-widening
gap between the quality of Japanese and European
cars and their domestic counterparts. Failure to
prevent errors in the production process nearly
destroyed the American automotive industry all together.
What Error Prevention Means
to the Software Industry
Error detection is how the software industry currently
deals with bugs, by treating the 搒ymptom? (bugs)
and not the 揹isease?(the development process). Consequently,
the quality problems facing software are the same
as those facing the automotive industry 50 years
ago. Given that detection does not cure the root
cause of errors, we must ask why isn抰 our industry
taking advantage of error prevention? Why isn抰 software
behaving like other industries when it comes to
preventing errors? These are vital questions, since
software is in practically everything we touch and
use.
To begin learning the same lesson that the automotive
industry (among others) has taken to heart, the
software industry must first concede that software
is no different from other manufactured products.
Software is not a specialty service industry, as
so many believe, but a manufacturing sector. Cars,
toasters, coffee pots, and computers all come off
assembly lines. If at the end of the line a defect
is found, it serves no purpose to fix that defect
item by item. The assembly line itself must be examined
and fixed. Software is no different from any other
manufactured product in this sense. Software defects
must also be prevented, not simply detected and
fixed individually.
The reason defects still plague the industry is
that many within the industry itself doubt that
the lessons learned by other industries can be learned
by the software sector. The rapid evolution of software
demanded by the consumer suggests to many that a
stable production process, an assembly line model,
is not possible in software. Where is the production
line when you produce one thing only one time? How
do you ensure quality when one product is as different
from the next as an apple is different from an orange?
The answer to this quandary requires rethinking
of what is meant by 損roduction line.?There is indeed
a production line in software manufacturing, but
not one that repeatedly makes multiple copies of
the same thing. The software production line exists
in the transition of raw ideas into software products
that provide intelligence to computers. Producing
software requires not a traditional rigid production
line, of machines making machines, but rather a
framework, sophisticated and flexible, that can
adapt to the rapid changes that occur from one piece
of software to the next. In software, intelligence
creating intelligent tools is, in and of itself,
a production process.

Figure 1: The Software "Production
Line"
The industry lost many years of productivity believing
that there is no way to produce software in the
same way other durable goods are manufactured. Focusing
on processes rather than production, as quality
control initiatives such as ISO and SEI do, has
proven to be an unfruitful diversion to the real
problem of software bugs. Just as unfruitful was
the incessant search for a 搒ilver bullet?that would
enable the industry to deal effectively with the
problem of bugs once and for all, without mess or
fuss.
Such a silver bullet is not likely to appear as
long as the software industry continues to ignore
the vital lessons that Deming and others have to
teach regarding error prevention. We抳e had plenty
of time to learn how to deal with defects from the
automotive and appliance industries and apply the
lessons of error prevention to software production.
Like these other industries, software must redefine
what it means by testing. We must stop thinking
about testing as a means of cleaning a product of
bugs to regarding testing as a means to measure
how well processes and production line techniques
are working. This is the first fundamental need.
A second fundamental need, just as important as
the first, is to realize that errors cannot be prevented
without automation; people are simply not capable
of constantly and consistently maintaining processes
and practices over time. The human mind is great
at creating processes, but not at maintaining them
(this is the fatal flaw in processes such as ISO,
CMM, and other similar process initiatives). Mundane
practices, left to human fallibility, will decay
over time. Automation is the only solution to this
problem. Again, looking at the automotive industry,
mundane tasks such as repetitive welding have been
fully automated. The result? Weld errors are rarely
found anymore.
What is Automated Error Prevention
(AEP)?
For the software industry, the only true way to
make high-quality applications is through Automated
Error Prevention (AEP). To fully understand how
revolutionary AEP is, you must understand the difference
between the AEP Concept and the Parasoft AEP
Methodology?/i>.
The AEP Concept addresses a basic need within the
software industry ?improving application quality
through the automatic prevention of errors during
the entire software development lifecycle. The AEP
Concept is fundamental for improving the efficiency
of the software industry; in order for the industry
to mature it must fully embrace the AEP Concept.
The AEP Concept was developed over the course of
many years through close work with the software
industry. It builds on Deming's work, then revolutionizes
it by automating it as much as possible. Deming
promoted quality improvement through a manual process
of root cause analysis and elimination of error
causes. This manual process is feasible for industries
where the same exact production process is repeated
over and over to create a large quantity of the
same product. To prevent errors from hundreds, thousands,
or even millions of products made from the same
production line, you modify an error-prone process
once, verify that it improves product quality, then
repeat this same modified process?in the exact same
way?every time you manufacture another copy of this
same product. Preventing software errors is more
complicated because each piece of software is a
unique product; while the same general error prevention
practices can be applied to almost any piece of
software, each application of that practice must
be tailored to the specifics of the current piece
of software. Consequently, manually applying process
improvements to the software lifecycle is extremely
time-consuming and expensive. That's why CMM, CMMI,
and other software quality initiatives based on
Deming's findings rarely deliver the promised results
and often endure only in notebooks that sit on a
shelf and gather dust.
AEP's automation is the innovation that allows
AEP to overcome the main obstacle to other software
quality initiatives: their manual implementation
is typically so difficult and time-consuming that
they cannot feasibly be integrated into most development
organizations and software product lifecycles. The
introduction of automation is essential to making
error prevention a practical strategy for the software
industry. When key error prevention practices are
automated, organizations can ensure that they are
performed thoroughly and precisely, with minimal
disruption to existing processes and projects. As
a result, error prevention practices can become
a practical, enduring part of a team's development
process rather than an idea that is appreciated
in principle, but never truly embraced and implemented.
Because the AEP Concept is so important, it must
be shared with others in the industry. Therefore,
Parasoft is pleased to place the Automated Error
Prevention (AEP) Concept, the 揻ive steps plus automation,?into
the public domain. Parasoft is dedicated to the
AEP Concept, and wishes to see it embraced by companies
and organizations that want to improve their software
manufacturing and development processes. Others
in the software industry can adopt the AEP Concept,
improve it or change it as needed, and build their
own methodologies and products from it.
The Parasoft AEP Methodology?/i> is a specific
and practical application of the AEP Concept. It
defines how you effectively implement the AEP Concept
into your software development lifecycle. The Parasoft
AEP Methodology?/i> is not a product. Rather, it
is the definition and description of a process in
which proven error prevention practices are adopted
and automated for the entire software development
lifecycle.
There are other AEP methodologies waiting to be
discovered and applied to the software development
lifecycle. Just as with the AEP Concept, others
in the software industry are encouraged to conduct
research to facilitate the creation and dissemination
of these additional methodologies. Parasoft Corporation
has simply conducted the first phases of this research
?there is still much more that can be done.
Wedding error prevention and automation is a difficult
task, and many in the software industry have failed
in the attempt to automate error prevention techniques.
Together, the AEP Concept and the Parasoft AEP
Methodology? are unique in their ability to
help the software industry transition from ineffective
error detection into comprehensive automated error
prevention.
Let us show you how this is done.
How Automated Error Prevention
(AEP) Works
A simple example will demonstrate the thought processes
involved when utilizing the AEP Concept, and the
powerful effect that such thinking will have on
how you address and eliminate errors. Normal testing
cycles allow you to find and fix only one error
type at a time. Using the AEP Concept, you can abstract
single errors into entire classes of errors, preventing
many types of errors from reoccurring.
For a simple example of how this strategy might
work, imagine that you have an n-tier system, which
includes a client, middleware written in Java, and
a database. Naturally, you want to determine how
robust that system is. This means that you need
to conduct load tests to establish what types of
traffic that system can manage.
Fair enough. Load tests are a fairly straightforward,
if mundane, task. This isn抰 your first N-tier system
so you anticipate few problems, if any. Unfortunately,
your optimism isn抰 rewarded. During testing you
discover that loads overwhelm the system, forcing
it to stop working altogether. What happened? When
you investigate the cause of this problem you discover
that the middleware is not talking to the database.
More to the point, the middleware is leaking resources
and running out of connections, effectively overloading
the system and causing it to shut down.
In a normal software testing cycle, you would attempt
to fix this problem. If you were the developer,
you would attempt to fix it on your own; if you
were the project manager, you would have someone
on your team work on it. Your task would be to go
through the code and find each leaking connection.
During your search, you discover a pattern to this
error: assuming that the middleware is written in
Java, the problem is that classes opening connections
do not have either 1) a finalize() method that closes
the connection or 2) connection openings that are
enclosed in a try/finally block which ensures the
connection is closed. This is an easy, if repetitive,
repair. You go through the code, pair each open
connection with a finalize() method, and ensure
that every connection opening is enclosed in a try/finally
block.
Now that your connections are closed, you retest
and find that your system is more robust. You抳e
found the source of the problem, correctly diagnosed
it, and fixed the problem. End of story, right?
But what if other resource leaks crop up? You would
have to fix them one-by-one as well. For complex
N-tier systems, you might end up spending a great
deal of time debugging and very little time actually
writing code.
The ineffectiveness of this 揾unt and peck?method
is obvious. It is simple error detection; by fixing
only the error in front of you, you抳e done nothing
to prevent an entire class of similar errors. What
about resource leaks to network connections, or
to files? What about the next N-tier system you
create? Have you done anything to ensure that you
pair each open connection with a finalize() method
and enclose each connection opening with a try/finally
block? If not, you will probably have to find and
fix this same error in the next system as well.
Thinking the 揂EP Concept?Way
When you use the AEP Concept to address this problem,
you are not interested only in fixing the leaking
connections between your middleware and your database.
Rather, when faced with this problem, you ask yourself
two questions:
- a. Why am I leaking connections? (And how can
I quickly and effectively fix this particular
problem?). This can be understood as your immediate
short-term goal.
- b. How can I prevent lost resources in general
across all error types? This can be understood
as your larger, long-term goal. Asking this question
begins to move you away from error detection into
true error prevention.
Using the AEP Concept, you move through Deming抯
five steps to answer these questions:
- 1. Identify the error. In this case, you are
leaking connections between your middleware and
the database, leading to system overload and shutdown.
- 2. Find the cause of the error. The leaked connections
are due to open connections that are not closed.
- 3. Locate the point in production that created
the error. For the software industry, this is
where a lot of people get lost. 揇evelopment?is
the location of the error. But is it one particular
developer that continually forgets to close connections,
or is it a common group error that is repeated
again and again? Once you抳e located the source,
you must address these types of questions in order
to move to the next point. For our N-tier example,
let抯 assume that it isn抰 one sloppy programmer
but a common mistake that is made throughout your
development group. Knowing this, you need a common
solution that can be implemented automatically
to ensure that the error is always looked for.
- 4. Implement preventative practices to ensure
that errors do not reoccur. Here, you need to
pair each open connection with a finalize() method
and enclose each connection opening with a try/finally
block to close each open connection or each open
finally block. How can you repeat this practice
for future development projects? Through coding
standards. Coding standards are a great way to
achieve this level of prevention ?simply create
a rule that checks for this pairing. This action
ensures that each opened method is also closed.
By creating this rule you have moved upstream
from the load testing error and established a
coding rule that mandates that no connections
should be left open.
- 5. Monitor the process. An automated coding
standards scanning tool can be used to ensure
that this rule is enforced across the development
group. This is true AEP implementation. Not only
do you change the practice to ensure that errors
are not repeated, but you also automate the procedure
and use its results to measure how well that change
is being followed across the development group.
This makes it possible to determine if your change
is effective, or if further changes need to be
implemented somewhere else in the process.
Most importantly, using this rule and the scanning
tool, you can extrapolate this rule from leaked
connections between middleware and the database
into a larger class of lost resource errors. Lost
network connections, bad file locations, and similar
errors can be prevented by created coding standards
that look for such errors. Having an entire set
of rules that prevent lost resources makes it easier
and quicker to determine how robust your final N-tier
systems are.
Implementing the Parasoft
AEP Methodology?
As you can see from the previous section and its
example, using the AEP Concept is a powerful way
to prevent errors. It is also a very complex concept.
The process described here for eliminating one
bug is elaborate and difficult to perform. It requires
experience, patience, and creative thinking to implement
correctly. In reality, if you tried to perform this
set of steps for every error that you needed to
eliminate, you probably would not succeed.
To remove these barriers to applying the AEP Concept,
Parasoft researched the most effective way to implement
the AEP Concept in a software group environment
and developed the Parasoft AEP Methodology?/i>,
which codifies Parasoft抯 research and experience
with AEP. This methodology provides a well-tested
blueprint for effectively implementing the five-step
AEP Concept in a group. It provides clear and practical
guidelines for every implementation detail?from
the required practices and infrastructure elements,
to a plan for introducing each practice and element,
to a team workflow that makes the most efficient
use of the prescribed practices and elements. By
leveraging the experience and research represented
by the Parasoft AEP Methodology, you can start applying
and benefiting from the AEP Concept as rapidly as
possible.
There are 5 main principles to the Parasoft AEP
Methodology:
- 1. Apply industry best practices to prevent
common errors and establish a foundation for full-lifecycle
error prevention.
- 2. Modify practices as needed to prevent unique
errors.
- 3. Ensure that each group implements AEP correctly
and consistently.
- ?3a. Introduce AEP on a group-by-group basis.
- ?3b. Ensure that each group has an appropriate
supporting infrastructure.
- ?3c. Implement a group workflow that ensures
error prevention practices are performed appropriately.
- 4. Phase in each practice incrementally.
- 5. Use statistics to stabilize each process,
and then make it capable.
AEP Methodology
Principle 1: Apply industry best practices to prevent
common errors and establish a foundation for full-lifecycle
error prevention
The Parasoft AEP Methodology was developed in recognition
of the fact that different development processes
share many of the same characteristics and pitfalls,
but that each project and organization will also
encounter its own unique challenges. Consequently,
Parasoft AEP Methodology defines how to best prevent
the errors common to most development groups, but
also provides a blueprint for identifying and preventing
recurrences of the errors that are unique to a specific
development group, process, or project.
At a high-level, the software lifecycle always
looks the same, whether it involves developing Java-enabled
Web applications with Extreme Programming, developing
C embedded system programs with a waterfall process,
or anything in between. In all cases, software development
involves the natural design> develop> deploy>
manage lifecycle sequence illustrated in the left
circle in Figure 2.
Figure 2: Anchoring and customizing
AEP in the software development lifecycle
To anchor basic error prevention into the common
software lifecycle, you integrate industry-accepted
best practices (such as coding standards, unit testing,
integration testing, and so on) into the lifecycle
as described in the middle circle of Figure 2 and
automate these practices as much as possible. This
automated full-lifecycle error prevention is the
basic foundation of the Parasoft AEP Methodology?
Automation is Essential to AEP
Automation is vital to the success of AEP.
Without technology that automates each AEP
practice as fully as possible, AEP will become
too difficult to practice, will not be implemented
thoroughly and consistently, and will not
reach its potential.
|
The integrated best practices are the product of
software industry experts examining the most common
errors for different languages, and then developing
best practices designed to prevent these common
errors. They represent a wealth of knowledge that
was developed through many prior rounds through
the five-step AEP Concept process. By leveraging
these existing practices, you can start preventing
many common and serious errors before you ever perform
the five-step process yourself. Essentially, by
adopting these "out-of-the-box" practices, you can
instantly progress from following the few best practices
that your organization has developed over the years
to following a comprehensive set of best practices
that have been developed, tested, and perfected
by industry experts who have analyzed vast amounts
of code梐nd errors?worldwide. You inherit the benefits
of tremendous experience without having to perform
any of the groundwork required to acquire that experience.
AEP Methodology
Principle 2: Modify practices as needed to prevent
unique errors
Inevitably, some errors will slip through these
搊ut-of-the-box? practices because every development
process and project has its own unique challenges.
The AEP Methodology expects these unique errors
and provides a mechanism for customizing the practices
to prevent these errors.
Each time you discover an error that evades the
existing error prevention practices, you apply the
five-step procedure that is the core of the AEP
Concept. You identify the error, determine the cause
of the error, look upstream in the software lifecycle
to determine the root cause of the error, modify
an existing practice (or implement a new one) to
prevent that type of error from recurring, then
check adherence to that practice to monitor whether
it is being followed. This process is illustrated
by the right circle in Figure 2, which shows how
this principle might be applied to prevent the problem
in the previous section抯 example?a performance problem
that was identified during load testing, traced
back to a resource leak caused by a development
coding error, then prevented through the implementation
of a new coding standard. When you apply this five-step
process to each error that evades the 搊ut-of-the-box?
practices, you not only fix each individual bug
you encounter, but also modify the development methodology
to prevent the same type of error from recurring.
As a result, the development methodology becomes
increasingly error-resistant with each bug identified.
Using AEP Leverages Valuable Industry
Knowledge
Before moving to our third fundamental consideration
?how to ensure that AEP practices are performed
correctly and consistently within a group?it
is essential to reiterate that the Parasoft
AEP Methodology?/i> is not a labor-intensive
approach to error prevention. Why is this?
While formulating the methodology, we realized
that a lot of people in the software industry
have thought a great deal about error prevention.
There is a vast body of knowledge available
about error prevention techniques such as
coding standards, unit testing, and so on.
There is a lot of valuable advice available
about how to write good code and avoid errors;
indeed, there are volumes upon volumes of
coding rules for any number of languages.
The trouble with this information is that
it is not easily applied, and therefore it
is underused. This is unfortunate, since this
knowledge is a great shortcut to implementing
AEP. Because so many software industry experts
have analyzed the bad practices and behaviors
that lead to errors, and have discovered how
to avoid them, it is imperative that this
knowledge be used. The brilliance inherent
in the Parasoft AEP Methodology?/i>
is that by deploying these practices, you
bring into your software lifecycle a huge
body of knowledge that prevents errors.
Even when you need to prevent unique errors,
you can leverage the work that others have
already completed. As you apply the five-step
process to prevent unique errors, you determine
why the standard, prepackaged sets of rules
and techniques may not have caught that particular
error, and then you modify the existing practices
to prevent that error. You already have a
fully usable and implemented system based
on a prepackaged set of rules and techniques,
so you do not need to start from scratch.
The work described in our load testing example
would only need to be done for errors you
are finding that have not been previously
analyzed by industry experts. Because you
are already leveraging industry-accepted best
practices, you need only analyze your own
special development and coding needs.
|
AEP Methodology
Principle 3: Ensure that each group implements AEP
correctly and consistently
Even if a development group or organization has
carefully implemented the "out-of-the-box" AEP practices
and diligently modified them as needed to prevent
their unique errors, this effort is useless if the
vital AEP practices are not implemented correctly
and consistently throughout the organization抯 development
groups. This consistent application requires that
you introduce AEP on a group-by-group basis, ensure
that each group has an appropriate supporting infrastructure,
then ensure that each group follows a group workflow
that ensures error prevention practices are performed
appropriately.
Principle 3a: Introduce
AEP on a group-by-group basis
The most effective way to bring AEP into an organization
is to implement it on a group-by-group basis. You
start with one group, then, after this group is
effectively practicing AEP, you start another group,
and so on and so on.
The definition of "group" tends to vary from organization
to organization. For the purposes of AEP, a "group"
is a group of people that work on a common development
project. This group typically includes between five
to ten developers, an architect, a project manager,
and any QA testers that were assigned to that project.
AEP group definitions were developed by reviewing
the natural structure of most groups, then formalizing
that natural structure in a way that would best
support AEP.
Principle 3b: Ensure that
each group has an appropriate supporting infrastructure
Each group must have a functioning source control
system and automated build process before its members
can start practicing AEP. A source control system
(also known as a configuration management system)
is a database where source code is stored. Its purpose
is to provide a central place where the team members
can store and access the entire source base. An
automated build process automatically executes the
necessary build steps (compilations, transfers,
etc.) at a scheduled time, without any human intervention.
Principle 3c: Implement
a group workflow that ensures error prevention practices
are performed appropriately
To ensure that error prevention techniques are
used consistently, the Parasoft AEP Methodology
describes how to build an automated support infrastructure
to ensure that these practices become an integral
and enduring part of the team members?day-to-day
workflow.
Experience has taught us that these practices must
be embedded into the software development groups
that will use them. This means that everyone in
the development group must understand their role
?be it Developer, Architect, QA, or Project Manager
?and that they must understand how to adhere to
that role given the practices in place. For example,
understanding who creates coding standards, where
these standards are stored, and who uses them and
when, is necessary for the group to successfully
adopt automated coding standards. Most importantly,
defining group behavior ensures that the practices
adopted remained ingrained and that they do not
deteriorate over time.
In a model AEP implementation, every developer
and QA team member has a local installation of each
AEP supporting technology; the technology settings
are determined by the team architect and standardized
across the team.
The development team members will have the appropriate
AEP technologies installed on their workstations
and integrated into their daily edit, compile, debug
process. When a developer creates a new file or
checks existing code out of source control and edits
it, he must ensure that he has performed the practices
defined by the team抯 architect before adding that
new or modified code to source control. Practice
compliance is checked by using the appropriate AEP
technologies to automate the error prevention practices
(static analysis, unit testing, etc.). Developers
fix any reported errors so that their code complies
with the required practices, and then check their
code back into source control.
The QA team performs integration testing on the
entire checked in project baseline (the checked
in work of all project developers) by using workstation
versions of the standard team AEP technologies and
by manually exercising the application as needed.
Preferably, QA tests new and modified functionality
as soon as developers add it to source control.
To streamline the QA testing process, QA抯 technologies
are configured to access the same standard team
test settings and files that the developers used
and created. QA can then extend the developers?tests
to perform more complex functional tests (such as
system level tests that check real-life operations
which span multiple developers?work) as well as
advanced load tests (if applicable). If QA uncovers
a new error through these or other tests, the QA
team notifies the architect and then works with
the architect to diagnose the root cause of the
error, then design and implement an error prevention
mechanism to prevent this error from recurring.
This error prevention mechanism might involve requiring
developers to follow a new practice, modifying how
developers should perform a current practice, or
simply changing the team抯 standard tool settings.
QA also adds to the test suite one or more test
cases that verify whether the error has been resolved.
At that point, the architect asks the responsible
developer to correct the problem. Once the new test
cases pass, the error is deemed corrected.
To verify that the required practices are being
implemented correctly and do not decay, the AEP
technologies run as a batch process to test the
entire project baseline at regularly scheduled intervals/frequencies
(usually, at the same intervals/frequencies as the
team抯 automated build). These tests verify whether
the appropriate coding standards, unit testing,
and integration testing practices have been implemented.
They use the same test parameters and files used
by development and QA and automatically generate
and execute additional tests as needed. No errors
should be reported at this point. If any coding
standard or unit test errors are reported, the tools
notify the architect and developer; this notification
serves as an invitation for a code review. During
the code review, the architect and developer review
the violation of the practice and determine whether
this was an appropriate violation of the team抯 required
practices. If any other errors are reported, the
tools notify the architect and the QA team; this
notification serves as an invitation for a meeting
to diagnose the root cause of the error, then design
and implement an error prevention mechanism to prevent
this error from recurring.
In addition, this infrastructure should store and
analyze the information from this verification.
Team members can access these analyses and use them
to assess the effectiveness of the current practices
and determine what additional practices would be
helpful to implement. The team can then use this
same infrastructure to implement and monitor any
process improvements that they decide to adopt.
The recommended workflow is illustrated in Figure
3.
Figure 3: The recommended group
workflow for success with AEP
AEP Methodology
Principle 4: Phase in each practice incrementally
One reason why best practices (such as automated
coding standard enforcement) have historically failed
is due to the overwhelming amount of information
that typically is delivered when developers first
apply a new practice to their source code. In fact,
the amount of information is usually so overwhelming
that many developers reject the findings out-of-hand
as 搉oise? and simply ignore the results altogether.
The key to overcoming these challenges and successfully
implementing a practice is introducing it in phases.
This prevents the team from being overwhelmed by
having to learn and following an unmanageable number
of new requirements at once梚n addition to performing
their normal job responsibilities. One particularly
helpful strategy is to divide each practice into
several levels梒ritical, important, and recommended梩hen
introduce each level incrementally. You would start
by requiring that team members follow only the most
揷ritical?practice requirements. Once the team was
able to comply with these requirements, you would
then require that they also follow the 搃mportant?practice
requirements. Once all of these tasks were mastered,
you would phase in the 搑ecommended? practice requirements.
Only at that point would the team members be required
to follow the complete practice.
Another useful strategy is to apply the practice
only to files that were created or modified after
a predetermined "cutoff date," or?even better?to
apply the practice to only the specific lines of
code that were created or modified after the cutoff
date. Having a reasonable implementation strategy
facilitates practice implementation and enforcement,
and also maximizes its benefits.
AEP Methodology
Principle 5: Use statistics to stabilize each process,
then make it capable
AEP will not deliver the maximum benefits until
its processes are stable and capable. A stable process
is predictable; its variation is under control (i.e.,
when representative variables are plotted on a control
chart, they fall between the upper control limit
and the lower control limit, which are based on
quality controls such as Six Sigma). For a process
to be considered capable, it must be stable and
the average of its plotted variables must fall within
the specification limits, which vary for each process.
A process that is under control, but does not meet
its target levels, would be considered stable but
not capable.
AEP processes can and should be treated as statistical
processes. To facilitate this, Parasoft has defined
variables to represent each AEP process and defined
the target average value used to assess each process's
capability. Each group implementing AEP needs to
monitor these predefined variables, plot them in
a control graph, then use statistical control limits
and target average values to statistically analyze
the related processes. To make a process stable,
the group must identify and remove the "special
causes" that are responsible for the significant
variance in the results. If the process is stable
but not yet capable, the group must modify and improve
the process so that the related variables are consistently
at a more desirable level.
For example, the Confidence Factor is one variable
that measures the group's adherence to all error
prevention practices. When the Confidence Factor
variable values are plotted in a control graph,
the graph illustrates the typical day-to-day behavior
of the development group. What you want to achieve
is the long-term stability of this behavior, so
that it is moving across time in a small range near
the top of the scale. When the Confidence Factor
is in this small range it indicates that the code
is not being broken through feature additions, test
cases are succeeding, and so on. When this is the
case, the application can be released because everything
is working as it should, and errors are not being
found.
The process represented in the following graphic
is statistically stable, since it remains well within
the control limits. However, the average Confidence
Factor is not terribly high; although the process
is stable, it is not yet capable.
Figure 4: A control graph for the
Confidence Factor Variable
To make this process capable, the group must determine
what process modifications would raise their Confidence
Factor measurements, modify the process, then continue
plotting and analyzing this variable. Once this
variable is both stable and at an acceptable level
(for example, an average of 80), the process is
capable.
How Automated is AEP?
I estimate that approximately 50% of errors'
root causes can be prevented automatically
by applying "pre-canned" general best practices
that leverage the experience of industry experts.
These best practices are the product of software
industry experts examining the most common
errors for different languages, and then developing
best practices designed to prevent these common
errors. They represent a wealth of knowledge
that was developed through many prior rounds
through the five-step AEP Concept process.
By leveraging these existing practices, you
can start preventing many common and serious
errors before you ever perform the five-step
process yourself. Essentially, by adopting
these "out-of-the-box" practices, you can
instantly progress from following the few
best practices that your organization has
developed over the years to following a comprehensive
set of best practices that have been developed,
tested, and perfected by industry experts
who have analyzed vast amounts of code梐nd
errors?worldwide. You inherit the benefits
of tremendous experience without having to
perform any of the groundwork required to
acquire that experience.
Whether or not a practice can be automated
very much depends on its degree of generality.
Generally, industry best practices are developed
in an attempt to devise general rules that
apply to a wide variety of possible code usages.
However, the most general best practices are
typically the most difficult to automate.
Often, a best practice's generality needs
to be restricted before it can be automated.
Every practice cannot be automated; however,
in many situations, you can use automation
to streamline processes that require human
involvement. Automation cannot relieve you
from having to manually review code to determine
whether it satisfies very subjective, situation-based
criteria, but it can automatically identify
all code that needs to be reviewed.
For example, assume that you have classes
that are designed to perform transactions.
The transactions inside these classes always
have to be properly rolled back if the transaction
encounters problems. However, your team knows
from prior experience that transaction rollbacks
typically have a lot of problems. There are
some general rules you can apply to make rollbacks
less error prone (for example, "Do not throw
exceptions unless you have the class in the
initial state). This rule is so general that
it means different things to different pieces
of code, so it's virtually impossible to automatically
verify whether the code follows this rule.
Consequently, you need to manually review
all exceptions in transaction-related classes.
If you develop an automated way to identify
all transaction-related classes that throw
exceptions, you are relived from having to
manually search through your entire code base,
and have more time and energy to dedicate
to performing the subjective review that cannot
be automated.
Moreover, some errors will inevitably slip
through the automated 損re-canned? automated
practices because every development process
and project has its own unique challenges.
The AEP Methodology expects these unique errors
and provides a mechanism for customizing the
practices to prevent these errors automatically.
I recommend that once you start fully implementing
AEP, you take the last version of your software,
classify all errors that you found for that
version, identify their root causes, then
abstract general mechanisms that will prevent
that same type of error from occurring. Identifying
and introducing the process modifications
will require some time and energy upfront,
but the automated error prevention that results
should make it well worth the effort.
|
Human Concerns Related to
AEP
One goal of AEP is to automate error prevention
as much as possible. However, in the end, there's
no value to automation if team members do not take
the appropriate action based on the data yielded
from that automation. Ironically, while AEP strives
to remove the dependence on human factors as much
as possible, in the end, it is humans that will
make or break AEP. The prime reasons why AEP may
not deliver the desired results are:
- ?Insufficient management support: The manager
does not frequently monitor the process and strictly
require that team members follow the prescribed
practices and correct all problems exposed by
the practices.
- ?Insufficient architect and QA support: The
architect and QA are not dedicated to ensuring
that each time an error is detected, the process
is modified to prevent all similar errors. If
the architect and QA are not committed to this
process improvement, the team will not be able
to improve the way in which they produce software.
- ?Lack of buy-in to the importance of error prevention:
The entire team does not recognize the value in
identifying and correcting code does not yet produce
a runtime error, but which is error-prone. Developers
often argue that if a reported code problem does
not result in a runtime error, the code does not
need to be modified. They don't understand that
in order to prevent errors, you need to write
code in a way that reduces the possibility for
errors梟ot just rid the code of full-fledged errors.
Management needs to educate the team on why error-prone
code needs to be taken just as seriously as error-causing
code.
- ?An overwhelming introduction to AEP: All AEP
practices are suddenly required for all code,
which typically causes the team to become overwhelmed,
reject the findings as 搉oise,?and simply ignore
the results altogether. A better strategy is to
introduce practices one by one, and introduce
each practice in phases. One particularly helpful
strategy is to divide each practice into several
levels, then introduce each level incrementally.
Another useful strategy is to apply the practice
only to files that were created or modified after
a predetermined "cutoff date", or?even better?to
apply the practice to only the specific lines
of code that were created or modified after the
cutoff date.
Redefining Roles
One key requirement for success is that the team's
architect and QA team members are willing and able
to extend their role as defined by AEP.
QA team members need to adopt a new approach to
errors. Each time an error is found, they need to
consider what other errors might be related to that
error, and how to abstract a general error prevention
mechanism from that error. The most talented QA
team members have a sharp intuition about where
errors are coming from, and they write test cases
to check whether the anticipated errors are actually
present. They strive to break the system, not find
positive reassurance that it is working. Every time
they find an error, QA team members need to try
to understand its root cause. This requires that
they understand the system's overall system architecture
and have an intimate understanding of the part of
the architecture related to that error. Moreover,
to help the team improve the process and monitor
attempted process improvements, QA team members
must understand the statistics behind measuring
stabilizing, and managing processes, as well as
how to identify whether a special cause or variation
is affecting a process.
Under AEP, the architect not only needs to be able
to design a system, but also to ensure that this
system is built properly and works properly. He
or she needs to have excellent vision, many years
experience writing code, a keen sense of where problems
can occur in code, the ability to determine algorithmic
solutions for problems, and the ability to abstract
from technical details to a higher-level understanding
of code operations.
Both QA and the architect need the talent and drive
to determine how the team can modify the way the
team designs, writes, deploys, and monitors code
in a way that will prevent errors from recurring;
they need to adopt the mantra "I can find each type
of error only once." To do this, they need to be
very skilled at abstracting, at looking at a simple
error, determining its root cause, and determining
how this simple error can provide great insight
into ways to improve the team's process. The team's
ability to improve its process is directly related
to the QA team members' and architect's ability
to abstract.
Establishing Group
Culture
For AEP to become an enduring part of the team,
it must be supported by a strong group culture.
Group culture is a development team's way of working
together, including their shared habits, traditions,
and beliefs. A positive group culture should promote
code ownership, group cooperation, peer learning,
common working hours, and mutual respect. When managers
and leaders focus on developing and supporting a
positive group culture, the team is typically more
self-regulating, creative, effective, and satisfied.
The most important element of such a group culture
is code ownership. Code is the group抯 greatest asset
because it is the main thing that they have to show
for all of their work. It also serves as means of
communication: developers exchange the majority
of their ideas by reading and writing code. Just
as mathematicians communicate their ideas most precisely
with equations, developers communicate their ideas
most precisely with code. Thus, by protecting the
quality of their code, developers can preserve their
best ideas in the clearest, most concise way possible,
as well as ensure that their communications are
as effective as possible.
Because code is such an important expression and
product of the group, caring about the quality and
success of this code is fundamental to group culture.
It is the glue that holds a group together. You
want to build a culture where the developers?attitude
towards the code reflects the code抯 importance.
Developers should show that they care about the
code because caring about the code is synonymous
with caring about the group. If a developer cares
about the code, he will care about the group, and
if he cares about the group, he will care about
the code. It抯 fundamental that everyone feels that
they have a stake in maintaining high-quality code.
This prevents group members from doing anything
that harms code quality-- if they care about the
code, they won抰 hack at it, shortchange it, wire
it, etc.; they will always try to ensure that it
is solid and working. This, in turn, helps filter
揵ad apples?out of your development group. In an
environment where group members feel a strong investment
in code quality, any developer who does not care
about the code will alienate himself from the group.
Most group developers will soon become frustrated
with someone who constantly introduces problems
into the code and try to help this developer improve.
If he does improve, the group is stronger; if he
does not, group conflict will generally lead him
to leave the group.
Final Thoughts
The software industry must mature. There can be
no maturation without effective error prevention
and process improvement. We have no other options?we
must do it. If we don't, our industry will pay dearly
for it, much as the U.S. automotive industry did
in the 1960s and '70s, when it lost nearly half
of its market to foreign competitors that knew how
to prevent errors and make better quality products.
Offshore outsourcing is already starting to do this
to us. How long will we wait before we make the
necessary changes?
AEP, with its comprehensive collection of "ready-to-use"
error prevention practices and its unique embracement
of automation, offers the software industry a prime
opportunity to start making the necessary changes
in a way that is as painless as possible. By providing
a feasible way to implement error prevention in
the software industry, AEP helps software development
organizations eliminate the errors that lie at the
root of virtually every concern they are currently
struggling with?from high development and labor
costs, to quality issues, to security weaknesses,
to the need for more (or more rapid) strategic improvements.
Moreover, when an organization adopts the Parasoft
AEP Methodology, it will also enjoy some pleasant
side effects, including 1) the ability to capture
and continuously reuse organizational knowledge,
2) improved process visibility and control, and
3) the ability to establish a quality "baseline"
that can be improved upon to ensure that the organization
is always moving forward and using past experience
to improve its future.
Learning More
About AEP
Additional Parasoft AEP documents provide detailed
instructions for following the Parasoft AEP Methodology
principles and explain how the Parasoft AEP Methodology
supports industry quality control initiatives such
as SEI-CMM, SEI-CMMI, and ISO 9000-3.
About Parasoft
We make software work.
Parasoft provides Automated Error Prevention solutions
that combine advanced products, services and expertise
to help companies automatically prevent errors throughout
the software lifecycle, to improve software quality
and reliability. Based on Parasoft AEP Methodology,
the company抯 solutions and products automate practices
such as coding standards, static analysis, unit
testing, regression testing, load & stress testing,
functional testing, integration testing, application
testing and monitoring. The solutions enable software
development and IT organizations to significantly
reduce costs by shortening production cycles, improving
overall quality and reducing time-to-market. Parasoft
has been granted nine patents and numerous awards
for the technology behind its innovative line of
solutions and products. Founded in 1987, Parasoft
is a privately held company whose clients include
IBM, HP, Daimler Chrysler and over 10,000 companies
worldwide. Parasoft is headquartered in Monrovia,
CA. Telephone (888) 305-0041. Fax (626) 305-3036.
Email to info@parasoft.com .
URL: http://www.parasoft.com
.
Notes
i 揟he Economic Impacts of Inadequate
Infrastructure for Software Testing.?Washington
D.C., National Institute of Standards and Technology
(NIST), 2002. This report is available online
at http://www.nist.gov
. The astronomical
sum of $60 billion suggests the exponential increase
in software functionality in the last several
decades has been devoid of true quality innovations.
The study also notes that roughly 64% of this
enormous cost is absorbed directly by the consumer,
yet despite this tremendous burden there is no
other consumer goods market where poor products
enjoy such a vast popularity.
This economic toll is not immediately obvious
to the consumer. Confronted with endless application
variety, and stimulated by the constant growth
of features and performance, consumers see the
software industry as dynamic and inventive. The
industry actively propagates this view. Constant
innovation churns out 揷utting edge?applications,
rending recent products obsolete, and fueling
constant demand for newer and better programs.
Consumers buy into this view wholeheartedly, despite
the fact that it costs them dearly.
Indeed, this nasty cycle of innovation and obsolescence
hides the fact that most of the software available
today, from games to word processing to data and
industrial controls, is simply bad. Cries for
better software development techniques are made
when the havoc bad software makes takes its toll
on property and lives. But mostly, even when they
make the news, bugs are shrugged off as the necessary
price of innovation.
|