"Quality metrics let you know when to laugh and when to cry", Tom Gilb
"If you can't measure it, you can't manage it", Deming
"Count what is countable, measure what is measurable. What is not measurable, make measurable", Galileo
"Not everything that can be counted counts, and not everything that counts can be counted.", Albert Einstein
SUBJECTIVENESS OF QUALITY
"When a measure becomes a target, it ceases to be a good measure"
GOODHART'S LAW
E.g. Increase number of test cases, detected bugs, penalize false bug reports...
Fingerpointing, Create Reports, Evaluate People...
BECAUSE OF THE TARGET OF MEASURING
Goodhart's laws occur when metrics are used for:
By measuring something, you should not always imply that the target is always improving that metric
Collect (Data)
Calculate (Metrics)
Evaluate (Metrics)
Measures
Metrics
Indicator
Examples: 120 detected bugs, 12 months of project duration, 10 engineers working on the project, 100.000 Lines of Code (LOC)
Examples: 2 bugs found per engineer-month, 1.2 bugs per KLOC
Examples: bugs found per engineer-month might be an indicator of the test process efficiency, bugs/KLOC an indicator of code quality, etc..
Measure: Quantitative indication of the exact amount (dimension, capacity, etc.) of some attributes
Measurement: The act of determining a measure
Metric: is a quantitative measure of the degree to which a system, component or a process possesses a given attribute
Indicator: a metric that provides insights into the product, process, project
Process Project Product
Process Metrics: Related to the Software Development Life Cycle (e.g. the process throughout which we developed software). E.g. if a measure the defects I detect every month, I can get the metric "Defect Arrival Pattern" and get an indication on how good my process is about removing defects. E.g. If the number of defects detected increases with the time, I might need to increases testing, review the testing approach, add code reviews, etc...
Project Metrics: Related to the Team that developed the product, usually focused on efficiency, productivity, etc. Errors found per engineer-month is an example of a project metric that provides an indication of how the efficiency of my engineers with respect to detecting defects.
Product Metrics: Related to the finished product itself. They measure aspect such as product size, complexity, quality level... E.g in the case of quality, a possible metric is the Defect Density (number of defects / Size) which provides an indication of the quality of the product (the higher the worse)
120 defects detected during 6 months by 2 engineers
Defects defected every month: 10, 10, 20, 20, 25, 35
Defects remaining in the final product: 40
Size of the Product: 40.000 Lines of Code
Process Metric: Defect Arrival Pattern 10, +0, +10, +0, +5, +10 -> Indicator of Maturity
Project Metric: 40 KLOC / 2 / 6 = KLOC per eng-month -> Indicator of Productivy
Product Metric: 40 defects / 40 KLOC = 1 defect / KLOC -> Indicator of Quality
Measurements
Metrics and Indicators
Not all the metrics above are related to quality, let's focus on them
End Product
In Process
Intrinsic
Customer
Quality
Metrics
Metrics that provide indications of the Quality of the Product and Process
Metrics that provide indications of the Quality of the Process. Useful to take actions to improve the process the team is following for building products.
Metrics that provide indications of the Quality of the end of product. Don't take into account how the product has been developed as they just look at a snapshot.
Not useful to improve the current product but to understand its quality
Don't take into account the customer. The product and just the product.
Focused on customer perception of the product.
Prevent Bugs
Detect Bugs and fix them
Implement techniques to contain them
Measure and Calculate them
AND
Anticipate end-product metrics
NO RETURN POINT
Controlled roll-out
A/B Testing
Beta programs
are intended to mitigate this "leap of faith"
QA Activities
Process metrics
End Product
In Process
Intrinsic
Customer
Quality Metrics
How often your product has a failure, i.e.
Probability of not having a failure during a specific amount of time
R(n) - Probability of not failing during n time
where n is the number of time units (days, hours...)
F(n) = 1 - R(n) - Probability of failing during n time
For instance, if the time is measured in days, R(1) is the probability of the product not having a failure during 1 day
Metrics very related to Reliability
Mean Time To Failure (MTTF) is the average time that occurs between two failures in the system
Error Rate is the average number of failures suffered by the system during a given amount of time
Relationship between them depends on the statistical distribution of the errors, not only on the error rate.
Example: The number of failures of one product during the days of one week is 10, 20, 30, 40, 50, 60, 70. A second product fails in the opposite way: 70, 60, 50, 40, 30, 20, 10.
Total number of errors is 280, Error Rate = 280/7 = 40 Errors / Day for both systems, but the Reliability of the first day is very different, right?
If the system failures follow an exponential distribution:
Constant Error Rate, in failures per unit of measurement, (e.g., failures per hour, per cycle, etc.)
Probability Density Function
Cumulative Density Function
Think about Hardware...
Do you think Hardware has a constant error rate or do you think it changes with the time? If it changes with the time, how?
Think about Hardware...
Weibull distribution intended to be versatile and flexible enough to deal with this kind of Failure Rates
Fixed eta
Fixed eta
To not fail, the system requires all the three modules not to fail
To not fail, the system requires all the three modules not to fail at the same time
for instance:
Different types of densities can be calculated depending on the type of defects that we count into the numerator, for instance:
... in any case, the only difference is the name and the meaning of the metric, the way to calculate it is the same
The easiest way to measure the size of a Software is counting the Lines of Code (LOC). The more LOC, the bigger is the software, the more opportunities for inject defects we have.
With the same amount of defects. The bigger the software is, the smaller the density.
1.000 LOC = 1 KLOC
Product A - Version 1
PLEASE CALCULATE THE DIFFERENT DEFECT DENSITIES YOU CAN IMAGINE
PLEASE CALCULATE THE DIFFERENT DEFECT DENSITIES YOU CAN IMAGINE
Product A - v1.0
30 KLOC
Product A - v1.0
30 KLOC
FOR REPORTED BUGS:
DD = 1000 / 30 = 33.33 Defects reported / KLOC
Per type:
900 / 30 = 30 Critical Defects Reported / KLOC
70 / 30 = 2.3 Major Defects Reported / KLOC
30 / 30 = 1 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 30 / 30 = 1 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (30+20) / 30 = 1.6 Defects / KLOC
Product A - v1.1
30 KLOC
5 KLOC
v1.0 released
v1.1 released
1 month
Team fixed all the bugs remaining (30) and reported by users (20) Team added new functionality (5000 new LOC) During Development of v1.1 we reported: 200 Bugs, 180 Critical, 10 Major, 10 Minor We fixed all the bugs except Minor ones
1 month
During the usage of v1.1 (1 month) based on users usage we detected 10 new bugs
PLEASE CALCULATE THE DIFFERENT DEFECT DENSITIES YOU CAN IMAGINE
FOR REPORTED BUGS:
DD = 1000 / 30 = 33.33 Defects reported / KLOC
Per type:
900 / 30 = 30 Critical Defects Reported / KLOC
70 / 30 = 2.3 Major Defects Reported / KLOC
30 / 30 = 1 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 30 / 30 = 1 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (30+20) / 30 = 1.6 Defects / KLOC
FOR REPORTED BUGS:
DD = 200 / 35 = 5.7 Defects reported / KLOC
Per type:
180 / 35 = 5.14 Critical Defects Reported / KLOC
10 / 35 = 0.28 Major Defects Reported / KLOC
10 / 35 = 0.28 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 10 / 35 = 0.38 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (10+10) / 35 = 0.57 Defects / KLOC
v1.0
v1.1
Defect Density?
Which number would you choose as
FOR REPORTED BUGS:
DD = 1000 / 30 = 33.33 Defects reported / KLOC
Per type:
900 / 30 = 30 Critical Defects Reported / KLOC
70 / 30 = 2.3 Major Defects Reported / KLOC
30 / 30 = 1 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 30 / 30 = 1 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (30+20) / 30 = 1.6 Defects / KLOC
v1.0
v1.1
FOR REPORTED BUGS:
DD = 200 / 5 = 40 Defects reported / KLOC
Per type:
180 / 5 = 36 Critical Defects Reported / KLOC
10 / 5 = 2 Major Defects Reported / KLOC
10 / 5 = 2 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 10 / 5 = 2 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (10+10) / 5 = 4 Defects / KLOC
FOR REPORTED BUGS:
DD = 1000 / 30 = 33.33 Defects reported / KLOC
Per type:
900 / 30 = 30 Critical Defects Reported / KLOC
70 / 30 = 2.3 Major Defects Reported / KLOC
30 / 30 = 1 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 30 / 30 = 1 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (30+20) / 30 = 1.6 Defects / KLOC
FOR REPORTED BUGS:
DD = 200 / 35 = 5.7 Defects reported / KLOC
Per type:
180 / 35 = 5.14 Critical Defects Reported / KLOC
10 / 35 = 0.28 Major Defects Reported / KLOC
10 / 35 = 0.28 Minor Defects Reported / KLOC
FOR BUGS IN FINAL PRODUCT (AT RELEASE TIME)
DD = 10 / 35 = 0.38 Defects / KLOC FOR ALL THE BUGS WE KNOW AFTER 1 MONTH
DD = (10+10) / 35 = 0.57 Defects / KLOC
All
Lines
New
Lines
YOU NEED TO UNDERSTAND WHAT YOU ARE MEASURING, OTHERWISE, IT'S JUST A NUMBER OF NO HELP TO YOU
When the only programming language was assembler, this could be easy:
Physical Line = Instruction = LOC
But now, there are multiple programming languages with different approaches that includes comments, control sentences, logical lines that occupy multiple screen lines
... and as programming languages are very flexible, the same programming language could be used in different styles
Is it a good measurement of how big the code is? Maybe we should not talk about the size of the code in terms of length but in terms of how many opportunities for injecting defects there are. (OFE)
for (i=0; i<100; ++i) printf("I love compact coding"); /* what is the number of lines of code in this case? */
/* How many lines of code is this? */
for (i=0; i<100; ++i)
{
printf("I am the most productive developer");·
}
/* end of for */
1 LOC if I count lines limited by line breaks
6 LOC if I count lines limited by line breaks, 4 if I don't count comments...
Conclusion: If I want to measure how many opportunities to inject defects are there in the software, using Lines of Code is not the best approach
The opportunities for injecting defects are about the same in both examples, but the LOC count is very different
Remember the Goodhart's Law: If a metric becomes a target, it's not useful anymore
If I want to improve productivity, and for that, my target is increasing the number of LOC per developer-month, it could have a very negative impact on the project (devs can start doing stupid things such as adding unneeded comments)
Please have a look at:
External Inputs (EI): Process in which data comes inside from the outside (e.g. from an input screen from another app...)
External Outputs (EO): Process in which data goes from inside to outside. The data must be derived data (i.e. data that was not stored directly but that has been calculated based on other data)
External Inquiries (EQ): Similar to EO but the data must not be derived data.
Internal Logical Files (ILF): A group of logically related data that resides inside the application (and hence maintained by EI).
External Interface Files (EIF): A group of logically related data that resides outside of the app and is maintained by other app.
Need to count the amount of:
1 - Count Number of EIs, EOs, EQs, ILF, EIF
X1 High
Y1 Mid
Z1 Low
2 - Divide every bucket you have counted in 3 depending on its complexity (Low, Medium, High)
3 - Multiply the count of every bucket by the predefined factor (based on a table) and add all to calculate the Function Count
EIs
EOs
EQs
ILF
EIF
X2 High
Y2 Mid
Z2 Low
X3 High
Y3 Mid
Z3 Low
X4 High
Y4 Mid
Z4 Low
X5 High
Y5 Mid
Z5 Low
#EIs
#EOs
#ILFs
#EQs
#EIFs
X1*Factor-EI-High + Y1*Factor-EI-Mid + Z1*Factore-EI-Low + X2*Factor-EO-High...
X5*Factor-EIF-High + Y5*Factor-EIF-Mid + Z5*Factore-EIF-Low = FUNCTION COUNT
4 - Multiply the Function Count by the adjustment factor that depends on the type of software
FUNCTION COUNT * VALUE ADJUSTMENT FACTOR = FUNCTION POINTS
In order to fill in the table we need to: - Know how to count the components - Know how to classify depending on complexity - Know how to calculate the Value Adjustment Factor
Know how to classify depending on complexity
FTR (File Types Referenced): Number of files updated or referred
DET: Number of unique fields in the files
RET (Record Element Type): is a user recognizable subgroup of data elements (e.g. a table in a DB) DET: unique user recognizable, non recursive field
Know how to calculate the Value Adjustment Factor
Based on a list of 14 characteristics and evaluate the degree in terms of effect in the software from 0 to 5 (ci). After that, apply the following formula:
Characteristics:
- Count the components - Classify depending on complexity - Calculate the Value Adjustment Factor
Assembly Version | Java Version | ||
---|---|---|---|
LOC | 1000 | 200 | Java is higher level and hence requires fewer LOC |
FP | 5 | 5 | As the SW has the same features, FP are equal |
Coding Effort | 2 months | 1.25 months | Using a higher level language helps to reduce times |
Cost per month | $5,000 | $5,000 | The cost of the team per month is exactly the same |
Cost | $10,000 | $6,250 | As the team spend less time, the cost was smaller |
LOC per month | 500 | 160 | However, the number of LOC per month decreases |
FP per month | 2.5 | 4 | While the number of FP per month grows |
$ per LOC | $10 per LOC | $31.25 per LOC | The cost per LOC is higher... BUT it's not important |
$ per FP | $2,000 per FP | $1,250 per FP | As the important thing is cost per functionality which reduces a lot |
After developing an assembly version of one software we want to move to a higher level programming language (Java). Please check the impact on the project figures:
End Product
In Process
Intrinsic
Customer
Quality Metrics
End Product
In Process
Intrinsic
Customer
Quality Metrics
Problems
Satisfaction
Actionable
1.000 users, using a product during 1 year are 12.000 users-month
(The software has been used during 12.000 months)
DIFFERENCES BETWEEN PROBLEM USER-MONTH AND DEFECT DENSITY?
ASK DIRECTLY TO END-USERS:
CSAT (CUSTOMER SATISFACTION)
NPS (NET PROMOTER SCORE)
CES (CUSTOMER EFFORT SCORE)
...
GUIDELINES
- AVOID ANY ASSUMPTIONS
- DON'T ASK ABOUT HYPOTHETICAL SITUATIONS
- USE CLEAR AND COMPREHENSIBLE LANGUAGE
- ASK ONLY NECESSARY QUESTIONS
- IF YOU NEED DETAILED INFO USE "HOW" QUESTIONS
Measure of how products and services supplied by a company meet or surpass customer expectations
SCALE:
1 - Strongly Disagree
2 - Disagree
3 - Somewhat Disagree
4 - Neither Agree nor Disagree
5 - Somewhat Agree
6 - Agree
7 - Strongly Agree
Measures the level of difficulty customers experience when using a product
Typical answer: we've made many changes during the last month, our customers seem to like them and our overall numbers are higher this month...
Real Project graphics. Start-up running for about 2 years and getting significant funding.
Everything looks great, every month there are more customers. Right?
Let's look at the data from a different Point of View...
Don't look to total numbers (to avoid vanity metrics) but check the "performance" of each group of customers that comes into contact with the product independently (group = cohort)
After months of work, investment, improvements, new features, the percentage of new customers who subsequently pay money is exactly the same as in the beginning, even though we have more users using our product...
STEP 1. CREATE A FIREBASE ACCOUNT https://firebase.google.com/ STEP 2. ADD A NEW PROJECT STEP STEP 3. ENABLE HOSTING STEP STEP 4. FOLLOW THE INSTRUCTIONS TO CONFIGURE THE ENV
STEP 5. DEPLOY THE DEFAULT PAGES
STEP 1. CREATE A GOOGLE ANALYTICS ACCOUNT STEP 2. ADD A NEW PROPERTY (INCLUDING THE DOMAIN) STEP 3. GET THE CODE TO BE ADDED TO YOUR WEBSITE STEP 4. INSERT IT INTO YOUR WEBPAGES STEP 5. DEPLOY THE NEW VERSION
MONITOR REAL-TIME INFORMATION GET INFORMATION ABOUT YOUR AUDIENCE: TIMES, VISITED PAGES, FLOWS, ORIGINGS... CREATING GOALS CREATING EXPERIMENTS (A/B TESTING)
End Product
In Process
Intrinsic
Customer
Quality Metrics
Calculated during the development of the product
Most of them are based on:
a/ the number of defects detected
b/ when are they detected
c/ when are they fixed
Calculate the Defect Density of the non-final product after a development cycle (e.g. iteration, sprint, etc.) or after a testing cycle (e.g. integration testing)
Keep track of the evolution of this metric over the different cycles: E.g. is the metric getting better or worse in every sprint?
Purpose: get an indication of the potential Defect Density of the final product when it's on the field
Why do you think it's important to get that indication?
Myers suggested a counter-intuitive principle that the more defects found during testing, the more defects will be found later: Positive correlation between defect rates during development/testing and on the field.
In general, we could try to identify patterns in the evolution of the Defect Density
Example A
Example B
But... can you think of any reason for detecting more defects but not because the software is getting worse?
The defect rate is same or lower that on previous iteration (or a similar product). But has been testing worse?
YES
NO
The defect rate is higher than previous iteration. But, did we plan to test more thoroughly?
YES
NO
Although the previous metric is useful, is not enough to understand the progress of the development team:
1 - Because it's a discrete function: we only measure it after an iteration
2 - Because as any Defect Density, it tends to improve as the product matures: when fewer features are introduced, fewer defects are injected and hence can be fixed faster (fixed defects are not counted in Defect Density)
We use the Defect Arrival Pattern to complement the Defect Density.
What do you think it's the defect arrival pattern?
DAP: Distribution of the defects detected over the time
There are two ways to represent it: Rate or Cumulative
Rate
Day1
Day2
Day3
Day4
Day5
Day6
Day7
2
4
5
5
3
4
4
Defects
Day1
Day2
Day3
Day4
Day5
Day6
Day7
2
6
11
16
19
23
27
Time
Time
Defects / day
Defects
Imagine this was part of a single iteration. Which one would have the best Defect Density?
Valid/All Defects: We could count all the defects reported or only those that have confirmed as valid
Defect Severity: Maybe we are just interested in the critical/major defects
So far, we have talked about metrics related to the defects that I have detected.
Not all the defects that are detected can be immediately removed: Remember that a defect is just the visible symptom of something that needs to be fixed. But in order to do so, many actions must be done:
- Is it a valid defect?
- What is the root behind the defect?
- How complicated is to fix it?
- By fixing it, am I breaking other functionalities?
....
Can you think on some metrics to take this into account?
It describes the distribution of removed defects over time.
It can be done per-unit-of-time or per-iteration.
But this metric in an isolated way it's quite useless. There are some additional ones that complement it:
- Average time to fix a defect
- Average time to detect a defect
Remember that the cost of a defect is higher, the later we find it and the later we fix it
A burn-down chart is a graphical representation of work left to do vs. time
Typical Agile Burn-down chart
work left: number of defects remaining that
can go up (new defect found) or go down (defect fixed)
In an ideal world, before releasing a product both curves should be at the same point. What can we do to make it happen?
In an ideal world, before releasing a product both curves should be at the same point. What can we do to make it happen?
In summary, the classic quality, scope, time, resources dilemma
Measures how good my process is with respect to detect and remove defects: "How many defects I have removed from the potential list of defects that I could have removed?"
This question could be asked globally or in any phase/iteration.
Trying to be more formal:
- DRE = (Defects Removed / Defects Latent)*100 %
where Defects Latent = Defects Removed + Defects Found Later
An example
For instance, during the development a product I detected and fixed 80 defects. When the product hit the market, 20 additional defects were discovered.
DRE = Defects Found / Latent Defects = 80 / (80+20) = 80%
In average, I have removed 80 of every 100 defects I injected.
Another view
Another example
IT1 | IT2 | IT3 | IT4 | TOTAL REMOVED | |
---|---|---|---|---|---|
IT1 | 5 | 5 | |||
IT2 | 10 | 15 | 25 | ||
IT3 | 5 | 5 | 10 | 20 | |
IT4 | 5 | 5 | 0 | 5 | 15 |
TOTAL INJECTED | 25 | 25 | 10 | 5 | 65 |
During It1 25 defects were injected. 5 of them were fixed during the same iteration, 10 in It2, 5 in It3 and 5 in It4.
During It4 15 defects were removed. 5 of them were introduced in It1, 5 in It2 and 5 in It4.
What is the DRE of all the phases?
Another example
IT1 | IT2 | IT3 | IT4 | TOTAL REMOVED | |
---|---|---|---|---|---|
IT1 | 5 | 5 | |||
IT2 | 10 | 15 | 25 | ||
IT3 | 5 | 5 | 10 | 20 | |
IT4 | 5 | 5 | 0 | 5 | 15 |
TOTAL INJECTED | 25 | 25 | 10 | 5 | 65 |
During It1 25 defects were injected. 5 of them were fixed during the same iteration, 10 in It2, 5 in It3 and 5 in It4.
During It4 15 defects were removed. 5 of them were introduced in It1, 5 in It2 and 5 in It4.
DRE-IT1 = 5 / 25 = 20%
DRE-IT2 = 25 / (25+25) - 5 = 25 /45 = 55%
DRE-IT3 = 20 / (25+25+10) - 5 - 25 = 20 / 30 = 66%
DRE-IT4 = 15 / (25+25+10+5) - 5 - 25 - 20 = 15/15 = 100%
DRE = 65 / 65 = 100%
Another example
IT1 | IT2 | IT3 | IT4 | TOTAL REMOVED | |
---|---|---|---|---|---|
IT1 | 2 | 2 | |||
IT2 | 2 | 3 | 5 | ||
IT3 | 4 |
2 |
4 |
10 | |
IT4 | 2 |
1 |
2 |
3 |
8 |
ON THE FIELD | 2 | 1 | 1 | 1 | 5 |
TOTAL INJECTED | 12 | 7 | 7 | 4 | 30 |
During It1 12 defects were injected. 2 of them fixed in It1, 2 in It2, 4 in It3, and 2 in It4. 2 were only discovered on the field.
During It4 8 defects were removed. 2 of them were introduced in It1, 1 in It2, It3, It4.
What is the DRE of all the phases?
On the field we discovered 5 remaining defects that were introduced: 2 in It1 and 1 in It2, It3, It4.
Another example
IT1 | IT2 | IT3 | IT4 | TOTAL REMOVED | |
---|---|---|---|---|---|
IT1 | 2 | 2 | |||
IT2 | 2 | 3 | 5 | ||
IT3 | 4 |
2 |
4 |
10 | |
IT4 | 2 |
1 |
2 |
3 |
8 |
ON THE FIELD | 2 | 1 | 1 | 1 | 5 |
TOTAL INJECTED | 12 | 7 | 7 | 4 | 30 |
During It1 25 defects were injected. 2 of them fixed in It1, 2 in It2, 4 in It3, and 2 in It4. 2 were only discovered on the field.
During It4 8 defects were removed. 2 of them were introduced in It1, 1 in It2, It3, It4.
On the field we discovered 5 remaining defects that were introduced: 2 in It1 and 1 in It2, It3, It4.
DRE-IT1 = 2 / 12 = 16.67%
DRE-IT2 = 5 / (12+7) - 2 = 5 / 17 = 29.41%
DRE-IT3 = 10 / (12+7+7) - 2 - 5 = 10 / 19 = 52.63%
DRE-IT4 = 8 / (12+7+7+4) - 2 -5 - 10 = 8 / 13 = 61.54%
DRE = 2+5+10+8 / 12+7+7+4 = 25/30 = 83.33%
Injecting vs. Removing defects
We have been discussing so far Quality Metrics, but it's interesting to check some Software Metrics because, as we studied in Unit 1, internal quality affects external quality:
Internal Software Metrics can affect External Quality Metrics in the long-term
Quality is linked to Size, Control Flow Complexity, Complexity, Understandability, Testability
There are some Software metrics that are linked to them: Average method size, Cyclomatic Number, Average Number of methods/instance variables per class, nesting level, Number of classes/relationships within in a module or outside of the module...
Understanding the relationship between them is critical
Cyclomatic Complexity
This metric measures the complexity of the control flow graph of a method or procedure
High Value: High Complexity, difficult to test and maintain
WMC: Weighted Methods per Class
This metric measures the complexity of a class. Class complexity can be calculated using the cylomatic complexity of every method
High value of WMC indicates the class is more complex and hence difficult to maintain
DIT: Depth of Inheritance Tree
It measures the maximum level of the inheritance hierarchy of a class
If DIT increases, it means that more methods are to be expected to be inherited, which makes it more difficult to calculate a class’s behavior. Thus it can be hard to understand a system with many inheritance layers.
NOC: Number of Children
how many sub-classes are going to inherit the methods of the parent class
If NOC grows it means reuse increases. On the other hand, as NOC increases, the amount of testing will also increase because more children in a class indicate more responsibility
Most of these metrics (and many others) can be calculated using static analysis tools
REFACTOR YOUR CODE...
YOU SHOULD ALSO REFACTOR IF YOU DETECT OTHER ISSUES, E.G. DUPLICATED CODE, SHOTGUN SURGERY, ETC.
SOME OF THESE ISSUES CAN BE ALSO SPOTTED WITH TOOLS (E.G. DUPLICATED TOOLS)