Project size estimation is a crucial aspect of software engineering, as it helps in planning and allocating resources for the project. Here are some of the popular project size estimation techniques used in software engineering:
Expert Judgment: In this technique, a group of experts in the relevant field estimates the project size based on their experience and expertise. This technique is often used when there is limited information available about the project.
Analogous Estimation: This technique involves estimating the project size based on the similarities between the current project and previously completed projects. This technique is useful when historical data is available for similar projects.
Bottom-up Estimation: In this technique, the project is divided into smaller modules or tasks, and each task is estimated separately. The estimates are then aggregated to arrive at the overall project estimate.
Three-point Estimation: This technique involves estimating the project size using three values: optimistic, pessimistic, and most likely. These values are then used to calculate the expected project size using a formula such as the PERT formula.
Function Points: This technique involves estimating the project size based on the functionality provided by the software. Function points consider factors such as inputs, outputs, inquiries, and files to arrive at the project size estimate.
Use Case Points: This technique involves estimating the project size based on the number of use cases that the software must support. Use case points consider factors such as the complexity of each use case, the number of actors involved, and the number of use cases.
Each of these techniques has its strengths and weaknesses, and the choice of technique depends on various factors such as the project’s complexity, available data, and the expertise of the team.
Estimation of the size of the software is an essential part of Software Project Management. It helps the project manager to further predict the effort and time which will be needed to build the project. Various measures are used in project size estimation. Some of these are:
- Lines of Code
- Number of entities in ER diagram
- Total number of processes in detailed data flow diagram
- Function points
1. Lines of Code (LOC): As the name suggests, LOC counts the total number of lines of source code in a project. The units of LOC are:
- KLOC- Thousand lines of code
- NLOC- Non-comment lines of code
- KDSI- Thousands of delivered source instruction
The size is estimated by comparing it with the existing systems of the same kind. The experts use it to predict the required size of various components of software and then add them to get the total size.
It’s tough to estimate LOC by analyzing the problem definition. Only after the whole code has been developed can accurate LOC be estimated. This statistic is of little utility to project managers because project planning must be completed before development activity can begin.
Two separate source files having a similar number of lines may not require the same effort. A file with complicated logic would take longer to create than one with simple logic. Proper estimation may not be attainable based on LOC.
The length of time it takes to solve an issue is measured in LOC. This statistic will differ greatly from one programmer to the next. A seasoned programmer can write the same logic in fewer lines than a newbie coder.
Advantages:
- Universally accepted and is used in many models like COCOMO.
- Estimation is closer to the developer’s perspective.
- Both people throughout the world utilize and accept it.
- At project completion, LOC is easily quantified.
- It has a specific connection to the result.
- Simple to use.
Disadvantages:
- Different programming languages contain a different number of lines.
- No proper industry standard exists for this technique.
- It is difficult to estimate the size using this technique in the early stages of the project.
- When platforms and languages are different, LOC cannot be used to normalize.
2. Number of entities in ER diagram: ER model provides a static view of the project. It describes the entities and their relationships. The number of entities in ER model can be used to measure the estimation of the size of the project. The number of entities depends on the size of the project. This is because more entities needed more classes/structures thus leading to more coding.
Advantages:
- Size estimation can be done during the initial stages of planning.
- The number of entities is independent of the programming technologies used.
Disadvantages:
- No fixed standards exist. Some entities contribute more to project size than others.
- Just like FPA, it is less used in the cost estimation model. Hence, it must be converted to LOC.
3. Total number of processes in detailed data flow diagram: Data Flow Diagram(DFD) represents the functional view of software. The model depicts the main processes/functions involved in software and the flow of data between them. Utilization of the number of functions in DFD to predict software size. Already existing processes of similar type are studied and used to estimate the size of the process. Sum of the estimated size of each process gives the final estimated size.
Advantages:
- It is independent of the programming language.
- Each major process can be decomposed into smaller processes. This will increase the accuracy of the estimation
Disadvantages:
- Studying similar kinds of processes to estimate size takes additional time and effort.
- All software projects are not required for the construction of DFD.
4. Function Point Analysis: In this method, the number and type of functions supported by the software are utilized to find FPC(function point count). The steps in function point analysis are:
- Count the number of functions of each proposed type.
- Compute the Unadjusted Function Points(UFP).
- Find the Total Degree of Influence(TDI).
- Compute Value Adjustment Factor(VAF).
- Find the Function Point Count(FPC).
The explanation of the above points is given below:
- Count the number of functions of each proposed type: Find the number of functions belonging to the following types:
- External Inputs: Functions related to data entering the system.
- External outputs: Functions related to data exiting the system.
- External Inquiries: They lead to data retrieval from the system but don’t change the system.
- Internal Files: Logical files maintained within the system. Log files are not included here.
- External interface Files: These are logical files for other applications which are used by our system.
- Compute the Unadjusted Function Points(UFP): Categorise each of the five function types like simple, average, or complex based on their complexity. Multiply the count of each function type with its weighting factor and find the weighted sum. The weighting factors for each type based on their complexity are as follows:
Function type | Simple | Average | Complex |
---|---|---|---|
External Inputs | 3 | 4 | 6 |
External Output | 4 | 5 | 7 |
External Inquiries | 3 | 4 | 6 |
Internal Logical Files | 7 | 10 | 15 |
External Interface Files | 5 | 7 | 10 |
- Find Total Degree of Influence: Use the ’14 general characteristics’ of a system to find the degree of influence of each of them. The sum of all 14 degrees of influence will give the TDI. The range of TDI is 0 to 70. The 14 general characteristics are: Data Communications, Distributed Data Processing, Performance, Heavily Used Configuration, Transaction Rate, On-Line Data Entry, End-user Efficiency, Online Update, Complex Processing Reusability, Installation Ease, Operational Ease, Multiple Sites and Facilitate Change.
Each of the above characteristics is evaluated on a scale of 0-5. - Compute Value Adjustment Factor(VAF): Use the following formula to calculate VAF
VAF = (TDI * 0.01) + 0.65
- Find the Function Point Count: Use the following formula to calculate FPC
FPC = UFP * VAF
Advantages:
- It can be easily used in the early stages of project planning.
- It is independent of the programming language.
- It can be used to compare different projects even if they use different technologies(database, language, etc).
Disadvantages:
- It is not good for real-time systems and embedded systems.
- Many cost estimation models like COCOMO use LOC and hence FPC must be converted to LOC.