Fault Diagnosis And Troubleshooting Information Technology Essay
Network management is nothing but the activity which is associated with the network, which is implemented along with the technology to support the activities. Different types of merged communications and the videos are handled by network. A network is an interlinking structure which requires very much attention. It should be planned cautiously. The network devices must be configured without affecting the remaining part of the network. There may be the failures in the network, so they need to be detected and repaired. Reliability and availability are linked with the network. The role of the network manager not only observes the performance and the security of the network, they also predict the problems in the network and exceed the technologies to make that everything will work well. There exist two frameworks i.e. FCAPS and ITIL that can be useful for interpret and demonstrating the network management. In this paper we can see that how they are helpful to think about the management tools.
Now a day’s many of the UK public sector organizations are using ITIL. Some companies use FCAPS which is layered with TMN, but ITIL is more beneficiary when compared to FCAPS. The main objective of this paper is to provide detailed information about FCAPS and ITIL and to provide their advantages. They are helpful in managing the networks of medium and large organizations. And also compare FCAPS and ITIL, which would be helpful for NMRU for migrating to ITIL.
International telecommunications union has developed the Fcaps. They have stated it as a model and not as a product. Along with the TMN layering, ITU-T divided the functionalities provided by the management into five areas. The functionality of the FCAPS is performed at various levels of TMN.
1.1 Fault management
Fault management is group of operations that performs finding the errors and correcting the defecting the errors. To be a good fault management it needs to acquire the problem, leave the information to the concerned person and observe the problems through trouble ticketing. The aim of this is to find the errors and indicate the errors that have come in the various places of network. In the short latent time the errors must be identified and rectified. It contains the functions given below,
The main function of network monitoring is that to check whether the performance of the network is good, to have a look on the actual state of the network and also to modify the state. The fundamental step of this is to know the errors in the network and respond accordingly which occur in the network.
The main aim of network monitoring is to refer the functionality of the alarms. Alarms are nothing but the messages which are sent from the network that something has occurred unexpectedly. The unexpected things can be of any kind i.e. it can be from a router that the line card is not working, a sudden change in the quality of the signal in a wireless network and some unauthorized user have entered into the network. An alarm for network is nothing but a fire alarm which occurs unexpectedly.
Sometimes the alarm management synonymously acts with the fault management. Alarm management has classified into two different functions.
The basic function of the alarm management is that collecting the alarms, maintaining the exact and ongoing list of the alarms and modifies the alarms. The main task is to collect the alarms from network and also create in such a way that nothing is missing which is important. This considers the alarms that are received and store it to the memory so that a human or an applicant can process it further. It also includes the persisting of the alarms, written to a disk or store in a database so that the alarms which have occurred can be built as a record.
In most cases, collection of alarms includes addition of alarms, the mechanisms to check that the zero alarms have lost and we can also request for the replay of alarms. In general we will lose the alarms in different ways. For example, the transportation which is from the underground may not be tested so we may lose the information of the alarms when it is going to the application of the management. There may be the other reason for the information to be lost i.e. the network is blocked so that alarms may fail to reach the destination. In the third case, it will reach the destination but it was not collected in order because the functioning of the application or the database was not in the right way.
After the collection of the alarms, it is needed to maintain the list of the alarms. The list also communicates with the operator about the current state of the entities and for instance any device is having any problems. It is essential to realise the way the alarms are being submitted to the users. The result of each alarm is entered into the list that contains the information of the alarm. The list can be examined, classified and it can be separated accordingly, such as the alarm type, the type of the network element modified, the time of the occurrence of the alarm etc. The information of the alarms can be visualized in different ways, but the topology maps are the most popular one.
In the advanced alarm management, the additional functions are required to manage the alarms. For processing of the alarms the network managers are provided with the great flexibility. For example, with the functioning of the alarm-forwarding the alarms might be sent to the operator to permit for the dispatch, such as the local police will be called automatically for the home intrusion detection.
Another function is that the acknowledgment of the alarm can be done by the network operator; it means they confirm that the alarm has occurred and they are under processing. And the third is clearing the alarms; to describe the condition of the alarm an alarm message has been sent. And after some time the second message is sent indicating that the condition of the alarm doesn’t exist for the longer period.
Two techniques are dealt with the overloading of the information. One is filtering, is it used to remove the information of the event that is unimportant, the receiver will be allowed to concentrate on the applicable event information. The other one is the correlation, it is used to pre-process and collect the data from the events and the alarms. We will discuss about these two techniques in detail.
In general filtering is done not only on alarms but also on events. It is essential to hide as many as events that are not important. Filtering can be enabled in two ways; one is allowing the operators to subscribe only the limited events and also the alarms that are applicable to them, as chosen according to the criteria. By doing this the operators will receive the events that matches there criteria. And the other one which is used to filter the alarms is deduplication of alarms. The condition of the alarm might cause in such a way that it may send the same alarm repeatedly. Because the alarm which comes repeatedly doesn’t contain any new information so the new instance of the alarm which is received might be removed. The process of removing the extra alarms is named as deduplication.
Alarm correlation refers to alarms that must be filtered and the functions that must be pre-processed. All the received alarm messages are stopped, studied and compared with all the alarms which are probably related with each other. For example, the alarm messages might be linked up because they may have the similar problem. The general idea behind the event and alarm correlation is that rather than forwarding and reporting different messages, it is better to send a few that combine and resume the same information from different raw events. By doing in this way, the alarm messages that are reported can be automatically decreased.
Fault Diagnosis and troubleshooting
Network diagnosis doesn’t variant a lot from medical diagnosis. The variation is nothing but the patient. When the network contains a fault then the ability to solve the problem is, immediately finding out the reason for the problem. The process of solving the problem is meant as root cause analysis. Alarm will only alert us about the symptom but not the reason for that problem.
Troubleshooting will support diagnosis. Troubleshooting can simply retrieve the data about the device. Essential support is provided for diagnosis for testing a device or a network. Test can be used not only after occurring a problem but it can be used proactively i.e. we can know the problem in before it knows to the user. Avoiding the faults altogether is best for fault management.
Proactive Fault Management
In fault management most of the functionalities work in such way that they become active after occurring the errors. It is nothing but taking precautions in the network so that any failures cannot occur. It also includes the analysis of the alarms that recognizes the alarms that have caused due to the minor error.
There might be ten thousand users who are using a very large network. In this case, there is chance to occur hundreds of problems in one day. In those only few or none of the problem can be solved. There might be many individual users who are experiencing problems which might be serious to them. Trouble ticket doesn’t result for every alarm, issuing that many is not possible.
1.2 Configuration Management
In this the first step is to configure the network. In this the hardware and also the programming changes, considering the new programs and the equipments and adding them to the previous one, existing systems should be modified and removal of the unused systems and the programs. A list should be kept so that the equipment and the programs are kept and they are regularly updated.
Configuring Managed Resources
In the beginning of the configuration management, the activities and the operations which are being managed are configured first. It means that it is involved in sending the commands to the network equipment for changing the settings of the configuration. Sometimes it involves in isolating only a single device, such as only one interface of the port will be configured.
It can be viewed in two ways one is considering the network as the master and the other one is considering the management system as the master.
In reconciliation the network is considered to be the master, the information which is in the management system will reflect in the network. Synchronization of the information is performed from network to the management system.
In reprovisioning the Management system is considered to be the master of the management information. It flows from management system to the network, resulting the changes in the configuration of the network. Until the management system will receive a report from the network device that the changes has been made it will maintain a flag indicating that it is out of synchronization.
In discrepancy reporting the user is being detected and flagged by the discrepancies. It doesn’t maintain the direction of the synchronization which is to be taken place. This is to be performed by the user on case by case basis. If he decides that the information should be reflected by the management system it will ask the reconciliation.
Backup and Restore
The virus can destroy the data that is present in the hard disk. If we have a backup of data then in such cases we can recover the data. In the same it also applies for the network i.e. the backup and restores functionalities. The data of our users will not be in word or excel sheet but it will be the configuration of the network. The data is very important and it needs to be protected, just as we protect the database in a company. If unfortunately the configurations in the network are wiped off then many people will be affected. Then we don’t have time to reconfigure the network. The easiest way to bring the things backup is restoring the network till the end of the configurations.
Many network vendors issue the new versions of the software. In such cases you must be able to upgrade the network. The problem is that we are dealing with thousands of pcs which are connected across the same network. We must be have an idea of which devices are being installed with the different software’s, so that we can send the images which are to be updated and installed without disturbing the services of the network. This is nothing but the image management.
1.3 Accounting Management
It is nothing but the functions that will provide the organizations to acquire the revenue and for getting the credit for the services they have provided. It needs to be extremely strong, large availability and the reliable standards are applied.
1.4 Performance Management
The performance metrics
Throughput, the number of communication units performed per unit time. The communication units depend on the type of the layer, network and the services which are provided to the network. Examples,
In the network layer, the total number of packets that are sent per second.
In the application layer the voice calls or calls which are attempted per hour.
Delay, it is measured per unit time. Different kinds of delays can be measured depending on the layer or the network services.
In the network layer, the time taken for an ip packet to reach its destination.
In the application layer, the time taken to receive a dial tone after we lift the receiver.
Quality, it can be measured in different ways depending on the services of the network.
In the network layer, the percentage of the number of packets lost.
In the application layer, the percentage of the number of calls terminated or the calls that were dropped.
1.5 Security Management
The security aspects that are linked with securing the network from the treats, hackers attack, worms and viruses and the intrusion of the malicious attempts. It is distinguished in two ways.
Security of Management means that the management is secure. The management applications must be accessed securely. It is generally authorized based on the application management but not on the user basics. Without securing the management application there is no use of securing the interfaces and the network of the management.
Management of security means that the network is secured. It involves only in managing network security. Now days we can come across many online treats. The security treats doesn’t target on the network it will just target on the devices which are connected to the network i.e. end users.
The use of FCAPS in managing the network:
It will manage all the kinds of networks i.e. private, public, mobile, narrow and broadband and including all area networks (WAN, MAN, LAN).
Cost of implementation is reduced.
Transmitting the digital and analog systems.
Signalling the systems and the terminals including the transfer points of the signals.
Performance problems are located easily.
User is made satisfied.
Schedules are implemented shortly.
The feedback on the design is very effective.
Simplified procedure of network operation center.
Telecommunication services are provided with the software.
Now days many organizations are more depending on IT because of that ITIL has been developed by the CCTA in UK. It has provided the same framework for different kind of activities which are performed by the IT department. ITIL is managed in different sets, they are defined as the related functions service support, service delivery, and the other operational guidance are managerial, software support, computer operations, security management and environmental. ITIL has been designed to supply a good framework to present a high quality. Actually it is owned by CCTA, but it is observed and evolved by the Office of Government Commerce.
2.1 Service Support
The service support focuses on the users. The customers and the users are the starting point to the model. They are involved in
Asking for the changes
For communication and also the updates
Having any difficulties and queries
The delivery of the process
In most of the organizations it a Network Operations Center (NOC). It is mainly focused on one discipline i.e. whether the users are able to access to the applications what they are required. It focuses on finding the troubles, helping the users and giving the new applications which are completed on the internet. It includes the following
The main aim of this is to restore the service operation as early as possible, minimizes the effects on the business operations, and verifying that all the levels of quality of the service and the availability are maintained.
It can be defined as an event which is not the part of the service operation which may or may not reduce the quality of the service. The reality of this is the normal operations must be restored as soon as possible without effect on the business or on the end user.
It helps in representing the logical and the physical functioning of the ICT services which are provided or delivered to the end user. It is nothing but the asset register, because it contains the information about the maintenance and problems which occur during the configuration of the items.
The main aim of this is find out the reasons for the cause of the incidents and to minimise the cause for the incidents and the problems which are caused because of the errors. A problem is nothing but the unknown cause for one or more incidents and the known error is nothing but the problem which is diagnosed successfully. The problem and known errors are defined by the CCTA as given below,
Problem is a condition which is often determined as the come out of the multiple incidents that contains the general symptoms. It can also be determined from an individual incident that indicates the single error, the reason is unknown.
Known error is an improvement which is identified by the self made diagnosis of the main cause of the problem and the work which is developed around.
The aim of this is to check the way the changes are handled using with the help of methods and procedures. Change is an event that the status of one or more configuring items which are approved by the management.
The aim of this includes:
Back- out activities are reduced.
Change in the utilization of resources.
Disruption of the services.
The terminology for the change management:
Change: the addition, alteration or deletion of CLs.
Change Request: the form which is used to store the details which are to be changed and it is sent into the Change Management by using Change Requestor.
Forward Schedule of Changes (FSC): it contains a list of all the changes which are going to come.
The main aim of this includes initiating the incidents and the request, and an interface is initiated for the ITSM processes.
Individual point of contact.
Individual point of entry.
Individual point of exit.
The Service Desk functions include:
Incident Control: service request for the life cycle management.
Communication: the progress and the advising of the workarounds must be keep on informing to the customers.
The Service Desk contains different names:
Call Center: it involves managaging of huge amount of telephone based transactions.
Help Desk: at primary support level it will conclude the incidents as soon as possible.
Service Desk: it not only helps in handling the incidents and solving the problems but also it will provide an interface for various activities such as changing the requests, maintaining the contracts etc.
It contains three different types of structures:
Central Service Desk: it will handle the organizations which contains in multiple locations.
Local Service Desk: it will meet the local business needs.
Virtual Service Desk: it will handle the organizations which contains the locations in multiple countries.
The software migration team uses it for the purpose of platform-independent and the distribution of the software and hardware. The availability of the licence and the certified version of the software and the hardware ensure the proper control of the software and the hardware. The responsibility of this is to control the quality of the hardware and software during the implementation and development.
The goals of this include:
Planning the role of the software. Creation and implementation of the process for distributing and to install the changes that occur in the IT. The expectations of the customers are effectively communicated and managed during the planning of the new versions. The changes in the IT systems must be controlled while the distribution and the installation take place.
It focus on the protection of the existing environment. It consists of the recent or modified software or the hardware which is required to use the authorized changes.
Leading software which are released and the leading hardware which are updated, this contains the large amount of recent functionalities. Limited software which are released and the limited hardware which are updated, which contains the lesser enhancements and fixes, in which some of them have already issued as the emergency fixes. Emergency software and hardware fixes, which contains the corrections of some known problems.
Based on the released unit it is separated into:
Delta Release: the changes which have occurred in the software are only released.
Full Release: the complete software program is distributed.
Packaged Release: it releases the combination of various changes.
2.2 Service Delivery
It mainly concentrates on the services which the ICT must be delivered to supply the sufficient support to the business users. It consists of the following processes.
Service Level Management
It provides for supervising, identification and examining the stages of IT services which are specified in the Service Level Agreements. It involves in assessing the change of the quality of services. To control the activities of the service level management it will join with the operational processes. It is the direct interface to the customer. It is responsible for the following:
It will check whether the It services are delivered are not.
It will maintain and produce the Service Catalog.
It will check that the IT Service Continuity plans subsist to support the business and its requirements.
It supports the best and actual cost supply of IT services by providing the organizations match with their business demands. It includes:
Size of the application.
Planning the Capacity.
IT Service Continuity Management
It Processes succeed an organisation’s capability to supply the essential aim of service followed by an interruption of service. It is not only reactive measures but also proactive measures.
It involves the following steps:
By conducting the Business Impact Analysis the activities can be prioritised.
The options are evaluated for the purpose of recovery.
Contingency plan has been produced.
The plan has been tested, reviewed and revised on regular basis.
The ICT infrastructure capabilities and services are optimized, service outages are minimized by having a support and give continued level of services to business requirements. The ability of IT component has been addressed to perform at a level all over the time.
Reliability: the performance of the IT component at a concerned level at an identified condition.
Maintainability: the IT component has an ability to remain or regenerate to a functional state.
Serviceability: the external supplier has an ability to conserve the availability of the function below the third party.
Resilience: the freedom has been measured from the operational failure and the way the services are kept reliable. Redundancy is one of the popular methods for resilience.
Security: a service may contain a related data. Security is nothing but the availability of the data.
It is the process to deal the cost linked to provide the organisation with the services or resources to see the business requirements. It may refer to
Managerial Finance: The financial technique has been concerned itself with managerial significance within the branch finance.
Corporate Finance: the financial decisions are dealt with the area of the finance.
2.3 Security Management
Since several years it has become a prevalent network management. The external treats are justified with the firewalls and the access prevention. The rights and permissions of the configuration management have been included in the security management, so that the end users are not granted with the unauthorized access.
2.4 Infrastructure Management
In large organizations, the systems have been designed and the troubleshoot by the teams are different from the team that installs the equipment. Because of this Configuration management is necessary for the success of IT organizations. For installing and configuring of network devices in an organization the infrastructure management is responsible.
2.5 Application Management
It is designed to ensure that an application has the correct configuration design to implement in the environment. This can cover different aspects of network management. It is designed to ensure that it is completely enabled to supply the service and delivery to end users.
2.6 Software Asset Management
It is considered for managing an organization. The software products and licenses are very expensive. It is designed similar to the configuration management, because it provides the information on each device about the software installation. In large organizations maintaining the software and accounting for the licence is the complex task.
Uses of ITIL in an organization
The utilization of the resources is improved.
Rework is reduced.
The submission of project to the client and the time management is improved.
The cost of the quality of the service is justified.
The central process is integrated.
Excess work is decreased.
The services are provided in such a way that they meet the customers demand.
Know more from the earlier experience.
Be more aggressive.
Comparision of ITIL and FCAPS:
FCAPS mainly focus on the technology management. ITIL focus on the way to run an IT organization efficiently, i.e. on the process and the workflow.
One of the limitation of FCAPS, it cannot target operational process which is required to operate a Service Desk. In ITIL framework we have the service desk in service support which provides the operational services to customers or end users.
FCAPS on informs about the problem but doesn’t gives us the solution to the problem. But ITIL standards provides services to resolve the problem using service delivery and service management.
The main task of the FCAPS is that it will help out in managing the objectives of the network. The ITIL is planned to supply the improved framework.
The difference between FCAPS and ITIL is that, FCAPS contains only five layers but ITIL contains eleven layers.
The incident management and the availability management in the ITIL are similar to the fault management in the FCAPS. The purpose of fault management in FCAPS is finding out the faults in the network and correcting them, but in ITIL if any problem occurs in the management there is no need to rework entire process.
The purpose of the incident management is restoring the normal operations and the availability management is associated with the availability of the service to the business at an executable cost.
Finally the above discussion concludes the use of FCAPS and ITIL in network management. The organizations which implements ITIL will get the good results in the name of the way the services are designed and delivered. The use of any technology is not specified by ITIL, but the implementation on the use of the tools is effective. Main focus of FCAPS begins with technological view. FCAPS has been proved as low risk and logical. For any organization to enhance its performance or to get proper outcome both FCAPS and ITIL has to be associated together.Order Now