Difference Between Data Warehousing And Data Mining Information Technology Essay
The information systems (IS) field is terms of key development can be called data warehousing in database environtment. Different organizations receiving different significant returns than others because its benefits are plentiful. The organization will have different types of returns in the impact of data warehousing . Each company can be tie the benefit to the way in which it conforms to the framework as it shows in the analysis . We will see how dan organization can be transformed by data warehouse, Analysis also showing the explaination of differences in impact. A large manufacturing company (LMC) involving case study of data warehousing, a financial services company (FSC) are presented and discussed within the context of the framework and the Internal Revenue Service.
Keywords : Multidimensional modelling; Conceptual modelling; Time-series; Data warehouses; Data-mining
Introduction
Data warehouses are becoming part of the technology. Data warehouses are used to consolidate data located in disparate databases. A data warehouse stores large quantities of data by specific categories so it can be more easily retrieved, interpreted, and sorted by users. Warehouses enable executives and managers to work with vast stores of transactional or other data to respond faster to markets and make more informed business decisions. It has been predicted that every business will have a data warehouse within ten years. But merely storing data in a data warehouse does a company little good. Companies will want to learn more about that data to improve knowledge of customers and markets. The company benefits when meaningful trends and patterns are extracted from the data. Data mining, or knowledge discovery, is the computer-assisted process of digging through and analyzing enormous sets of data and then extracting the meaning of the data. Data mining tools predict behaviors and future trends, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Data mining derives its name from the similarities between searching for valuable information in a large database and mining a mountain for a vein of valuable ore. Both processes require either sifting through an immense amount of material, or intelligently probing it to find where the value resides.
What is Data Warehousing
‘Data warehousing’ is a collection of decision support technologies that enable the knowledge worker, the statistician, the business manager and theexecutive in processing the information contained in a data warehouse meaningfully and make well informed decisions based on outputs.
The Data warehousing system includes backend tools for extracting, cleaning and loading data from Online Transaction Processing (OLTP) Databases and historical repositories of data. It also consists of the Data storage area–composed of the Data warehouse, the data marts and the Data store. It also provides for tools like OLAP for organizing, partitioning and summarizing data in the data warehouse and data marts and finally contains front end tools for mining, quering, reporting on data. It is important to distinguish between a “Data warehouse” and “Data warehousing”. A ‘Data warehouse’ is a component of the data warehousing system. It is a facility that provides for a consolidated, flexible and accessible collection of data
for end user reporting and analysis. A data warehouse has been defined by Inmom (considered one of the founders of the Data warehouse concept) as a “subject-oriented, integrated, time-varying, non-volatile collection of data that is primarily used in organizational decision making.”
The data in a data warehouse is categorized on the basis of the subject area and hence it is “subject oriented”
Universal naming conventions, measurements, classifications and so on used in the data warehouse, provide an enterprise consolidated view of data and therefore it is designated as integrated.
The data once loaded can only be read. Users cannot make changes to the data and this makes it non-volatile.
Finally data is stored for long periods of time quantified in years and bears a time and date stamp and therefore it is described as time variant.
The format of data collected are not always the same eventhough each of these systems collect data . So, the combination of all these data from all sorts sources that combined together to form a single data and the combination of database forming a single homogenous form is what we called data warehousing. The data is not only subject-oriented and integrated but also time-variant and non-volatile collection of data to support management in making decision.
1) Time-variant
that In order to make the reports that go in propotional with time , the data ware house that has been changed should be tracked and recorded.
2) Non-volatile
Data warehouse is never deleted, so that data is read only and static and will be used for future reporting
3) Integrated
Data is consistence and is representing all operational system.
Data management and data retrieval are the proess that can define data warehousing. An organizations can integrate their various databases into data warehouses using process different capabilities and culturesThe idea of maintaining central data is brought by data warehouse. Although the concept itself has been around for years, like data mining data warehousing, is a relatively new term.
Data Warehouse Design
There are two different ways of Data Warehouse design; Ralph Kimball and Bill Inmon’s model.
In Inmon’s architecture, before data being transferred to the data marts, data from OLTP database is firstly stored in a warehouse. In Inmon’s model view,tThe data warehouse is a real database.
Kimball’s architecture using OLTP database shows that data is transferrefed directly to the data mart. Based on the picture above, a collection of integrated data marts will form the data warehouse.
What is Data Mining?
Data or knowledge discovery or generally called Data mining. It is the process of getting data from different resources and turn them into information that can be useful for organization One example of tools to analyze is analyzing data using data mining software. Data from many different sources and with different uses and characters will be analyze by users. So, technically, data mining the process of finding correlations of fields in large relational databases.
In general, data mining is the process of making data usefull after analyze it from different kind of information environment and turn into vary kind of useful information. . Data from many different dimensions or angles can be analyzed by users and then be summarized into the relationships . On the other hand, the process of finding or different in data mining is not an easy process and do need different process and analyze.
There are five major elements in data mining :
Data can be presented in graph or table and other formats.
Data transaction is loaded onto the data warehouse system.
Data is stored and managed in a multidimensional database system.
Application software can be used to analyze and intrepenting data
Information professinals especially who are involve in business world can access the provided data.
Continuous Innovation
The technology is not new nowdays eventhough data mining is a quite new. Take this case for example, One Midwest grecory is using Oracle software to look and search about the buying patterns of local. It has been discovered that whenever men bought cigarettes on Saturdays and Thursdays, they also want to buy beer. Then it also showed that Saturday is the day when these shoppers typically did their weekly grocery. However, they only bought a few items on Saturdays. It is concluded by the retailer that the reason they purchased the beer is to have it for the weekends ahead.. Furthermore, they have to make sure that the price are full for cigarettes and beer on Thursdays.
What can data mining do?
Here are the focus among a costumers – retail, financial, communication, and marketing organizations which are data mining that been used by a company especially for the basic data. Product, the condition of product and the day of business can be some internal factors for this. Other than that the perfomance on sales, customer rights satisfaction, also profits for organizations can also determined by companies. Finally, transactional data can be viewed with details from the summary information.
The development of products and promotion is also important to appeal to specific customer segments by getting demographic data from comments or even warranty cards provided by company. In order to transform supplier relationships.
Applications of Data Mining
The basic of data mining has been involved in many fields such as business, scientific research, banking sector and many more . All of these fields have been using data mining. Now its use is easier compare to what it was.. These tools can be used practically to carry out complex data mining techniques and get ease with it.. In order to do improve marketing field of and organization and try to understand its clients buying patterns, Data Mining is used by businesses. Data mining is used more widely than ever before because of expanding variety of tools and softwares in new era like it is today.. Intelligence agencies like FBI and CIA also use data mining for identifying terrorisms threats or even after the incident of 9/11 in America. It is really important to uncover terrorist plots. However the people concerned as data collected for such works will invade many people’s privacy.. Banking sector also using data mining for credit card problem detection and anything relation to credit card crime as what is happening now days. . It is also can reduce the risk of any credit card problem among their customers, especially to identify potetntial costumers and to state wether loan can be able to approve to any customers.
Steps of Data Mining
From the picture, we can see the step of data mining
Data Integration: All the different sources contribute data which are collected and integrated.
Data Selection: We have to select data and make sure that it is usefull for data mining.
Data Cleaning: The data collected may not all correct and need to be checked again before being used to avoid data errors and uncertain problem.
Data Transformation: Eventhough the data has been cleaned, to have data ready for mining, we still have to do something and transform data into the right form so that mining process will not be any problem. Many techniques can be used to complete data transformation suck as like smoothing, aggregation, normalization techniques.
Data Mining: Techniques like clustering and association analysis are used among the many different techniques used for data mining only when we are ready to apply data mining techniques on the data to discover the interesting patterns.
Pattern Evaluation and Knowledge Presentation: Transformation, visualization, removing redundant patterns are steps from the patterns we have generated.
Decisions / Use of Discovered Knowledge. In order to make use of the knowledge which acquired to take better decisions, this step helps.
Difference between Data Warehousing and Data Mining
Data warehousing can be define as the inntegration and combination of data from different sources and various of format into a single form or a single schema. Huge amount of data can be provided by data warehousing with a storage mechanism. Meanwhile, The Enterprise has been provided with a memory by Data Warehousing and intelligenge is provided by data mining . So, the discovery of useful patterns can be done using the of data mining techniques on the data warehouse.
Data Mining Tools
In order to build company’s own custom mining solution, they have to purchase mining programs. This program is designed for existing software and hardware platforms and the program can be integrated into new products and systems. For example in order to give the mined data more value, the organization has to determine data mining output into another output such as ware house as a neural network which is said quite common . This is because, , the data mining tool will gather the data if the other program makes decisions based on the data that has been collected.
In the market place they are a lot of data minings that come in different model and tools or tecniques. Each with their own strengths and weaknesses. A right kind of data mining is really important to any organization because if the organization buys a wrong data mining and it’s nothing to do with their business goal, data mining will be useless. This is a really important consider especially for organizations which are going to expand.
These three categories are the classification of most data mining tools:
Dashboards. Is used to monitor information in a database and is installed in computer, dashboards reflect updates onscreen and data changes – always wether in charts or in tables – so that the user can see how the business is performing and working.
Traditional Data Mining Tools. Data mining programs will be used in organization and the effect of it will be securely seen.. In order to monitor the data and highlight trends and others capture information residing outside a database. You can even see some of thiese tools on the desktop.
Text-mining Tools. The ability to work as PDF and ability to analyze data and can be used in various kind of data processing from hard text to simple words. These tools can be used to scan content. There are a lot of unstructured scanned content for example information is scattered almost randomly across the document, audio or video data or internet based data or structured for example the data’s form and purpose is known, we can say database content.. A wealth of information can be provided for organizations by capturing these inputs that can be mined to discover all kinds of concepts and trends or any other data processing in another environment.
The Benefits of Data mining/ Data warehouse To Organization.
By using Data warehouses business executives can look at the company as a whole unit. There must be reasons on spending so much money by many corporations to implement data instead of looking at an organization in terms of the departments that it comprises. . Data warehouses also use their ability to handle a lot of tasks in an organization involving many different departments. In order to make sure that every transaction will be made in certain time frame, the good transaction system will be set up by every organization The biggest problem with report and queries is the transaction can not be made in time frame and then will be late to be compiled. On top of that and in order to overcome the problem, many companies are working to come out with a good data warehouse and hopefully can be able to settle down any problems regarding transactions. Another benefit from data warehouse is the uses of data model for any queries regarding transactions and the outcome is really convincing..
Models for queries are really important especially to come out with good reports. Eventhough transaction processing system doesn’t really need this but the implementing of a good model can help the company. Anyway, wrong modeling methods can slow down the process of transactions. Eventhough transaction process has to be at speed by the server units, but at the same time, they will slow down the process of query.
Queries of data can be made using data ware house and this is one of the reason why it is very efficient. Anyway, a big number of transaction system can lead to big transaction system. So, the company always has to get ready to come out with different data warehouse or even worse, different models of processing. The combination of every departments in a company is really important in order to overcome any problems regarding processing and transaction of data.
The Benefits of Data Mining
Data mining allows companies to exploit information and use this to obtain competitiveadvantage.
Data mining helps identify trends such as:
why customers buy certain products
ideas for very direct marketing
ideas for shelf placement
training of employees vs. employee retention
employee benefits vs. employee retention
Conclusion
In order to discover trends and patterns in temporary data, we use time-series analysis as a powerful technique . Not only their management is expensive but these data also are also low-level of abstraction. Two main problems are always been faced by most analyst. First, to clean of the huge amount of potentially-analysable data and second to correct the definition of the data-mining algorithms to be employed. Data warehouses has been proved to be a powerful repository of historical data since their appearance especially for data-mining . In addition, their modelling paradigm for example multidimensional modelling, is not any different to the problem domain. Anyway, a coherent conceptual modelling framework for data-mining assures is believed as a better and easier knowledge-discovery process on top rank of data warehouses.
Order Now