The concepts of data warehouse and data mining in organization

Introduction

In today real world, most of information and data has been managed or organized by using information technology and also information system. Information systems are now widely use in every industry to stored data and information for future use. Data warehouse and data mining are the common process that can be found in information technology field. Data warehouse are used to store a huge volume of data and data mining can be defined as a process of pull out patterns fromdata.

Data warehouse

Adata warehouseworks as an electronic storage area of an organization’s to stored data. Data warehouses are planned to assist in reporting and analysis for an organization. Retrieving and analyzing data, extracting, transforming and loading and managing data are also the fundamental components of a data warehousing. The data warehouse has specific characteristics that include the following:

1. Subject-Oriented

Information is presented according to specific subjects or areas of interest, not simply as computer files. Data is manipulated to provide information about a particular subject.

2. Integrated

Data stored in a worldwide accepted method with constant measurements, naming conventions, physical characteristic and encoding structures.

3. Non-Volatile

Stable information that doesn’t change each time an operational process is executed. Information is consistent in any case of when the warehouse is accessed.

4. Time-Variant

Containing a history of the subject, as well as current information. Historical information is an important component of a data warehouse.

5. Process-Oriented

It is important to view data warehousing as a process for delivery of information. The maintenance of a data warehouse is ongoing and iterative in nature.

6. Accessible

Provide easy access for information to end-users.

There are three Data Warehouse Models:

Ã¢â‚¬Â¢ Enterprise warehouse

– collects all of the information about subjects across the entire organization

Ã¢â‚¬Â¢ Data Mart

– a subset of corporate-wide data that is of value to a specific groups of users. Its scope is confined to specific, selected groups, such as marketing data mart

Ã¢â‚¬Â¢ Virtual warehouse

– A set of views over operational databases .Only some of the possible summary views may be materialized

Data Warehouse Concepts

In data warehouse, there are several concepts that can be listed as valued to data ware housing and the value concepts as per below:

1. Dimensional Data Model- Dimensional data model is usually used in data warehousing systems. This section describes this modeling technique, and the two common schema types,star schemaandsnowflake schema. It is the most regularly used in data warehousing systems. 3rd normal form is different from it, regularly used for transactional (OLTP) type systems. There are few term that can be define regularly to understand dimensional data modeling:

Dimension: A category of information.

Conceptual Data Model

From the figure above, we can see that the only information shown via the conceptual data model is the entities that describe the data and the relationships between those entities. No other information is shown through the conceptual data model.

Logical Data Model: Logical data models explain the data in as much detail as feasible, without look upon to how they will be corporeal apply in the database. Features of a logical data model include:

* Consist of all units, entities and relationships between them.

* All attributes for each unit are precise and specific.

* The primary key for each entity is particular precise.

* Foreign keys (keys recognize the relationship between different entities) are specified.

* Normalization transpires at this level.

The steps for scheming the logical data model are as follows:

1. Identify input keys for all entities.

2. Locate the relationships between different entities.

3. Discover all attributes for each entity.

4. Determine many-to-many relationships.

5. Normalization.

The figure below is an example of a logical data model.

Logical Data Model

The different between two conceptual data of the model from the diagram and the logical data as to be listed below:

* Primary keys are present, whereas in a theoretical data model, no primary key is present in a logical data model.

* All attributes are specified in an entity. No characteristic are specified in a conceptual data model also in a logical data model,

* In a conceptual data model, the relationships are basically set, not explicit, so we simply know that two entities are related, but we do not specify what attributes are used for this relationship. The relationships between entities are specified using primary keys and foreign keys in a logical data model.

– Physical Data Model

– Conceptual, Logical, and Physical Data Model: Altered or different levels of abstraction for a data model. This part compares and contrasts the three other types of data models.

The Benefits of data warehouse to the organization

* The potential to handle server tasks and responsibilities connected to querying which is not used by most operation systems.

* Can be ended within the good time frame

* The set up do not need a technical skill workers

* Data warehouses are exotic unique that they can act as a repository, a repository for transaction processing systems that have been cleaned.

* Can produce reports, data extracts, can also be done from outside sources.

* Chronological information for competent and competitive analysis

* Niche data quality and completeness

* Enhancement disaster recovery plans with another data back up source

Data Mining

Introduction

Data mining is the progression of analyzing data from dissimilar standpoint and summarizing it into practical information – information that can be used to increase profits, cuts costs, or both. Data mining can also called data or knowledge innovation or knowledge discovery. Software of data mining is one of a number of systematic and methodological tools for evaluating or analyzing data. It assigns the users to analyze and evaluate the data from many different scope or angles, dimensions, proportions, categorize it, and review and summarize the relationships identified. In technical view, data mining is the procedure of finding relationship or patterns among all of fields in large relational databases. The Knowledge Discovery in Databases procedure includes of a few steps the most important from raw and undefined data compilation to some form of innovative knowledge. The progression as of the following stepsÂ²:

* Data cleaning: also known as data cleansing, it is a stage in which noise data and irrelevant data are removed from the group collection.

* Data integration: at this point, multiple data sources, often heterogeneous, may be combined in a general source.

* Data selection: at this step, the data relevant to the analysis is decided on and retrieved from the data collection.

* Data transformation: also known as data consolidation, it is a phase in which the certain data is transformed into forms suitable for the mining process.

* Data mining: it is the vital step in which smart techniques are applied to extract patterns potentially valuable.

* Pattern evaluation: in this step, firmly interesting patterns representing knowledge are identified based on given method.

* Knowledge representation: is the final chapter in which the exposed knowledge is visually represented to the user. This crucial step uses visualization techniques to help users understand and infer the data mining results.

Function

Data mining is mainly data and knowledge for each relation of tools. It enables to decide relationships among home factors and external factors for each study. The purpose as large-scale information technology has been emergent detach transaction and analytical systems, data mining provides the link between the two. Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user inquiry. Data mining consists of five major elementsÂ³:

* Remove, transform, and load transaction data onto the data warehouse system.

* Store and administer the data in a multidimensional database system.

* Provide data access to business forecaster and information technology professionals.

* Analyze the data by relevance software.

* Present the data in a useful format, such as a graph or chart.

Â² http://www.exinfm.com/pdffiles/intro_dm.pdf

Â³ http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

Data Mining Concepts

Data mining process contains of 5 processes, there areÂ³:

* State the problem

* Collect the data

* Perform pre-processing

* Approximate the model (mine the data)

* Interpret the model & draw the finale

Â³http://media.wiley.com/product_data/excerpt/24/04712285/0471228524-1.pdf

Order Now