Term paper data warehousing and data mining

Emphasizing applicability to real world problems, this journal meets the needs of both academic researchers and practicing IT professionals. The journal is devoted to the publications of high quality papers on theoretical developments and practical applications in data warehousing and data mining. Original research papers, state-of-the-art reviews, and technical notes are invited for publications. The journal accepts paper submission of any work relevant to data warehousing and data mining. Special attention will be given to papers focusing on mining of data from data warehouses; integration of databases, data warehousing, and data mining; and holistic approaches to mining and archiving data.

Your email address will not be published. Technologies that offer valuable insight and predictive capabilities to drive business growth and improve ROI are a great next step after the data warehouse is in place.

  • disadvantages of divorce essay.
  • saddam hussein essay biography.
  • harold rosenberg the american action painters essay.
  • International Journal of Data Warehousing and Mining (IJDWM) | Publons.

Data mining is just the right technology for supercharging CRM and analytic applications by inserting intelligence in the form of predictions, scores, descriptions, and profiles where data mining excels. Volumes of historical data containing facts about what occurred in business operations can be analyzed and used to predict what will happen in the future. Data mining is one of the fastest growing business intelligence technologies because it pays off in quantitative value.

Here are just a few results from companies that have embraced data mining:. What Exactly Is Data Mining?

What You need to Know About Data Warehousing and Data Mining

Data mining is a powerful technology that converts detail data into competitive intelligence that businesses can use to predict future trends and behaviors. Some vendors define data mining as a tool or as the application of an algorithm to data. The truth is, data mining is not just a tool or algorithm. Data mining is a process of discovering and interpreting previously unknown patterns in data to solve business problems. Data mining is an iterative process, which means that each cycle further refines the result set.

This can be a complex process, but there are tools and approaches available today to help you navigate successfully through the steps of data mining projects. From an IT perspective, the data mining process requires support for the following activities:. Therefore, the IT organization must provide an environment capable of addressing the following challenges:. Data Mining Makes Its Way to the Business World Data mining has been very effective in focused areas, such as medical diagnosis, scientific research, and behavioral profiling since the mids.

In the past 10 years, data mining technology has journeyed into the business world where it has added the new dimension of predictive analysis. To be effective in the business world, the data mining process had to be adapted to deliver models in a much more time-sensitive manner. Today, with the advent of in-database data mining techniques, businesses have finally found it possible and affordable to benefit from the advanced capabilities of this powerful technology. For years, businesses have relied on reports and ad hoc query tools to glean useful information from data.

However, as data volumes continue to increase, finding valuable information becomes a daunting task. Data mining technology was designed to sift through detailed historical data to identify hidden patterns that are not obvious to humans or query tools. Many of these previously hidden patterns reveal intelligence that can be integrated into business processes to provide predictive capabilities for improving strategic business decision making.

Data mining makes analytical business applications, such as CRM, smarter by providing insights into many new areas of your business that would otherwise go unnoticed. By making your applications smarter, data mining translates into a higher return on your warehouse investment. OLAP results are also factual results. But what if you want to make a prediction about future demand for portable CD players with a high degree of confidence so that the amount in inventory will fulfill demand?

These types of business questions challenge traditional query and OLAP techniques beyond their capabilities. Data mining, on the other hand, is a form of discovery-driven analysis where statistical and machine-learning techniques are used to make predictions or estimates about outcomes or traits before knowing their true values.

With data mining, predictions are accompanied by specific estimates of the sources and number of errors that are likely to be made. Estimates of errors translate directly to estimates of risk. Consequently, with data mining, making business decisions in the presence of uncertainty can be done with detailed and reliable information about associated risks. Data mining techniques are used to find meaningful, often complex, and previously unknown patterns in data.

Typically, OLAP analyses use predefined, summarized, or aggregated data, such as multi-dimensional cubes. Data mining requires detail data so that it can be aggregated to, and analyzed at, optimal levels during exploratory analyses. The optimal levels are unique to a specific business question and the data attributes available to address it in a specific data warehouse. Although these technologies are used for different purposes, OLAP and data mining are complementary. During the data mining exploration phase, you may use OLAP technology to help you understand your data.

Data mining results can also be used in OLAP applications by incorporating new predictive variables or scores as dimensions or attributes in your OLAP tool. When retailers analyze which products to stock, they can consider products that attract high-value or profitable customers. How Does Data Mining Work? Data mining leverages artificial intelligence and statistical techniques to build models.

Data Mining & Data Warehousing + 10 PYQs - GATE & UGC NET CS

Data mining models are built from situations where you know the outcome. These models are then applied to other situations where you do not know the outcome. For example, if your data warehouse identifies customers who have responded to past marketing campaigns, you can create a model that identifies the characteristics of those customers. This model can be applied to a wider customer database, identifying customers who demonstrate the same characteristics, allowing you to target those likely to respond, thereby improving response rates and reducing marketing cost. Business problems that lend themselves to data mining are predictive and descriptive in nature.

Predictive models are used to predict an outcome, referred to as the dependent or target variable, based on the value of other variables in the data set. For example, a predictive model could determine the likelihood that a customer will purchase a product based on her income, number of children, current product ownership, or debt. The algorithm analyzes the values of all input variables and identifies which variables are significant as predictors for a desired outcome. Unlike predictive models, descriptive models do not predict variables based on known outcomes.

Instead, they describe a particular pattern that has no known outcome. Common techniques include data visualization, where large volumes of data are reduced to a picture that can be easily understood. Another common descriptive technique is clustering, where data is grouped into subsets based on common attributes. For example, you may use descriptive techniques to determine customer segments and their attributes.

In many cases, both descriptive and predictive models are used to solve business problems. A descriptive technique may identify customer segments based on value in terms of profitability to your business, and a predictive technique may identify the likelihood a particular segment will defect to your competitor. By combining results of the descriptive technique to predict customer defection, you can act to prevent attrition of your high-value customers. The Data Mining Process You cannot buy a data mining product, apply it to data and expect to generate a meaningful model.

Data mining models are built as part of a data mining process—an ongoing process requiring maintenance throughout the life of the model. The data mining process is not linear, but an iterative process where you loop back to the previous phase. For example, the initial model you create may lead to insight requiring you to return back to the data pre-processing phase to create new analytical variables.

The data mining process contains four high-level steps: 1 define the business problem, 2 explore and pre-process the data, 3 develop the data model, and 4 deploy knowledge.


A Data Mining Primer for the Data Warehouse Professional

Tasks for each step are listed in Figure 1 to provide a brief overview to the data mining process. Although each step is important, most of your time will be spent in the data exploration and pre-processing phase. A well structured data warehouse can significantly reduce the pain felt in this phase. You can mine inconsistent or dirty data and find patterns. However, the patterns will be meaningless if your data does not accurately reflect the business you are modeling. The key to data mining is ensuring that you have a foundation of quality data that is clean, consistent, and accurate.

A data warehouse provides the right foundation for data mining. Although data mining can be done without having a warehouse in place, the process of gathering, cleansing, and transforming the data from multiple data sources can be arduous.

Recommended Posts:

Once the process has been completed for one model, you must repeat the process for subsequent data mining projects. Approximately 70 percent of the data mining process involves accessing, exploring, and preparing the data. The data warehouse makes data mining more viable by removing many of the data redundancy and system management issues allowing people to focus on analysis.

Data Mining Terms and Techniques Following are some data mining terms and techniques commonly used to solve predictive and descriptive analytical problems. Analytic Model A model is a set of logical rules or a mathematical formula that represents patterns found in data that are useful for a business purpose.

Once a model has been built based on one set of data, it can be reused to search for the discovered patterns in similar data.


Models are sometimes called predictive models since they can be used to predict behaviors based on the discovered patterns. Association This modeling technique is commonly referred to as affinity analysis and is used to identify items that occur together during a particular event. Affinity analysis is commonly used to study market baskets by identifying which combinations of products are most likely to be purchased together. Another form of this technique is sequence analysis in which you can understand the order in which customers tend to purchase specific products.

These results may be helpful in the early phases of establishing cross-selling strategies. Clustering Clustering is a class of modeling techniques that can be used to place items into groups based on like characteristics.