[Abstract] With the popularity of e-commerce and data mining technology, data mining technology into e-business e-commerce can solve the huge amount of data problems in order to be truly valuable information. This paper briefly introduces the concept of e-commerce and data mining, and e-commerce use data mining techniques to a detailed analysis.
Paper Keywords: e-commerce, data mining, clustering, association rules mining 1. Introduction With the popularity of Internet, e-commerce has been an unprecedented development, trade between dealers and customers via the Internet, saving a lot of cost and time. But filled with a lot of data in e-commerce, how to mine large amounts of data from these truly valuable information to help businesses make better marketing strategies dealer e-commerce urgent problem. Data mining, also known as knowledge discovery in databases (Knowledge Discovery in Database, KDD), is from a large number of data mining, a technique useful information. Using data mining technology allows dealers from large amounts of data to dig out useful information to help make decisions, to gain a dominant position in the market competition.
E-commerce Overview E-2 refers to trading parties or participants the use of modern information technology and computer networks (primarily the Internet) carried out various commercial activities, including trade in goods, trade in services and intellectual property trade. 'E-commerce' included 'Modern information technology' should cover all kinds of use of electronic means of communication based; 'Business' means whether or not the various matters contractual relationship between the commercial nature of the contractual arising. If the 'modern information technology' as a subset of 'business' as a subset of another, covered by the scope of e-commerce should be formed by the intersection of the two subsets, which may widely under 'electronic commerce' title involving the Internet, intranet and electronic data interchange for various purposes in trade.
Compared with the traditional business e-commerce has the following advantages: (1) the traditional e-commerce business processes digital, electronic, so that the traditional business processes into electronic flow, information flow, break through the limitations of time and space, greatly improving the business operational efficiency. (2) simplifies the circulation of e-commerce companies and enterprises, enterprises and individuals, and to minimize distribution costs, can effectively improve the enterprise in modern business competitiveness. (3) is based on an e-commerce business activities on the Internet, the Internet itself is an open global nature of e-commerce can provide rich information resource for businesses and individuals, creating more business opportunities for enterprises. (4) e-commerce for large enterprises and SMEs are advantageous because medium-sized enterprises need more trading transactions, e-commerce can effectively manage and improve efficiency, equally beneficial for small businesses, because e-commerce allows businesses to close The cost to conduct online transactions, so that SMEs and large enterprises may have the same distribution channels and information resources, greatly improve the competitiveness of SMEs. (5) The majority of e-commerce business to move online, businesses can implement paperless office saving money.
3 Data Mining Data Mining (Data Mining, DM) technology is widely used in computers as accumulation and data and developed. Data mining is the extraction of large amounts of data from or 'mining' knowledge, that is found implicit, unknown, meaningful process information, and it is called 'Knowledge Discovery in Databases' (KDD), it was also Data mining as a basic step KDD, knowledge discovery process consists of the following steps: (1) data cleaning (2) Data Integration (3) the data selection (4) Data conversion (5) Data Mining (6) Mode Assessment (7) knowledge representation.
Defined from a business perspective, data mining is a new business information processing technology, its main feature is the large number of commercial database business data extraction, transformation, analysis and other modeling process to extract the auxiliary business decisions critical data. Use powerful data mining technology that allows enterprises to transform data into useful information to help make decisions, to gain a dominant position in the market competition. Different data mining and traditional data analysis is not clearly hypothetical premise to go digging information found knowledge. Data mining the information obtained should have previously unknown, effective and practical three characteristics.
4 Data Mining role in e-commerce data mining technology can serve the reason why e-commerce, because it can dig out the activities in the process of information to guide potential of e-commerce activities. The role of e-commerce has seven aspects: (1) mining customer activity concerns, targeted to provide 'personalized' services in e-commerce platform. (2) can dig out potential customers in e-commerce Web site visitors to browse. (3) e-commerce by mining activity information to visitors and can be more in-depth understanding of customer needs. (4) by digging purchase online customers can help formulate a reasonable product strategy and pricing strategy. (5) Based on sales of goods available and mining, can help develop product marketing strategies, optimizing promotional activities. (6) e-commerce site optimization information navigation, enabling customers to browse. (7) found that the performance bottlenecks site customers browse through congestion recorded on the network, thereby enhancing the stability of the site to ensure that e-commerce shopping expedition.
Techniques and methods of data mining 5 e-commerce e-commerce data mining process generally includes three main stages: data preparation, data mining, interpretation and evaluation of results. (1) Data preparation can be divided into two data selection and data preprocessing step. Data selected in order to determine the object discovery tasks. That objective data, is a set of data according to user needs extracted from the original database. Data preprocessing generally include the elimination of noise, calculate default values derived data, eliminate duplicate records, data type conversion is completed and the data dimensionality reduction. (2) The first phase of data mining to determine the objectives of data mining and knowledge of the type of mining. Determine mining task, Mining and Knowledge type selection according to a suitable mining algorithms, the final implementation of data mining operations, the use of selected mining algorithms to extract the required knowledge from the database. Interpretation and evaluation (3) results. Knowledge discovery data mining phase, after the assessment, there may be redundant or irrelevant knowledge, then it needs to be removed, it is possible knowledge does not meet the needs of users, we need to repeat the process of re-mining excavation. In addition, since the data mining end-users have to face, therefore, also we need to explain the Mining and Knowledge, in a user-understandable way for users to use.
Data mining in accordance with its mining tasks include classification and prediction, clustering, association rule mining, regression discovery and sequential pattern discovery techniques. Before choosing a data mining techniques need to be resolved first of all to be converted into the right data mining tasks, and then the mining task to choose which data mining techniques. In the e-commerce activities, the main use here are some data mining techniques.
5.1 Classification Classification is to identify common characteristics of database objects and a set of data according to the classification model will be divided into different classes, and its purpose is classification model or classification function, the data item in the map database to a given category. The main method of classification are based on data classification tree models, Bayesian classification algorithm, ID3 algorithm and BP neural network algorithm and the like.
Now assume that we have a description of the database customer property, including their name, age, income, occupation, etc., we can follow if they buy a product (for example, a computer) to be classified. If there are new customers added to the database, I would like to inform the new computer customer sales information, if the promotional material distributed database for each new customer, this may lead to consume more energy and resources. And if we only distribute materials to those customers likely to buy a new computer, the cost savings in larger extent. To do this, we can construct and use classification model. Features classification method is through the data sample database for analysis, has established a classification model, and then use the classification model for other records in the database are classified.
5.2 Cluster analysis Cluster analysis is a set of data according to the similarities and differences are divided into several categories, which aims to make the similarity between data belonging to the same category as large as possible, similar data between different categories as small as possible. Cluster analysis is data mining of the most common techniques. Cluster analysis methods commonly used are: segmentation clustering method, hierarchical clustering methods, density-based clustering method and sparse high-dimensional clustering algorithm. Differs from cluster analysis and classification method is clustering prior to the distribution of the data set does not have any understanding. Therefore, after gathering the business to have a very familiar one to explain such recruitment. In many cases you get free time gathering for your business, it may not be good, then you need to remove or add variables to affect the way the classification, after a few iterations to finally get a desired result. Cluster analysis methods used in the E-commerce is also very extensive. One typical application is to help market analysis found that different customer base from the customer base library, and with the buying patterns to describe the characteristics of different customer groups. By extracting features for clustering of customers, the customer base into finer market, providing targeted services.
5.3 Mining Association Rules Association rules are rules that describe the relationship between the data in the database items that, according to one thing appears certain items can be exported other items in the same things there that hide between the data association or relationship, such as a purchase activities in the purchase of different commodities correlation. In e-commerce, find interesting from a number of business associations in the transaction record, we can help many business decision making. Initially association rule mining is the most typical form of market basket analysis. It found that customers placed through links between different products in their shopping basket, analyze customer buying habits. For example, at the same time go to the supermarket, if a customer to buy the milk, he also bought the bread (including the purchase of any type of bread) How likely is it? This information can help retailers choose to sell and arrange shelves, boot sales. For example, milk and bread close as possible to put some further stimulus once go to the store to buy these products at the same time. In e-commerce, Web server because the log file records the user's access to records, these records through the use of association rule mining customers to buy products online relevance for certain brand preferences and loyalty, the price of the acceptable range, and packaging requirements The results can be used to help excavate the site managers plan, determine the type of investment goods, prices and new products.
5.4 sequential pattern analysis sequential pattern analysis and association rule mining is similar, but the focus before and after the sequence relationships between data analysis. It was found in the form database within a certain period of time, the customer purchase product A, and then buy goods B, then buy goods C, i.e., the higher the frequency of the information sequence ABC appears. An example of sequential pattern analysis is the 'nine months before the customer is likely to buy a Pentium PC order within a month the new CPU chip.'
6 Conclusion eBusiness processes various information and data is the basis for e-commerce activities to better conduct of e-commerce to tap valuable information by selecting the appropriate data mining technology, so that enterprises in the fierce competition in the market make the right decisions, to maintain a strong competitive advantage. With the continuous development of data mining techniques, we believe its use in e-commerce will get faster and more efficiently promote their development.
 Yao Miao. 'Data Mining in e-commerce applications.' Academic Library and Information Forum .Mar.2009.Vol.8 No.1
 Zhao Yan. Zhang Liming. Lvan. Zhao Yanhui. 'E-commerce in data mining technology.' China Youth Institute of Electronics Tenth Annual Conference Proceedings .2004.9
 Yangqing Jie. Hu Mingxia. 'Technology in Electronic Commerce data mining.' Market Modernization. 2008 16
 Han Wei. 'Data mining concepts and technologies.' Machinery Industry Press .2001
 Hu Chuang. 'How to use data mining technology in electronic commerce.' Popular Literature (theory). 2004 04