Zaznacz stronę

See how Walmart was able to increase its sales by anticipating the sale of pop pies? This is what Exploring Association Rules does. Proteins are sequences consisting of twenty types of amino acids. Each protein carries a unique 3D structure that depends on the order of these amino acids. A slight change in the sequence may result in a change in structure that could impair the function of the protein. This dependence of protein function on its amino acid sequence has been the subject of much research. Previously, these sequences were considered random, but now it is believed that they are not. Nitin Gupta, Nitin Mangal, Kamal Tiwari and Pabitra Mitra have deciphered the nature of the associations between the different amino acids present in a protein. Knowing and understanding these rules of association will be extremely useful in the synthesis of artificial proteins. Conversely, a rule may not stand out particularly well in a data set, but continuous analysis shows that it occurs very frequently. This would be a case of high confidence and low support.

Using these measures helps analysts separate causality from correlation and allows them to correctly evaluate a particular rule. Tarik Agouti is an Associate Professor in the Department of Computer Science at Cadi Ayyad University, Morocco. His research interests include mathematical economics, supply chain management, operations research, information systems, decision systems, data mining, GIS, spatial databases, fuzzy logic, multi-criteria analysis, and distributed systems. High-order pattern recognition makes it easy to capture (polythetic) patterns of high-order mappings or events, which are critical for complex real-world data. [40] Although the concepts behind the association rules date back to earlier data, the mining association rules were defined in the 1990s when computer scientists Rakesh Agrawal, Tomasz Imieliński and Arun Swami developed an algorithm-based method for finding relationships between items using point-of-sale (POS) systems. By applying the algorithms to supermarkets, the scientists were able to discover links between different items purchased, called association rules, and possibly use this information to predict the likelihood of different products being purchased together. Recently, with the rapid expansion of information technology, data analysis is becoming increasingly complex. To overcome these challenges, many approaches have been proposed. Park et al.

[7] proposed a classification-based approach to create a predictive model that could solve the problem of large amounts of data in the transport space using the MapReduce Hadoop framework [8]. MapReduce is a great solution for single-pass calculations, but not very efficient for use cases that require multi-pass calculations. It tends to be slow due to the huge space requirements of every job. In addition, Chen et al. [9] proposed an evolutionary algorithm, namely niche-assisted gene expression programming (NGEP), to solve the problem of computational cost and inefficiency in achieving the goal. Although these methods have done a lot to maintain the rules of association. In addition, many other research papers [10,11,12] have been proposed, but still suffer from the accuracy and relevance of the rules extracted. The motivation for this research is to propose an approach that can solve this problem, as mentioned above. In this context, we proposed an approach based on Apache Spark [13] and multi-criteria business intelligence to extract relevant association rules using the parallel FP growth algorithm [14]. This approach is based on four main steps: preprocessing data, frequently retrieving models without candidate generation, retrieving association rules, and prioritizing extracted rules. In order to select interesting rules from the set of all possible rules, restrictions are used for different measures of meaning and interest. The most well-known limits are the minimum thresholds of support and trust.

Exploring association rules, as the name suggests, association rules are simple if/then statements that help identify relationships between seemingly independent relational databases or other data repositories. For larger datasets, a minimum threshold or percentage confidence limit can be useful for determining relationships between items. When this method is applied to some of the data in Table 2, information that does not meet the requirements is deleted. Table 4 shows examples of association rules where the minimum confidence level is 0.5 ( 50 %). All data with a reliability of not at least 0.5 will be omitted. Threshold generation helps to strengthen the association between items as data is further explored by highlighting which ones occur most frequently at the same time. The table uses the trust information in Table 3 to implement the Support x Trust column, which highlights the relationship between items through their trust and support, not just a concept. Rule ranking by Support x Trust multiplies the trust of a particular rule to support it and is often implemented for a deeper understanding of the relationship between elements. An example of a rule for the supermarket could be { b u t t e r , b r e a d } ⇒ { m i l k } {displaystyle {mathrm {butter,bread} }Rightarrow {mathrm {milk} }}, which means that when butter and bread are bought, customers also buy milk. Association rule mining is a technique that aims to observe common patterns, correlations, or mappings from records found in various types of databases such as relational databases, transactional databases, and other forms of repositories. Data mining techniques and model extraction from large data sets play an important role in knowledge discovery.

Most decision-makers encounter a variety of decision-making rules that result from the rules of mining associations. In addition, the volume of datasets brings with it a new challenge to extract patterns such as computational costs and inefficiency to achieve the relevant rules. To overcome these challenges, this article aims to create a learning model based on the growth of fp and Apache Spark Framework to process and extract the relevant association rules. We also integrate a multi-criteria decision analysis to prioritize the extracted rules, taking into account the subjective judgment of the decision-maker. We believe that this approach would be a useful model, especially for decision-makers who suffer from conflicts between the extracted rules and the difficulty of creating only the most interesting rules. The experimental results for the analysis of traffic accidents show that with the proposed approach, more association rules can be effectively achieved with a higher accuracy rate and the response time of the proposed algorithm can be improved. The results clearly show that the proposed approach is working well and can provide useful information that could help decision-makers improve road safety. Association rules in medical diagnosis can be helpful in helping doctors heal patients. Diagnosis is not an easy process and has a number of errors that can lead to unreliable end results. Using the exploration of relational association rules, we can identify the probability of disease onset based on various factors and symptoms. In addition, this interface can be extended using learning techniques by adding new symptoms and defining relationships between new signs and corresponding diseases. .