par Calders, Toon ;Ramon, J.;Van Dyck, Dries
Référence Quantitative Graph Theory: Mathematical Foundations and Applications, CRC Press, page (303-324)
Publication Publié, 2014
Partie d'ouvrage collectif
Résumé : Over the last several years, graph mining has emerged as a new field within contemporary data mining. One of the central tasks is the search for subgraphs, called patterns, that occur frequently in either a collection of graphs (e.g., databases of molecules [6], game positions [15], scene descriptions) or in a single large graph (e.g., the Internet, citation networks [16], social networks [12], protein interaction networks [10]). In the literature, the terms frequency and support have been used interchangeably to denote the measure to quantify the prevalence of a pattern. In the single-graph setting, however, the notion of frequency is not at all straightforward to define. For example, the obvious definition of taking the number of instances of a pattern as its frequency has the undesirable property that extending the pattern (i.e., making it more restrictive) may actually increase its frequency. Indeed, consider for instance, the unlabeled k-clique Kk as the single graph in which we want to find patterns. There are [see content source for equation details] different embeddings under subgraph isomorphism of the unlabeled path of length 1 in Kk, whereas there are 3[see content source for equation details] embeddings of the path of length 2 in Kk. In fact, the number of different embeddings may increase exponentially in the size of the pattern. Hence, as pointed out by Vanetik et al. [17], a good frequency measure must be such that the frequency of a superpattern is always at most as high as that of a subpattern. This property is called the antimonotonicity. Also, for reasons of efficiency, antimonotonicity of the frequency measure is highly desirable, as it allows for pruning large parts of the search space in a general-to-specific exploration. The efficiency and correctness of most existing graph pattern miners relies critically on the antimonotonicity of the frequency measure being used.