par Louchard, Guy ;Szpankowski, Wojciech
Référence Lecture notes in computer science, 684 LNCS, page (152-163)
Publication Publié, 1993
Article révisé par les pairs
Résumé : We consider a string edit problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transforms one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. An edit operation can be the deletion of a symbol, the insertion of a symbol or the substitution of a symbol. We assume that these weights can be arbitrary distributed. We reduce the problem to finding an optimal path in a weighted grid graph, and provide several results regarding a typical behavior of such a path. In particular, we observe that the optimal path (i.e., edit distance) is asymptotically almost surely (a.s.) equal to αn where α is a constant and n is the sum of lengths of both strings. We also obtained some bounds on α in the so called independent model in which all weights (in the associated grid graph) are assumed to be independent. More importantly, we show that the edit distance is well concentrated around its average value. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions, and the method of bounded difference.