par Louchard, Guy ;Szpankowski, Wojciech
Référence Data Compression Conference Proceedings, page (262-271)
Publication Publié, 1995
Article révisé par les pairs
Résumé : The goal of this contribution is twofold: (i) to introduce a generalized Lempel-Ziv parsing scheme, and (ii) to analyze second-order properties of some compression schemes based on the above parsing scheme. We consider a generalized Lempel-Ziv parsing scheme that partitions a sequence of length n into variable phrases (blocks) such that a new block is the longest substring seen in the past by at most b-1 phrases. The case b = 1 corresponds to the original Lempel-Ziv scheme. In this paper, we investigate the size of a randomly selected phrase, and the average number of phrases of a given size through analyzing the so called b-digital search tree (b-DST) representation. For a memoryless source, we prove that the size of a typical phrase is asymptotically normally distributed. This result is new even for b = 1, and b>1 is a non-trivial extension.