CNR Institutional Research Information System

In this paper we consider the linear time algorithm of Kasai et al. [6] for the computation of the Longest Common Prefix (LCP) array given the text and the suffix array. We show that this algorithm can be implemented without any auxiliary array in addition to the ones required for the input (the text and the suffix array) and the output (the LCP array). Thus, for a text of length n, we reduce the space occupancy of this algorithm from 13n bytes to 9n bytes. We also consider the problem of computing the LCP array by "overwriting" the suffix array. For this problem we propose an algorithm whose space occupancy can be bounded in terms of the empirical entropy of the input text. Experiments show that for linguistic texts our algorithm uses roughly 7n bytes. Our algorithm makes use of the Burrows-Wheeler Transform even if it does not represent any data in compressed form. To our knowledge this is the first application of the Burrows-Wheeler Transform outside the domain of data compression. The source code for the algorithms described in this paper has been included in the lightweight suffix sorting package [13] which is freely available under the GNU GPL. © Springer-Verlag Berlin Heidelberg 2004.

Two space saving tricks for linear time LCP array computation

Manzini;Giovanni

2004

Abstract

In this paper we consider the linear time algorithm of Kasai et al. [6] for the computation of the Longest Common Prefix (LCP) array given the text and the suffix array. We show that this algorithm can be implemented without any auxiliary array in addition to the ones required for the input (the text and the suffix array) and the output (the LCP array). Thus, for a text of length n, we reduce the space occupancy of this algorithm from 13n bytes to 9n bytes. We also consider the problem of computing the LCP array by "overwriting" the suffix array. For this problem we propose an algorithm whose space occupancy can be bounded in terms of the empirical entropy of the input text. Experiments show that for linguistic texts our algorithm uses roughly 7n bytes. Our algorithm makes use of the Burrows-Wheeler Transform even if it does not represent any data in compressed form. To our knowledge this is the first application of the Burrows-Wheeler Transform outside the domain of data compression. The source code for the algorithms described in this paper has been included in the lightweight suffix sorting package [13] which is freely available under the GNU GPL. © Springer-Verlag Berlin Heidelberg 2004.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2004
			
	Lingua/e
	
				Inglese
			
	Titolo del convegno
	
				9th Scandinavian Workshop on Algorithm Theory
			
	Volume
	
				3111
			
	Da pagina
	
				372
			
	A pagina
	
				383
			
	Codice ISBN
	
				3-540-22339-8
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-540-27810-8_32
			
	URL
	
				http://www.scopus.com/record/display.url?eid=2-s2.0-35048818017&origin=inward
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				JUL 08-10, 2004
			
	Luogo del Convegno
	
				Humlebaek, DENMARK
			
	Parole chiave
	
				Suffix Array
LCP array
Compression
			
	Codice Scopus
	
				2-s2.0-35048818017
			
	Codice Web of Science
	
				WOS:000222682500032
			
	Numero autori
	
				2
			
	Fulltext
	
				none
			
	Tutti gli autori
	
						Manzini, Giovanni; Manzini, Giovanni
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/318203

Citazioni

ND

78

54

social impact