This short note describes the main characteristics of WebDocs, a huge real-life transactional dataset we made publicly available to the Data Mining community through the FIMI repository. We built WebDocs from a spidered collection of web html documents. The whole collection contains about 1.7 millions documents, mainly written in English, and its size is about 5GB.
WebDocs: a real-life huge transactional dataset
Lucchese C;Orlando S;Perego R;Silvestri F
2004
Abstract
This short note describes the main characteristics of WebDocs, a huge real-life transactional dataset we made publicly available to the Data Mining community through the FIMI repository. We built WebDocs from a spidered collection of web html documents. The whole collection contains about 1.7 millions documents, mainly written in English, and its size is about 5GB.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
prod_91780-doc_125585.pdf
solo utenti autorizzati
Descrizione: WebDocs: a real-life huge transactional dataset
Tipologia:
Versione Editoriale (PDF)
Dimensione
858.22 kB
Formato
Adobe PDF
|
858.22 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.