Artigo Acesso aberto Revisado por pares

Permuting Web and Social Graphs

2009; Taylor & Francis; Volume: 6; Issue: 3 Linguagem: Inglês

10.1080/15427951.2009.10390641

ISSN

1944-9488

Autores

Paolo Boldi, Massimo Santini, Sebastiano Vigna,

Tópico(s)

Caching and Content Delivery

Resumo

Since the first investigations on web-graph compression, it has been clear that the ordering of the nodes of a web graph has a fundamental influence on the compression rate (usually expressed as the number of bits per link). The authors of the LINK database [Randall et al. 02], for instance, investigated three different approaches: an _extrinsic_ ordering (URL ordering) and two _intrinsic_ orderings based on the rows of the adjacency matrix (lexicographic and Gray code); they concluded that URL ordering has many advantages in spite of a small penalty in compression. In this paper we approach this issue in a more systematic way, testing some known orderings and proposing some new ones. Our experiments are made in the WebGraph framework [Boldi and Vigna 04], and show that the compression technique and the structure of the graph can produce significantly different results. In particular, we show that for a _transposed_ web graph, URL ordering is significantly less effective, and that some new mixed orderings combining host information and Gray/lexicographic orderings outperform all previous methods: in some large transposed graphs they yield the quite incredible compression rate of 1 bit per link. We experiment with these simple ideas on some nonweb social networks and obtain results that are extremely promising and are very close to those recently achieved using shingle orderings and backlink compression schemes [Chierichetti et al. 09].

Referência(s)