Random geometric graph

Summary

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space (according to a specified probability distribution) and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

Random geometric graphs resemble real human social networks in a number of ways. For instance, they spontaneously demonstrate community structure - clusters of nodes with high modularity. Other random graph generation algorithms, such as those generated using the Erdős–Rényi model or Barabási–Albert (BA) model do not create this type of structure. Additionally, random geometric graphs display degree assortativity according to their spatial dimension:[1] "popular" nodes (those with many links) are particularly likely to be linked to other popular nodes.

A real-world application of RGGs is the modeling of ad hoc networks.[2] Furthermore they are used to perform benchmarks for graph algorithms.

Definition edit

 
The generation of a random geometric graph for different connectivity parameters r.

In the following, let  G = (V, E) denote an undirected Graph with a set of vertices V and a set of edges E ⊆ V × V. The set sizes are denoted by |V| = n and |E| = m. Additionally, if not noted otherwise, the metric space [0,1)d with the euclidean distance is considered, i.e. for any points   the euclidean distance of x and y is defined as

 .

A random geometric graph (RGG) is an undirected geometric graph with nodes randomly sampled from the uniform distribution of the underlying space [0,1)d.[3] Two vertices p, q ∈ V are connected if, and only if, their distance is less than a previously specified parameter r ∈ (0,1), excluding any loops. Thus, the parameters r and n fully characterize a RGG.

Algorithms edit

Naive algorithm edit

The naive approach is to calculate the distance of every vertex to every other vertex. As there are  possible connections that are checked, the time complexity of the naive algorithm is  . The samples are generated by using a random number generator (RNG) on  . Practically, one can implement this using d random number generators on  , one RNG for every dimension.

Pseudocode edit

V := generateSamples(n)  // Generates n samples in the unit cube.
for each pV do
    for each qV\{p} do
        if distance(p, q) ≤ r then
            addConnection(p, q) // Add the edge (p, q) to the edge data structure.
        end if
    end for
end for

As this algorithm is not scalable (every vertex needs information of every other vertex), Holtgrewe et al. and Funke et al. have introduced new algorithms for this problem.

Distributed algorithms edit

Holtgrewe et al. edit

This algorithm, which was proposed by Holtgrewe et al., was the first distributed RGG generator algorithm for dimension 2.[4] It partitions the unit square into equal sized cells with side length of at least  . For a given number  of processors, each processor is assigned  cells, where  For simplicity,   is assumed to be a square number, but this can be generalized to any number of processors. Each processor then generates  vertices, which are then distributed to their respective owners. Then the vertices are sorted by the cell number they fall into, for example with Quicksort. Next, each processor then sends their adjacent processors the information about the vertices in the border cells, such that each processing unit can calculate the edges in their partition independent of the other units. The expected running time is  . An upper bound for the communication cost of this algorithm is given by  , where  denotes the time for an all-to-all communication with messages of length l bits to c communication partners.  is the time taken for a point-to-point communication for a message of length l bits.

Since this algorithm is not communication free, Funke et al. proposed[4] a scalable distributed RGG generator for higher dimensions, which works without any communication between the processing units.

Funke et al. edit

The approach used in this algorithm[4] is similar to the approach in Holtgrewe: Partition the unit cube into equal sized chunks with side length of at least r. So in d = 2 this will be squares, in d = 3 this will be cubes. As there can only fit at most   chunks per dimension, the number of chunks is capped at  . As before, each processor is assigned  chunks, for which it generates the vertices. To achieve a communication free process, each processor then generates the same vertices in the adjacent chunks by exploiting pseudorandomization of seeded hash functions. This way, each processor calculates the same vertices and there is no need for exchanging vertex information.

For dimension 3, Funke et al. showed that the expected running time is  , without any cost for communication between processing units.

Properties edit

Isolated vertices and connectivity edit

The probability that a single vertex is isolated in a RGG is  .[5] Let   be the random variable counting how many vertices are isolated. Then the expected value of   is  . The term  provides information about the connectivity of the RGG. For  , the RGG is asymptotically almost surely connected. For  , the RGG is asymptotically almost surely disconnected. And for  , the RGG has a giant component that covers more than  vertices and   is Poisson distributed with parameter  . It follows that if  , the probability that the RGG is connected is  and the probability that the RGG is not connected is  .

For any  -Norm (  ) and for any number of dimensions  , a RGG possesses a sharp threshold of connectivity at  with constant  . In the special case of a two-dimensional space and the euclidean norm (  and  ) this yields  .

Hamiltonicity edit

It has been shown, that in the two-dimensional case, the threshold  also provides information about the existence of a Hamiltonian cycle (Hamiltonian Path).[6] For any  , if  , then the RGG has asymptotically almost surely no Hamiltonian cycle and if  for any  , then the RGG has asymptotically almost surely a Hamiltonian cycle.

Clustering coefficient edit

The clustering coefficient of RGGs only depends on the dimension d of the underlying space [0,1)d. The clustering coefficient is [7]

 for even   and  for odd   where

 
For large  , this simplifies to  .

Generalized random geometric graphs edit

In 1988 Waxman[8] generalised the standard RGG by introducing a probabilistic connection function as opposed to the deterministic one suggested by Gilbert. The example introduced by Waxman was a stretched exponential where two nodes   and   connect with probability given by  where   is the euclidean separation and  ,  are parameters determined by the system. This type of RGG with probabilistic connection function is often referred to a soft random geometric Graph, which now has two sources of randomness; the location of nodes (vertices) and the formation of links (edges). This connection function has been generalized further in the literature  which is often used to study wireless networks without interference. The parameter   represents how the signal decays with distance, when   is free space,   models a more cluttered environment like a town (= 6 models cities like New York) whilst   models highly reflective environments. We notice that for   is the Waxman model, whilst as   and   we have the standard RGG. Intuitively these type of connection functions model how the probability of a link being made decays with distance.

Overview of some results for Soft RGG edit

In the high density limit for a network with exponential connection function the number of isolated nodes is Poisson distributed, and the resulting network contains a unique giant component and isolated nodes only.[9] Therefore by ensuring there are no isolated nodes, in the dense regime, the network is a.a.s fully connected; similar to the results shown in [10] for the disk model. Often the properties of these networks such as betweenness centrality [11] and connectivity [9] are studied in the limit as the density   which often means border effects become negligible. However, in real life where networks are finite, although can still be extremely dense, border effects will impact on full connectivity; in fact [12] showed that for full connectivity, with an exponential connection function, is greatly impacted by boundary effects as nodes near the corner/face of a domain are less likely to connect compared with those in the bulk. As a result full connectivity can be expressed as a sum of the contributions from the bulk and the geometries boundaries. A more general analysis of the connection functions in wireless networks has shown that the probability of full connectivity can be well approximated expressed by a few moments of the connection function and the regions geometry.[13]

References edit

  1. ^ Antonioni, Alberto; Tomassini, Marco (28 September 2012). "Degree correlations in random geometric graphs". Physical Review E. 86 (3): 037101. arXiv:1207.2573. Bibcode:2012PhRvE..86c7101A. doi:10.1103/PhysRevE.86.037101. PMID 23031054. S2CID 14750415.
  2. ^ Nekovee, Maziar (28 June 2007). "Worm epidemics in wireless ad hoc networks". New Journal of Physics. 9 (6): 189. arXiv:0707.2293. Bibcode:2007NJPh....9..189N. doi:10.1088/1367-2630/9/6/189. S2CID 203944.
  3. ^ Penrose, Mathew. (2003). Random geometric graphs. Oxford: Oxford University Press. ISBN 0198506260. OCLC 51316118.
  4. ^ a b c von Looz, Moritz; Strash, Darren; Schulz, Christian; Penschuck, Manuel; Sanders, Peter; Meyer, Ulrich; Lamm, Sebastian; Funke, Daniel (2017-10-20). "Communication-free Massively Distributed Graph Generation". arXiv:1710.07565v3 [cs.DC].
  5. ^ Perez, Xavier; Mitsche, Dieter; Diaz, Josep (2007-02-13). "Dynamic Random Geometric Graphs". arXiv:cs/0702074. Bibcode:2007cs........2074D. {{cite journal}}: Cite journal requires |journal= (help)
  6. ^ Perez, X.; Mitsche, D.; Diaz, J. (2006-07-07). "Sharp threshold for hamiltonicity of random geometric graphs". arXiv:cs/0607023. Bibcode:2006cs........7023D. {{cite journal}}: Cite journal requires |journal= (help)
  7. ^ Christensen, Michael; Dall, Jesper (2002-03-01). "Random Geometric Graphs". Physical Review E. 66 (1 Pt 2): 016121. arXiv:cond-mat/0203026. Bibcode:2002PhRvE..66a6121D. doi:10.1103/PhysRevE.66.016121. PMID 12241440. S2CID 15193516.
  8. ^ Waxman, B.M (1988). "Routing of multipoint connections". IEEE Journal on Selected Areas in Communications. 6 (9): 1617–1622. doi:10.1109/49.12889.
  9. ^ a b Mao, G; Anderson, B.D (2013). "Connectivity of large wireless networks under a general connection model". IEEE Transactions on Information Theory. 59 (3): 1761–1772. doi:10.1109/tit.2012.2228894. S2CID 3027610.
  10. ^ Penrose, Mathew D (1997). "The longest edge of the random minimal spanning tree". The Annals of Applied Probability: 340361.
  11. ^ Giles, Alexander P.; Georgiou, Orestis; Dettmann, Carl P. (2015). "Betweenness centrality in dense random geometric networks". 2015 IEEE International Conference on Communications (ICC). pp. 6450–6455. arXiv:1410.8521. Bibcode:2014arXiv1410.8521K. doi:10.1109/ICC.2015.7249352. ISBN 978-1-4673-6432-4. S2CID 928409.
  12. ^ Coon, J; Dettmann, C P; Georgiou, O (2012). "Full connectivity: corners, edges and faces". Journal of Statistical Physics. 147 (4): 758–778. arXiv:1201.3123. Bibcode:2012JSP...147..758C. doi:10.1007/s10955-012-0493-y. S2CID 18794396.
  13. ^ Dettmann, C.P; Georgiou, O (2016). "Random geometric graphs with general connection functions". Physical Review E. 93 (3): 032313. arXiv:1411.3617. Bibcode:2016PhRvE..93c2313D. doi:10.1103/physreve.93.032313. PMID 27078372. S2CID 124506496.