logo
vision scalability social networking revelation

Sybil Attack

We are able to pretend that bank transactions are instant, because they are reversible. And then they get reversed for the wrong reasons, often very bad reasons.

But in general, with irreversible transactions, hard to pretend they go through in an instant. Bitcoin times are pretty good, but not good enough to pay at the cash register.

For this reason, all rhocoin transactions are made with an identity. You can have as many identities as you please, and these identities are not visible to third parties unless one of the parties to the transaction chooses to make them visible.

To support transactions at the cash register, going to need credit rating agencies. If you make the payment as an identity that has an adequate credit rating, the cash register trusts that the transaction will go through.

So, how do we stop someone from generating a credit rating by issuing a lot of sham transactions from his right hand to his left hand, and then buying something and not paying? The credit card problem.

And how do we stop someone from generating a pile of fake sales, and then making a real sale and not delivering? The Ebay problem.

There is a solution to this problem. Each rating agency will treat payments and reviews as a directed graph, and measure graph connectivity. each subgraph of fake payments, and fake reviews will have good connectivity to itself and poor connectivity to the rest of the graph.

1 SybilGuard

The SybilGuard paper provides graph theory solutions against sybils.

According to SybilGuard, social networks are empirically determined to be “fast mixing”

“social networks tend to be fast mixing (defined in the next section), which necessarily means that subsets of honest nodes have good connectivity to the rest of the social network”

So the legitimate musician with legitimate fans will be connected to the rest of the social network – his legitimate fans will have interactions with legitimate fans of other legitimate musicians, and will purchase stuff from other legitimate musicians, while the fake fans making fake purchases will not.

“in a fast mixing graph, after a small number of hops a random walk is equally likely to be traversing any edge in a given hop.”

The paper argues that Sybil subsets of the network are substantially less likely to be entered after a small number of hops.

“With a length w random walk, clearly the distribution of the ending point of the walk depends on the starting point. However, for connected and nonbipartite graphs, the ending point distribution becomes independent of the starting point when w → ∞. This distribution is called the stationary distribution of the graph. The mixing time T of a graph quantifies how fast the ending point of a random walk approach the stationary distribution. In other words, after Θ(T) steps, the node on the random walk becomes roughly independent of the starting point. If T = Θ(log(n)), the graph is called fast mixing.”

Θ(f) means that the space or running time is always proportional to f, O(f) means that the common worst case space or running time is proportional to f. Engineers generally do not care about such differences, and just use O everywhere.

If we mix the graph for a suitable time, legitimate nodes will be well mixed with each other, and Sybil nodes will not be well mixed with legitimate nodes.

Legitimate nodes will have positions that are close to each other in the space of distributions of end probabilities, and Sybil nodes will be far from legitimate nodes in that space.

This algorithm is O(n^2) in data size (the distribution for each starting point), and Θ(n^2{log(n)}) in computational time, which is a lot better than non polynomial, though still apt to be inconveniently large. It can probably be speeded up considerably by random dimensional reduction, wherein a vector of very large dimension (the probability of reaching each end point, the distribution of end points) is randomly mapped to a vector of dimension Θ(log(n)) log of the original dimension.

In this case, the vector space of large dimension is of vectors whose elements are the probability of arriving at a given node after a random walk of distance w, “the distribution of the ending point of the walk”

Which will, I think, speed it up to O(n×{{(log (n))}^2}) which is acceptable.

Mixing for a suitable time, a suitable path distance, divides the graph into well mixed regions, one of which is the legitimate nodes. We identify which of the regions is non Sybil by checking the region of a small number of known legitimate nodes. If they all belong to the same region, our mixing distance is sufficient, and chances are the Sybil nodes are not in that region.

During mixing, the points draw closer to each other in the space of distributions. The fan group of a particular musician will coalesce rapidly, and then the fan groups of all legitimate coalitions will, more slowly draw together, while the fake fan group drifts very slowly indeed. We halt the mixing when the known legitimate nodes are all close to each other, because by then everyone who is a legitimate node is close to them.

We then downrank reviews or page ranks that are distant from the coalescence.

Creative Commons License reaction.la gpg key 154588427F2709CD9D7146B01C99BB982002C39F
This work is licensed under the Creative Commons Attribution 4.0 International License.