STING -- Surveying The INternet's Growth
Measuring and predicting growth in Internet
addressing, routing complexity and energy
usage
Secure Fast Set Intersection (SeFaSI) Tool
Overview
SeFaSI is a tool that allows multiple parties to securely compute
the set intersection cardinality of data sets while keeping the data
items private. SeFaSI is secure (with appropriately selected encryption parameters),
works with two or more parties, is fast and scalable due to data sampling,
and also supports a technique to prevent probe attacks.
We developed SeFaSI to use capture-recapture techniques to estimate the
used IPv4 space from multiple datasets of observed IPv4 addresses while ensuring
the anonymity of the IPv4 addresses in each dataset.
However, the underlying technique and a large part of the code is generic,
and SeFaSI could be used for other applications.
SeFaSI is based on commutative encryption. Each participating party first
encrypts and permutes their own data set. Then all other parties encrypt and permute
each other party's encrypted data set. Since the encryption is commutative, the set
intersection cardinality of the n-party encrypted and permuted sets is the
same as the intersection cardinality of the unencrypted sets. Since the
encryption is secure and the data sets are permuted by each party,
no party can learn anything about another party's data set apart from the
intersection cardinality.
The commutative encryption method
is more efficient than other secure set intersection cardinality computation methods,
but it still has significant computational and storage/transmission costs. SeFaSI
supports hash-based sampling of the input data to
drastically speed up the encryption and drastically reduce the storage/
transmission costs for the encrypted data sets. Large performance gains can be
achieved with relatively small sampling errors.
SeFaSI can detect probe attacks if the input data is from some known finite
set. Probe attacks are attacks where an attacker constructs a data set with mostly
invalid/impossible items and only a few valid items. Successful probe attacks
allow the attacker to determine if a few specific items are present in other
parties' data sets.
SeFaSI runs under Linux and FreeBSD. See INSTALL and README for further information.
The approach and the SeFaSI implementation are described in more detail in the
tech report CAIA-TR-130930A.pdf.
The tech report CAIA-TR-130930B.pdf
describes how SeFaSI can be used by multiple collaborators to estimate the size of the used IPv4 space
while ensuring the anonymity of the IPv4 addresses each collaborator has observed.
Download
Latest Version
Version 0.5 (released 14th of May 2014):
Release 0.5: renaming of project in doc, small changes in how sample salts are generated
Old versions
Version 0.4 (released 13th of January 2014):
Release 0.4: fix against an attack when input data contains duplicates (i.e. not a proper set)
Version 0.3 (released 2nd of December 2013):
Release 0.3: test script reorganisation, added test script for n local parties, bug fixes
Version 0.2 (released 18th of November 2013):
Release 0.2: bug fixes
Version 0.1 (released 1st of October 2013):
Initial release
Authors
SeFaSI has been developed by the following people:
Sebastian Zander
(szander@swin.edu.au)
License and Terms of Use
The SeFaSI source code is publicly available under the terms
and conditions of the GNU General Public License version 2 (GPLv2),
which can be viewed
here.
Acknowledgements
This tool has been made possible in part by an Australian Research Council (ARC)
Linkage Project grant (project LP110100240) with APNIC Pty Ltd as partner organisation, for a
project titled "Tools and models for measuring and predicting growth in internet addressing and routing
complexity".
|