STING -- Surveying The INternet's Growth

Measuring and predicting growth in Internet addressing, routing complexity and energy usage

Secure Fast Set Intersection (SeFaSI) Tool

Overview

SeFaSI is a tool that allows multiple parties to securely compute the set intersection cardinality of data sets while keeping the data items private. SeFaSI is secure (with appropriately selected encryption parameters), works with two or more parties, is fast and scalable due to data sampling, and also supports a technique to prevent probe attacks. We developed SeFaSI to use capture-recapture techniques to estimate the used IPv4 space from multiple datasets of observed IPv4 addresses while ensuring the anonymity of the IPv4 addresses in each dataset. However, the underlying technique and a large part of the code is generic, and SeFaSI could be used for other applications.

SeFaSI is based on commutative encryption. Each participating party first encrypts and permutes their own data set. Then all other parties encrypt and permute each other party's encrypted data set. Since the encryption is commutative, the set intersection cardinality of the n-party encrypted and permuted sets is the same as the intersection cardinality of the unencrypted sets. Since the encryption is secure and the data sets are permuted by each party, no party can learn anything about another party's data set apart from the intersection cardinality.

The commutative encryption method is more efficient than other secure set intersection cardinality computation methods, but it still has significant computational and storage/transmission costs. SeFaSI supports hash-based sampling of the input data to drastically speed up the encryption and drastically reduce the storage/ transmission costs for the encrypted data sets. Large performance gains can be achieved with relatively small sampling errors.

SeFaSI can detect probe attacks if the input data is from some known finite set. Probe attacks are attacks where an attacker constructs a data set with mostly invalid/impossible items and only a few valid items. Successful probe attacks allow the attacker to determine if a few specific items are present in other parties' data sets.

SeFaSI runs under Linux and FreeBSD. See INSTALL and README for further information. The approach and the SeFaSI implementation are described in more detail in the tech report CAIA-TR-130930A.pdf. The tech report CAIA-TR-130930B.pdf describes how SeFaSI can be used by multiple collaborators to estimate the size of the used IPv4 space while ensuring the anonymity of the IPv4 addresses each collaborator has observed.

Download

Latest Version

Version 0.5 (released 14th of May 2014):

Release 0.5: renaming of project in doc, small changes in how sample salts are generated

Source tarball sefasi-0.5.tar.gz
INSTALL.txt (also in tarball)
README.txt (also in tarball)

Old versions

Version 0.4 (released 13th of January 2014):

Release 0.4: fix against an attack when input data contains duplicates (i.e. not a proper set)

Source tarball sefasi-0.4.tar.gz
INSTALL.txt (also in tarball)
README.txt (also in tarball)

Version 0.3 (released 2nd of December 2013):

Release 0.3: test script reorganisation, added test script for n local parties, bug fixes

Source tarball sefasi-0.3.tar.gz
INSTALL.txt (also in tarball)
README.txt (also in tarball)

Version 0.2 (released 18th of November 2013):

Release 0.2: bug fixes

Source tarball sefasi-0.2.tar.gz
INSTALL.txt (also in tarball)
README.txt (also in tarball)

Version 0.1 (released 1st of October 2013):

Initial release

Source tarball sefasi-0.1.tar.gz
INSTALL.txt (also in tarball)
README.txt (also in tarball)

Authors

SeFaSI has been developed by the following people:

Sebastian Zander (szander@swin.edu.au)

License and Terms of Use

The SeFaSI source code is publicly available under the terms and conditions of the GNU General Public License version 2 (GPLv2), which can be viewed here.

Acknowledgements

This tool has been made possible in part by an Australian Research Council (ARC) Linkage Project grant (project LP110100240) with APNIC Pty Ltd as partner organisation, for a project titled "Tools and models for measuring and predicting growth in internet addressing and routing complexity".

Centre closure