Ready for Transfer

Method for Measuring Similarity Between Data Sets

RFT image

Laboratory: National Security Administration (NSA)

Technology: Measuring similarity between data sets

Opportunity: This NSA technology has completed all of its development stages and is available for licensing.

Details: This technology enables users to measure similarity between data sets without knowing how the sets interact with each other. Other approaches require knowledge of the sets’ interaction using the Jaccard index. This new process is more efficient and accurate because it runs searches despite insertion of duplicate information, resulting in greater flexibility of search parameters and more efficient processing.

This capability is critical for managing and sorting immense quantities of data, and can enhance big data analytics in multiple fields, including finance, genetics, and law enforcement.

Benefits: Thanks to its novel approach to measuring similarities between data sets, this technology offers greater flexibility of search parameters and more efficient processing. The technology also can compare sets without knowing points of intersection and runs searches despite insertion of duplicate information.

Potential Applications:

  • Knowledge management
  • Genetic analysis
  • Social network analysis
  • Financial forensics

Contact: For more information about this technology, contact the NSA’s Office of Research and Technology Applications at tech_transfer@nsa.gov.