Skip to content

Nilsimsa locality-sensitive hashing algorithm in Elixir

License

Notifications You must be signed in to change notification settings

philipbrown/elixir-nilsimsa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nilsimsa

Documentation

 def deps do
   [
     {:nilsimsa, "~> 1.0.0"}
   ]
 end

Nilsimsa is an implementation of a locality-sensitive hashing algorithm where similar input values produce similar hashes. The more similar the input strings are, the smaller the bitwise different between the out generated hashes.

Nilsimsa hashes are useful for detecting texts of the same origin.

Processing a string

To process a string, pass the value to the process/1 function:

Nilsimsa.process("abcdefgh")

You can also process a stream:

"war_and_peace.txt"
|> File.stream!()
|> Enum.reduce(Nilsimsa.new(), &Nilsimsa.process/2)

Generating a digest

To generate a digest of the Nilsimsa hash, just pass the process struct to the to_string/1 function:

to_string(Nilsimsa.process("abcdefgh"))
# => 14c8118000000000030800000004042004189020001308014088003280000078

Comparing values

To compare two values, use the compare/2 function:

Nilsimsa.compare(Nilsimsa.process("hello world"), Nilsimsa.process("all of your base"))
# => 3

About

Nilsimsa locality-sensitive hashing algorithm in Elixir

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages