Page 1 of 1

xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 9:35 am
by vince
I'm trying to build a Livecode implementation of the xxHash algorithm. Originally it's written in C but there have been built implementations in a lot of other languages and i'm using the C# version as a source because i understand the language better than plain C: https://github.com/noricube/xxHashSharp ... /xxHash.cs

The first problem i'm facing is that in this code the uint variable type (unsigned 32bit integer) is used and they also use a "feature" (which i always saw as a limitation) where if it reaches it's maximum value of 4,294,967,295 it will loop around. The problem i'm facing is that in LC i can't find a variable type which works this way because of the loose typing which is a good thing normally but not in this case.

For example, take the following which seems like a simple sum:

_state.v1 = seed + PRIME32_1 + PRIME32_2; // (seed = uint = 0, PRIME32_1 = uint = 2654435761, PRIME32_2 = uint = 2246822519)

In LC _state.v1 will have the value 4901258280 but in C# it will overflow at 4294967295 so it will have the value 606290985

Now my actual question: what is the best way to implement this in LC and mimick the way a uint works (also for subtraction)?

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 7:13 pm
by richmond62
I understand nothing about this; but I did look up "uint"
in Livecode's built-in Dictionary.
uint.png

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 7:16 pm
by FourthWorld
Vince, might you have better luck implementing the 32-bit version of the xxHash algo?

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 7:41 pm
by vince
@Richmond62: for a moment I thought that I had missed a crucial LC command but unfortunately these commands don't do what I need...

@FourthWorld: to my knowledge this is the 32bit version.

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 8:43 pm
by shaosean

Code: Select all

if (_state.v1 > 4294967295) then
   _state.v1 = _state.v1- 4294967295
end if

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 9:46 pm
by [-hh]
Hi 'shaosean'.
I came to LC after you were "gone" for a while. But I read a lot of your brilliant posts. Thus I like it very much that you are here again.

p.s. Why do you dislike "mod"?

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 9:49 pm
by vince
shaosean wrote:

Code: Select all

if (_state.v1 > 4294967295) then
   _state.v1 = _state.v1- 4294967295
end if
Yes, i guess something like that is what i have to do to get this working. I was hoping there would be a more elegant solution because this way it can get less efficient and a lot of extra code very quickly. I will have to cater for every type of interaction: add, substract, multiply, etc...

PS: for the add i would resort to a "mod" call so it will work fine even if it would overflow several times.

Thanks all!

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 9:54 pm
by [-hh]
"a mod b" works also for *negative* numbers a. Only the divisor b has to be positive (as you need it).

== "if (_state.v1 > 4294967295)" is not slower than "(_state.v1 mod 4294967295)"

Two ideas:
== You could also think about an implementation of python's "bignum" package?
== You could think about using log2?

[As this is a very fast algorithm: Do you plan to share your LC-implementation?]

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:08 pm
by vince
[-hh] wrote:"a mod b" works also for *negative* numbers a. Only the divisor b has to be positive (as you need it).

You could also think about an implementation of python's "bignum" package.
[As this is a very fast algorithm: Do you plan to share your LC-implementation?]
Thanks.

If i manage to translate xxHash to LC i might share it. Something tells me that the best way to do it would be to use LC builder/studio but this is very new to me and seems more complicated to do. I selected xxHash because of it's speed/quality and because i need to hash potentially very big files while i am copying them (50GB and larger). So for this to work i will have to continuously "stream" my data through the hashing function and not perform the hash like md5Digest does.

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:13 pm
by [-hh]
vince wrote:need to hash potentially very big files while i am copying them (50GB and larger)
This is by far too much for an interpreter, you will probably need a native procedure, that is, if using LiveCode, an external or a FFI (there is even a very fast LuaJIT implementation https://github.com/szensk/luaxxhash ).

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:19 pm
by FourthWorld
vince wrote:@FourthWorld: to my knowledge this is the 32bit version.
Yes, I saw that in your OP, but the sample code uses a ulong and the LC version needs to handle numbers larger than 4 bytes can accommodate. Looks like shaosean's got you covered (good to see you here again, shaosean).

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:22 pm
by vince
[-hh] wrote:This is by far too much for an interpreter, you will probably need a native procedure, that is, if using LiveCode, an external or a FFI.
I will have to find out if it is too much. To me it seems you call Init once at the start, after that you stream your data through the Update function which will contiuously update the hash counters with a not too complex calculation and at the end calculate/call the Digest function to obtain the hash. By doing it this way i think it should be possible even from pure LC.

But i'm by no means an expert, just someone with a challenge to keep me busy :wink:

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:26 pm
by vince
[-hh] wrote:...there is even a very fast LuaJIT implementation https://github.com/szensk/luaxxhash ).
I have seen this, is this something i could easily interface from within LC? (not knowing Lua / LuaJIT at all and how it relates to LC)

(But a downside of that LuaJIT implementation is that it doesn't implement the update/digest way of calculating the hash which is what i need)

Re: xxHash conversion to Livecode

Posted: Tue Jul 19, 2016 10:36 pm
by [-hh]
There are other xxHash-LuaJIT-implementations cited there (one very fast, that uses C-bindings).

LuaJIT is, being mostly written in Assembler, for a lot of 'basic' procedures even faster than C. You can use it via LC's "shell" handler.

An example for using LC as a GUI to LuaJIT is here (imageJIT -- beta version)
http://forums.livecode.com/viewtopic.ph ... 32#p143932

Re: xxHash conversion to Livecode

Posted: Wed Jul 20, 2016 2:05 pm
by vince
Thanks -hh, you have given me good ideas!

I have now built a small test with LuaJIT and xxHash and it performs hash calculations at the speed of the copy while using < 1% CPU! For example i hashed a 25GB file in under a minute from my internal SSD.

Based on these results i now think it is better to use LuaJIT in combination with LC and interface it with "open process". This also has the advantage that it creates a seperate process which frees the main LC program of the i/o intensive data transfer handling.