Just to see if I could, I decided to try implementing SHA256 in DAX. I was partly successful but limited lack of recursion, which meant I had to unroll loops manually with a bunch of variables (like@Greg_Deckleremulating recursion). I did that but ultimately ran into the problem that the hash for each chunk uses the output of the prior chunk, so the DAX version only works on strings short than 55 characters. (If you want to take a peek at the gory details, check the attached .pbix.)
After brooding on that for while, I decided to try to write an M function that wouldn't have the single chunk limitation. This is my first version that appears to be working (also included in the attached file).
(text as text) as text => let input = text, len = Text.Length(input), nChunks = Number.RoundUp( ( len * 8 + 65 ) / 512 ), bytes = nChunks * 64, output = Text.ToBinary(input, TextEncoding.Ascii), /*Define Constants*/ t = List.Transform({0..32}, each Number.Power(2, _)), h0 = { 1779033703, 3144134277, 1013904242, 2773480762, 1359893119, 2600822924, 528734635, 1541459225 }, k = { 1116352408, 1899447441, 3049323471, 3921009573, 961987163, 1508970993, 2453635748, 2870763221, 3624381080, 310598401, 607225278, 1426881987, 1925078388, 2162078206, 2614888103, 3248222580, 3835390401, 4022224774, 264347078, 604807628, 770255983, 1249150122, 1555081692, 1996064986, 2554220882, 2821834349, 2952996808, 3210313671, 3336571891, 3584528711, 113926993, 338241895, 666307205, 773529912, 1294757372, 1396182291, 1695183700, 1986661051, 2177026350, 2456956037, 2730485921, 2820302411, 3259730800, 3345764771, 3516065817, 3600352804, 4094571909, 275423344, 430227734, 506948616, 659060556, 883997877, 958139571, 1322822218, 1537002063, 1747873779, 1955562222, 2024104815, 2227730452, 2361852424, 2428436474, 2756734187, 3204031479, 3329325298 }, /*Define Functions*/ XOR = Number.BitwiseXor, AND = Number.BitwiseAnd,MOD32 = (n) => Number.Mod(n, t{32}), RROT32 = (n, r) => Number.BitwiseShiftRight(n, r) + t{32-r} * Number.Mod(n, t{r}), s0 = (w) => XOR(XOR(RROT32(w, 7), RROT32(w, 18)), Number.BitwiseShiftRight(w, 3)), s1 = (w) => XOR(XOR(RROT32(w, 17), RROT32(w, 19)), Number.BitwiseShiftRight(w, 10)),S1 = (w) => XOR(XOR(RROT32(w, 6), RROT32(w, 11)), RROT32(w, 25)),S0 = (w) => XOR(XOR(RROT32(w, 2), RROT32(w, 13)), RROT32(w, 22)), /*Main compression function*/ CHUNK = (hash, w) => let G = List.Generate( () => [ i = -1, a = hash{0}, b = hash{1}, c = hash{2}, d = hash{3}, e = hash{4}, f = hash{5}, g = hash{6}, h = hash{7}, temp1 = 0, temp2 = 0, ch = 0, maj = 0 ], each [i] < 64, each [ ch = XOR(AND([e], [f]), AND(XOR([e], t{32} - 1), [g])), temp1 = MOD32([h] + S1([e]) + ch + k{i} + w{i}), maj = XOR(XOR(AND([a], [b]), AND([a], [c])), AND([b], [c])), temp2 = MOD32(S0([a]) + maj), h = [g], g = [f], f = [e], e = MOD32([d] + temp1), d = [c], c = [b], b = [a], a = MOD32(temp1 + temp2), i = [i] + 1 ]//, //each [s] ), list_abc = Record.ToList(Record.SelectFields(List.Last(G), {"a", "b", "c", "d", "e", "f", "g", "h"})) in list_abc, /*Create Message Schedule (w)*/ list_bytes = {1..bytes}, list_chunks = List.Transform(list_bytes, each Number.IntegerDivide((_ - 1), 64)), list_words = List.Transform(list_bytes, each Number.Mod(Number.IntegerDivide((_ - 1), 4), 16)), list_chars = List.Transform(list_bytes, each if _ < len + 1 then Character.ToNumber(Text.At(input, _ - 1)) else if _ = len + 1 then 128 else if _ > bytes - 8 then Number.Mod(Number.BitwiseShiftRight(len * 8, (bytes - _) * 8), 256) else 0 ), lists_table = Table.FromColumns( {list_bytes, list_chunks, list_words, list_chars }, {"index", "chunk", "word", "u"}), group_words = Table.Group(lists_table, {"chunk", "word"}, {{"w", each List.Accumulate([u], 0, (state, current) => 256 * state + current), Int32.Type}}), /*Loop through chunks*/ update_hash = List.Accumulate( {0..nChunks - 1}, h0, /*Initial hash*/ (state, current) => let w16 = List.Range(group_words[w], 16 * current, 16), w48 = List.Generate( () => [i = 15, w = w16, s = 0], each [i] < 64, each [ s = MOD32([w]{i - 16} + s0([w]{i - 15}) + [w]{i - 7} + s1([w]{i - 2})), w = [w] & { s }, i = [i] + 1 ], each [s] ), w64 = w16 & List.Skip(w48), hash = CHUNK(state, w64), endstate = List.Transform({0..7}, each MOD32(state{_} + hash{_})) in endstate ), /*Convert hash to hex code*/ hash_bytes = List.Transform(update_hash, (hash) => List.Transform({0..3}, (byte) => Number.BitwiseShiftRight(Number.Mod(hash, t{32 - 8 * byte}), 24 - 8 * byte))), hash_hex = List.Transform(hash_bytes, each Binary.ToText(Binary.FromList(_), BinaryEncoding.Hex)), concatenate = Text.Combine(hash_hex)in concatenate
This is clearly not a production-ready function and isn't much faster than the DAX version (on the order of seconds per call).
If I want to improve the performance of this function, how do I even approach this? I've used DAX Studio to examine DAX performance but I don't know if there is anything comparable for M that people actually use. Any advice or references are welcome, either related to the M code itself or related to tools/resources.
I haven't learned who the best members to ask this are, so I hope you'll indulge me in a bit of mention spamming.
@edhans,@ImkeF,@MarcelBeug,@mahoneypat,@BA_Pete,@AlB,@KNP,@lbendlin,@HotChilli,@Jimmy801,@Anonymous,@Vera_33@CNENFRNL,@artemus
Feel free to ignore if you're not interested.
Solved!Go to Solution.