Anyone familiar with importing a binary field? - Qlik Community

evan_kurowski · ‎2020-08-28

Hello Qlik community,

I have a fixed width flat file in which the first 4 characters are a binary encoded integer.

These 4 characters appear to be in the Unicode realm of possibilities (so not limited to HEX or extended ASCII range)

Here is an example value, where I can derive the Unicode ordinal of each of the first 4 characters separately, and then form binary strings based on conversion of the ordinal value to binary, but from this point I'm unsure how to reassemble the 4 binary sequences into the final binary chain that then get reinterpreted back to an integer (do the bit strings get appended left to right? Is there a sign bit? do they pad zeros to always be bytes? etc..)

(It has been suggested this is compatible to the SAS informat ib4.)

	Ord Char 1	Ord Char 2	Ord Char 3	Ord Char 4	Binary Char 1	Binary Char2	Binary Char 3	Binary Char 4
†©ät	8224	169	228	116	10000000100000	10101001	11100100	1110100

Also, a second similar but unrelated matter. When I attempt to interpret a binary string, I can only feed the Num() format 11 bits.

So for example if I convert a number to binary, QlikView can display the number in bits out to 53 bit places.
=Num(Pow(2,53), '(BIN)')
10000000000000000000000000000000000000000000000000

But if I attempt the inverse and try to feed the Num() function those 53 bits to arrive back at the original number, the Num() function cannot process more than 14 bits (not even enough to derive a 16 bit number).

e.g.
=Num(Num#('11111111111111','(BIN)'),'#.') // at 14 bits will interpret 16383
=Num(Num#('111111111111111','(BIN)'),'#.') // anything beyond 14 bits will not return result

Appreciate any insight on this matter, thanks for your help Qlik Community.

Brett_Bleess · ‎2020-09-09

I believe the following Help link should prove useful:

https://help.qlik.com/en-US/qlikview/April2020/Subsystems/Client/Content/QV_QlikView/conventions-num...

I am not sure that is what you were looking for though after I reread things, but I ran across the following statement from one of the engineers, which I hope is what you needed:

QlikView (and Qlik Sense) uses IEEE 754 64-bit double-precision binary floating-point format to store all numbers. This means that 64 bits are used to store a number: One bit for the sign, 11 bits for the exponent and 52 bits for the number itself. A 52 bit number corresponds to approximately 15 significant decimal digits. This means that QlikView can store integers up to 14 digits while preserving the exactness of the number. However, integers with more than 14 digits will be rounded to 14 significant digits.

I am pretty sure this explains things and it would seem things are working as expected in this case given the 14 significant digits...

The only thing you could do is put in a request in the IDEAS area:

https://community.qlik.com/t5/Ideas/idb-p/qlik-ideas

Regards,
Brett

To help users find verified answers, please do not forget to use the "Accept as Solution" button on any post(s) that helped you resolve your problem or question.
I now work a compressed schedule, Tuesday, Wednesday and Thursday, so those will be the days I will reply to any follow-up posts.

marcus_sommer · ‎2020-09-10

I'm really not sure that I could comprehend how your data looks like and what exactly the problem is to load and/or to interpret them. Nevertheless I thought about the matter and think you may be able to bypass the num#() limitation by interpreting the values on yourself with an additionally logic, like:

t1:
load *, rowno() as RowNo, B2 * E as B3;
load *, iterno() as IterNo, pow(2, iterno()-1) as E, mid(B1, len(B1) - iterno() + 1, 1) as B2
while iterno() <= len(B1);
load recno() as RecNo, * inline [
B1
1111111100110011
1100001100000000001
];

t2: load B1, sum(B3) as B4 resident t1 group by B1;

I think this logic could be adjusted, further extended or even reversed to your requirements. Of course this adds some overhead to a pure loading/converting/formatting but if the dataset isn't really large it should be fast enough (and if not you may apply some incremental approaches).

- Marcus

marcus_sommer · ‎2020-09-13

It's really an interesting matter. Therefore I thought if it couldn't be solved directly within a single load. And yes there are possibilities to fetch the relevant information from the field - each positions of 1 - and to calculate with it the appropriate partial sums, like:

pow(2, index(Field, 1, -1) -1)

But this approach needs to be repeated multiple times within a rangesum() for each single value and also extended to probably one or two additionally checks to the len() and/or to handle errors or NULL, maybe with alt(). So it would be possible but not very beautiful.

And this led to the idea to shorten it anyway. And of course there are ways. For the most things which needs to be matched or replaced I use mappings. Why not here. A classical mapping is very simple and even with larger tables quite performant. But here more values needed would be as available fieldvalues are in Qlik. Therefore the mapping itself needs to be shortened. The essential approach to simplify the matter here is fact that the mapping could be divided into smaller parts and to adjust the output the a bit-shifting. Sounds more complicated as it is. And so could it be look like:

t0:
load *, V<<8 as X, V<<16 as Y, V<<24 as Z;
load *, repeat('0', 8 - len(text(num(V, '(bin)')))) & text(num(V, '(bin)')) as B;
load recno() -1 as V autogenerate 256;

m: mapping load B, V & chr(1) & X & chr(1) & Y & chr(1) & Z resident t0;

set v = "subfield(applymap('m', mid(B1, $1, 8), 0), chr(1), $2)";

t1:
load *, rangesum($(v(1, -1)), $(v(9, -2)), $(v(17, -3)), $(v(25, -4))) as Result;
load repeat('0', 32 - len(B1)) & B1 as B1 inline [
B1
1111111100110011
1100001100000000001
];

This means creating a generic table with 256 records whereby recno() - 1 is the formatted lookup-string and also the direct value respectively the basis for a multiple 8 bit-shift. Within the mapping the returns are concatenated and within the final load subfield() picks the right part from the applymap(). An additionally variable simplifies this access. Of course it could be also done with multiple mapping-tables and you may need here and there further adjustments.

- Marcus