I've recently inherited a keras based network from a colleague, and I want to quantize it down to 8 bit fixed point.
Unfortunately I'm not overly familiar with keras itself.
I've been looking around, and there doesn't seem to be any easy methods to do this, without converting to something like tf.lite etc, and even that seems to have problems. (please correct me if I'm missing any great solutions here).
So I'm wondering if I can do it manually. I understand the formula, and don't think I'd have any major trouble implementing, but I'm not sure how keras handles weights under the hood. If I were to just manually map a weight from 32 to 8 bit, would keras be fine with that, or would it do something annoying like just append 0s to make it some internally expected length for a weight.
Any help or pointers in this are would be greatly appreciated.