Show HN: TurboQuant-WASM – Google's vector quantization in the browser(github.com/teamchong)
125 points by teamchong 9 hours ago | 4 comments
- netdur 1 hour agoI tried TQ for vector search and my findings is not good, it is not worth it if you cannot use GPU, however I got same quality of search as 32f using 8bit quant
I wrote ann ext for sqlite, using tq, I do save a lot on space but 32f is still faster despite everything I have tried
- glohbalrob 7 hours agoVery cool. I added the new multi embedding 2 model to my site the other week from google
I guess need to dig into this and see if it’s faster and has more use cases! Thanks for publishing your work
- hhthrowaway1230 7 hours agoAwesome! Also love the gaussian splat demo, cool use case!
- refulgentis 2 hours agoSloppiest slop I've seen in a couple weeks:
- fork of a fork of a quantization technique
- Only contribution is...compiling JS to WASM by default?
- suspicious burst of ~nothing comments from new accounts
- 6 comments 7 hours in, 4 flagged/dead, other 2 also spammy, confused and making category errors at best, at worst, more spam.
- Demo shows it's worse: 800 ms instead of 2.6 ms for text embedding search
- "but it saves space" - yes! 1.2 MB in RAM instead of 7.2 MB to turn search into 1s on a MacBook Pro M4 Max, instead of sub-frame duration.
- It's not even wrong to do this with the output embeddings, there's way more obvious ways to save space that don’t affect retrieval time this much
- himmelsee2018 4 hours ago[flagged]
- newbrowseruser 7 hours ago[dead]
- bingbong06 7 hours ago[flagged]
- aritzdf 5 hours ago[flagged]