Aussie AI

Hashing

Last Updated 18 July, 2025

by David Spuler, Ph.D.

Hashing is a well-known optimization in general programming. The idea is to "hash" the characters of a string together to create a number, which is then mapped into a lookup table, called a "hash table". When implemented properly, hashing can convert search lookups to an O(1)-complexity request.

Another more generalized technique that combines hashing and bit vectors is Bloom filters. There are also various papers on the use of Bloom filters to speed up model inference.

But how do you hash a vector? Or a matrix? It's a complicated theoretical area. Is there a way to convert a vector dot product operation on two vectors into a hash lookup, thereby avoiding all those multiplications? What about speedup of matrix multiplication by hashing?

Remember that you can pre-compute anything about the weights before inference, because they don't change during inference. Hence, one of the vectors could potentially be pre-hashed offline. Maybe you could even use some type of "perfect hashing" for those vector hashes, if you've got a big enough compute budget for training. But you can't pre-hash both of the vectors or pre-compute the dot product, because the other vectors are dynamically calculated along the way, dependent on user inputs.

Research on Hashing for AI Models

Various research has used hash tables to attempt to quickly identify vectors or sub-parts of AI models. Research papers include:

Nikita Kitaev, Lukasz Kaiser, and Anselm Levskaya, Reformer: The efficient transformer, In International Conference on Learning Representations, 2019, https://arxiv.org/abs/2001.04451
Weinberger, Kilian, Dasgupta, Anirban, Langford, John, Smola, Alex, and Attenberg, Josh. Feature hashing for large scale multitask learning. In ICML, 2009, https://arxiv.org/abs/0902.2206
Apr Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y., Compressing neural networks with the hashing trick, In: International Conference on Machine Learning, pp. 2285–2294 (2015), https://arxiv.org/abs/1504.04788, PDF: https://arxiv.org/pdf/1504.04788v1.pdf
Shi, Qinfeng, Petterson, James, Dror, Gideon, Langford, John, Smola, Alex, and Vishwanathan, S.V.N. Hash kernels for structured data. Journal of Machine Learning Research, 10:2615–2637, December 2009. PDF: https://jmlr.csail.mit.edu/papers/volume10/shi09a/shi09a.pdf
Ganchev, Kuzman and Dredze, Mark. Small statistical models by random feature mixing. In Workshop on Mobile NLP at ACL, 2008, PDF: https://www.cs.jhu.edu/~mdredze/publications/mobile_nlp_feature_mixing.pdf
Cao Z, Long M, Wang J, Yu PS. Hashnet: Deep learning to hash by continuation. In: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5608–5617., https://arxiv.org/abs/1702.00758
Wang-Cheng Kang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Ting Chen, Lichan Hong, Ed H. Chi, Learning to Embed Categorical Features without Embedding Tables for Recommendation, June 2021, https://arxiv.org/abs/2010.10784v2
Xing Shi and Kevin Knight. Speeding up neural machine translation decoding by shrinking run-time vocabulary. In Proc. of ACL, 2017. https://aclanthology.org/P17-2091/ PDF: http://xingshi.me/data/pdf/ACL2017short.pdf
Thomas Dean, Mark A Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan, and Jay Yagnik. 2013. Fast, accurate detection of 100,000 object classes on a single machine. In Proc. CVPR. PDF: https://web.stanford.edu/class/cs231m/references/hashing-dpm.pdf
Sudheendra Vijayanarasimhan, Jonathon Shlens, Rajat Monga, and Jay Yagnik. Deep networks with large output spaces. In ICLR (Workshop), 2015. https://arxiv.org/abs/1412.7479 (This paper has vector dot product hashing for AI models.)
Mojan Javaheripi, Jung-Woo Chang, and Farinaz Koushanfar. Acchashtag: Accelerated hashing for detecting fault-injection attacks on embedded neural networks. ACM Journal on Emerging Technologies in Computing Systems, 19(1):1–20, 2022. https://doi.org/10.1145/3555808, PDF: https://dl.acm.org/doi/pdf/10.1145/3555808
Elad Eban, Yair Movshovitz-Attias, Hao Wu, Mark Sandler, Andrew Poon, Yerlan Idelbayev, and Miguel A Carreira-Perpinan. Structured multi-hashing for model compression. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11903–11912, 2020, https://ieeexplore.ieee.org/document/9156952, https://arxiv.org/abs/1911.11177, PDF: https://openaccess.thecvf.com/content_CVPR_2020/papers/Eban_Structured_Multi-Hashing_for_Model_Compression_CVPR_2020_paper.pdf
Oren Rippel, Michael Gelbart, and Ryan Adams. 2014. Learning ordered representations with nested dropout. In International Conference on Machine Learning, pages 1746–1754. https://arxiv.org/abs/1402.0915 (This paper from 2014 isn't really about AI.)
Sourin Chakrabarti, 18 Nov 2020 (v2), Efficient image retrieval using multi neural hash codes and bloom filters, https://arxiv.org/abs/2011.03234
H Wang, 2024, Minimalism Yields Maximum Results: Deep Learning with Limited Resource, Ph.D. Thesis, Purdue University, PDF: https://hammer.purdue.edu/articles/thesis/Minimalism_Yields_Maximum_Results_Deep_Learning_with_Limited_Resource/26349415/1/files/47855029.pdf
Piotr Indyk, Michael Kapralov, Kshiteej Sheth, Tal Wagner, 17 Jul 2024, Improved Algorithms for Kernel Matrix-Vector Multiplication, LCFM 2024, https://openreview.net/forum?id=7CCzyEtZXH https://openreview.net/pdf?id=7CCzyEtZXH
Duy-Thanh Nguyen, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda, 9 Feb 2023, DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks, https://arxiv.org/abs/2302.04712
Jie Ran, Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Ngai Wong, 13 Aug 2022, PECAN: A Product-Quantized Content Addressable Memory Network, https://arxiv.org/abs/2208.13571
Aditya Desai, Shuo Yang, Alejandro Cuadron, Ana Klimovic, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica, 19 Dec 2024, HashAttention: Semantic Sparsity for Faster Inference, https://arxiv.org/abs/2412.14468
Minghui Liu, Tahseen Rabbani, Tony O'Halloran, Ananth Sankaralingam, Mary-Anne Hartley, Brian Gravelle, Furong Huang, Cornelia Fermüller, Yiannis Aloimonos, 24 Dec 2024 (v2), HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashing, https://arxiv.org/abs/2412.16187 (LSH to compress the KV cache.)
Haichao Zhang, Yun Fu, 30 Mar 2025 (v2), Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models, https://arxiv.org/abs/2503.16980
huizhou92 Oct 3, 2024, SwissTable: A High-Performance Hash Table Implementation, https://dev.to/huizhou92/swisstable-a-high-performance-hash-table-implementation-1knc (Hash tables via a type of linear probing, unlike std::unordered_map, but with a parallel array of extra metadata, allowing fast SIMD optimizations of hashing, with the downside of more frequent resizes than chaining.)
Abseil, 2017, Swiss Tables Design Notes, https://abseil.io/about/design/swisstables https://github.com/abseil/abseil-cpp

Hashing Theory for Vectors

There are various low-level papers on using hashing for various computations involving vectors and tensors of higher dimensions. One of the main techniques is Locality-Sensitive Hashing (LSH), which is hashing to find vectors that are "close" in n-dimensional space. For example, LSH is used to hash vectors for caching vector dot products.

Research papers on vector-level hashing and neural networks:

Gionis, Aristides, Indyk, Piotr, and Motwani, Rajeev. Similarity search in high dimensions via hashing. In Atkinson, Malcolm P., Orlowska, Maria E., Valduriez, Patrick, Zdonik, Stanley B., and Brodie, Michael L. (eds.), Proceedings of the 25th International Conference on Very Large Data Bases, pp. 518–529. Morgan Kaufmann, 1999, https://dl.acm.org/doi/10.5555/645925.671516
Jaiyam Sharma, Saket Navlakha, Dec 2018, Improving Similarity Search with High-dimensional Locality-sensitive Hashing, https://arxiv.org/abs/1812.01844
A. Andoni, P. Indyk, T. Laarhoven, I. Razenshteyn, and L. Schmidt, “Practical and optimal LSH for angular distance,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, (Cambridge, MA, USA), pp. 1225–1233, MIT Press, 2015. https://arxiv.org/abs/1509.02897
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, “Locality-sensitive hashing scheme based on p-stable distributions,” in Proceedings of the Twentieth Annual Symposium on Computational Geometry, (New York, NY, USA), pp. 253–262, ACM, 2004. https://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p253-datar.pdf
P. Indyk and R. Motwani, “Approximate nearest neighbors: towards removing the curse of dimensionality,” in Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604–613, 1998. https://www.theoryofcomputing.org/articles/v008a014/v008a014.pdf
A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Foundations of Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on, pp. 459–468, 2006. http://web.mit.edu/andoni/www/papers/cSquared.pdf
K. Terasawa and Y. Tanaka, “Spherical LSH for approximate nearest neighbor search on unit hypersphere,” in Workshop on Algorithms and Data Structures, pp. 27–38, 2007. https://dl.acm.org/doi/10.5555/2394893.2394899
Giannis Daras, Nikita Kitaev, Augustus Odena, and Alexandros G Dimakis. SMYRF: Efficient Attention using Asymmetric Clustering. Advances in Neural Information Processing Systems, 33:6476–6489, 2020. https://arxiv.org/abs/2010.05315 (LSH used in attention algorithms.)
Chinnadhurai Sankar, Sujith Ravi, and Zornitsa Kozareva. 2021. ProFormer: Towards On-Device LSH Projection Based Transformers. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2823– 2828, Online. Association for Computational Linguistics. https://arxiv.org/abs/2004.05801 (LSH used for embedding vectors.)
J Zhang, 2023, Quantization for High-dimensional Data and Neural Networks: Theory and Algorithms, Ph.D. Thesis, University of California, San Diego, https://escholarship.org/content/qt9bd2k7gf/qt9bd2k7gf.pdf
Gonzalez TF, 1985, Clustering to minimize the maximum intercluster distance. Theoretical computer science 38:293–306, PDF: https://sites.cs.ucsb.edu/~teo/papers/TCS-ktmm.pdf (Approximate algorithm for vector clustering.)
Chen, B. and Shrivastava, A., 2018, Densified Winner Take All (WTA) Hashing for Sparse Datasets, Uncertainty in artificial intelligence, http://auai.org/uai2018/proceedings/papers/321.pdf (Uses hashing related to LSH.)
Hackerllama, January 7, 2024, Sentence Embeddings. Introduction to Sentence Embeddings, https://osanseviero.github.io/hackerllama/blog/posts/sentence_embeddings/
M Sponner, B Waschneck, A Kumar , 2024, Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning, ACM Computing Surveys,, PDF: https://dl.acm.org/doi/pdf/10.1145/3657283 (Survey of various adaptive inference optimization techniques with much focus on image and video processing optimization for LLMs.)
David Spuler, March 2024, Chapter 18. Parallel Data Structures, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
Yikun Han, Chunjiang Liu, Pengfei Wang, 18 Oct 2023, A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge, https://arxiv.org/abs/2310.11703
Ala-Eddine Benrazek, Zineddine Kouahla, Brahim Farou, Hamid Seridi, Ibtissem Kemouguette, 28 Aug 2024, Efficient k-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures, https://arxiv.org/abs/2408.16036
David Spuler, March 2024, Vector Hashing, in Generative AI in C++, https://www.aussieai.com/book/ch18-vector-hasing
Duy-Thanh Nguyen, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda, 9 Feb 2023, DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks, https://arxiv.org/abs/2302.04712
Aditya Desai, Shuo Yang, Alejandro Cuadron, Ana Klimovic, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica, 19 Dec 2024, HashAttention: Semantic Sparsity for Faster Inference, https://arxiv.org/abs/2412.14468
Uri Merhav, Dec 30, 2024, Vector Similarity Search is Hopeless. A pre-baked notion of similarity is inherently flawed and doomed to fail. We can do better. https://urimerhav.medium.com/vector-similarity-search-is-hopeless-7251a855b4bd

Aussie AI

Hashing

Research on Hashing for AI Models

Hashing Theory for Vectors

More AI Research

Quick Links

Product

New to Writing?

Writing Styles