Understanding Transformers via N-Gram Statistics

I released a preprint of my paper Understanding Transformers via N-Gram Statistics last Friday, which provides insights into the ways in which LLM behavior can be described in terms of simple statistical rules. I wrote a detailed X thread summarizing the paper, so I don’t have anything else to add to that for now. I’m quite excited to finally make the paper available given that it was research that was started last September upon my move to London from the US. It’s an idiosyncratic paper, which is perhaps inevitable given that it is single-author paper and I do not have a background in natural language processing.

The X thread also explains why my paper is on ResearchGate and not (yet!) on the arxiv. This isn’t my first time encountering friction with the arxiv. For those with some time to kill, I’ve explained elsewhere how my paper A Response to Geometric Unity that debunks Eric Weinstein‘s “theory of everything” was not accepted to the arxiv (see 26:22 of my interview on Decoding The Gurus). Nevertheless, for both amusing incidents like the above and the cases of serious research alike, it seems the arxiv is unable to provide any transparency into how they operate, which is a truly unfortunate disservice to the research community.

Though I should point out that one benefit of ResearchGate over the arxiv that I discovered from my upload is that you can view how many times your article has been read on ResearchGate.

Update: The article is now on the arxiv. (Article announcement appeared on the arxiv mailing list on July 17 in the USA. My submission date was June 30): https://www.arxiv.org/abs/2407.12034

Share this:

Related

Leave a comment Cancel reply