?? From the Understanding of Transformer to the Non-linearity of Time
Given that token is smallest unit in LLM, while pixel is smallest unit in CV, putting Transformer, RNN/LSTM, and CNN together, Transformer is closer to CNN in the sense of its ability to catch the panorama (considering the kernel of CNN could equal to whole picture).
Say it again in metaphor, Heptapod B in the film Arrival challenged the linearity of time. With the Sapir-Whorf Hypothesis, the language Heptapod speak changed the way they think, thus they can foresee the future, so does Louise (the linguist who was sent to decipher aliens' language).
Think more about art, specifically about music, and then painting.
Music is usually considered as sequence art form, audiences sit in Southbank Centre all listen to the same piece of symphony, at a time; while painting, when you walked in Tate Britain and get seized by Turner, your eyes might wandering in quite different way compared with your friend.
(Il n'y a pas de règle sans exception. such as ... Bach palindromises? Schoenberg betrays?)
Then how about languages that fall in between? It reminds me of a trick in Chinese, sometimes when the sequence is disorganised, you won't even find it when you skim it. Typoglycemia.
Not only sentences, but also paragraph. When reading Chinese, I tend to capture the whole paragraph(s) as an "image" input (not sure about native English speaker?)
领英推荐
Behind Non-Linearity, it is Uncertainty. I appreciate one review of the film saying that
heptapod的语言最引人入胜的地方不在去掉时间这个维度,而在于保留了那些意识中被正常语言剔除掉了的不确定性
while I think it should not be an "rather than", but a "(non-linearity) thereafter (retaining the uncertainty)".
When we "read", we appreciate painting, when we "listen", we expect music,
when we "speak" we pause and think and usually end up with letting out scattered words,
when we "write" we are finally exact (wo)man, however, as tying the text to linearity of time, what we gain, what we lose?