The Zipf Mystery

The of and to. A in is I. That it, for you, was with on. As have ... but be they.
How many days have you been alive?
random letter generator:
Dictionary of Obscure Sorrows:
Word frequency resources:
[combined Wikipedia and Gutenberg]
Great Zipf's law papers:
Zipf’s law articles and discussions:
other Zipf’s law PDFs
in untranslated language:
Zipf’s law slides:
Pareto Principle and related ‘laws’:
Random typing and Zipf:
health 80/20:
Principle of least effort: [PDF]
self organized criticality:
Hapax Legomenon:
Learning curve:
Forgetting curve:
Experience curve effects:
and zipf's law:
music from:

Mathias Pampus Hace 49 minutos
Hi everyone, I know it's a little late, but I just watched it and actually wondered if this video itself would show signs of zipf-iness. So I took the transcript and processed it a little. Some info upfront: I sliced things like "power-law" and "day-to-day basis" into completely separate words I kept the distinction between singular an plural of the same word, as well as all the conjugations I stripped words of possessive suffixes (languages' to languages; word's to word) I extended abbreviated forms of "is", "have" etc. to full length (I've to I have; can't to can not; etc.) I kept digits as digits, numbers as well, I did, however, separate "ten", "hundred", "thousand" etc. if they were terminating a number (30 thousand; 181 million) So here are the results, draw your own conclusions: Word count total : 2,885 (2,308 = 80%) Word count unique: 853 (171 = 20%) Hapax legomena: 514 First 20% of most common words are used 1998 times total which is 71% of the total word count. Words used only once make 64% of the unique word count and 18% of the total word count. ----------------------------------------------------------- the: 164 of: 103 a: 84 is: 73 and: 72 in: 53 to: 51 that: 47 it: 46 words: 31 percent: 30 are: 29 word: 27 as: 27 for: 25 used: 22 we: 21 but: 21 be: 20 Zipf: 20 about: 19 or: 19 more: 18 not: 18 one: 17 I: 17 this: 16 have: 16 will: 16 often: 16 on: 16 you: 16 most: 15 so: 14 language: 14 law: 13 just: 13 even: 12 what: 12 20: 12 get: 12 if: 12 by: 12 there: 12 all: 11 letter: 11 way: 11 up: 11 can: 10 out: 10 at: 10 when: 10 has: 10 world: 10 was: 9 0: 8 how: 8 them: 8 they: 8 80: 8 principle: 8 said: 8 many: 8 according: 8 things: 8 some: 8 does: 8 appear: 7 from: 7 number: 7 than: 7 which: 7 only: 7 point: 7 much: 7 these: 7 English: 7 like: 7 every: 6 26: 6 Pareto: 6 times: 6 corpus: 6 been: 6 few: 6 once: 6 languages: 5 here: 5 something: 5 after: 5 eighty: 5 least: 5 half: 5 frequency: 5 likely: 5 typing: 5 result: 5 randomly: 5 also: 5 spacebar: 5
1:48 I love that Chilean appears like a different language than Spanish
We remember the less used words from a book because they are what make it different from all other books.
I'm breathless.
I'm a 3D artist and I'm confused as well, when rendering 3D objects with lightning, the way the light goes from 100% intense (the light source) and to the furthest place reflecting the light also follow Zipf's law, so we as 3D artists tend to put lightning far away from the object with high intensity so the difference isn't noticed and is almost equally distributed.
Holy shit we also follow the 20/80 rule when learning and practicing.
He should make a video about Benfords law Its really cool if you don't know what it is look it up
Michael: diameter of moon craters Me: You’ve lost this time
can somebody write out everything Michael said in this video and see if zipf law works with it
this make a great audio book
heyyyyy how to, basic here
This guy is entirely too happy to actually be as intelligent as he tries to come off.
So we are in a simulation after all
you use 20% of blenders features 80% of the time
Time to use quizatiously or however you spell it on a paper and confuse my ela teacher
So 20% of scientists discover 80% of stuff. So 80% of scientists discover 20% of stuff.
In case anyone's wondering... I think I counted 126 "the" s in this video. Probably not too accurate but I used a Tally Counter lol
LMAO Vilfredo Pareto... I can't be the only one dying at that name. 5:00
Humans interpret information logarithmic to base ten, so that's probably what causes our language to exhibit Zipf's law.
My theory to why this exists is because of the nature of language. Certain words have more preference and play different roles in a language (words such as 'a or the' are used in pairings with other words, meaning they will occur more often then the other words themselves). That is why when a language forms, it naturally uses shorter words to allow these words to be used in higher EFFICIENCY and to allow better expression. Imagine if we replaced the/a with baculiform, ex: "baculiform cat ate baculiform red treat". In general, the goal of a language is to be an efficient and fast way to express ideas, and using short sounds to do this would make sense. Zipfs law makes a language what it is support to be.
i’m 80% confused and 20% understand
... the number of people that die in morgues...
You have to wonder with all these coincidences, if it points to coding in real life...
80% of what i say is 20% of what the voices are telling me to say jk they arnt real but they do got some good ideas
11:30 almost closed the video :|
awsome video. needs more views
So I tried to count how many times he said the word 'the', but I lost. His to interesting
And that is why english is ez
Jokes on you, my language doesn’t have the word ”the”.
I have a question. Why do we want to learn a language? Why do we (I) feel as if we need to learn everything, or seek out knowledge?
Soooo... does this mean that studying 20% of the work give me 80% of the grades?? 🧐
Your grasp, communication, and expressiveness of the subjects you expound on is without equal in a delightful way. Makes me want to pay attention and that is rewarding. Thank you.
God:They are finding out the Easter eggs
While accurate, interesting, and of value to examine and analyze - A problem (80%) is addressed by a solution (20%). Be careful of a solution is search of a problem! i.e. what about the OTHER problems? If one is looking for this pattern, one will find it. -- Sorry - Explained at 8:41 - If you are a mathematician, give up and dislike here.
Why am I getting Christmas ads it’s not even November yet
the fact about vsauce video is it throws mysteries and questions without answering them and even if answers makes it more complicated.. it must be listed as one of the most mentally haunted site of youtube
idk why vsauce makes me really tired
This is vsauce's best video
what's the song at the end of the video?
20% of all countries have 80% percent of coronavirus cases
Why you many words, few work fine
Maybe the Zipf Mystery was just a easter egg put in by the developers.
could this be used to find out if the voynage manuscript is bulshit or not?
the most common words make a sentence. not a good sentence.. just a sentence
3:25 Sorry God forgive please i like-1 sorry
We are in the matrix... (the)
0:27 *my English teacher is questioning reality rn*
18:08 imagine his video started like this😅 You wouldnt be surprised, would ya?
Beautiful video! Very profound! Michael is such a pure example of the ENTP personality type...asking questions about overlooked aspects of reality and going on numerous rants about various different but seemingly unrelated topics just because it's fun and interesting!!!
We are what we eat.... we are what we read...
Reality is probably some toy bought by a rich kid in a parallel universe.
Pea pod pea pod pea pod
And so chaos theory comes into play
Because I'm a smart ass dickhead whenever someone posts one of those naf 'the first word you see' blah, blah says blah, blah grids I always answer what the Hell does 'A' indicate I guess it indicates that I'm a smart ass dickhead.
Instead of writing my essay, I watched this video and got an A because I used a word my teacher hasn't heard. thank you quizzaciously AND DONT TREAT ME QUIZZACIOUSLY IN THE COMMENTS FOR TELLING A JOKE :]
Zipf's law seen in comments and comment's replies and likes of this video Most liked comment has twice the number of likes of the second most liked one...
0:27 Sounds like Dennis reading the campaign video monologue that Charlie wrote in Sunny
I always find myself back on this channel because I want to watch a vsauce video but there isn't a new one. It's whatever, Michael's videos are timeless anyway.
But not Arabic 👌🏽😎
The of and to a in is that it for you was with on as have but be they
That last segment was pretty sorrowful and angsty even before I discovered Michael is only a day older than me
So it seems two exponential data sets that conflict often create a zipf relationship. Interesting
old english be like 0:27
the pareto principle sounds really dumb. of course 80/20 is gonna apply to a bunch of things, but so would 70/30 and any other combination of numbers
Shakespeare laughs in 'Thine'
I'm gonna go read a book now
And as always, thanks for (2 unskippable 15 second ads)
When he was listing all of those examples with the graphs, what was that last one?
20% of my basement is occupied by 80% of the neighbor kids..
Now I’m looking every single comment to find it
Fun fact: Michael uses a Mac.
"It looks like this test brought down the class average by 80%" _Me and the other 20% of kids_
How do I find the list of the number of times the word was used?
six year old me when I don't get to eat ice cream 30 minutes before my bed time: 2:45
sauce cant be the 5555th most common word.. i use your name in conversation ALL the time because i cant wait to discuss theories like this...
i see this as undeniable evidence that the theory of everything does exist. we will find it one day.
Yeah like how time feels like it's going faster as you age. 1 1/2 1/3 ... 1/27 1/32 1/40 Etc You are experiencing it by that much of a fractional amount.
I watched this video a while back and was amazed at how much I forgot.
He reads the top 20 most popular words like my grandma reads a bible verse
Interesting stuff! Now, could you please explain how Zipf applies to things like commodities and the stock market? I mean, if it applies to all of the different things we talked about here then it has to apply to the market n stuff, right? Come to think of it, this explains why some guys have all the luck....;(
Does this confirm that we dont have free will?
i came for knowledge but i got existential crisis in the end
18:15 Michael spitting some faccs part of the video...
20:03 too...
U know the theory about our life was a simulation?? Maybe the coder set this up
How many times did he say "the "
How many times did he say the
Could Zipfs Law be used to determine if the Voynich Manuscript is written in a real language by looking at the number of times the “words” that repeat appear? 🤔
"Hey 55555, Michael here, what is.. Here"
sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce sauce
I dont think the amount of letters is the reason for that. I think its the other way around. More common words are shorter, because they are used so often. This might sound some kind of strange, but think about how language might came into being. And think about what algorithm your brain might use to formulate a phrase. Its not like: "Take random letters until I get the word I need -> finish". For me it works like this: "How can I tell someone what Id like to say -> What kind of words are there -> Which word has a matching meaning -> translate them so non-german people know what you are talking about -> finish". For example: 130.000 years ago human wanted to say they are hungry. They said: "Huguh guahah guguluggugulugukuuku". But "Im hungry" was often needed, so the said "Huguh" becomes "I", "guahah" becomes "am" and "guguluggugulugukuuku" becomes hungry. Thats it (or at least: That might be it).
WHY. DOES. THIS. KEEP. COMING. UP. NEXT. i've sadly even been forced to dislike this video, and the which way is down video. because they just dont stop coming up next, highly annoying.
90% of population lives in 30% of land
The 16 words sounded like a quote from 200 years ago for me
Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously QuizzaciouslyQuizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously Quizzaciously
mind blown
