Thanks for your patience, we’re back up and running.
Thanks for your patience, we’re back up and running.
Hi all, you might be surprised to be on this page. We’re updating www.wordnik.com right now, it’ll be back shortly.
What kind of Wordnik are you? Now you can find out by taking a look at your Wordnik profile!
Your Wordnik profile (which is available whenever you’re logged in) shows you (and only you — it’s private) how many words you’ve looked up, and the most recent words you’ve looked up, tagged, left notes about, recorded pronunciations for, and declared to be your favorites!
When you’re logged in, your profile page will help you keep track of cool words you’ve found or the words you frequently misspell.
To find your profile page, click on your username in the upper right-hand corner of the site. (Never logged in? Join today!)
And if you’ve logged in with Facebook Connect, we’ll even show you your user picture (just in case you’ve been spending so much time at Wordnik that you’ve forgotten what you look like).
While putting the profiles together, we thought we identified some Wordnik types …
The Enthusiast: has lots of words marked as favorites. (Enthusiasts like to tweet their new favorites, too!)
The Organizer: has tagged lots of words. Organizers’ tags range from the purely informational (consecutivevowels) to the editorial (funnysounding) to the just plain funny (apersonwhoeatsonlyvegetarians).
The Explainer: leaves a lot of helpful notes. (Or funny notes, which are also helpful in their way.)
The Announcer: records a lot of pronunciations. (Or finds them online and does some kind of prestidigitation to add them to Wordnik. Check out the one by “Vizzini” here.)
Soon we’ll be adding even more information to your profile pages, including your complete browsing history and some fun ways to compare yourself to other Wordniks. (A hint: start trying to score those Wordniks now …)
At Wordnik, our plan is to give you as much information as we can about as many words as we can — and that includes information about your own word use. Please let us know what else you’d like your profile to keep track of for you!
Over the last few days we’ve added a couple new things to Wordnik that we hope you’ll like — first, autocomplete! (Quite a few people have requested this.) Now, when you start typing in the search box, you’ll see a list of suggestions.
We’ve also added a “forms” graph. What’s a forms graph? A forms graph tells you stuff like this:
In our data, upper-case “Internet” is still slightly more common than lower-case “internet”.
We’ve also started showing you words used in the same contexts as the word you’ve looked up. These are the words that we’ve found to be used in the same contexts as the word wry:
These are words that aren’t necessarily synonymous, but words that are used in the same way in the same kinds of sentences. Have a word on the tip of your tongue? Check out our same-context list for a word that describes the same kind of thing as the word you can’t think of. Something that is described as wry (like a sense of humor or a comment) might very well also be called sardonic.
At Wordnik, our plan is to give you as much information as we can about as many words as we can — please let us know how useful you find it!
One of the most common questions about our site, on Twitter and in emails to feedback@wordnik.com, is “What exactly is the Statistics bubble chart saying?” Like everything else on our beta site, this chart is very much a work in progress. But it is grounded in real counts of word occurrences, so here’s the full explanation.
The legend reads “Bubble size: how much this word was used in a year. Bubble height: unusualness in that year”. “Unusualness” is hopelessly vague, but the vagueness there — and the absence of any numbering on the vertical axis — was meant to avoid misleading false precision.
The size of the bubble is a representation of the count of occurrences within the given year. The count comes from our collection of text, currently around 4 billion words of running text from Project Gutenberg, web feeds from Spinn3r, and a human-directed crawl of interesting texts from all around the web. Since we have widely varying amounts of text for many years &emdash; and lower (but growing!) amounts for the public-domain black hole of years between 1923 and the rise of the Internet &emdash; the raw count of a word’s occurrence is not very useful for showing how often it was used in a given year. Plenty of words will show up millions of times in the 21st century, because we’ll always have an endless flow of new text from now on. But that would mean that in any year before 2008 or so, everything would always have pitifully low frequency. So instead of showing the count of the occurrences of the word, we divide that count by the total number of word tokens used in that year, and the words that have bigger bubbles are the words that constitute a higher proportion of the words used in the year that that bubble represents. The formula is just:
The height/unusualness is an attempt to highlight years where the word was used more often than the word is normally used in other years. Some have inferred that a word’s unusualness should be the inverse of its frequency: that a rare word in 1960 would have high unusualness. What we’re trying to show is the years where the word is used unusually often, compared to how often the word is used in other years. So while the bubble size reflects the amount that the word was used in that year, the bubble height considers all of the word’s uses in all years, and reflects the proportion of those uses that occurred within the given year.
For example: a word like the should be pretty evenly un-unusual, with lots of fairly big bubbles (since “the” is pretty much always the most frequent word in a given year) hanging out around the baseline (since it’s very frequent in every year). Instead, at the moment, “the” gives you this:
with most of the years about a quarter of the way above the baseline. This is a flaw in the way we’re generating our charts via Google: the vertical axis is not constant from chart to chart, so the charts are not comparable, and “the” is spreading a very small amount of frequency variation across the whole vertical space. The next iteration of these charts will have a constant vertical axis, to make them more usefully comparable from word to word, but we’re still looking for the right answer to what the constant axis should be. The current formula for this is:
For our bubble-size number, many people use measures like “count per 10,000 words”, which might be a better way, since the axis is pre-defined. We will need to give it some sort of logarithmic smoothing so that the bubbles for low frequency words don’t completely disappear. A word that occurs only once in a billion words will only occur 0.000001 times per 10,000 words — but we’d still like you to know, right away, that it was used that one time.
If you have suggestions for more useful metrics, and more useful visualizations, bring ’em on!
Photo by, and licensed from, Arbitrary.Marks.
Thank you all so much for trying out Wordnik! We hope you’re enjoying using the site (and that you let us know about things you’d like us to change/fix/update).
A few of our favorite things from our first few days:
The tags “sprinkles” and “ice cream toppings“. (And the images at “jimmies“, yum!)
The pronunciation of “inconceivable” by Mitchell.
That 81 people (and counting) have looked up “lexicographer“!
The title of this post (which comes from a comment left on this post from our friends at the TED blog — thanks Shanna!)
Also, here’s our Firefox search bar plugin, for those of you who were looking for it.
Photo by, and licensed from, Gary Simmons.
Hey folks! Wordnik is now in open beta!
What does that mean? Well, you no longer need a username and password to check out the site (although you do still need one to leave notes, record pronunciations, and add related words and tags). Come on over and check us out!
We’re still in beta, though, so please don’t be alarmed if things don’t work perfectly smoothly — we’re working hard to add more words, more data, more cool, interesting, and informative features, and just more MORE in general!
Feel free to leave us feedback at any word—and have fun exploring words with Wordnik!