“It used to be my little secret, my secret that is until I found out that many of the writers I know practice the same habit. We love to read the dictionary. Many times I have pulled out the dictionary to look up the spelling of a particular word and then another word on the page catches my eye. Twenty minutes later I am still engrossed in the dictionary, browsing through the less familiar definitions.” —Creating Copy by William Ackerly

vacuum tube schematicIt’s true for more than just advertising copywriters: people love the serendipity of a dictionary. They like to get lost for a while, to be distracted, to learn something new.

We like to do that, too, so we’ve made many ways to explore Wordnik.

For example, you can explore another user’s lists. You can look at the related items for a word. You can check out zeitgeist and see what other words people are visiting right now.

But for my money, tags are the feature that offers the most subtle pathways to the unexpected. You can find tags on the right-hand side of a word’s main page.

There’s nothing particularly Linnaean about tags. They’re not meant to be universal. No governing body is going to insist on a hierarchy, a structure, or a form. Unlike Wordnik lists, which can have a mission statement (such as “words I found while reading Great Expectations by Charles Dickens“), tags’ intentions are usually silent.

Tags are personal. They are a way of classifying a word in a way that suits you. Beyond “don’t be a knucklehead,” there aren’t really any rules. You can use short tags, long tags, tags in other languages. You can tag a lot or a little. You can let that basic human need to sort and organize take over. Tag like a maniac in any way that is useful to you or the world.

In lieu of rules, I offer two tag guidelines that have been helpful to me:

1. Make your tags true as far as you know.
2. Make your tags memorable to you.

That way, you’ll have left clues for yourself (if you forget the word) and for other serendipiters who come across the same word. (See, I used a new word there and then tagged it with “neologism.”)

Tags are so personal that often the only obvious intention behind a tag is to demonstrate a connection between two words. For example, if someone tags the word basilect with language, then there’s a pretty good chance that basilect has something to do with language. That’s about as much as we can glean.

However, if someone tags the word language with cvccvvcv, most people are going to be mystified. It doesn’t even look like a word! But there was indeed a connection there for somebody, and, it turns out, the tags are useful if you need to know something about the orthography of a set of words. (Hint: each “c” stands for “consonant” and each “v” stands for “vowel.” Full explanation here.)

Remember that a word can both be tagged and can be a tag itself. At the top of every word’s tag page you’ll see “words tagged” with the word you’re looking at and at the bottom you’ll see “the word has been tagged.” Check out the tag page for neologism to see what I mean.

If you want a bit of guided serendipity, you can browse the tags made by any user who has a public profile. Here are some of mine.

If you’re looking for a little more about tagging from an insider’s point of view, I recommend the book Tagging: People-powered Metadata for the Social Web.

Happy tagging!

Announcing the new Wordnik alpha APIs! (UPDATED)

Today we’re happy to announce the alpha version of our new Wordnik APIs! UPDATE: See the video of the announcement we made at the Web 2.0 Summit.

Wordnik’s goal is not just to collect at least some information about every word in English — it’s also to make great information about words widely available, and our alpha APIs are a first step towards that goal.

Our new APIs include:

  • a definitions server, with definitions from The Century Dictionary (other dictionaries will be coming soon);
  • a “frequency” API, which returns a frequency number based on our initial API corpus*;
  • an “examples” API, which will return up to five example sentences for any word that appears in our initial API corpus;
  • the Wordnik word-of-the-day API (so you can create your own word-of-the-day wrapper or widget);
  • and it’s not really a standalone API, but we’re also throwing in an autocomplete API that is useful for making stuff with the other APIs.

You can sign up for our APIs here. Depending on demand, we may have to stagger approvals so as not to overwhelm the servers. If you want a better chance of being approved, give us as much detail as possible about how you plan to use our APIs. Coolness counts (but spelling doesn’t — since we haven’t released a spelling API yet).

Rudimentary documentation is here.

This is just a start — we’re hoping to release new APIs at regular intervals, so if there’s a kind of word data you’re longing to have access to, please let us know!

(* Our initial API corpus is about 3 billion words of running text. The API corpus is slightly different from the corpus that drives the Wordnik web site.)

Wordnik word of the day: pluck

Today’s word of the day is pluck. Naturally, if we’re going to choose a word that seems so ordinary, we’re going to tell you about a meaning that isn’t. This pluck is the heart, liver, windpipe, and lungs of a sheep, ox, or other animal used as butchers’ meat. It’s also used figuratively or humorously for similar parts of a human being, especially when talking about “having the pluck” or “being plucky,” meaning, “showing courage and spirit in trying circumstances” or “being bold or brave.” In other words, “having the guts or the stomach to do something” or “showing intestinal fortitude.”

Why doesn’t anyone ever say, “He has the belly button to do what’s right?”

We Love the Century Dictionary

O NE of our favorite parts of Wordnik is the Century Dictionary. With more than 530,000 definitions and discursive notes, it is the second-largest English-language dictionary ever published.

But the Century isn’t just big—it’s beautiful, too. To quote expert etymologist Anatoly Liberman, “The Century is one of the great reference works in American history (some would say the greatest).” In the Oxford History of English Lexicography, Thomas Herbst and Michael Klotz write that “it is a superb dictionary in many respects and still has much to offer to those interested in the vocabulary of the period. It was from the beginning a quixotic venture (as many new dictionaries are), and it occupies a singular place in American lexicography for its attempt to marry the highest form of the printers art with dictionary-making.”

The Century—despite having been available online as searchable images from the nice folks at Global Language, and in scanned and OCR (optical character recognition) versions at the Internet Archive and through Google Books—has been too little-known for too long. So we knew we wanted it to be a part of Wordnik in a format that was a little less archival and a little more useful, to give more people the joy of browsing through it.

We didn’t want to change the spirit of the original text, but we did want to make the Century a bit more readable. So we expanded thousands of abbreviations (such as mycol., priv., and Lett.) to their full forms (mycology, privative, and Lettish, in case you were curious). We also converted more than 240,000 pronunciations from the obsolete Century format (they had about a dozen different representations for schwa [ə]!) to the International Phonetic Alphabet.

Even though we had the entire Century keyed from scanned pages, instead of using OCR (for better accuracy) there are still some typos scattered through the text. If you see a typo in any entry, please do use the “Report a typo” link at the top of the page to let us know!

Other usability improvements are coming soon, but in the meantime, if you’d like more information about the Century Dictionary, see the Wikipedia entry. Also, in the 1996 (number 17) issue of the journal Dictionaries, published by the Dictionary Society of North America, there are a number of excellent articles celebrating the centennial of the first edition of the Century.