Built-in Translations

Wordnik is a primarily an English-language resource, but we just added a feature to help bridge the gap between English and the rest of the world. Every word page now sports a “translate to” option, which lets you view that word in any of 50 languages. Translation is “sticky,” meaning once you select a language, subsequent words will appear with the translation to that language until you turn translation off (it’s easily dismissible).

The translations come from Google’s amazing Language API, the only downside of which is that if we want to support other languages (Tamil, for instance), we need to wait for Google to support them first.

For those learning English, we hope that having translations alongside the context Wordnik provides will make for a richer learning environment than standalone translations. If you have any questions, suggestions, or comments on how we can better implement translation, please email us, or let us know in the comments.

Spring News from Wordnik

in just-spring

Photo by, and licensed (CC BY-NC 2.0) from, cuellar.

Spring is always a time for new growth, and we’re certainly growing here at Wordnik!

Some new stuff we think you’d like:

  • We now have a beta mobile site at http://m.wordnik.com, optimized for small-screen devices.
  • We have more new (and better!) example sentences, from new sources, with more on the way soon.
  • Check out our improved word frequency charts!
  • The Wordnik Word of the Day is now available as a daily email. You can sign up for it now by logging in to Wordnik and editing your preferences.
  • Our new autoexpanding comment areas make it easier to write and edit comments of more than a few lines (for when you have a lot to say about a particular word).
  • You’ll find improved definition data from the GNU Webster’s 1913 dictionary, available both on the site and through the API.
  • Developers, check out the New API calls for retrieving examples, related words (synonyms, antonyms, and the like), phrases, and definitions by part of speech. Support for JSONP is now available as well.
  • Our corpus is now using mongodb under the hood, providing improved performance now, and interesting feature possibilities down the road.
  • And just for fun, follow us on Twitter and Facebook to play SECRET WORD WEDNESDAY! Guess the SECRET WORD OF THE DAY, and win Wordnik stickers!

Hungry for more? Email us at feedback@wordnik.com and let us know what you’d like to see!

Also — for all you developers out there, keep an eye out for details of Wordnik’s first developer contest! We’ll be making an announcement this Friday …

Word frequency charts

awesome!

This spring, our word statistics pages have quietly improved. We’re now indicating the frequency of a word and how it has changed over the last 200 years. Our new graphs show word occurrences for each year in counts per-million-words-of-text, which — for most words — will be in the low handful.

It’s neat to look at how some words have appeared over time (Internet, a fad which will never catch on) or disappeared (e.g. hansom a two-wheeled horse-drawn carriage). Also neat to see are words that have changed their sense — icon has a new meaning in the late 21st century, and this remarkably changes its frequency (from 1-3 per million up to 10+ in the last fifteen years). (We note that not all statistics are entirely safe for work.)

Since our corpus varies in its density (we have far more text available for the last twenty years than we do in the 150 before that), our frequency representations are shown with confidence intervals, indicating a 95% confidence interval* on a given year. (Sometimes that gives us unusually spiky plots, because the sparse years offer relatively little information.)

In future releases, we’d like to compare two words on the same plot (compare apple to Apple) or explore other aspects of the words’ appearance.

What would you like to see?

* Our confidence intervals use the Agresti-Coull approximation, which is probably too generous in its upper-bound, especially for rare words. We’d like to fix that to include Bayesian priors on word frequencies in a future release.

See also previous post on word-frequency visualization.

Are your words smart enough?

it's love

Today, at the O’Reilly Tools of Change conference, we’ll be announcing an initiative to create a new standard for getting and publishing information about words.

We’re calling it “smartwords”, and it will be an open standard — meaning anyone can publish data sets or develop applications using it. Smartwords will be context-aware and real-time … but also lightweight, easy-to-use, and versatile. We’re developing this standard with help from our first smartwords partners, including The New York Times, Forbes, The Huffington Post, O’Reilly, Vook, Scribd, ibis reader, and the Internet Archive.

With smartwords, you’ll be able to access not just traditional “dictionary-style” information, but also metadata, such as how frequently a word is used, where words are used, and who uses particular words. You’ll also be able to publish information about words — if you create a word, you can put a flag in the ground and claim it for your own — and smartwords will enable cool social features, like sharing and tagging.

What would a world with smarter words look like?

— You’re reading a new popular-science bestseller and your reader shows you quick definitions of the most difficult words, set right in the text … based on knowing what books you’ve already read and what words you’ve already seen!

— You’re a consumer and you have a few sources you trust for information (like, say, the New York Times). When you’re reading something from a different source, you can set your ereader to highlight what you’re reading to link you to good definitions (or similar content) in your trusted sources. (Instant fact-check!)

— You’re reading a great new novel and you see a great quote you’d like to pass along — you highlight it and share it on Facebook or Twitter.

The question is: if every word became a smart word, what would you ask it and what would it tell you?

We’ll be releasing version 1 of the smartwords standard in Summer 2010. With this new standard, we should able to do fantastic things with smartwords — and we want to hear from you about the kinds of information you would like to access and the kinds of applications you would like to build. Visit us at smartwords.wordnik.com to learn more!

(There’s more information in this nice writeup about smartwords from the Wall Street Journal’s Digits blog.)

Serendipi-tag

“It used to be my little secret, my secret that is until I found out that many of the writers I know practice the same habit. We love to read the dictionary. Many times I have pulled out the dictionary to look up the spelling of a particular word and then another word on the page catches my eye. Twenty minutes later I am still engrossed in the dictionary, browsing through the less familiar definitions.” —Creating Copy by William Ackerly

vacuum tube schematicIt’s true for more than just advertising copywriters: people love the serendipity of a dictionary. They like to get lost for a while, to be distracted, to learn something new.

We like to do that, too, so we’ve made many ways to explore Wordnik.

For example, you can explore another user’s lists. You can look at the related items for a word. You can check out zeitgeist and see what other words people are visiting right now.

But for my money, tags are the feature that offers the most subtle pathways to the unexpected. You can find tags on the right-hand side of a word’s main page.

There’s nothing particularly Linnaean about tags. They’re not meant to be universal. No governing body is going to insist on a hierarchy, a structure, or a form. Unlike Wordnik lists, which can have a mission statement (such as “words I found while reading Great Expectations by Charles Dickens“), tags’ intentions are usually silent.

Tags are personal. They are a way of classifying a word in a way that suits you. Beyond “don’t be a knucklehead,” there aren’t really any rules. You can use short tags, long tags, tags in other languages. You can tag a lot or a little. You can let that basic human need to sort and organize take over. Tag like a maniac in any way that is useful to you or the world.

In lieu of rules, I offer two tag guidelines that have been helpful to me:

1. Make your tags true as far as you know.
2. Make your tags memorable to you.

That way, you’ll have left clues for yourself (if you forget the word) and for other serendipiters who come across the same word. (See, I used a new word there and then tagged it with “neologism.”)

Tags are so personal that often the only obvious intention behind a tag is to demonstrate a connection between two words. For example, if someone tags the word basilect with language, then there’s a pretty good chance that basilect has something to do with language. That’s about as much as we can glean.

However, if someone tags the word language with cvccvvcv, most people are going to be mystified. It doesn’t even look like a word! But there was indeed a connection there for somebody, and, it turns out, the tags are useful if you need to know something about the orthography of a set of words. (Hint: each “c” stands for “consonant” and each “v” stands for “vowel.” Full explanation here.)

Remember that a word can both be tagged and can be a tag itself. At the top of every word’s tag page you’ll see “words tagged” with the word you’re looking at and at the bottom you’ll see “the word has been tagged.” Check out the tag page for neologism to see what I mean.

If you want a bit of guided serendipity, you can browse the tags made by any user who has a public profile. Here are some of mine.

If you’re looking for a little more about tagging from an insider’s point of view, I recommend the book Tagging: People-powered Metadata for the Social Web.

Happy tagging!

Photo by Paula Rey. Used under a Creative Commons license.

Hurrah! We Have a Word-of-the-Day Widget!

Wordnik now has a word-of-the-day (henceforth WOTD) widget!


Wordnik WOTD widget


You can check it out and grab the code here.


With our new widget you can display the Wordnik WOTD on your blog or website, for the entertainment and edification of your readers!


If you’d rather follow the WOTD through RSS you can use this link. (You can also follow us on Twitter for WOTDs, interesting language links, and more.)


We’ve also added a new graph to some word pages—a punctuation profile!


Hurrah!


The punctuation profile gives you an idea of how often a word is followed by an exclamation point, a question mark, or a period at the end of a sentence, as compared with the average for all words.


As you can see, an exclamation like hurrah is more likely than average to be followed by an exclamation point, and less likely to be followed by a question mark.


The punctuation profiles are turning up some interesting conundrums: for instance, why is the tally of question marks so high for the word peanut?


peanut?


(It can’t all be because of Wordnik’s favorite movie…)


We hope you enjoy the words of the day and the punctuation profiles! If you’d like to email us suggestions for future WOTD candidates, you can do so at feedback@wordnik.com.

What Does Your Wordnik Profile Say About You?

What kind of Wordnik are you? Now you can find out by taking a look at your Wordnik profile!


Your Wordnik profile (which is available whenever you’re logged in) shows you (and only you — it’s private) how many words you’ve looked up, and the most recent words you’ve looked up, tagged, left notes about, recorded pronunciations for, and declared to be your favorites!


When you’re logged in, your profile page will help you keep track of cool words you’ve found or the words you frequently misspell.


To find your profile page, click on your username in the upper right-hand corner of the site. (Never logged in? Join today!)


Hap E Wordnik's username


And if you’ve logged in with Facebook Connect, we’ll even show you your user picture (just in case you’ve been spending so much time at Wordnik that you’ve forgotten what you look like).


While putting the profiles together, we thought we identified some Wordnik types …


The Enthusiast: has lots of words marked as favorites. (Enthusiasts like to tweet their new favorites, too!)


The Organizer: has tagged lots of words. Organizers’ tags range from the purely informational (consecutivevowels) to the editorial (funnysounding) to the just plain funny (apersonwhoeatsonlyvegetarians).


The Explainer: leaves a lot of helpful notes. (Or funny notes, which are also helpful in their way.)


The Announcer: records a lot of pronunciations. (Or finds them online and does some kind of prestidigitation to add them to Wordnik. Check out the one by “Vizzini” here.)


Soon we’ll be adding even more information to your profile pages, including your complete browsing history and some fun ways to compare yourself to other Wordniks. (A hint: start trying to score those Wordniks now …)


At Wordnik, our plan is to give you as much information as we can about as many words as we can — and that includes information about your own word use. Please let us know what else you’d like your profile to keep track of for you!