Natural Language Search

Natural language processing–designing computer systems that understand human language– has proven a tough nut to crack. Yesterday a TechCrunch UK post covered the beta launch of True Knowledge, a UK startup offering a natural language search engine. They join competitors like Powerset, which has yet to launch but is apparently tackling the same problem. It’ll be interesting to see if these go anywhere, or become the next Ask Jeeves.

True Knowledge has a good demo video explaining their technology. The most interesting part of which is hearing a British voice pronounce the word “beta.” It comes out sounding like beet-ah. Is that how it’s actually pronounced over there, or is that just a quirk of the guy talking? Team Wordie, UK division, please report.

Paginated Word Lists

Finally, word lists have been broken into pages, to make it easier to go through long lists (and to prevent long lists from crashing browsers–Wordie now passes the stpeter test).

I cranked this out, so it’s pretty basic, and probably buggy. Right now each page is 100 words long; eventually I’ll make that configurable, and otherwise fancy it up. Let me know if you see any problems.

The Swearing Festival

This coming Saturday, November 10, is the second annual Swearing Festival, which is exactly what it sounds like: an exploration and celebration of expletives.

The afternoon program is relatively staid, with a panel of linguists, authors, and publishers talking about cursing. The evening program is pretty much just cursing: competitive cursing, cursing to music, cursing in different languages.

If you’re in the Bay Area (the one near San Francisco, not the Bay of Fundy or Bengal), you might want to check it out.

The Cupertino Effect

Ben Zimmer* has an interesting and amusing post in today’s OUPblog about the Cupertino Effect: the tendency of spellcheckers, due to outdated dictionaries, bad algorithms, or a combintion thereof, to insert or suggest nonsensical words.

The recent addition of WordNet definitions to Wordie (which I’ll blog at greater length on Monday) was resulting in a version of this before I tweaked the algorithm. As someone famous once said (Barbie, I think), natural language processing is hard.

* update: I incorrectly called Ben “Bill Zimmer” when this was first posted. Not sure where that came from, sorry Ben!

Wordie hearts vajayjay

Stephanie Rosenbloom of the The New York Times has an excellent piece about the word “vajayjay,” a euphemism for vagina coined on “Grey’s Anatomy” and popularized by Oprah Winfrey.

There was some gnashing of teeth on Wordie when vajayjay was first listed a few months ago, but the word fills a linguistic void, according to Rosenbloom: There are a slew of lighthearted euphemisms for male genitalia (enough to constitute a Monty Python song), but fewer for the female equivalent, and fewer still that aren’t vulgar or sexist.

Rosenbloom takes a silly word as an occasion to talk to some serious linguists and writers on some interesting topics. Well worth the read.

Give That Woman a Crappaccino!*

My pal Theo pointed me to this WSJ Law Blog piece on Sharon Nichols, founder of the “I Judge You When You Use Poor Grammar” Facebook group. The group’s stated mission is to document bad grammar, and to date almost 5,000 photos have been uploaded for that purpose. One example: a rather large tattoo claiming “You Bleed Just To Know Your Alive.”

Nichols, a student at Alabama Law, was also covered last week in The New York Times Fashion & Style section, which I found a bit odd–does good grammar ever go out of style?

* See crappuccino. And don’t forget your unlimited edition crappuccino mugs.