In the past week or so Hank, John and Ken have let slip some of the improvements we’re planning for V3. In this milestone blog post (ok, I’m overstating things, but it *is* my first) I want to talk about some of the backend changes we’re instituting in the run-up to V3, and start a conversation about what this can bring us in the post-V3 era.
As John mentioned in his last post, one of the bigger changes that will happen with V3 is the shift to having customized pinyin popups for the dialogue and expansion sentences. The basic idea is to provide more useful and contextually clear explanations. This doesn’t mean taking away the ability for readers to access all of the other definitions for any particular word, but it does help us be as specific as we need to be to explain any single sentence. This has required some significant changes on the way we develop lessons on the backend as well as a lot of editing work. It is one of the major improvements that is taking up a lot of the academic team’s time.
The result of this is more than better popups though. What it gives us (you) is deeper knowledge of our Chinese content. There should be a number of reasons why this matters pedagogically. Practically, it also gives us a much more flexible system for designing new games, exercises and other site features. Here are three of them off the top of my head:
(1) Better Support for Traditional Chinese:
We’ve been getting a lot of email from people expressing concern about continued support for traditional characters. It almost feels like Ken declared traditional chinese dead without telling us. By ensuring that we have all the necessary data on traditional characters for our materials, the new system should provide a stronger base for supporting traditional Chinese moving forward.
(2) Better Exercises and Games:
Most Chinese language learning exercises that are user-oriented are either flashcards or flashcards-in-disguse. Look past the interface and you’ll notice that gameplay consists mostly of recognizing a chinese word and matching it to either the pinyin or english. There isn’t a lot of variety in these sorts of exercises even if we add audio and switch things up. A word-by-word understanding of sentence content is the first step in building games with different sorts of fundamental gameplay. I have no idea what these will be. But if anyone has suggestions please let us know.
(3) Better Site Search
It’s hard to find lessons about “opening a bank account” in Chinese if you don’t already know how to say “bank” or “account” or “are my deposits insured” in the language. Tagging and english-language site introductions are helping make Chinese content accessible now and in V3. In the long-term we will want better bilingual search, which requires a very fine-grained intelligence on lesson content.
That’s it off the top of my head. None of this stuff will be ready for V3 launch, but they are in the pipeline in the long-term. Thoughts and suggestions are invited as always.
David Lancashire
[Ps, I thought I would include this pic of David solving the pop up question in an unusual fashion. Ken Carroll, ed.]

中文 Chinese
Jason S Says:
March 20th, 2007 at 11:46 pm
I would love to see a game of sorts in which the user is asked questions and is required to respond in written Chinese. (Er…typed)
I’m not sure how hard something like this would be to program, but it might be easier (and even more beneficial for the user), to have a word bank from which the user is required to take their vocab from. (Which could increase as the game progresses, etc. Something like that)
I think the possibilities for a game like that are endless. You could move along a story, meet different characters, etc. All the while participating in different sorts of mini-games.
I think something like this might require utilizing a ’save’ function, which would be all the cooler.
Just some ideas off the top of my bald head.
Jeff Says:
March 21st, 2007 at 12:17 am
I agree that one of the key’s to a successful game would be to get away from the flashcard approach and towards a more cognitive approach. Taking Jason’s idea of written responses, an incredibly simple (but incredibly useful) game might be to have a character spit out sentences (in English) at you and you translate into Chinese and type. Then you could compare your written sentence with the correct one. I realize that there are many ways to translate a sentence, but knowing at least 1 correct way would be useful. The game could have different levels of complexety - much like the expansion sentences for each lesson. By the end of the game the user could have translated an entire short story or dialogue. Who know’s? Just throwing some ideas out there.
海宁 / Henning Says:
March 21st, 2007 at 1:35 am
There are tons of possibilities which might impose more or less strong demands on the data base scheme and on algorithmic capabilities. Here an unordered quick & durty idea collection:
- Ordering words/phrases by level of degree of similarity
- Ordering words/phrases by level of formality
- Ordering words/phrases based on emotional impact (neutral –> heavy)
- Word domino - align words with similar meaning
- Grouping words/phrases of similar domains
- Drag/drop vocab on respective picture areas
- The “directions on a map” lesson as an interactive game with different levels of difficulty
- Crossword puzzles (with Chinese definitions!)
- Identification of synonyms and antonyms
- Interactive menu to select a nice little meal
- Vocab Tetris (like the game in the dictionary at http://www.xuexizhongwen.de/)
- Games that are geared at repeating groups of the same semantic category (colors, forms, sizes etc.): the “colored ball” lesson in game form
- Games that built around the idea of understanding increasingly complex Mandarin commands (either Hanzi or Audio)
- Text Adventures - “Monkey Island” Style
- Grammar/vocab/characters: presenting wrong sentences (e.g. user posts) with corrections (after the user finished his own correction)
- Chinese Scrabble (needs a big vocab base in the back)
- Dictation (either pure or camouflaged as a game - type the code word!)
- Differentiation of similar characters
- Stroke order test
- Using some kind of Chinese voice recognition software to train pronounciation - this could be used naked or built into games
Ifung Lu Says:
March 21st, 2007 at 1:45 am
Scrabble!!! :p
Ifung Lu Says:
March 21st, 2007 at 1:58 am
Regarding traditional Chinese, thank you for working on improving this. For me, being an American of Chinese / Taiwanese descent (currently living in Britain… funny how things work out), it’s both a matter of historical pride as well as practicality why I want to be able to read traditional script in addition to simplified.
Beyond the few characters I currently am able read, it’s nice to be able to figure out a new word’s meaning by looking at the radicals that compose the character, and then deducing its meaning from that as well as the context in the sentence.
Thanks again on working to improve traditional character support.
hanyu_xuesheng Says:
March 21st, 2007 at 3:04 am
I would like to thank you for your efforts to improve the traditional character support in CP. After relaunch I will make another test run. If my expectations and hopes (after all these v3 announcements) are met, I will subscribe to CP.
Thanks again.
Henning Says:
March 21st, 2007 at 3:27 am
Ifung Lu:
Scrabble might indeed be technically feasible.
AS far as i know, Google has a SOAP-API.
You could use that to check the validity of an answer. Impose some threathold like “100 Googe hits on the complete input counts as a correct answer.”
The software opponet however would have to use the CPod-Glossary…
I do not know about possible legal implications of such a solution.
dave lancashire Says:
March 21st, 2007 at 11:19 am
This is great feedback, especially with Henning’s long list. I’m going to print this one out and stick it up by the tech team. Maybe it can inspire us. Scrabble would be an interesting game to mix with Chinese.
@Ifung - there are fanti champions on the inside too. Maybe we can look at fanti/jianti matching games as well.
Michael Butler Says:
March 23rd, 2007 at 11:21 am
David,
My suggestion for a fun game that encourages people to use and remember what they have learned is an Economist style quiz where people are asked about lessons over the past month (they do it weekly).
Users are asked questions about podcasts over the past month. You could ask people to complete the quiz in a say, 24-hour window.
People who got all the answers correct would be automatically entered into a drawing for some kind of swag or…? Perhaps there would only be one clear winner but my guess is that more than one person would get every answer correct. To do well on this requires that you listen to every podcast over the course of a month. Heh, isn’t it neat that a game would reward diligence?
For example:
In one lesson Ken asked jenny the following: …….. and Jenny answered as followed:
a.
b.
c.
d.
This all meets my criteria for a game which is to recycle, reapply, or review what was already learned or, more aptly, covered.
Michael Butler Says:
March 23rd, 2007 at 1:14 pm
David,
I realized I missed explaining one step. In A-D above the listener would hear 4 answers only one of which would be correct.
dave lancashire Says:
March 23rd, 2007 at 11:57 pm
@Michael,
Review-based games are a good idea. I think the critical thing from the development perspective, as Henning mentioned, is going to be the distinction between algorithmic intelligence and database-driven intelligence.
We should be well-positioned to take advantage of database-driven intelligence in creating games and study tools. I don’t see anyone else doing anything really algorithmic in the short term because POS analysis for Chinese is so difficult computationally. I’ll pass the suggestion forward to the Saturday Show, maybe they’d like to do something like this?
–dave
Michael Butler Says:
March 24th, 2007 at 11:35 am
Dave,
I’m fascinated by your reply. Oh, BTW I agree that this game idea makes sense on a Saturday show first.
I’m confused by what you mean by database driven and algorithmic computation in regards to games. Maybe this is beyond me but I see both used in tandem in creating games (in reading and writing data and in simple/complex chains of If-then and more highly evolved statements).
I also think I understand what you are saying about POS analysis but not all games need to do POS analysis on the fly. If you limit the user decisions ahead of time then you can limit their choices to things you have analyzed in advance (especially easy for beginners).
At the edges, in terms of so-called artificial intelligence, I understand how you might need to solve the problem about POS but short of that I don’t see why database driven and algorithmic computation can’t be used in tandem especially if applied to user actions as opposed to raw language input.
BTW here is an interesting list of games. I find the categories very instructive. http://flashchild.com/all-games/
dave lancashire Says:
March 25th, 2007 at 2:31 pm
@Michael,
Thanks for the link to the page of differently categorized flash games. Really useful. I thought it was funny to see that bust-a-move had its own category of gameplay… and then have a flash of recognition a moment later that they’re right.
What I meant by the database-driven approach is having the intelligence in the game (the determination of what constitutes a correct answer, or a branch in game choice) specified in an external resource (ie. pulling various sample sentences, definitions, etc. from a backend database or corpus of texts, recordings, etc.). This content could be manipulated in relatively simple ways by the game engine. In contrast, the algorithmic approach would put the onus of creating game branches and correct and incorrect answers on the logic of the game engine itself.
The problem isn’t really the ability to handle relatively simple (or even quite complex) IF-THEN chains so much as the way Chinese grammar is so open-ended and flexible that it would be difficult to hit the levels of reliability that I imagine are necessary to make a game a good learning/review experience. Would people want to play a game that made a mistake every now and then, or (worse?) couldn’t recognize correct answers because it didn’t fit the engine’s pre-coded understanding of grammar structures? I’m not sure.
A bit off topic, but I think it’s revealing that the most effective approach to machine translation being used in the C-E space right now is the statistical machine translation approach Google has adopted, which basically ignores grammar completely in lieu of subphrase-matching on enormous corpora.
Michael Butler Says:
March 26th, 2007 at 12:04 pm
David,
Interesting reply and on the weekend at that!
Correct me if I’m wrong but that advantage of having in-game intelligence is two-fold. First, there is less stress on the back-end due to reduced usage and second, and more importantly, there is less/no need for a human to program or account for every possible feedback/answer.
Moving on, I guess one of your big questions deals with the issue of branching. How many branches do you want to allow for and what kind of language manipulation you expect to be doing?
My own feeling is that to get a sense of what is possible that you need to begin at a simple level first and as your capabilities improve add complexity. I’m interested to see, for example, how you could improve on your existing game (making it richer or more complex). My own gut feeling is that you started with a “game type” that to be successful needs to go down the path of other RPG-like games. That is a daunting task.
My other feeling is that you need to work on a more micro level at first. Do you have a game for using counters? A game for cardinal and ordinal numbers? A game for when to using certain fixed lexical phrases correctly (this being close to Ken’s heart)?
Just thinking out loud here.
Michael
Marc Says:
April 25th, 2007 at 8:14 pm
I just had a closer look at the new type of exercises. The scoring facility looks promising, but it fails to deliver, because other than the score for a particular exercise there doesn’t seem to be any other feedback.
Usually I pay a lot of attention to the exercises, so I often get the maximum score. Now I got curious and I deliberately entered a mistake in the exercise I was doing. The score was correct -9/10- but what was my mistake? There is no way I could tell and trying to find the mistake by trial and error isn’t very interesting. In the V2 website it was either impossible to make a mistake (the drag and drop exercises) or it was signalled immediately (the connect exercises). I hope we will get something similar in V3.
Marc