Development of a Lexical Platform
for Distant Languages
#TCLT8
2014-jun-07
@edouard_lopez
我
↖That's where I live and work
-
Languages: French, English, Spanish, Chinese, Japanese
INALCO
- Best French University to study foreign languages.
- ~93 languages
- poorly structured/accessible data
- Collaboration with one of their PhD student.
- Project in early stage
- using Chinese as pilot language
Importance of Lexicon
- Lexical competence
- elementary bricks ;
- critical in foreign language acquisition.
- Chinese (Foreign) Language
- can be opaque;
- difficulties to maintain expansion for
advanced learners.
- Great need to provide the right stimulus at the right moment.
But how?
- Limited Human resources
- Class of 30+ persons
- Each different from the others
- Limited time
- Limited knowledge (we are humans after all)
- Using technologies, right ?
- There is lack of electronic resources in French
- Dictionary as .doc ? Please don't, ask IT guy they can help
- Writing a dictionary, what format to use ?
- does NOT matter to end-user
Platform Stack
- Disclaimer
- Work in Progress ;
- Throw away version ;
- we are exploring, testing, failing… to improve!
- Some technical principles we want:
- cross-platform ;
- flexibility ;
- openness ;
- plug-n-play ;
-
user tailored content.
Back-end
- Database (MySQL)
- pros: widespread, easy to deploy
- cons: inconsistent behaviors (types, UTF-8), lack of FTS (<5.6), slow (300k entries).
- Server Application (PHP/CodeIgniter)
- pros: widespread, light, fast, easy to deploy
- cons: slow & dirty
-
API
(JSON)
- pros: loose-coupling (REST), human & machine friendly, widespread, light weight.
- cons: v1.0
Front-end
- Web Browser
- we use AngularJS by Google
- goal: provide high level DSL for end-user
- Mobile
- Default webapp is mobile-friendly
- PhoneGap (see Hugo presentation)
- web → native
- works offline
- Others
- Whatever you want we got an API!
DSL
<span>{{entry.def}}</span>
<span>{{entry.ort}}</span>
<span>{{entry.ortx1}}</span>
<span>{{entry.pho}}</span>
<span>{{entry.pho | pinyin}}</span>