Mentors: Hrishi, copyninja
The project aims to tackle the problems faced by search and spellchecking due to the agglutinative nature of indic languages. The implementation as part of this project is in Malayalam language. Also, implementation of a REST API for libindic modules.
Subprojects
sandhi-splitter
for Malayalam- Improvements to
spellchecker
usingsandhi-splitter
. - libindic REST API
Target Milestones
Community Bonding Period
- Semi-automated annotation tool
- Feasibiity study - non rule based splitting/joining
- Initial Setup
- Standard - PEP-8
- Packaging -
pbr
- Testing Tool -
testtool
Mid Term Evaluation
- Sandhi splitter
- split point identification
- splitter
- joiner
- Spellchecker Integration
Final Evaluation
- REST API for libindic
- Finishing touches.
Links
- sandhi-splitter, working github repo under libindic.
- Proposal Draft - Google Docs, view only.