Workshop on

"Shallow Parsing in South Asian Languages"

Downloads & References

  • The proceedings of the workshop can be downloaded here.

  • The proceedings of the NLPAI Machine Learning Contest-2006 on POS tagging & Chunking for Indian languages can be accessed here.

  • Download unicode fonts for Indian languages here.

  • The guidelines for chunking can be found here.

  • The guidelines for POS annotation can be found here.

  • The training data has been released in Shakti Standard Format (SSF). A short note on SSF is here. For a detailed description of the SSF format, please refer section 4 in this pdf .

  • The data provided is in utf-8 encoding. Read more about UTF-8 and converting it to ASCII and back

References

  1. Jurafsky, Daniel, and James H Martin, Speech and Language Processing, Prentice-Hall, 2000. (Indian edition available from Pearson.)

  2. C.D. Manning and Hinrich Schutze, Foundations of Statistical Processing, MIT Press, 1999.

  3. Akshar Bharati, Rajeev Sangal and Vineet Chaitanya, Natural Language Processing: A Paninian Perspective on , Prentice-Hall of India, New Delhi, 1995.



  [Top]