Word segmentation is the process of determining word boundaries in the given text. Urdu text is composed of ligatures having no defined word boundaries while writing. Word segmentation system converts the sequence of ligatures into the best sequence of words. The system takes sequence of ligatures and outputs space separated sequence of words with 97.9% accuracy. The system is statistically trained using one million words corpus.
Against the ان پٹ , you have to give the Urdu ligature strings delimited by space after pressing the button “تخصیص کریں” , system will process these ligatures and will output the best sequence of words in “آؤٹ پٹ” text field.