... about 50% by using UTF‐8. Although the Standard Compression Scheme for Unicode (SCSU) can compress Unicode strings to the size ...
Tópico(s): Particle accelerators and beam dynamics
2001 - Wiley | Software Practice and Experience
... SMS message by changing Arabic characters coding from Unicode to Base64 coding scheme and developing a runt version of lossless Huffman coding scheme. Examples are shown where the application of the text compressor for short message services offering more than three times the capacity compared to a standard message.
Tópico(s): Cryptographic Implementations and Security
2009 - | Maǧallaẗ al-rāfidayn li-ʿulūm al-ḥāsibāt wa-al-riyāḍiyyāẗ/Al-Rafidain journal for computer sciences and mathematics
This paper describes a sorting algorithm for Bengali texts which is one of the most vital tasks for Bengali Natural Language Processing. As Unicode is much more preferable than ASCII encoding, we need to use this representation for Bengali Language. But due to some distinct properties of Bengali Language, they cannot be sorted directly using the order in Unicode character scheme. A few works have been done on this topics – some of them are for ASCII encoding whether some are for Unicode. But still ...
Tópico(s): DNA and Biological Computing
2016 - | International Journal of Computer Applications
Pitambar Behera, Atul Kr. Ojha, Girish Nath Jha,
Low-density languages are also known as lesser-known, poorly-described, less-resourced, minority or less-computerized language because they have fewer resources available. Collection and annotation of a voluminous corpus for the purpose of NLP application for these languages prove to be quite challenging. For the development of any NLP application for a low-density language, one needs to have an annotated corpus and a standard scheme for annotation. Because of their non-standard usage in text and ...
Tópico(s): Handwritten Text Recognition Techniques
2018 - Springer Science+Business Media | Lecture notes in computer science
Roberto Grossi, Jeffrey Scott Vitter,
... size, such as in \textsc{ascii} or \textsc{unicode}. On the other hand, these indexes support fast ... emph{both} time and space of previous indexing schemes. Listing the pattern occurrences introduces a sublogarithmic slowdown ...
Tópico(s): DNA and Biological Computing
2005 - Society for Industrial and Applied Mathematics | SIAM Journal on Computing
Growth of information technology has played a great role in connecting the world together.The to and fro of information is common in this world.Fonts play a key major role in this communication process in digital domain.Common encoding scheme for one language helps in loss-less digital communication.Indian fonts lacks in this zone, as no Indian font has standard encoding format for mapping characters.Numerous indic fonts were created with diverse mapping schemes.Gurmukhi as one of the prominent ...
Tópico(s): Computational Physics and Python Applications
2015 - | International Journal of Computer Applications