Project 4: Decoding Text

Given a binary prefix tree, decode a list of messages.

Both prefix encoding and LZW are widely used in data compression, and are a part of many real-world standards such as GIF, JPEG, MPEG, MP3, and more. Prefix codes are variable-length encoding, fixed in advanced, perhaps by some a priori knowledge of the frequency of the characters.

For example, the binary tree.

encodes the characters ABCDE (the leaves of the tree). Each character has a fixed bit string representing it. But the bit strings are possibly different lengths. Here is a table of the encoding represented by the tree above.
character  encoding
   A        1
   B        00
   C        011
   D        0100
   E        0101 
Each character is encoded by a string of binary digits. Notice that no character is a prefix of another.

We use the preorder traversal of the binary tree to represent the tree as a string. Internal nodes are labeled with the special character '*'. (For convenience, we restrict ourselves to messages that do not contain this special character.) The preorder traversal of the above tree is:



Read the input from the standard input stream. The first line of input is the binary tree encoding two or more printable US-ASCII characters (but not '*'). Every internal node has two branches. The rest of the input consists of zero or more lines of binary digits. Each line is a single message of the given characters encoded correctly.



Decode each message and print the US-ASCII characters of the decoded message on its own line.

Information on the net

Be sure to eliminate all warnings detected by -Wall before submitting your project. Do not use the do construct, records, or $. Prove you know the basics by using the more common constructs, otherwise one might assume you are mimicing other Haskell code you do not understand.

Turning it in

Usually, the file should be named after the module name, replacing dots in the module name by directory separators. For example, on a Unix system, the module A.B.C should be placed in the file A/B/C.hs, relative to some base directory. If the module is not going to be imported by another module (Main, for example), then you are free to use any filename for it.

Turn in the source code for your project in a file named main.hs. Your project will be compiled like this:

ghc -Wall --make main.hs

Write the program as clearly as possible, including reasonable comments. You may work alone or with one other student in the class, as long as you did not work with this partner in any of the previous projects. It is important to get the output formatted correctly, as the output must conform character by character to the specification. Incorrectly formatted output will fail the test cases. If there is any question about the output, please ask.

Turn in the Haskell source code for the program using the Submit Server. The project tag is proj4. In the comments somewhere at the beginning of your source file include a header like the following:

 - Author: name1, e-mail address
 - Author: name2, e-mail address
 - Course: CSE 4250, Fall 2017
 - Project: Proj4, Decoding Text
 - Language implementation:

All programs are analyzed for similarity with other programs, both current and past. Students whose programs are very similar to other sources will receive no credit. This policy is necessary to ensure that students take reasonable action to avoid and prevent plagiarism, and to ensure the proper recognition of independent effort. Without student cooperation individual grades would have less meaning and the incentive for learning decreases. If you have evidence of academic misconduct, you should bring it to the attention of your instructor, the head of the department, or Dean of the school of engineering.