File Decoding Program

Last update: 4/19/02

Instructors Only

Strategy: I let the students analyze the files and try to figure out the locations of the encoded bits. They eventually discover that the first two bytes of each file contain the 16-bit file size, and the third byte indicates the positions of the encoded bits. For example, in encryp1.bin, the first two bytes contain 0018h, and the third byte is 43h. The "43h" indicates that the first encrypted bit is in position 3, and the second encrypted bit is in position 4. The next four bytes following this contain the eight bits in the letter "T" (54h):

 11100011  11110011  11110011  11110011
 
	This produces 54h:  01010100

And so on, through the file. I require their program to correctly decode three test files (encryp1.bin, encryp2.bin, and encryp3.bin), and I also test their program with a fourth file that students are not allowed to see (encryp4.bin).

Evaluation: I received very favorable responses from students, who said that they enjoyed having to figure out the algorithm completely on their own. This was their final project for the semester, and they were given 3 weeks to finish. A few students finished with 3 days. About three days before the due date, to help those who are completely lost, I let students know how to read the 3-byte header of each file.


Scenario

You're a programmer for an agency who's name is so secret that we cannot even mention it here. You have been given several binary files to decode, each containing a different message. You have found out, through intelligence sources, that each byte in the encoded messages contains two bits which have been extracted from the original message. The remaining 6 bits in each encoded byte are meaningless. The problem is, you don't know which 2 bits are being used, and you have learned that the positions of the two bits are different in each of the encoded files. Fortunately, you do know that each file contains heading information regarding the message size and the positions of the two encoded bits. Somehow, you will have to study the file headers and figure out how to use them.

Encoding Method

Suppose a message byte contained 01000001b (ASCII code for capital A). If the encoded file used bits 0 and 1 to encode this first byte, the first four bytes of the file would contain the following binary values:

xxxxxx01  xxxxxx00  xxxxxx00  xxxxxx01

Note that I've used x's to represent the meaningless bit positions. Also, note that the message byte was decoded in little endian order (bits 0-1 were first, then bits 2-3, and so on), shown by the color coding. The files you decode will not necessarily use bit positions 0 and 1.

Suggestions:

1. Use a debugger to examine the encrypted files. Write out the binary values of the first eight or ten bytes to see if you can find out which bits form a recognizable word. Four encrypted bytes are used to hold a single byte of the original message.

2. To open a file, use INT 21h function 716Ch. To read a file, use INT 21h function 3Fh. To close a file, use INT 21h function 3Eh. The following instructions open a file for input:

	mov ax,716Ch 			; create file
	mov bx,0 				; mode: read input
	mov cx,0 				; normal file
	mov dx,1 				; action: open existing file
	mov si,OFFSET fileName 	; filename (null-terminated string)
	int 21h 				; call MS-DOS
	mov fileHandle,ax 		; save handle
	jc  CannotOpenFile


3. To read the entire encrypted file into a buffer, use INT 21h Function 3Fh:

.data
buffer DB 5000 dup(?)	
.code  
    mov ah,3Fh 	            ; read from file or device
	mov bx,fileHandle 	    ; output file handle
	mov cx,SIZEOF buffer 	; number of bytes
	mov dx,OFFSET buffer 	; buffer pointer
	int 21h

    jc  ReadErrror          ; error reading file?

    mov fileSize,ax         ; if not, size is in AX

4. To get the filename from the user, call the ReadString procedure from the book's link library.

5. When writing the solution program, I found the SHR instruction to be useful, particularly the version that uses the CL register as a counter. Imagine that shiftCount is a BYTE variable:

    mov cl,shiftCount
    shr al,cl

Specifications

Your program must do the following steps, in order:

1. Display a program heading that includes your name.

2. Prompt the user for a file name. If a blank name is entered, quit immediately.

3. Attempt to open the file. If the file cannot be opened, return to Step 2. If the file is opened, continue to Step 4.

4. Read the file into an array of bytes.

5. Display the message size (in bytes).

6. Display the decoded message.

7. Return to Step 2.