410 likes | 585 Views
Huffman Codes. Using Binary Files. Getting Started. Last class we extended a program to create a Huffman code and permit the user to encode and decode messages. We will use that program as our starting point today:
E N D
Huffman Codes Using Binary Files
Getting Started • Last class we extended a program to create a Huffman code and permit the user to encode and decode messages. • We will use that program as our starting point today: • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/ • File Huffman_Code_with_Associative_Map.zip • Download, extract, built, and run.
Program in Action Widen window to 100.
Binary Output • Huffman codes are useful in real life only when we output the coded message as binary. • Let's modify do_encode to output a file. • Start with a text file. • ASCII 1's and 0's • Modify to write a binary file.
main.cpp • Add at top of main.cpp: #include <fstream> • Modified version of do_encode: • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/do_encode.cpp.txt
Modifications to do_encode() void do_encode(void) { string msg; string output_filename; ofstream outfile; string junk; while (!outfile.good()) { cout << "File name for output? "; cin >> output_filename; getline(cin,junk); // Skip newline char outfile.open(output_filename.c_str()); if (!outfile.good()) { cout << "Failed to open output file\n"; cout << "Please try again\n"; } }
Modifications to do_encode() cout << "\n\nEnter message to encode\n"; getline(cin, msg); for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code = huffman_tree.Encode_Char(next_char); cout << code; outfile << code; } cout << endl << endl; outfile << endl << endl; outfile.close(); cout << "File " << output_filename << " written\n"; }
Clean Up Output • Comment out statements that output the tree and the code. • In main(): int main(void) { cout << "This is the Huffman Code Program" << endl; build_huffman_tree(); //huffman_tree.Display_List();
In Huffman_Tree.cpp void Huffman_Tree::Make_Decode_Tree(void) { node_list.sort(); //cout << "\nSorted list:\n"; //Display_List(); ... //cout << endl << "The Huffman Tree" << endl; //Display_Decode_Tree(&decode_tree_root, 0); //cout << endl << "The Code: " << endl; //Display_Code(&decode_tree_root, ""); }
Program in Action Examine c:\out.txt
Invalid Characters • What should we do with characters that are not in the code? • Encode_Char() returns a zero length string. • Detect the error in do_encode(). • Tell user about the error. • Skip the invalid character in output.
main.cpp In do_encode() for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code = huffman_tree.Encode_Char(next_char); if (code.size() == 0) { cout << endl << "Invalid character in input to do_encode: " << next_char << endl; continue; } cout << code; outfile << code << " "; }
Binary File I/O • Issues with binary files. • Hardware architecture dependencies. • Code is typically not portable. • Output is by byte, not by bit • For Huffman coding we need variable length bit strings. • Must know number of bits. • Encapsulate code to do binary file I/O in classes. • Provide relatively simple interface to the rest of the program.
Bit Count Client Classes Bits Buffer Binary Output File Class Binary Output File Class
Binary Input File Class Bit Count Client Classes Bits Buffer Binary Input File Class
Binary_File is_open buffer next_bit_position filename BUFFER_SIZE FIRST_BIT_POSITION + Is_Open Binary_Output_File -fstream + Output_Bit_String + Close - Write_Buffer Binary_Input_File -fstream + Get_Next_Bit + Close -Read_Buffer Binary File Classes
Binary File I/O • Download • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Binary_File_IO/ • File Binary_File_IO_Classes.zip
Binary File IO Classes Copy into project folder and add to project.
Add Binary File IO Files to Project Build project.
Binary_File.h #pragma once #include <string> using std::string; class Binary_File { public: Binary_File(const string& Filename); virtual void Close() = 0; bool Is_Open() const {return is_open;}; protected: static const int BUFFER_SIZE = 1024; // Size in bytes static const int FIRST_BIT_POSITION = 8*sizeof(size_t); union Buffer { char bits[BUFFER_SIZE]; size_t bit_count; }; void Reset_Buffer(void); const string filename; bool is_open; Buffer buffer; size_t next_bit_position; };
Binary_File.cpp #include "Binary_File.h" Binary_File::Binary_File(const string& Filename) : is_open(false), filename(Filename) { Reset_Buffer(); } void Binary_File::Reset_Buffer(void) { for (int i = 0; i < BUFFER_SIZE; ++i) { buffer.bits[i] = 0; } next_bit_position = FIRST_BIT_POSITION; }
Binary_Output_File.h #pragma once #include <iostream> #include <fstream> #include <string> #include "Binary_File.h" using std::string; class Binary_Output_File : public Binary_File { public: Binary_Output_File(const string& filename); void Output(const string& bit_string); void Close(); private: std::fstream outfile; void Write_Buffer(); };
Binary_Output_File.cpp #include <cassert> #include <cmath> #include "Binary_Output_File.h" using namespace std; Binary_Output_File::Binary_Output_File(const string& filename) : Binary_File(filename) { outfile.open(filename.c_str(), ios::out | ios::binary ); if (outfile.fail()) { string err_msg("Error opening output file "); err_msg += filename; throw err_msg; } Reset_Buffer(); is_open = true; }
Binary_Output_File.cpp void Binary_Output_File::Write_Buffer() { assert (is_open); if (next_bit_position == FIRST_BIT_POSITION) { return; } buffer.bit_count = next_bit_position - FIRST_BIT_POSITION; size_t nr_bytes = (size_t) ceil(next_bit_position / 8.0); outfile.write( buffer.bits, nr_bytes); Reset_Buffer(); }
Binary_Output_File.cpp void Binary_Output_File::Output(const string& bit_string) { assert(is_open); for (size_t i = 0; i < bit_string.size(); ++i) { if (bit_string[i] == '1') { size_t byte_position = next_bit_position / 8; size_t bit_position_within_byte = next_bit_position % 8; buffer.bits[byte_position] |= (0x80 >> bit_position_within_byte); } else { assert(bit_string[i] == '0'); } ++next_bit_position; if (next_bit_position == BUFFER_SIZE*8) { Write_Buffer(); } } }
Binary_Output_File.cpp void Binary_Output_File::Close() { Write_Buffer(); outfile.close(); is_open = false; }
Using Binary File IO • Now let's modify do_encode() to write a binary file. • Add at top of main.cpp: #include "Binary_Output_File.h"
do_encode() void do_encode(void) { string msg; string output_filename; Binary_Output_File* outfile; string junk; while (true) { cout << "File name for output? "; cin >> output_filename; getline(cin, junk); // Skip newline char try { outfile = new Binary_Output_File(output_filename); break; } catch (const string& msg) { cout << msg << endl; } }
do_encode() cout << "Enter message to encode\n"; getline(cin, msg); for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code= huffman_tree.Encode_Char(next_char); if (code.size() == 0) { cout << endl << "Invalid character in input to do_encode: " << next_char << endl; continue; } cout << code; outfile->Output(code); }
do_encode() cout << endl << endl; outfile->Close(); delete(outfile); cout << "File " << output_filename << " written\n"; }
Bit Count = 16 0000 0001 0010 0011 c:\test.dat • Look at the output file in Visual Studio • File > Open > File
Binary Input • Now let's modify do_decode() to read a binary input file rather than reading 1's and 0's from the keyboard. • Add at top of main.cpp: #include "Binary_Input_File.h" http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/do_decode.cpp.txt
do_decode() void do_decode(void) { string msg; string input_filename; Binary_Input_File* infile; string junk; while (true) { cout << "File name for input? "; cin >> input_filename; getline(cin, junk); // Skip newline char try { infile = new Binary_Input_File(input_filename); break; } catch (const string& msg) { cout << msg << endl; } }
do_decode() string coded_message = ""; string original_message; while (infile->Is_Open()) { int next_bit = infile->Get_Next_Bit(); if (next_bit < 0) break; if (next_bit == 0) { coded_message += "0"; } else { coded_message += "1"; } } original_message = huffman_tree.Decode_Msg(coded_message); cout << "Original message: " << original_message << endl; cout << endl << endl; }