My Way of Implementing Streams #3

Uwaga! Informacje na tej stronie mają ponad 6 lat. Nadal je udostępniam, ale prawdopodobnie nie odzwierciedlają one mojej aktualnej wiedzy ani przekonań.

Mon
17
Aug 2009

I've decribed my way of implementing base classes for binary data streams in my recent posts. This time I want to show some of my specific stream classes. The main one is of course FileStream, which gives access to disk files. If opens file in the constructor (throws exception if opening fails) and of course closes it in the destructor.

class FileStream : public SeekableStream
{
public:
  FileStream(const std::string &FileName, FILE_MODE FileMode, bool Lock = true);
  ...

I use standard C functions on Linux (fopen, fclose, fread, fwrite) and WinAPI functions on Windows (CreateFile, CloseHandle, ReadFile, WriteFile). These two API are very different. WinAPI gives much more flexibility, because using CreateFile you can choose separately whether you want write and/or read access, whether to protect file from opening by another application, whether clear the file or access existing data, whether create new file or fail if it doesn't exist etc. But still I've decided to define my FILE_MODE enum in the C-like style (where you pass mode as string like "wb", "rb+" etc.) together with comments explaining details of each mode.

enum FILE_MODE
{
  // write: yes, read: no, initial pos: 0
  // not exists: create, exists: clear
  FM_WRITE,
  // write: yes, read: yes, initial pos: 0
  // not exists: create, exists: clear
  FM_WRITE_PLUS,
  // write: no, read: yes, initial pos: 0
  // not exists: error, exists: open
  FM_READ,
  // write: yes, read: yes, initial pos: 0
  // not exists: error, exists: open
  FM_READ_PLUS,
  // write: yes, read: no, initial pos: end
  // not exists: create, exists: open
  FM_APPEND,
  // write: yes, read: yes, initial pos: end
  // not exists: create, exists: open
  FM_APPEND_PLUS,
};

There is one issue in the Linux implementation though. Documentation says that seeking doesn't work when file is opened in "a" or "a+" mode. WinAPI doesn't have such problems :)

Having abstract stream class gives much flexibility. Code that writes or reads data doesn't need to know about where is it putting these data into. So next to the file stream I also have some stream classes which operate on memory. MemoryStream uses memory block of constant size, either given to the constructor or allocated and freed internally by the class (when Data=NULL).

class MemoryStream : public SeekableStream
{
public:
  MemoryStream(size_t Size, void *Data = NULL);
  char *Data() { return m_Data; }
  ...

I also have memory streams that can automatically resize themselves: VectorStream (based on std::vector) and StringStream (based on std::string). It's worth mentioning that you can always be sure about std::vector to be implemented as just a wrapper for dynamically allocated array in continuous memory. So you can get adress &MyVector[0] and use it as normal array. According to the standard you can't do the same with std::string because it can have different memory representation (like for example storing first few characters inside string object instead of dynamically allocated block). So to wrap std::string into a stream I had to write and read single characters.

I have many utility streams that can be attached to other streams and work as another level of indirection. OverlayStream is the base class for all such streams:

class OverlayStream : public Stream
{
private:
  Stream *m_Stream;
public:
  OverlayStream(Stream *a_Stream) : m_Stream(a_Stream) { }
  Stream * GetStream() { return m_Stream; }
  ...

One of specific overlay streams is BufferingStream. It manages internal read and write buffer and use attached stream only if write buffer is full or read buffer is empty. Using it speeds up data transfer, especially when writing or reading single bytes or other small values. Of course it still has overhead for virtual function calls, but it avoids overhead for system functions call each time you want to transfer single value to disk file.

Another group of overlay streams is designed to change data on the fly - encode or decode it. E.g. I have Base64Encoder and Base64Decoder classes to do Base64 encoding. Very important for me are ZlibCompressionStream and ZlibDecompressionStream classes. Zlib is great data compression library but it has the worst API I've ever seen, so I'm very glad to have simple and object-oriented wrappers for it.

Streams concept can also be used in a bit more exotic way. I use it to be able to icrementally calculate checksums and hashes. For example I have the MD5_Calc class which can be written as every other stream. At the end I can call Finish method to get MD5 checksum calculated from passed data.

struct MD5_SUM
{
  uint1 Data[16];
  ...
};

class MD5_Calc : public Stream
{
public:
  MD5_Calc();
  virtual void Write(const void *Data, size_t Size);
  void Finish(MD5_SUM *Out);
  
  // Just for convenience. Calculating hash from single data buffer.
  static void Calc(MD5_SUM *Out, const void *Buf, uint4 BufLen);
  ...

Here is quite complex (yet still very short and clear) example of how I can compress a file to a GZ archive and calculate its MD5 hash at the same time using my stream classes.

string srcFileName = "c:\\WINDOWS\\Media\\ding.wav";
string dstFileName = "g:\\tmp\\ding.gz";

FileStream inputFile(srcFileName, FM_READ);
FileStream outputFile(dstFileName, FM_WRITE);

GzipCompressionStream gzipCompression(&outputFile, NULL, NULL);

MD5_Calc md5Calc;

MultiWriterStream multiWriter;
multiWriter.AddStream(&gzipCompression);
multiWriter.AddStream(&md5Calc);

CopyToEnd(&multiWriter, &inputFile);

MD5_SUM md5Sum;
md5Calc.Finish(&md5Sum);
LOG(1, Format("Hash=#") % md5Sum);

Comments | #c++ Share

Comments

[Download] [Dropbox] [pub] [Mirror] [Privacy policy]
Copyright © 2004-2024