Click here to Skip to main content
Click here to Skip to main content

Tagged as

C++: Minimalistic CSV Streams

, 19 Aug 2014 MIT
Rate this:
Please Sign up or sign in to vote.
Write and read CSV in few lines of code!

Introduction

MiniCSV is a small, single header library which is based on C++ file streams and is comparatively easy to use. Without further ado, let us see some code in action.

Writing

We see an example of writing tab-separated values to file using csv::ofstream class. Tab is a perfect separator to use because it seldom appear in the data. I have once encountered a comma in company name which ruined the CSV processing.

#include "minicsv.h"

struct Product
{
    Product() : name(""), qty(0), price(0.0f) {}
    Product(std::string name_, int qty_, float price_) 
        : name(name_), qty(qty_), price(price_) {}
    std::string name;
    int qty;
    float price;
};

int main()
{
    csv::ofstream os("products.txt", std::ios_base::out);
    os.set_delimiter('\t');
    if(os.is_open())
    {
        Product product("Shampoo", 200, 15.0f);
        os << product.name << product.qty << product.price << NEWLINE;
        Product product2("Soap", 300, 6.0f);
        os << product2.name << product2.qty << product2.price << NEWLINE;
    }
    return 0;
}

NEWLINE is defined as '\n'. We cannot use std::endl here because csv::ofstream is not derived from the std::ofstream.

Reading

To read back the same file, csv::ifstream is used and std::cout is for displaying the read items on the console.

#include "minicsv.h"
#include <iostream>

int main()
{
    csv::ifstream is("products.txt", std::ios_base::in);
    is.set_delimiter('\t');
    if(is.is_open())
    {
        Product temp;
        while(!is.eof())
        {
            is >> temp.name >> temp.qty >> temp.price;
            // display the read items
            std::cout << temp.name << "," << temp.qty << "," << temp.price << std::endl;
        }
    }
    return 0;
}

The output in console is as follows.

Shampoo,200,15
Soap,300,6

ofstream

We first look at the ofstream class and its constructors and data member.

class ofstream
{
public:
    ofstream() : after_newline(true), delimiter(',')
    {
    }
    ofstream(const char * file, std::ios_base::openmode mode)
    {
        open(file, mode);
    }
    void open(const char * file, std::ios_base::openmode mode)
    {
        init();
        ostm.open(file, mode);
    }
    void init()
    {
        after_newline = true; 
        delimiter = ',';
    }
    void flush()
    {
        ostm.flush();
    }
    void close()
    {
        ostm.close();
    }
    bool is_open()
    {
        return ostm.is_open();
    }
    void set_delimiter(char delimiter_)
    {
        delimiter = delimiter_;
    }
    char get_delimiter() const
    {
        return delimiter;
    }
    void set_after_newline(bool after_newline_)
    {
        after_newline = after_newline_;
    }
    bool get_after_newline() const
    {
        return after_newline;
    }
    std::ofstream& get_ofstream()
    {
        return ostm;
    }
private:
    std::ofstream ostm;
    bool after_newline;
    char delimiter;
};

What follows is the non-member << operators. The first << operator is a template so csv::ofstream supports the data types which std::ofstream can handle. So if there are custom data types to be handled, then overload the << operator for std::ofstream, not csv::ofstream! The second specialized << operator is to track the linefeed and set after_newline to true.

#define NEWLINE '\n'

template<typename T>
csv::ofstream& operator << (csv::ofstream& ostm, const T& val)
{
    if(!ostm.get_after_newline())
        ostm.get_ofstream() << ostm.get_delimiter();

    ostm.get_ofstream() << val;

    ostm.set_after_newline(false);

    return ostm;
}
template<>
csv::ofstream& operator << (csv::ofstream& ostm, const char& val)
{
    if(val==NEWLINE)
    {
        ostm.get_ofstream() << std::endl;

        ostm.set_after_newline(true);
    }
    else
        ostm.get_ofstream() << val;

    return ostm;
}

ifstream

For csv::ifstream,as the reader can see, even their constructors are similar.

class ifstream
{
public:
    ifstream() : str(""), pos(0), delimiter(',')
    {
    }
    ifstream(const char * file, std::ios_base::openmode mode)
    {
        open(file, mode);
    }
    void open(const char * file, std::ios_base::openmode mode)
    {
        init();
        istm.open(file, mode);
    }
    void init()
    {
        str = "";
        pos = 0;
        delimiter = ',';
    }
    void close()
    {
        istm.close();
    }
    bool is_open()
    {
        return istm.is_open();
    }
    bool eof() const
    {
        return (istm.eof()&&str == "");
    }
    void set_delimiter(char delimiter_)
    {
        delimiter = delimiter_;
    }
    char get_delimiter() const
    {
        return delimiter;
    }
    void skip_1st_line()
    {
        if(!istm.eof())
        {
            std::getline(istm, str); // read 1st line.
            std::getline(istm, str); // read 2nd line.
            pos = 0;
        }
    }
    void skip_line()
    {
        if(!istm.eof())
        {
            std::getline(istm, str);
            pos = 0;
        }
    }
    std::string get_delimited_str()
    {
        std::string str = "";
        char ch = '\0';
        do
        {
            if(pos>=this->str.size())
            {
                if(!istm.eof())
                {
                    std::getline(istm, this->str);
                    pos = 0;
                }
                else
                {
                    this->str = "";
                    break;
                }

                if(!str.empty())
                    return str;
            }

            ch = this->str[pos];
            ++(pos);
            if(ch==delimiter||ch=='\r'||ch=='\n')
                break;

            str += ch;
        }
        while(true);

        return str;
    }
private:
    std::ifstream istm;
    std::string str;
    size_t pos;
    char delimiter;
};

All the function member delegate their calls to std::ifstream except for one heavy duty function, get_delimited_str. One main reason, get_delimited_str does not make use of strtok is because strtok has a serious bug with regards to CSV processing where consecutive delimiters is counted as one delimiter. For instance, ",," is the same one delimiter, not 2 delimiters with a empty string in between them.

Now, we'll look at the >> operator. The first template operator calls get_delimited_str and use std::istringstream to convert to the data type. The second specialized form does not make use of std::istringstream as std::istringstream will delimit/split the string type if the string contains whitespace. It is advisable to switch to boost::lexical_cast by defining USE_BOOST_LEXICAL_CAST because std::istringstream is slow and data conversion is not robust. For example, during a string to integer conversion, an empty string will be silently converted to a zero!

template<typename T>
csv::ifstream& operator >> (csv::ifstream& istm, T& val)
{
    std::string str = istm.get_delimited_str();
    std::istringstream is(str);
    
    is >> val;

    return istm;
}
template<>
csv::ifstream& operator >> (csv::ifstream& istm, std::string& val)
{
    val = istm.get_delimited_str();

    return istm;
}

Conclusion

MiniCSV is a minimalistic CSV library that is based on C++ file streams. The initial decision was to based on my Elmax file library for UTF-8 but that is a monolithic library! To keep things small, file streams is chosen. MiniCSV is hosted at Google Code. Thank you for reading!

History

  • 2014-03-09: Initial Release
  • 2014-08-20: Remove the use of smart ptr

Related Articles

License

This article, along with any associated source code and files, is licensed under The MIT License

Share

About the Author

Shao Voon Wong
Software Developer
Singapore Singapore
No Biography provided

Comments and Discussions

 
QuestionWhat should the separator character be? PinmemberRoger Bamforth21-Aug-14 4:08 
AnswerRe: What should the separator character be? PinprofessionalShao Voon Wong25-Aug-14 19:51 
QuestionCSV Data PinmemberWintonRoseland21-Aug-14 3:02 
QuestionSeparator character PinmemberWombaticus20-Aug-14 2:25 
QuestionSome thoughts on how to improve the lib PinmemberMember 102726144-Apr-14 1:58 
AnswerRe: Some thoughts on how to improve the lib PinprofessionalWong Shao Voon7-Apr-14 20:41 
Questiongreat article! PinmemberFrank Reidar Haugen2-Apr-14 7:26 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.141223.1 | Last Updated 19 Aug 2014
Article Copyright 2014 by Shao Voon Wong
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid