Click here to Skip to main content
15,867,488 members
Articles / Programming Languages / C++

C++: Minimalistic CSV Streams

Rate me:
Please Sign up or sign in to vote.
4.80/5 (77 votes)
10 Mar 2023MIT4 min read 175K   4.9K   172   42
Read/write CSV in few lines of code!

Introduction

MiniCSV is a small, single header library which is based on C++ file streams and is comparatively easy to use. Without further ado, let us see some code in action.

Writing

We see an example of writing tab-separated values to file using csv::ofstream class. Now you can specify the escape string when calling set_delimiter in version 1.7.

C++
#include "minicsv.h"

struct Product
{
    Product() : name(""), qty(0), price(0.0f) {}
    Product(std::string name_, int qty_, float price_) 
        : name(name_), qty(qty_), price(price_) {}
    std::string name;
    int qty;
    float price;
};

int main()
{
    csv::ofstream os("products.txt");
    os.set_delimiter('\t', "##");
    if(os.is_open())
    {
        Product product("Shampoo", 200, 15.0f);
        os << product.name << product.qty << product.price << NEWLINE;
        Product product2("Soap", 300, 6.0f);
        os << product2.name << product2.qty << product2.price << NEWLINE;
    }
    os.flush();
    return 0;
}

NEWLINE is defined as '\n'. We cannot use std::endl here because csv::ofstream is not derived from the std::ofstream.

Reading

To read back the same file, csv::ifstream is used and std::cout is for displaying the read items on the console.

C++
#include "minicsv.h"
#include <iostream>

int main()
{
    csv::ifstream is("products.txt");
    is.set_delimiter('\t', "##");
    if(is.is_open())
    {
        Product temp;
        while(is.read_line())
        {
            is >> temp.name >> temp.qty >> temp.price;
            // display the read items
            std::cout << temp.name << "," << temp.qty << "," << temp.price << std::endl;
        }
    }
    return 0;
}

The output in console is as follows:

C++
Shampoo,200,15
Soap,300,6

Overloaded Stream Operators

String stream has been introduced in v1.6. Let me show you an example on how to overload string stream operators for the Product class. The concept is the same for file streams.

C++
#include "minicsv.h"
#include <iostream>

struct Product
{
    Product() : name(""), qty(0), price(0.0f) {}
    Product(std::string name_, int qty_, float price_) : name(name_), 
                               qty(qty_), price(price_) {}
    std::string name;
    int qty;
    float price;
};

template<>
inline csv::istringstream& operator >> (csv::istringstream& istm, Product& val)
{
    return istm >> val.name >> val.qty >> val.price;
}

template<>
inline csv::ostringstream& operator << (csv::ostringstream& ostm, const Product& val)
{
    return ostm << val.name << val.qty << val.price;
}

int main()
{
    // test string streams using overloaded stream operators for Product
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        Product product("Shampoo", 200, 15.0f);
        os << product << NEWLINE;
        Product product2("Towel, Soap, Shower Foam", 300, 6.0f);
        os << product2 << NEWLINE;

        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        Product prod;
        while (is.read_line())
        {
            is >> prod;
            // display the read items
            std::cout << prod.name << "|" << prod.qty << "|" << prod.price << std::endl;
        }
    }
    return 0;
}

This is what is displayed on the console.

C++
Shampoo|200|15
Towel, Soap, Shower Foam|300|6

What if the type has private members? Create a member function that takes in the stream object.

C++
class Product
{
public:
    void read(csv::istringstream& istm)
    {
        istm >> this->name >> this->qty >> this->price;
    }
};

template<>
inline csv::istringstream& operator >> (csv::istringstream& istm, Product& prod)
{
    prod.read(istm);
    return istm;
}

Conclusion

MiniCSV is a small CSV library that is based on C++ file streams. Because delimiter can be changed on the fly, I have used this library to write file parser for MTL and Wavefront OBJ format in a relatively short time compared to handwritten with no library help. MiniCSV is now hosted at Github. Thank you for reading!

History

  • 2014-03-09: Initial release
  • 2014-08-20: Remove the use of smart ptr
  • 2015-03-23: 75% perf increase on writing by removing the flush on every line, fixed the lnk2005 error of multiple redefinition. read_line replace eof on ifstream.
  • 2015-09-22: v1.7: Escape/unescape and surround/trim quotes on text
  • 2015-09-24: Added overloaded stringstream operators example.
  • 2015-09-27: Stream operator overload for const char* in v1.7.2.
  • 2015-10-04: Fixed G++ and Clang++ compilation errors in v1.7.3.
  • 2015-10-20: Ignore delimiters within quotes during reading when enable_trim_quote_on_str is enabled in v1.7.6. Example: 10.0,"Bottle,Cup,Teaspoon",123.0 will be read as as 3 tokens : <10.0><Bottle,Cup,Teaspoon><123.0>
  • 2016-05-05: Now the quote inside your quoted string are escaped now. Default escape string is "&quot;" which can be changed through os.enable_surround_quote_on_str() and is.enable_trim_quote_on_str()
  • 2016-07-10: Version 1.7.9: Reading UTF-8 BOM
  • 2016-08-02: Version 1.7.10: Separator class for the stream, so that no need to call set_delimiter repeatedly if delimiter keep changing. See code example below:
    C++
    // demo sep class usage
    csv::istringstream is("vt:33,44,66");
    is.set_delimiter(',', "$$");
    csv::sep colon(':', "<colon>");
    csv::sep comma(',', "<comma>");
    while (is.read_line())
    {
        std::string type;
        int r = 0, b = 0, g = 0;
        is >> colon >> type >> comma >> r >> b >> g;
        // display the read items
        std::cout << type << "|" << r << "|" << b << "|" << g << std::endl;
    }
  • 2016-08-23: Version 1.7.11: Fixed num_of_delimiter function: do not count delimiter within quotes
  • 2016-08-26: Version 1.8.0: Added better error message for data conversion during reading. Before that, data conversion error with std::istringstream went undetected.

    Before change:
    C++
    template<typename T>
    csv::ifstream& operator >> (csv::ifstream& istm, T& val)
    {
        std::string str = istm.get_delimited_str();
        
    #ifdef USE_BOOST_LEXICAL_CAST
        val = boost::lexical_cast<T>(str);
    #else
        std::istringstream is(str);
        is >> val;
    #endif
    
        return istm;
    }

    After change:

    C++
    template<typename T>
    csv::ifstream& operator >> (csv::ifstream& istm, T& val)
    {
        std::string str = istm.get_delimited_str();
    
    #ifdef USE_BOOST_LEXICAL_CAST
        try 
        {
            val = boost::lexical_cast<T>(str);
        }
        catch (boost::bad_lexical_cast& e)
        {
            throw std::runtime_error(istm.error_line(str).c_str());
        }
    #else
        std::istringstream is(str);
        is >> val;
        if (!(bool)is)
        {
            throw std::runtime_error(istm.error_line(str).c_str());
        }
    #endif
    
        return istm;
    }

    Breaking changes: It means old user code to catch boost::bad_lexical_cast must be changed to catch std::runtime_error. Same for csv::istringstream. Beware std::istringstream is not as good as boost::lexical_cast at catching error. Example, "4a" gets converted to integer 4 without error.

    Example of the csv::ifstream error log as follows:

    C++
    csv::ifstream conversion error at line no.:2, 
    filename:products.txt, token position:3, token:aa

    Similar for csv::istringstream except there is no filename.

    C++
    csv::istringstream conversion error at line no.:2, token position:3, token:aa
  • 2017-01-08: Version 1.8.2 with better input stream performance. Run the benchmark to see (Note: Need to update the drive/folder location 1st).

    Benchmark results against version 1.8.0:

    C++
         mini_180::csv::ofstream:  348ms
         mini_180::csv::ifstream:  339ms <<< v1.8.0
             mini::csv::ofstream:  347ms
             mini::csv::ifstream:  308ms <<< v1.8.2
    mini_180::csv::ostringstream:  324ms
    mini_180::csv::istringstream:  332ms <<< v1.8.0
        mini::csv::ostringstream:  325ms
        mini::csv::istringstream:  301ms <<< v1.8.2
    
  • 2017-01-23: Version 1.8.3 add unit test and to allow 2 quotes escape 1 quote to be in line with CSV specification.
  • 2017-02-07: Version 1.8.3b add more unit tests and remove CPOL license file.
  • 2017-03-12: Version 1.8.4 fixed some char output problems and added NChar (char wrapper) class to write to numeric value [-127..128] to char variables.
    C++
    bool test_nchar(bool enable_quote)
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        os.enable_surround_quote_on_str(enable_quote, '\"');
    
        os << "Wallet" << 56 << NEWLINE;
    
        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        is.enable_trim_quote_on_str(enable_quote, '\"');
    
        while (is.read_line())
        {
            try
            {
                std::string dest_name = "";
                char dest_char = 0;
    
                is >> dest_name >> csv::NChar(dest_char);
    
                std::cout << dest_name << ", " 
                    << (int)dest_char << std::endl;
            }
            catch (std::runtime_error& e)
            {
                std::cerr << __FUNCTION__ << e.what() << std::endl;
            }
        }
        return true;
    }

    Display Output:

    C++
    Wallet, 56
  • 2017-09-18: Version 1.8.5:

    If your escape parameter in set_delimiter() is empty, text with delimiter will be automatically enclosed in quotes (to be compliant with Microsoft Excel and general CSV practice).

    "Hello,World",600

    Microsoft Excel and MiniCSV read this as "Hello,World" and 600.

  • 2021-02-21: Version 1.8.5d: Fixed infinite loop in quote_unescape.
  • 2021-05-06: MiniCSV detects the end of line with the presence of newline. Newline in the string input inevitably breaks the parsing. New version 1.8.6 takes care of newline by escaping it.
  • 2023-03-11: v1.8.7 added set_precision(), reset_precision() and get_precision() to ostream_base for setting float/double/long double precision in the output.

FAQ

Why does the reader stream encounter errors for CSV with text not enclosed within quotes?

Answer: To resolve it, please remember to call enable_trim_quote_on_str with false.

Product that Makes Use of MiniCSV

Points of Interest

Recently, I encountered a interesting benchmark result of reading a 5MB file, up against a string_view CSV parser by Vincent La. You can see the effects of Short String Buffer (SSO).

Benchmark of every column is 12 chars in length

The length is within SSO limit (24 bytes) to avoid heap allocation.

C++
csv_parser timing:113ms
MiniCSV timing:71ms
CSV Stream timing:187ms

Benchmark of every column is 30 chars in length

The length is outside SSO limit, memory has to allocated on the heap! Now string_view csv_parser wins.

C++
csv_parser timing:147ms
MiniCSV timing:175ms
CSV Stream timing:434ms

Note: Through I am not sure why CSV Stream is so slow in VC++ 15.9 update.

Note: Benchmark could be different with other C++ compiler like G++ and Clang++ which I do not have access now.

Related Articles

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer (Senior)
Singapore Singapore
Shao Voon is from Singapore. His interest lies primarily in computer graphics, software optimization, concurrency, security, and Agile methodologies.

In recent years, he shifted focus to software safety research. His hobby is writing a free C++ DirectX photo slideshow application which can be viewed here.

Comments and Discussions

 
QuestionWould adding a friend function do the same thing as creating a member function? Pin
vickoza21-Mar-23 6:46
vickoza21-Mar-23 6:46 
PraiseRe: Would adding a friend function do the same thing as creating a member function? Pin
Shao Voon Wong21-Mar-23 15:57
mvaShao Voon Wong21-Mar-23 15:57 
QuestionHow About A Managed-Code Version? Pin
Hyland Computer Systems17-Mar-23 5:56
Hyland Computer Systems17-Mar-23 5:56 
AnswerRe: How About A Managed-Code Version? Pin
Shao Voon Wong19-Mar-23 15:01
mvaShao Voon Wong19-Mar-23 15:01 
Praisejust great Pin
Southmountain21-Feb-21 6:54
Southmountain21-Feb-21 6:54 
Bug__PRETTY_FUNCTION__ use detection is incorrect Pin
Member 1402981026-Oct-18 0:46
Member 1402981026-Oct-18 0:46 
GeneralRe: __PRETTY_FUNCTION__ use detection is incorrect Pin
Shao Voon Wong27-Oct-18 3:02
mvaShao Voon Wong27-Oct-18 3:02 
Questioncan i change encoding Pin
Bogdan Pank1-Dec-17 0:47
Bogdan Pank1-Dec-17 0:47 
Questionhow to read comments lines Pin
Bogdan Pank20-Nov-17 0:56
Bogdan Pank20-Nov-17 0:56 
AnswerRe: how to read comments lines Pin
Shao Voon Wong20-Nov-17 1:53
mvaShao Voon Wong20-Nov-17 1:53 
QuestionGreat! Pin
Jaime Stuardo - Chile16-Feb-17 4:35
Jaime Stuardo - Chile16-Feb-17 4:35 
AnswerRe: Great! Pin
Shao Voon Wong16-Feb-17 18:15
mvaShao Voon Wong16-Feb-17 18:15 
QuestionReally nice. Download missing? Pin
mikepwilson13-Jan-17 8:44
mikepwilson13-Jan-17 8:44 
AnswerRe: Really nice. Download missing? Pin
Shao Voon Wong13-Jan-17 10:31
mvaShao Voon Wong13-Jan-17 10:31 
AnswerRe: Really nice. Download missing? Pin
Shao Voon Wong23-Jan-17 11:42
mvaShao Voon Wong23-Jan-17 11:42 
GeneralRe: Really nice. Download missing? Pin
mikepwilson24-Jan-17 8:32
mikepwilson24-Jan-17 8:32 
PraiseGreat Pin
GeorgeCohen195323-Aug-16 3:57
GeorgeCohen195323-Aug-16 3:57 
QuestionUnicode? Pin
Tom Tom16-May-16 13:32
Tom Tom16-May-16 13:32 
AnswerRe: Unicode? Pin
Shao Voon Wong16-May-16 16:02
mvaShao Voon Wong16-May-16 16:02 
GeneralMy vote of 5 Pin
Alexander Navalov7-May-16 0:45
professionalAlexander Navalov7-May-16 0:45 
GeneralMy vote of 5 Pin
Alexander Navalov7-May-16 0:36
professionalAlexander Navalov7-May-16 0:36 
GeneralGreat WorkShao! Pin
david2111422-Oct-15 4:08
david2111422-Oct-15 4:08 
QuestionUse in many classes of project Pin
vasvladal21-Oct-15 2:32
vasvladal21-Oct-15 2:32 
AnswerRe: Use in many classes of project Pin
Shao Voon Wong21-Oct-15 2:50
mvaShao Voon Wong21-Oct-15 2:50 
Hi, I have just fixed it in v1.7.7. Please download the updated source code again.

GeneralRe: Use in many classes of project Pin
vasvladal21-Oct-15 18:13
vasvladal21-Oct-15 18:13 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.