Click here to Skip to main content
15,885,027 members
Articles / Programming Languages / C++14

C++14: CSV Stream based on C File API

Rate me:
Please Sign up or sign in to vote.
3.69/5 (12 votes)
6 May 2021CPOL3 min read 22.3K   364   17   10
C++14: CSV Stream based on C File API to remove code bloat from STL File Streams
The purpose of this library which is based on C file API is to reduce code bloat brought on by use of C++ STL streams.

Table of Contents

Primary Motivation

The library is based on C File API. The purpose is to reduce code bloat brought on by use of C++ STL streams. Its usage is similar to Minimalistic CSV Stream which is based on C++ File Streams and likewise a header-only library. Just change the namespace from mini to capi. Some of its optimizations have been back-ported to Minimalistic CSV Stream version 1.8.3 including passing by reference whenever possible, caching the result with data member and avoiding operations that return new string object. Reader can do a diff between v1.8.2 and v1.8.3 to see the difference.

Clearing the Misconception

Using this class alone does not reduce the code bloat in your application. That would only come about when all other fstream, stringstream and cout/cin calls are removed or replaced with non STL stream equivalents.

Breaking Changes

If you overload the STL stream operators, instead of the CSV stream operators for your custom data type, the class cannot be just a drop-in replacement for MiniCSV. You have to overload the CSV stream operators.

Optional Dependencies

Boost Spirit Qi v2

To use Boost Spirit Qi for string to data conversion, define USE_BOOST_SPIRIT_QI before the header inclusion.

C++
#define USE_BOOST_SPIRIT_QI
#include "csv_stream.h"

To read char as ASCII not integer, define CHAR_AS_ASCII before the header inclusion.

C++
#define CHAR_AS_ASCII
#include "csv_stream.h"

Warning: This macro detection is removed in v0.5.2 as it is a global wide setting. For users that want to read/write char as numeric 8-bit integer, use NChar class. Use os << csv::NChar(ch) for writing but user can cast it to int without using NChar. And is >> csv::NChar(ch) for reading integers ranged from -127 to 128 into char variable.

Benchmark

Note: Benchmark results based in latest minicsv v1.8.2.
Note: Various methods only affect the input stream benchmark results.

File Stream Benchmark

C++
      // minicsv using std::stringstream
      mini::csv::ofstream:  387ms
      mini::csv::ifstream:  386ms
      // minicsv using Boost lexical_cast
      mini::csv::ofstream:  405ms
      mini::csv::ifstream:  283ms
      // capi csv using to_string
      capi::csv::ofstream:  152ms
      capi::csv::ifstream:  279ms
      // capi csv using Boost Spirit Qi
      capi::csv::ofstream:  163ms
      capi::csv::ifstream:  266ms
      // capi in-memory cached file csv
capi::csv::ocachedfstream:  124ms
capi::csv::icachedfstream:  127ms
      // capi in-memory cached file csv using Boost Spirit Qi
capi::csv::ocachedfstream:  122ms
capi::csv::icachedfstream:  100ms

Note: In-memory input stream means loading the whole file in memory before processing.
Note: In-memory output stream means keeping the contents in memory before saving.
Caution: In-memory streams requires sufficient memory to keep file contents on memory.

String Stream Benchmark

C++
// minicsv using std::stringstream
mini::csv::ostringstream:  362ms
mini::csv::istringstream:  377ms
// minicsv using Boost lexical_cast
mini::csv::ostringstream:  383ms
mini::csv::istringstream:  283ms
// capi csv
capi::csv::ostringstream:  113ms
capi::csv::istringstream:  127ms
// capi csv using Boost Spirit Qi
capi::csv::ostringstream:  116ms
capi::csv::istringstream:  106ms

Caveat

Instantiation can be slow because of many data members to initialize.

Sample Code for File Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ofstream os("products.txt");
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.flush();
os.close();

csv::ifstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Sample Code for Cached File Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ocachedfstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");

csv::icachedfstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Sample Code for String Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ostringstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");

csv::istringstream is(os.get_text().c_str());
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Output

File content:

"Shampoo",200,15.000000
"Towel",300,6.000000

Display output:

Shampoo,200,15
Towel,300,6

Change Delimiter on the Fly

Delimiter can be changed on the fly on the input/output stream with sep class. The example has whitespace and comma as delimiter in the text.

C++
// demo sep class usage
csv::istringstream is("vt 37.8,44.32,75.1");
is.set_delimiter(' ', "$$");
csv::sep space(' ', "<space>");
csv::sep comma(',', "<comma>");
while (is.read_line())
{
    std::string type;
    float r = 0, b = 0, g = 0;
    is >> space >> type >> comma >> r >> b >> g;
    // display the read items
    std::cout << type << "|" << r << "|" << b << "|" << g << std::endl;
}

The code is hosted at Github.

History

  • 28th January, 2017: Version 0.5.0: First release
  • 19th February, 2017: Version 0.5.1: Fix Input Stream exception while reading char
  • 12th March, 2017: Version 0.5.2 fixed some char output problems and added NChar (char wrapper) class to write to numeric value [-127..128] to char variables.
    C++
    bool test_nchar(bool enable_quote)
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        os.enable_surround_quote_on_str(enable_quote, '\"');
    
        os << "Wallet" << 56 << NEWLINE;
    
        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        is.enable_trim_quote_on_str(enable_quote, '\"');
    
        while (is.read_line())
        {
            try
            {
                std::string dest_name = "";
                char dest_char = 0;
    
                is >> dest_name >> csv::NChar(dest_char);
    
                std::cout << dest_name << ", " 
                    << (int)dest_char << std::endl;
            }
            catch (std::runtime_error& e)
            {
                std::cerr << __FUNCTION__ << e.what() << std::endl;
            }
        }
        return true;
    }

    Display output:

    Wallet, 56
  • 18th September, 2017: Version 0.5.3:

    If your escape parameter in set_delimiter() is empty, text with delimiter will be automatically enclosed in quotes (to be compliant with Microsoft Excel and general CSV practice)

    JavaScript
    "Hello,World",600

    Microsoft Excel and CSV Stream read this as "Hello,World" and 600.

  • 12th August, 2018: Version 0.5.4: Added overloaded file open functions that take in wide char file parameter (Only available on win32)
  • 21st February, 2021: Version 0.5.4e: Fixed infinite loop in quote_unescape.
  • 6th May, 2021: CSV Stream detects the end of line with the presence of newline. Newline in the string input inevitably breaks the parsing. New version 0.5.5 takes care of newline by escaping it.

Related Articles

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Singapore Singapore
Shao Voon is from Singapore. His interest lies primarily in computer graphics, software optimization, concurrency, security, and Agile methodologies.

In recent years, he shifted focus to software safety research. His hobby is writing a free C++ DirectX photo slideshow application which can be viewed here.

Comments and Discussions

 
QuestionA question Pin
Southmountain17-May-21 8:17
Southmountain17-May-21 8:17 
GeneralEnglish Pin
Raj veer Raj T6-May-21 22:22
Raj veer Raj T6-May-21 22:22 
QuestionMissing files Pin
Michael Haephrati20-Feb-21 22:44
professionalMichael Haephrati20-Feb-21 22:44 
AnswerRe: Missing files Pin
Shao Voon Wong21-Feb-21 12:31
mvaShao Voon Wong21-Feb-21 12:31 
Question[My vote of 1] Not an article Pin
Richard MacCutchan30-Jan-17 0:24
mveRichard MacCutchan30-Jan-17 0:24 
AnswerRe: [My vote of 1] Not an article Pin
Shao Voon Wong30-Jan-17 0:36
mvaShao Voon Wong30-Jan-17 0:36 
GeneralRe: [My vote of 1] Not an article Pin
Richard MacCutchan30-Jan-17 1:05
mveRichard MacCutchan30-Jan-17 1:05 
GeneralRe: [My vote of 1] Not an article Pin
Shao Voon Wong31-Jan-17 17:37
mvaShao Voon Wong31-Jan-17 17:37 
I have updated the 1st paragraph with more information.

GeneralRe: [My vote of 1] Not an article Pin
Richard MacCutchan31-Jan-17 20:36
mveRichard MacCutchan31-Jan-17 20:36 
AnswerRe: [My vote of 1] Not an article Pin
Mike Diack7-May-21 5:52
Mike Diack7-May-21 5:52 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.