C++14: CSV Stream based on C File API






3.69/5 (12 votes)
C++14: CSV Stream based on C File API to remove code bloat from STL File Streams
Table of Contents
- Primary Motivation
- Clearing the Misconception
- Breaking Changes
- Optional Dependencies
- Benchmark
- Caveat
- Sample Code for File Stream
- Sample Code for Cached File Stream
- Sample Code for String Stream
- Output
- Change Delimiter on the Fly
- History
Primary Motivation
The library is based on C File API. The purpose is to reduce code bloat brought on by use of C++ STL streams. Its usage is similar to Minimalistic CSV Stream which is based on C++ File Streams and likewise a header-only library. Just change the namespace from mini
to capi
. Some of its optimizations have been back-ported to Minimalistic CSV Stream version 1.8.3 including passing by reference whenever possible, caching the result with data member and avoiding operations that return new string
object. Reader can do a diff between v1.8.2 and v1.8.3 to see the difference.
Clearing the Misconception
Using this class alone does not reduce the code bloat in your application. That would only come about when all other fstream
, stringstream
and cout/cin
calls are removed or replaced with non STL stream equivalents.
Breaking Changes
If you overload the STL stream operators, instead of the CSV stream operators for your custom data type, the class cannot be just a drop-in replacement for MiniCSV. You have to overload the CSV stream operators.
Optional Dependencies
Boost Spirit Qi v2
To use Boost Spirit Qi for string to data conversion, define USE_BOOST_SPIRIT_QI
before the header inclusion.
#define USE_BOOST_SPIRIT_QI
#include "csv_stream.h"
To read char
as ASCII not integer, define CHAR_AS_ASCII
before the header inclusion.
#define CHAR_AS_ASCII
#include "csv_stream.h"
Warning: This macro detection is removed in v0.5.2 as it is a global wide setting. For users that want to read/write char
as numeric 8-bit integer, use NChar
class. Use os << csv::NChar(ch)
for writing but user can cast it to int
without using NChar
. And is >> csv::NChar(ch)
for reading integers ranged from -127 to 128 into char
variable.
Benchmark
Note: Benchmark results based in latest minicsv v1.8.2.
Note: Various methods only affect the input stream benchmark results.
File Stream Benchmark
// minicsv using std::stringstream
mini::csv::ofstream: 387ms
mini::csv::ifstream: 386ms
// minicsv using Boost lexical_cast
mini::csv::ofstream: 405ms
mini::csv::ifstream: 283ms
// capi csv using to_string
capi::csv::ofstream: 152ms
capi::csv::ifstream: 279ms
// capi csv using Boost Spirit Qi
capi::csv::ofstream: 163ms
capi::csv::ifstream: 266ms
// capi in-memory cached file csv
capi::csv::ocachedfstream: 124ms
capi::csv::icachedfstream: 127ms
// capi in-memory cached file csv using Boost Spirit Qi
capi::csv::ocachedfstream: 122ms
capi::csv::icachedfstream: 100ms
Note: In-memory input stream means loading the whole file in memory before processing.
Note: In-memory output stream means keeping the contents in memory before saving.
Caution: In-memory streams requires sufficient memory to keep file contents on memory.
String Stream Benchmark
// minicsv using std::stringstream
mini::csv::ostringstream: 362ms
mini::csv::istringstream: 377ms
// minicsv using Boost lexical_cast
mini::csv::ostringstream: 383ms
mini::csv::istringstream: 283ms
// capi csv
capi::csv::ostringstream: 113ms
capi::csv::istringstream: 127ms
// capi csv using Boost Spirit Qi
capi::csv::ostringstream: 116ms
capi::csv::istringstream: 106ms
Caveat
Instantiation can be slow because of many data members to initialize.
Sample Code for File Stream
#include "csv_stream.h"
using namespace capi;
csv::ofstream os("products.txt");
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
os << "Shampoo" << 200 << 15.0f << NEWLINE;
os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.flush();
os.close();
csv::ifstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');
if (is.is_open())
{
std::string name = "";
int qty = 0;
float price = 0.0f;
while (is.read_line())
{
try
{
is >> name >> qty >> price;
// display the read items
std::cout << name << "," << qty
<< "," << price << std::endl;
}
catch (std::runtime_error& e)
{
std::cerr << e.what() << std::endl;
}
}
}
Sample Code for Cached File Stream
#include "csv_stream.h"
using namespace capi;
csv::ocachedfstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
os << "Shampoo" << 200 << 15.0f << NEWLINE;
os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");
csv::icachedfstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');
if (is.is_open())
{
std::string name = "";
int qty = 0;
float price = 0.0f;
while (is.read_line())
{
try
{
is >> name >> qty >> price;
// display the read items
std::cout << name << "," << qty
<< "," << price << std::endl;
}
catch (std::runtime_error& e)
{
std::cerr << e.what() << std::endl;
}
}
}
Sample Code for String Stream
#include "csv_stream.h"
using namespace capi;
csv::ostringstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
os << "Shampoo" << 200 << 15.0f << NEWLINE;
os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");
csv::istringstream is(os.get_text().c_str());
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');
if (is.is_open())
{
std::string name = "";
int qty = 0;
float price = 0.0f;
while (is.read_line())
{
try
{
is >> name >> qty >> price;
// display the read items
std::cout << name << "," << qty
<< "," << price << std::endl;
}
catch (std::runtime_error& e)
{
std::cerr << e.what() << std::endl;
}
}
}
Output
File content:
"Shampoo",200,15.000000
"Towel",300,6.000000
Display output:
Shampoo,200,15
Towel,300,6
Change Delimiter on the Fly
Delimiter can be changed on the fly on the input/output stream with sep
class. The example has whitespace and comma as delimiter in the text.
// demo sep class usage
csv::istringstream is("vt 37.8,44.32,75.1");
is.set_delimiter(' ', "$$");
csv::sep space(' ', "<space>");
csv::sep comma(',', "<comma>");
while (is.read_line())
{
std::string type;
float r = 0, b = 0, g = 0;
is >> space >> type >> comma >> r >> b >> g;
// display the read items
std::cout << type << "|" << r << "|" << b << "|" << g << std::endl;
}
The code is hosted at Github.
History
- 28th January, 2017: Version 0.5.0: First release
- 19th February, 2017: Version 0.5.1: Fix Input Stream exception while reading
char
- 12th March, 2017: Version 0.5.2 fixed some
char
output problems and addedNChar
(char
wrapper) class to write to numeric value[-127..128]
tochar
variables.bool test_nchar(bool enable_quote) { csv::ostringstream os; os.set_delimiter(',', "$$"); os.enable_surround_quote_on_str(enable_quote, '\"'); os << "Wallet" << 56 << NEWLINE; csv::istringstream is(os.get_text().c_str()); is.set_delimiter(',', "$$"); is.enable_trim_quote_on_str(enable_quote, '\"'); while (is.read_line()) { try { std::string dest_name = ""; char dest_char = 0; is >> dest_name >> csv::NChar(dest_char); std::cout << dest_name << ", " << (int)dest_char << std::endl; } catch (std::runtime_error& e) { std::cerr << __FUNCTION__ << e.what() << std::endl; } } return true; }
Display output:
Wallet, 56
- 18th September, 2017: Version 0.5.3:
If your escape parameter in
set_delimiter()
is empty, text with delimiter will be automatically enclosed in quotes (to be compliant with Microsoft Excel and general CSV practice)"Hello,World",600
Microsoft Excel and CSV Stream read this as "
Hello,World
" and600
. - 12th August, 2018: Version 0.5.4: Added overloaded file open functions that take in wide char file parameter (Only available on win32)
- 21st February, 2021: Version 0.5.4e: Fixed infinite loop in
quote_unescape
. - 6th May, 2021: CSV Stream detects the end of line with the presence of newline. Newline in the string input inevitably breaks the parsing. New version 0.5.5 takes care of newline by escaping it.