randolf.ca  1.00
Randolf Richardson's C++ classes
Loading...
Searching...
No Matches
randolf::Atomize Class Reference

The Atomize class provides an object-oriented interface with array-style access to a string, that was efficiently separated into atoms, and with more granularity and functionality through the use of modes (see mode for deatils) and certain API methods, hence one could say "Atomize can split your atoms safely.". More...

#include <Atomize>

+ Collaboration diagram for randolf::Atomize:

Public Types

enum  ATOMIZE_FLAGS : int {
  ATOMIZE_DEFAULT = 0 ,
  ATOMIZE_USE_ALL_QUOTES = 1 ,
  ATOMIZE_IGNORE_QUOTES = 2 ,
  ATOMIZE_DELETE_QUOTES = 4
}
 Optional flags that alter, modify, or enhance the operation of atomization intake. More...
 

Public Member Functions

 Atomize (const char *intake, const int len, const int flags, const char mode)
 Instantiate an Atomize object using the specified ASCIIZ string for intake.
 
 Atomize (const char *intake, const int len=-1, const int flags=ATOMIZE_DEFAULT, const char *mode=nullptr)
 Instantiate an Atomize object using the specified ASCIIZ string for intake.
 
 Atomize (const int flags, const char mode) noexcept
 Instantiate an empty Atomize object, which is expected to be used with the assign method at some later point. (This is particularly useful for defining a local Atomize object in a header file in a way that won't throw an exception, including invalid mode codes {which will just be ignored}.)
 
 Atomize (const int flags=ATOMIZE_DEFAULT, const char *mode=nullptr) noexcept
 Instantiate an empty Atomize object, which is expected to be used with the assign method at some later point. (This is particularly useful for defining a local Atomize object in a header file in a way that won't throw an exception, including invalid mode codes {which will just be ignored}.)
 
 Atomize (const std::string intake, const int len, const int flags, const char mode)
 Instantiate an Atomize object using the specified string for intake.
 
 Atomize (const std::string intake, const int len=-1, const int flags=ATOMIZE_DEFAULT, const char *mode=nullptr)
 Instantiate an Atomize object using the specified string for intake.
 
 ~Atomize () noexcept
 Destructor.
 
Atomizeassign (const char *intake, const int len=-1)
 Assign (and interpret) a new ASCIIZ string (flags and modes are inherited).
 
Atomizeassign (const std::string intake, const int len=-1)
 Assign (and interpret) a new string (flags and modes are inherited).
 
std::string at (int index, const char *mode=nullptr)
 Access to atoms, whilst utilizing the operator mode that was configured using the mode method. Return an entire atom.
 
Atomizeclear ()
 Clear this Atomize's underlying data and reset all states. This does not reset nor alter flags or modes.
 
bool empty ()
 Confirm that there are no atoms.
 
int flags ()
 Obtain current set of internal flags.
 
Atomizeflags (const int flags)
 Obtain current set of internal flags.
 
std::string get (int index)
 Return the entire atom.
 
std::string get_key (int index)
 Return the key portion of an atom, or the entire atom if a key-vlue pair wasn't detected.
 
std::string get_value (int index)
 Return the value portion of an atom, or an empty string if a key-vlue pair wasn't detected.
 
bool has_kv (int index)
 Indicates whether the specified atom was split into a key-value pair (if it was, then the key and the value are delimited by the first instance of an equal sign {=}).
 
std::string mode () noexcept
 Get the operator modes that are set for the operator[] operator.
 
Atomizemode (const char *mode)
 Set the operator modes for use with the operator[] operator (modes that are not specified will be reset to their defaults).
 
Atomizemode (const char mode)
 Set the operator modes for use with the operator[] operator (modes that are not specified will be reset to their defaults).
 
std::string operator[] (int index)
 Array-style access to atoms, whilst utilizing the operator mode that was configured using the mode method.
 
size_t size ()
 Return the total quantity of atoms.
 
std::vector< std::string > to_vector (bool split_kv_pairs=false) noexcept
 Generate an std::vector<std::string> that contains all atoms.
 

Detailed Description

The Atomize class provides an object-oriented interface with array-style access to a string, that was efficiently separated into atoms, and with more granularity and functionality through the use of modes (see mode for deatils) and certain API methods, hence one could say "Atomize can split your atoms safely.".

When parsing a line or block of text, the following is assumed:

  • parameters are separated by one or multiple consecutive whitespace characters (space, null, tab, linefeed, carriage return)
  • values that are enclosed within a set of quotation marks may include whitespace characters that will then not be interpreted as delimiters

Data is interpreted in a single pass during instantiation or assignment, and the interpretation algorithm is written in an optimized programming style to ensure high efficiency.

There are no memory leaks, which, as it turns out, is particularly important not only because the specialized parsing involved is often utilized in heavy data processing loops where speed and reliability are needed, but also because some of the libraries I've tested that provide similar functionality leak memory or fail in other ways that were triggered by peculiar or fuzzed data, which this class is not impacted by (primarily because I don't trust data to always be "as expected").

Use case

Parsing command lines or configuration settings can become challenging when multiple parameters are provided on a single line, and some of those parameters include quoted text that contains spaces. This class handles all of these scenarios and makes it easy to access each parameter in the same manner that arrays and vectors are accessed.

Background

I created this class to make it easier to write internet server daemons.

Getting started
Author
Randolf Richardson
Version
1.00
History
2025-Jan-20 v1.00 Initial version
Conventions
Lower-case letter "h" is regularly used in partial example code to represent an instantiated rhostname object.

An ASCIIZ string is a C-string (char* array) that includes a terminating null (0) character at the end.

Notes

I use the term "ASCIIZ string" to indicate an array of characters that's terminated by a 0 (a.k.a., null). Although this is very much the same as a C-string, the difference is that in many API functions a C-string must often be accompanied by its length value. When referring to an ASCIIZ string, I'm intentionally indicating that the length of the string is not needed because the string is null-terminated. (This term was also commonly used in assembly language programming in the 1970s, 1980s, and 1990s, and as far as I know is still used by machine language programmers today.)

Examples
#include <iostream> // std::cout, std::cerr, std::endl, etc.
#include <randolf/Atomize>
int main(int argc, char *argv[]) {
randolf::Atomize a("parameters key=value");
std::cout << "atom0: " << a.at(0) << std::endl;
std::cout << "atom1: " << a.at(1) << std::endl;
std::cout << "key1: " << a.at(1, 'k') << std::endl;
std::cout << "val1: " << a.at(1, 'v') << std::endl;
return EXIT_SUCCESS;
} // -x- int main -x-
The Atomize class provides an object-oriented interface with array-style access to a string,...
Definition Atomize:96

Parameter stacking is also supported (with methods that return Atomize*).

Member Enumeration Documentation

◆ ATOMIZE_FLAGS

Optional flags that alter, modify, or enhance the operation of atomization intake.

Enumerator
ATOMIZE_DEFAULT 

The ATOMIZE_DEFAULT flag isn't necessary, but it's included here for completeness as it accomodates programming styles that prefer to emphasize when defaults are being relied upon.

ATOMIZE_USE_ALL_QUOTES 

Interpret all quotation marks (the default is to only utilize enclosing quotation marks).

ATOMIZE_IGNORE_QUOTES 

Don't interpret quotation marks as grouping characters.

ATOMIZE_DELETE_QUOTES 

Delete quotation marks that function as grouping characters (this flag has no effect when ATOMIZE_IGNORE_QUOTES is set).

Constructor & Destructor Documentation

◆ Atomize() [1/6]

randolf::Atomize::Atomize ( const int flags = ATOMIZE_DEFAULT,
const char * mode = nullptr )
inlinenoexcept

Instantiate an empty Atomize object, which is expected to be used with the assign method at some later point. (This is particularly useful for defining a local Atomize object in a header file in a way that won't throw an exception, including invalid mode codes {which will just be ignored}.)

Parameters
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (nullptr default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ Atomize() [2/6]

randolf::Atomize::Atomize ( const int flags,
const char mode )
inlinenoexcept

Instantiate an empty Atomize object, which is expected to be used with the assign method at some later point. (This is particularly useful for defining a local Atomize object in a header file in a way that won't throw an exception, including invalid mode codes {which will just be ignored}.)

Parameters
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (0 default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ Atomize() [3/6]

randolf::Atomize::Atomize ( const char * intake,
const int len = -1,
const int flags = ATOMIZE_DEFAULT,
const char * mode = nullptr )
inline

Instantiate an Atomize object using the specified ASCIIZ string for intake.

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Parameters
intakeThe intake ASCIIZ string
lenThe length of the intake string
-1 = Measure ASCIIZ string
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (nullptr default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ Atomize() [4/6]

randolf::Atomize::Atomize ( const char * intake,
const int len,
const int flags,
const char mode )
inline

Instantiate an Atomize object using the specified ASCIIZ string for intake.

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Parameters
intakeThe intake ASCIIZ string
lenThe length of the intake string
-1 = Measure ASCIIZ string
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (0 default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ Atomize() [5/6]

randolf::Atomize::Atomize ( const std::string intake,
const int len = -1,
const int flags = ATOMIZE_DEFAULT,
const char * mode = nullptr )
inline

Instantiate an Atomize object using the specified string for intake.

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Parameters
intakeThe intake C++ string
lenThe length of the intake string
-1 = Obtain length from intake.size() method
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (nullptr default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ Atomize() [6/6]

randolf::Atomize::Atomize ( const std::string intake,
const int len,
const int flags,
const char mode )
inline

Instantiate an Atomize object using the specified string for intake.

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Parameters
intakeThe intake C++ string
lenThe length of the intake string
-1 = Obtain length from intake.size() method
flagsSee ATOMIZE_FLAGS for a list of options
modeSet the modes (0 default means don't set the modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ ~Atomize()

randolf::Atomize::~Atomize ( )
inlinenoexcept

Destructor.

Member Function Documentation

◆ assign() [1/2]

Atomize * randolf::Atomize::assign ( const char * intake,
const int len = -1 )
inline

Assign (and interpret) a new ASCIIZ string (flags and modes are inherited).

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Returns
The same Atomize object so as to facilitate stacking
Parameters
intakeThe intake ASCIIZ string
lenThe length of the intake string
-1 = Measure ASCIIZ string

◆ assign() [2/2]

Atomize * randolf::Atomize::assign ( const std::string intake,
const int len = -1 )
inline

Assign (and interpret) a new string (flags and modes are inherited).

Exceptions
std::invalid_argumentIf the parameters are malformed in some way.
Returns
The same Atomize object so as to facilitate stacking
Parameters
intakeThe intake C++ string
lenThe length of the intake string
-1 = Obtain length from intake.size() method

◆ at()

std::string randolf::Atomize::at ( int index,
const char * mode = nullptr )
inline

Access to atoms, whilst utilizing the operator mode that was configured using the mode method. Return an entire atom.

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
Entire atom
See also
get
get_key
get_value
operator[]
Parameters
indexWhich atom to obtain (0 = first atom; negative values count backward from the last atom in the internal array)
modeTemporarily override the current modes (nullptr default means don't change modes)
Granulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ clear()

Atomize * randolf::Atomize::clear ( )
inline

Clear this Atomize's underlying data and reset all states. This does not reset nor alter flags or modes.

Returns
The same Atomize object so as to facilitate stacking

◆ empty()

bool randolf::Atomize::empty ( )
inline

Confirm that there are no atoms.

Returns
TRUE = no atoms
FALSE = at least one atom exists
See also
size

◆ flags() [1/2]

int randolf::Atomize::flags ( )
inline

Obtain current set of internal flags.

Returns
Current flags, as defined in ATOMIZE_FLAGS
See also
flags(const int)
mode

◆ flags() [2/2]

Atomize * randolf::Atomize::flags ( const int flags)
inline

Obtain current set of internal flags.

Returns
The same Atomize object so as to facilitate stacking
See also
flags
mode
Parameters
flagsSee ATOMIZE_FLAGS for a list of options

◆ get()

std::string randolf::Atomize::get ( int index)
inline

Return the entire atom.

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
Key portion of atom (or the entire atom if a key-value pair wasn't detected)
See also
at
get_key
get_value
has_kv
operator[int]
Parameters
indexWhich atom to obtain (0 = first atom; negative values count backward from the last atom in the internal array)

◆ get_key()

std::string randolf::Atomize::get_key ( int index)
inline

Return the key portion of an atom, or the entire atom if a key-vlue pair wasn't detected.

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
Key portion of atom (or the entire atom if a key-value pair wasn't detected)
See also
at
get
get_value
has_kv
operator[int]
Parameters
indexWhich atom to obtain (0 = first atom; negative values count backward from the last atom in the internal array)

◆ get_value()

std::string randolf::Atomize::get_value ( int index)
inline

Return the value portion of an atom, or an empty string if a key-vlue pair wasn't detected.

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
Value portion of atom (or an empty string if a key-value pair wasn't detected)
See also
at
get
get_key
has_kv
operator[int]
Parameters
indexWhich atom to obtain (0 = first atom; negative values count backward from the last atom in the internal array)

◆ has_kv()

bool randolf::Atomize::has_kv ( int index)
inline

Indicates whether the specified atom was split into a key-value pair (if it was, then the key and the value are delimited by the first instance of an equal sign {=}).

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
TRUE = key-value pair was detected by the parsing algorithm
FALSE = this atom was not split into a key-value pair
See also
at
get
get_key
get_value
operator[int]
Parameters
indexWhich atom to obtain (0 = first atom; negative values count backward from the last atom in the internal array)

◆ mode() [1/3]

std::string randolf::Atomize::mode ( )
inlinenoexcept

Get the operator modes that are set for the operator[] operator.

Exceptions
std::invalid_argumentif an incorrect value is provided
Returns
The same Atomize object so as to facilitate stacking
See also
flags
mode(const char*)

◆ mode() [2/3]

Atomize * randolf::Atomize::mode ( const char * mode)
inline

Set the operator modes for use with the operator[] operator (modes that are not specified will be reset to their defaults).

Calling this method with "\0" as the parameter will result in resetting all operator modes to the base defaults.

Exceptions
std::invalid_argumentif an incorrect value is provided
Returns
The same Atomize object so as to facilitate stacking
See also
flags
mode
Parameters
modeGranulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ mode() [3/3]

Atomize * randolf::Atomize::mode ( const char mode)
inline

Set the operator modes for use with the operator[] operator (modes that are not specified will be reset to their defaults).

Calling this method with "\0" as the parameter will result in resetting all operator modes to the base defaults.

Exceptions
std::invalid_argumentif an incorrect value is provided
Returns
The same Atomize object so as to facilitate stacking
See also
flags
mode
Parameters
modeGranulatarity (default is to return the entire atom):
"\0" = entire atom (default)
"k" = key (same as 0 if no key-value pair was detected)
"v" = value (will be empty if no key-value pair was detected)
"p" = returns: "1" = is a key-value pair / "" = not a key-value pair
Conversion options (default is for no conversion):
"c" = Camel_Case
"f" = First character in upper-case
"l" = all lower case
"u" = ALL UPPER CASE

◆ size()

size_t randolf::Atomize::size ( )
inline

Return the total quantity of atoms.

Returns
Quantity of atoms
See also
empty

◆ to_vector()

std::vector< std::string > randolf::Atomize::to_vector ( bool split_kv_pairs = false)
inlinenoexcept

Generate an std::vector<std::string> that contains all atoms.

Returns
std::string
Parameters
split_kv_pairsFALSE = don't split key-value pairs (default)
TRUE = split key-value pairs into separate entries (for key names, the equal sign will be included at the end of the string)

◆ operator[]()

std::string randolf::Atomize::operator[] ( int index)
inline

Array-style access to atoms, whilst utilizing the operator mode that was configured using the mode method.

Exceptions
std::out_of_rangeif the index is out-of-range
Returns
std::string
See also
at
mode
Parameters
indexIndex of character to access (0 = first atom; negative index values are calculated in reverse, starting with -1 as the final atom)

The documentation for this class was generated from the following file: