12 /*======================================================================*//**
14 The @ref Atomize class provides an object-oriented interface with array-style
15 access to a string, that was efficiently separated into atoms, and with more
16 granularity and functionality through the use of modes (see @ref mode for
17 deatils) and certain API methods, hence one could say "Atomize can split your
20 When parsing a line or block of text, the following is assumed:
22 - parameters are separated by one or multiple consecutive whitespace
23 characters (space, null, tab, linefeed, carriage return)
25 - values that are enclosed within a set of quotation marks may include
26 whitespace characters that will then not be interpreted as delimiters
28 Data is interpreted in a single pass during instantiation or assignment, and
29 the interpretation algorithm is written in an optimized programming style to
30 ensure high efficiency.
32 There are no memory leaks, which, as it turns out, is particularly important
33 not only because the specialized parsing involved is often utilized in heavy
34 data processing loops where speed and reliability are needed, but also
35 because some of the libraries I've tested that provide similar functionality
36 leak memory or fail in other ways that were triggered by peculiar or fuzzed
37 data, which this class is not impacted by (primarily because I don't trust
38 data to always be "as expected").
42 Parsing command lines or configuration settings can become challenging when
43 multiple parameters are provided on a single line, and some of those
44 parameters include quoted text that contains spaces. This class handles all
45 of these scenarios and makes it easy to access each parameter in the same
46 manner that arrays and vectors are accessed.
50 I created this class to make it easier to write internet server daemons.
54 @author Randolf Richardson
57 2025-Jan-20 v1.00 Initial version
60 Lower-case letter "h" is regularly used in partial example code to represent
61 an instantiated rhostname object.
63 An ASCIIZ string is a C-string (char* array) that includes a terminating null
64 (0) character at the end.
68 I use the term "ASCIIZ string" to indicate an array of characters that's
69 terminated by a 0 (a.k.a., null). Although this is very much the same as a
70 C-string, the difference is that in many API functions a C-string must often
71 be accompanied by its length value. When referring to an ASCIIZ string, I'm
72 intentionally indicating that the length of the string is not needed because
73 the string is null-terminated. (This term was also commonly used in assembly
74 language programming in the 1970s, 1980s, and 1990s, and as far as I know is
75 still used by machine language programmers today.)
80 #include <iostream> // std::cout, std::cerr, std::endl, etc.
82 #include <randolf/Atomize>
84 int main(int argc, char *argv[]) {
85 randolf::Atomize a("parameters key=value");
86 std::cout << "atom0: " << a.at(0) << std::endl;
87 std::cout << "atom1: " << a.at(1) << std::endl;
88 std::cout << "key1: " << a.at(1, 'k') << std::endl;
89 std::cout << "val1: " << a.at(1, 'v') << std::endl;
94 Parameter stacking is also supported (with methods that return @c Atomize*).
95 *///=========================================================================
99 // --------------------------------------------------------------------------
100 // Internal structures.
101 // --------------------------------------------------------------------------
103 uint fpos = 0; // First position/offset (key also begins here)
104 uint flen = 0; // Full length
105 uint klen = 0; // Key length
106 uint vpos = 0; // Value position/offset (0 == not present)
107 uint vlen = 0; // Value length (do not use this to test for presence of value)
108 }; // -x- struct __atom -x-
110 // --------------------------------------------------------------------------
111 // Internal variables.
112 // --------------------------------------------------------------------------
113 char* __origin = nullptr; // Copy of original string
114 std::vector<__atom*> __data_points;
117 // --------------------------------------------------------------------------
118 // Operator modes (all are disabled by default).
119 // --------------------------------------------------------------------------
120 bool mode_k = false; // Key
121 bool mode_v = false; // Value
122 bool mode_p = false; // Presence of key-value pair
123 bool mode_c = false; // Camel_Case
124 bool mode_f = false; // First letter upper-case
125 bool mode_l = false; // All lower-case
126 bool mode_u = false; // All upper-case
129 /*======================================================================*//**
131 Optional flags that alter, modify, or enhance the operation of atomization
133 *///=========================================================================
134 enum ATOMIZE_FLAGS: int {
136 /*----------------------------------------------------------------------*//**
137 The ATOMIZE_DEFAULT flag isn't necessary, but it's included here for
138 completeness as it accomodates programming styles that prefer to emphasize
139 when defaults are being relied upon.
140 *///-------------------------------------------------------------------------
143 /*----------------------------------------------------------------------*//**
144 Interpret all quotation marks (the default is to only utilize enclosing
146 *///-------------------------------------------------------------------------
147 ATOMIZE_USE_ALL_QUOTES = 1,
149 /*----------------------------------------------------------------------*//**
150 Don't interpret quotation marks as grouping characters.
151 *///-------------------------------------------------------------------------
152 ATOMIZE_IGNORE_QUOTES = 2,
154 /*----------------------------------------------------------------------*//**
155 Delete quotation marks that function as grouping characters (this flag has no
156 effect when @ref ATOMIZE_IGNORE_QUOTES is set).
157 *///-------------------------------------------------------------------------
158 ATOMIZE_DELETE_QUOTES = 4,
160 }; // -x- enum ATOMIZE_FLAGS -x-
163 /*======================================================================*//**
166 *///=========================================================================
168 /// The intake ASCIIZ string
170 /// The length of the intake string
173 // --------------------------------------------------------------------------
174 // Internal variables.
175 // --------------------------------------------------------------------------
176 __atom* atom = new __atom{}; // Data structure that will keep being replaced
177 bool k = false; // Key-value pair detection flag
178 bool q = false; // Begin in non-quote mode
179 bool w = true; // Begin in whitespace mode to skip any leading whitespace
181 // --------------------------------------------------------------------------
182 // Allocate internal memory.
183 // --------------------------------------------------------------------------
184 if (__origin != nullptr) __clear(); //::free(__origin); // Memory management
185 __origin = (char*)::calloc(1, len + 1); // Allocate memory
187 // --------------------------------------------------------------------------
189 // --------------------------------------------------------------------------
190 for (int i = 0; i < len; i++) {
192 // --------------------------------------------------------------------------
193 // Extract current charact and copy original string at the same time, which
194 // also creates an optimization opportunity for compilers to store "c" in a
195 // register, which is faster than accessing a memory location.
196 // --------------------------------------------------------------------------
197 char c = __origin[i] = intake[i];
199 // --------------------------------------------------------------------------
200 // Process character.
201 // --------------------------------------------------------------------------
203 case '\0': // Whitespace: NULL
204 //if (len == 0) break; // End of ASCIIZ string
206 case ' ': // Whitespace: Space
208 case '\t': // Whitespace: Tab
210 case '\r': // Whitespace: Carriage Return
212 case '\n': // Whitespace: Linefeed
213 if (!w) { // End of atom
214 if (k) atom->vlen = i - atom->vpos;
215 else atom->klen = i - atom->fpos;
216 __data_points.push_back(atom); // Save current atom to vector
217 atom = new __atom{.fpos = (uint)i}; // Create new atom
218 k = false; // Disable key-value pair mode
219 w = true; // Enable whitespace mode
223 goto __assign_default;
224 case '=': // Key-value pair detected
225 if (!k) { // Key-value pair mode is not already detected
226 k = true; // Enable key-value pair mode
227 atom->klen = w ? 0 : i - atom->fpos;
228 atom->vpos = i + 1; // Save position of value
232 default: // Non-whitespace characters
233 if (w) { // White space mode is enabled
234 atom->fpos = i; // Save new starting position
235 w = false; // Disable whitespace mode
240 } // -x- swtich str -x-
244 // --------------------------------------------------------------------------
245 // Save final atom if it's not empty.
246 // --------------------------------------------------------------------------
247 if (atom->flen != 0) __data_points.push_back(atom);
250 } // -x- void __assign -x-
252 /*======================================================================*//**
254 Clear internal data, but not flags or modes. Called by both the destructor,
255 the clear() method, and the various assign() methods.
256 *///=========================================================================
259 // --------------------------------------------------------------------------
260 // Delete all data entry points. These need to be deleted separately because
261 // the std::vector won't automatically delete structures.
262 // --------------------------------------------------------------------------
263 for (int i = __data_points.size() - 1; i >= 0; i--)
264 delete __data_points.at(i);
265 __data_points.clear();
267 // --------------------------------------------------------------------------
268 // Free the original string, if one has been allocated (in most cases, there
269 // will be allocated memory here that needs to be freed).
270 // --------------------------------------------------------------------------
271 if (__origin != nullptr) {
272 ::free(__origin); // Memory management
274 } // -x- if !__origin -x-
276 } // -x- void __clear -x-
279 /*======================================================================*//**
281 Instantiate an empty Atomize object, which is expected to be used with the
282 @ref assign method at some later point. (This is particularly useful for
283 defining a local Atomize object in a header file in a way that won't throw an
284 exception, including invalid mode codes {which will just be ignored}.)
285 *///=========================================================================
287 /// See @ref ATOMIZE_FLAGS for a list of options
288 const int flags = ATOMIZE_DEFAULT,
289 /// Set the modes (@c nullptr default means don't set the modes) @n
290 /// Granulatarity (default is to return the entire atom): @n
291 /// @c "\0" = entire atom (default) @n
292 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
293 /// @c "v" = value (will be empty if no key-value pair was detected) @n
294 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
295 /// Conversion options (default is for no conversion): @n
296 /// @c "c" = Camel_Case @n
297 /// @c "f" = First character in upper-case @n
298 /// @c "l" = all lower case @n
299 /// @c "u" = ALL UPPER CASE
300 const char* mode = nullptr) noexcept {
302 if (mode != nullptr) __mode(mode, false);
303 } // -x- constructor Atomize -x-
305 /*======================================================================*//**
307 Instantiate an empty Atomize object, which is expected to be used with the
308 @ref assign method at some later point. (This is particularly useful for
309 defining a local Atomize object in a header file in a way that won't throw an
310 exception, including invalid mode codes {which will just be ignored}.)
311 *///=========================================================================
313 /// See @ref ATOMIZE_FLAGS for a list of options
315 /// Set the modes (@c 0 default means don't set the modes) @n
316 /// Granulatarity (default is to return the entire atom): @n
317 /// @c "\0" = entire atom (default) @n
318 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
319 /// @c "v" = value (will be empty if no key-value pair was detected) @n
320 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
321 /// Conversion options (default is for no conversion): @n
322 /// @c "c" = Camel_Case @n
323 /// @c "f" = First character in upper-case @n
324 /// @c "l" = all lower case @n
325 /// @c "u" = ALL UPPER CASE
326 const char mode) noexcept {
329 char new_mode[]{mode, 0};
330 __mode(new_mode, false);
332 } // -x- constructor Atomize -x-
334 /*======================================================================*//**
336 Instantiate an Atomize object using the specified ASCIIZ string for intake.
337 @throws std::invalid_argument If the parameters are malformed in some way.
338 *///=========================================================================
340 /// The intake ASCIIZ string
342 /// The length of the intake string@n
343 /// -1 = Measure ASCIIZ string
345 /// See @ref ATOMIZE_FLAGS for a list of options
346 const int flags = ATOMIZE_DEFAULT,
347 /// Set the modes (@c nullptr default means don't set the modes) @n
348 /// Granulatarity (default is to return the entire atom): @n
349 /// @c "\0" = entire atom (default) @n
350 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
351 /// @c "v" = value (will be empty if no key-value pair was detected) @n
352 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
353 /// Conversion options (default is for no conversion): @n
354 /// @c "c" = Camel_Case @n
355 /// @c "f" = First character in upper-case @n
356 /// @c "l" = all lower case @n
357 /// @c "u" = ALL UPPER CASE
358 const char* mode = nullptr) {
360 if (mode != nullptr) __mode(mode);
361 __assign(intake, len >= 0 ? len : std::strlen(intake));
362 } // -x- constructor Atomize -x-
364 /*======================================================================*//**
366 Instantiate an Atomize object using the specified ASCIIZ string for intake.
367 @throws std::invalid_argument If the parameters are malformed in some way.
368 *///=========================================================================
370 /// The intake ASCIIZ string
372 /// The length of the intake string@n
373 /// -1 = Measure ASCIIZ string
375 /// See @ref ATOMIZE_FLAGS for a list of options
377 /// Set the modes (@c 0 default means don't set the modes) @n
378 /// Granulatarity (default is to return the entire atom): @n
379 /// @c "\0" = entire atom (default) @n
380 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
381 /// @c "v" = value (will be empty if no key-value pair was detected) @n
382 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
383 /// Conversion options (default is for no conversion): @n
384 /// @c "c" = Camel_Case @n
385 /// @c "f" = First character in upper-case @n
386 /// @c "l" = all lower case @n
387 /// @c "u" = ALL UPPER CASE
391 char new_mode[]{mode, 0};
392 __mode(new_mode, false);
394 __assign(intake, len >= 0 ? len : std::strlen(intake));
395 } // -x- constructor Atomize -x-
397 /*======================================================================*//**
399 Instantiate an Atomize object using the specified string for intake.
400 @throws std::invalid_argument If the parameters are malformed in some way.
401 *///=========================================================================
403 /// The intake C++ string
404 const std::string intake,
405 /// The length of the intake string@n
406 /// -1 = Obtain length from @c intake.size() method
408 /// See @ref ATOMIZE_FLAGS for a list of options
409 const int flags = ATOMIZE_DEFAULT,
410 /// Set the modes (@c nullptr default means don't set the modes) @n
411 /// Granulatarity (default is to return the entire atom): @n
412 /// @c "\0" = entire atom (default) @n
413 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
414 /// @c "v" = value (will be empty if no key-value pair was detected) @n
415 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
416 /// Conversion options (default is for no conversion): @n
417 /// @c "c" = Camel_Case @n
418 /// @c "f" = First character in upper-case @n
419 /// @c "l" = all lower case @n
420 /// @c "u" = ALL UPPER CASE
421 const char* mode = nullptr) {
423 if (mode != nullptr) __mode(mode);
424 __assign(intake.data(), len >= 0 ? len : intake.size());
425 } // -x- constructor Atomize -x-
427 /*======================================================================*//**
429 Instantiate an Atomize object using the specified string for intake.
430 @throws std::invalid_argument If the parameters are malformed in some way.
431 *///=========================================================================
433 /// The intake C++ string
434 const std::string intake,
435 /// The length of the intake string@n
436 /// -1 = Obtain length from @c intake.size() method
438 /// See @ref ATOMIZE_FLAGS for a list of options
440 /// Set the modes (@c 0 default means don't set the modes) @n
441 /// Granulatarity (default is to return the entire atom): @n
442 /// @c "\0" = entire atom (default) @n
443 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
444 /// @c "v" = value (will be empty if no key-value pair was detected) @n
445 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
446 /// Conversion options (default is for no conversion): @n
447 /// @c "c" = Camel_Case @n
448 /// @c "f" = First character in upper-case @n
449 /// @c "l" = all lower case @n
450 /// @c "u" = ALL UPPER CASE
454 char new_mode[]{mode, 0};
455 __mode(new_mode, false);
457 __assign(intake.data(), len >= 0 ? len : intake.size());
458 } // -x- constructor Atomize -x-
460 /*======================================================================*//**
463 *///=========================================================================
464 ~Atomize() noexcept {
466 } // -x- constructor Atomize -x-
468 /*======================================================================*//**
470 Assign (and interpret) a new ASCIIZ string (flags and modes are inherited).
471 @throws std::invalid_argument If the parameters are malformed in some way.
472 @returns The same Atomize object so as to facilitate stacking
473 *///=========================================================================
475 /// The intake ASCIIZ string
477 /// The length of the intake string@n
478 /// -1 = Measure ASCIIZ string
479 const int len = -1) {
481 __assign(intake, len >= 0 ? len : std::strlen(intake));
483 } // -x- Atomize* assign -x-
485 /*======================================================================*//**
487 Assign (and interpret) a new string (flags and modes are inherited).
488 @throws std::invalid_argument If the parameters are malformed in some way.
489 @returns The same Atomize object so as to facilitate stacking
490 *///=========================================================================
492 /// The intake C++ string
493 const std::string intake,
494 /// The length of the intake string@n
495 /// -1 = Obtain length from @c intake.size() method
496 const int len = -1) {
498 __assign(intake.data(), len >= 0 ? len : intake.size());
500 } // -x- Atomize* assign -x-
502 /*======================================================================*//**
504 Access to atoms, whilst utilizing the operator mode that was configured using
505 the @ref mode method.
506 Return an entire atom.
507 @throws std::out_of_range if the index is out-of-range
513 *///=========================================================================
515 /// Which atom to obtain (0 = first atom; negative values count backward from
516 /// the last atom in the internal array)
518 /// Temporarily override the current modes (@c nullptr default means don't
520 /// Granulatarity (default is to return the entire atom): @n
521 /// @c "\0" = entire atom (default) @n
522 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
523 /// @c "v" = value (will be empty if no key-value pair was detected) @n
524 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
525 /// Conversion options (default is for no conversion): @n
526 /// @c "c" = Camel_Case @n
527 /// @c "f" = First character in upper-case @n
528 /// @c "l" = all lower case @n
529 /// @c "u" = ALL UPPER CASE
530 const char* mode = nullptr) {
532 // --------------------------------------------------------------------------
533 // Internal variables.
534 // --------------------------------------------------------------------------
535 if (index < 0) index = __data_points.size() + index;
536 __atom* atom = __data_points[index];
537 std::string previous_mode;
539 // --------------------------------------------------------------------------
540 // Save and change modes.
541 // --------------------------------------------------------------------------
542 if (mode != nullptr) {
543 previous_mode = this->mode(); // Save current mode
547 // --------------------------------------------------------------------------
548 // Presence of key-value pair (results in ignoring all other modes).
549 // --------------------------------------------------------------------------
550 if (mode_p) return atom->vpos != 0 ? "1" : "";
552 // --------------------------------------------------------------------------
554 // --------------------------------------------------------------------------
556 if (mode_k) str.assign(__origin, atom->fpos, atom->klen);
557 else if (mode_v) str.assign(__origin, atom->vpos, atom->vlen);
558 else str.assign(__origin, atom->fpos, atom->flen);
560 // --------------------------------------------------------------------------
562 // --------------------------------------------------------------------------
563 char* data = str.data();
564 if (mode_l) { // All lower-case
565 for (int i = 0; i < str.size(); i++) {
567 if (ch >= 'A' && ch <= 'Z') data[i] += 32; // Convert to lower-case
569 } else if (mode_u) { // All upper-case
570 for (int i = 0; i < str.size(); i++) {
572 if (ch >= 'a' && ch <= 'z') data[i] -= 32; // Convert to upper-case
574 } else if (mode_c) { // Camel_Case
576 for (int i = 0; i < str.size(); i++) {
578 if (pch < 'A' || (pch > 'Z' && pch < 'a') || pch > 'z') {
579 if (ch >= 'a' && ch <= 'z')
580 data[i] -= 32; // Convert to upper-case
584 } else if (mode_f) { // First letter
586 if (ch >= 'a' && ch <= 'z') data[0] -= 32; // Convert to upper-case
587 } // -x- if mode_c -x-
589 // --------------------------------------------------------------------------
590 // Restore previously saved modes.
591 // --------------------------------------------------------------------------
592 if (mode != nullptr) this->mode(previous_mode.data());
595 } // -x- std::string at -x-
597 /*======================================================================*//**
599 Clear this Atomize's underlying data and reset all states. This does not
600 reset nor alter flags or modes.
601 @returns The same Atomize object so as to facilitate stacking
602 *///=========================================================================
606 } // -x- Atomize* clear -x-
608 /*======================================================================*//**
610 Confirm that there are no atoms.
611 @returns TRUE = no atoms@n
612 FALSE = at least one atom exists
614 *///=========================================================================
616 return __data_points.empty();
617 } // -x- bool empty -x-
619 /*======================================================================*//**
621 Obtain current set of internal flags.
622 @returns Current flags, as defined in @ref ATOMIZE_FLAGS
623 @see flags(const int)
625 *///=========================================================================
628 } // -x- int flags -x-
630 /*======================================================================*//**
632 Obtain current set of internal flags.
633 @returns The same Atomize object so as to facilitate stacking
636 *///=========================================================================
638 /// See @ref ATOMIZE_FLAGS for a list of options
642 } // -x- Atomize* flags -x-
644 /*======================================================================*//**
646 Return the entire atom.
647 @throws std::out_of_range if the index is out-of-range
648 @returns Key portion of atom (or the entire atom if a key-value pair wasn't
655 *///=========================================================================
657 /// Which atom to obtain (0 = first atom; negative values count backward from
658 /// the last atom in the internal array)
660 if (index < 0) index = __data_points.size() + index;
661 __atom* atom = __data_points[index];
662 return std::string(__origin, atom->fpos, atom->flen);
663 } // -x- std::string get -x-
665 /*======================================================================*//**
667 Return the key portion of an atom, or the entire atom if a key-vlue pair
669 @throws std::out_of_range if the index is out-of-range
670 @returns Key portion of atom (or the entire atom if a key-value pair wasn't
677 *///=========================================================================
679 /// Which atom to obtain (0 = first atom; negative values count backward from
680 /// the last atom in the internal array)
682 if (index < 0) index = __data_points.size() + index;
683 __atom* atom = __data_points[index];
684 return std::string(__origin, atom->fpos, atom->klen);
685 } // -x- std::string get_key -x-
687 /*======================================================================*//**
689 Return the value portion of an atom, or an empty string if a key-vlue pair
691 @throws std::out_of_range if the index is out-of-range
692 @returns Value portion of atom (or an empty string if a key-value pair wasn't
699 *///=========================================================================
700 std::string get_value(
701 /// Which atom to obtain (0 = first atom; negative values count backward from
702 /// the last atom in the internal array)
704 if (index < 0) index = __data_points.size() + index;
705 __atom* atom = __data_points[index];
706 return atom->vpos != 0 ? std::string(__origin, atom->vpos, atom->vlen) : "";
707 } // -x- std::string get_value -x-
709 /*======================================================================*//**
711 Indicates whether the specified atom was split into a key-value pair (if it
712 was, then the @c key and the @c value are delimited by the first instance of
713 an equal sign {`=`}).
714 @throws std::out_of_range if the index is out-of-range
715 @returns TRUE = key-value pair was detected by the parsing algorithm@n
716 FALSE = this atom was not split into a key-value pair
722 *///=========================================================================
724 /// Which atom to obtain (0 = first atom; negative values count backward from
725 /// the last atom in the internal array)
727 if (index < 0) index = __data_points.size() + index;
728 __atom* atom = __data_points[index];
729 return atom->vpos != 0;
730 } // -x- bool has_kv -x-
732 /*======================================================================*//**
734 Get the operator modes that are set for the @ref operator[] operator.
735 @throws std::invalid_argument if an incorrect value is provided
736 @returns The same Atomize object so as to facilitate stacking
738 @see mode(const char*)
739 *///=========================================================================
740 std::string mode() noexcept {
742 if (mode_k) modes.append("k");
743 if (mode_v) modes.append("v");
744 if (mode_p) modes.append("p");
745 if (mode_c) modes.append("c");
746 if (mode_f) modes.append("f");
747 if (mode_l) modes.append("l");
748 if (mode_u) modes.append("u");
750 } // -x- std::string mode -x-
753 /*======================================================================*//**
756 Set the operator modes for use with the @ref operator[] operator (modes that
757 are not specified will be reset to their defaults).
759 Calling this method with @c "\0" as the parameter will result in resetting
760 all operator modes to the base defaults.
761 @throws std::invalid_argument if an incorrect value is provided
762 @returns The same Atomize object so as to facilitate stacking
763 *///=========================================================================
765 /// Granulatarity (default is to return the entire atom): @n
766 /// @c "\0" = entire atom (default) @n
767 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
768 /// @c "v" = value (will be empty if no key-value pair was detected) @n
769 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
770 /// Conversion options (default is for no conversion): @n
771 /// @c "c" = Camel_Case @n
772 /// @c "f" = First character in upper-case @n
773 /// @c "l" = all lower case @n
774 /// @c "u" = ALL UPPER CASE
776 /// This is primarily used by the empty constructor@n
777 /// TRUE = invalid mode throws an std::invalid_argument exception (default) @n
778 /// FALSE = ignore invalid mode
779 const bool throw_exception = true) {
781 // --------------------------------------------------------------------------
782 // Clear all settings.
783 // --------------------------------------------------------------------------
792 // --------------------------------------------------------------------------
793 // Set modes (duplicates are effectively ignored, and 0 never shows up here
794 // because it terminates the string).
795 // --------------------------------------------------------------------------
796 const size_t len = std::strlen(mode);
797 for (int i = 0; i < len; i++) {
821 if (throw_exception) throw std::invalid_argument("unrecognized oeprator_mode \"" + std::to_string(mode[i]) + "\"");
822 } // -x- switch mode[i] -x-
825 } // -x- void mode -x-
828 /*======================================================================*//**
831 Set the operator modes for use with the @ref operator[] operator (modes that
832 are not specified will be reset to their defaults).
834 Calling this method with @c "\0" as the parameter will result in resetting
835 all operator modes to the base defaults.
836 @throws std::invalid_argument if an incorrect value is provided
837 @returns The same Atomize object so as to facilitate stacking
840 *///=========================================================================
842 /// Granulatarity (default is to return the entire atom): @n
843 /// @c "\0" = entire atom (default) @n
844 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
845 /// @c "v" = value (will be empty if no key-value pair was detected) @n
846 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
847 /// Conversion options (default is for no conversion): @n
848 /// @c "c" = Camel_Case @n
849 /// @c "f" = First character in upper-case @n
850 /// @c "l" = all lower case @n
851 /// @c "u" = ALL UPPER CASE
855 } // -x- Atomize* mode -x-
857 /*======================================================================*//**
860 Set the operator modes for use with the @ref operator[] operator (modes that
861 are not specified will be reset to their defaults).
863 Calling this method with @c "\0" as the parameter will result in resetting
864 all operator modes to the base defaults.
865 @throws std::invalid_argument if an incorrect value is provided
866 @returns The same Atomize object so as to facilitate stacking
869 *///=========================================================================
871 /// Granulatarity (default is to return the entire atom): @n
872 /// @c "\0" = entire atom (default) @n
873 /// @c "k" = key (same as 0 if no key-value pair was detected) @n
874 /// @c "v" = value (will be empty if no key-value pair was detected) @n
875 /// @c "p" = returns: "1" = is a key-value pair / "" = not a key-value pair@n
876 /// Conversion options (default is for no conversion): @n
877 /// @c "c" = Camel_Case @n
878 /// @c "f" = First character in upper-case @n
879 /// @c "l" = all lower case @n
880 /// @c "u" = ALL UPPER CASE
882 char new_mode[]{mode, 0};
883 __mode(new_mode, false);
885 } // -x- Atomize* mode -x-
887 /*======================================================================*//**
889 Return the total quantity of atoms.
890 @returns Quantity of atoms
892 *///=========================================================================
894 return __data_points.size();
895 } // -x- size_t size -x-
897 /*======================================================================*//**
899 Generate an std::vector<std::string> that contains all atoms.
901 *///=========================================================================
902 std::vector<std::string> to_vector(
903 /// FALSE = don't split key-value pairs (default) @n
904 /// TRUE = split key-value pairs into separate entries (for key names, the
905 /// equal sign will be included at the end of the string)
906 bool split_kv_pairs = false) noexcept {
907 std::vector<std::string> v;
909 // --------------------------------------------------------------------------
910 // Splitting key-value pairs is best handled in a separate loop.
911 // --------------------------------------------------------------------------
912 if (split_kv_pairs) {
913 for (int i = 0; i < __data_points.size(); i++) {
914 __atom* atom = __data_points[i];
915 if (atom->vpos != 0) { // Non-zero indicates that a key-value pair was detected
916 v.push_back(std::string(__origin, atom->fpos, atom->klen + 1)); // +1 includes equal sign
917 v.push_back(std::string(__origin, atom->vpos, atom->vlen));
918 } else { // No key-value pair was detected
919 v.push_back(std::string(__origin, atom->fpos, atom->flen));
920 } // -x- if atom->vpos -x-
923 } // -x- if split_kv_pairs -x-
925 // --------------------------------------------------------------------------
926 // Full atoms requires is straight-forward.
927 // --------------------------------------------------------------------------
928 for (int i = 0; i < __data_points.size(); i++) {
929 __atom* atom = __data_points[i];
930 v.push_back(std::string(__origin, atom->fpos, atom->flen));
934 } // -x- std::vector<std::string> to_vector -x-
936 /*======================================================================*//**
938 Array-style access to atoms, whilst utilizing the operator mode that was
939 configured using the @ref mode method.
940 @throws std::out_of_range if the index is out-of-range
944 *///=========================================================================
945 std::string operator[](
946 /// Index of character to access (0 = first atom; negative index values are
947 /// calculated in reverse, starting with -1 as the final atom)
950 } // -x- std::string operator[] -x-
952 }; // -x- class Atomize -x-
954}; // -x- namespace randolf -x-