diff --git a/CHANGES.md b/CHANGES.md index e5b8e42..21d6857 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -29,6 +29,17 @@ + Change the information variables to functions. These serve the same purpose, but don't use variable names that Lua explicitly warns users about. +## *3.0.0-1* (April 24, 2016) + ++ Clean up tests. ++ Change the name of `spliterator` to `each`. The new name is less silly and + hopefully clearer. **NB**: For the moment, `spliterator` is still provided as + an alias to `each`. However, in the next major version release (i.e., + 4.0.0-1), `spliterator` will be removed. Please start switching any code that + uses `spliterator` to `each`. ++ Add `first_and_rest`, a string equivalent to a function that splits a list + into its head and tail. + Would you rather view the [documentation][d]? [d]: /README.md diff --git a/README.md b/README.md index 900785b..a871ddb 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,9 @@ ## Synopsis -A string `split` function and iterator for Lua, which doesn't provide such -a function in its standard string library. Such a function is clearly useful, -and [many people have written their own][wiki]. +A string `split` function and iterator for Lua sicne Lua's standard sting +library doesn't provide such a function. When working with text `split` is very +useful, and [many people have written a version for Lua][wiki]. [wiki]: http://lua-users.org/wiki/SplitJoin @@ -14,16 +14,17 @@ and [many people have written their own][wiki]. The delimiter can be a literal string or a Lua pattern. The function returns a table of items found by splitting the string up into pieces divided by the - delimiter. - - Extra delimiters anywhere in the string will result in empty strings being - returned as part of the results table. + delimiter. If the delimiter is not present in the string, then the result + will be a table consisting of one item: the original string parameter. Extra + delimiters anywhere in the string will result in empty strings being returned + as part of the results table. The function also provides two shortcuts for common situations. If the delimiter parameter is an empty string, the function returns a table - containing every character in the original string as a separate item. If the - delimiter parameter is `nil`, the function considers this equivalent to the - Lua pattern `'%s+'` and splits the string on whitespace. + containing every character in the original string as a separate item. (I.e., + if the delimiter is the empty string, the function explodes the string.) If + the delimiter parameter is `nil`, the function considers this equivalent to + the Lua pattern `'%s+'` and splits the string on whitespace. Examples: @@ -39,41 +40,95 @@ and [many people have written their own][wiki]. * A special case: empty string delimiter - A pattern of the empty string is special. It tells the function to - return each character from the original string as an individual item. - Think of this as "explode the string". + If the delimiter is an empty string, the function returns each + character from the original string as an individual item. Think of + this as "explode the string". split('foo', '') -- returns {'f', 'o', 'o'} - * Another special case: nil delimiter + * Another special case: `nil` delimiter - Passing nothing or an explicit `nil` as the delimiter is a second - special case. `split` treats this as equivalent to a pattern of `'$s+'` - and splits on consecutive runs of whitespace. + Pass nothing or an explicit `nil` as the delimiter and `split` acts as + if the delimiter were `'$s+'`. This makes it easier to split on + consecutive runs of whitespace. split('foo bar buzz') -- returns {'foo', 'bar', 'buzz'} -+ `spliterator(string, delimiter) => custom iterator` ++ `each(string, delimiter) => custom iterator` + + **NB**: This function was previously called `spliterator`, but I've renamed + it to the shorter and less goofy `each`. In order to give people who might + rely on the previous name time to switch over, `spliterator` is still + provided as an alias for `each`. However, that name will be removed in the + next major version release (i.e., 4.0.0) of this module. - This is an iterator version of the same idea. Everything from above applies, - except that the function returns a custom iterator to work through results - rather than a table. + This is an iterator version of the same idea as `split`. Everything from + above applies, except that the function returns a iterator to work through + results rather than a table. - local spliter = require 'split'.spliterator + local split_each = require 'split'.each - local str = 'foo,bar,bizz,buzz,' + local str = 'foo,bar,bizz,buzz' local count = 1 - for p in spliter(str, ',') do + for p in split_each(str, ',') do print(count .. '. [' .. p .. ']') count = count + 1 end ++ `first_and_rest(string, delimiter) => string, string (or nil)` + + This function is a string equivalent for a function that divides a list into + its head and tail. The head of the string is everything that appears before + the first appearance of a specified delimiter; the tail is the rest of the + string. `first_and_rest` attempts to split a string into two pieces, and it + returns two results using Lua's multiple return. The exact return values vary + depending on the string and delimiter. + + In the simplest case, the string contains the delimiter at least once. If so, + the first return value will be the portion of the string before the first + appearance of the delimiter, and the second return value will be the rest of + the string after that delimiter. + + If the delimiter does not appear in the string, however, then there's no + possible split. In this case, the first return value will be the entire + string, and the second return value will be `nil`. (From Lua's point of view, + a second return value of `nil` is equivalent to saying that the function only + returns one value.) + + If the second return value is `nil`, there is probably a problem or malformed + record. So it will often make sense to test the second return value before + proceeding. For example: + + local head, tail = first_and_rest(record, '%s*:%s*') + if not tail then + -- Signal an error to the caller. + else + -- Process the record. + end + + A second complication is that the strings returned by the function may be + empty. If the delimiter is found, but the portion of the string before or + after it is zero-length, then an empty string may be returned. The examples + below show various possible outcomes. + + first_and_rest('head: tail', ': ') -- returns 'head', 'tail' + first_and_rest('head, tail', ': ') -- returns 'head, tail', nil + first_and_rest(': tail', ': ') -- returns '', 'tail' + first_and_rest('head: ', ': ') -- returns 'head', '' + + Like `split` and `each`, `first_and_rest` accepts `nil` or an empty string as + special cases for the delimiter. `nil` is automatically transformed into + '%s+', a generic "separated by space" pattern. In the case of an empty string + delimiter, `first_and_rest` returns the first character of the input and the + rest of the input. (This seems to be the only reasonable interpretation of + "exploding" the input string in the context of this function.) + ## Varia The module provides four informational functions that return strings. They should be self-explanatory. -+ `version() -- 2.0.0-1` ++ `version() -- 3.0.0-1` + `author() -- Peter Aronoff` @@ -86,8 +141,12 @@ should be self-explanatory. Many of my ideas came from reading [the LuaWiki page on split][wiki]. I thank all those contributors for their suggestions and examples. +[Alexey Melnichuk, AKA moteus][moteus] provided the idea and initial code for +`first_and_rest`. + All mistakes are mine. See [version history][c] for release details. +[moteus]: https://bitbucket.org/moteus [c]: /CHANGES.md --- diff --git a/doc/changes.html b/doc/changes.html index 6eedf6f..b5ce2b0 100644 --- a/doc/changes.html +++ b/doc/changes.html @@ -44,6 +44,20 @@
spliterator
to each
. The new name is less silly and
+hopefully clearer. NB: For the moment, spliterator
is still provided as
+an alias to each
. However, in the next major version release (i.e.,
+4.0.0-1), spliterator
will be removed. Please start switching any code that
+uses spliterator
to each
.first_and_rest
, a string equivalent to a function that splits a list
+into its head and tail.Would you rather view the documentation?
A string split
function and iterator for Lua, which doesn’t provide such
-a function in its standard string library. Such a function is clearly useful,
-and many people have written their own.
A string split
function and iterator for Lua sicne Lua’s standard sting
+library doesn’t provide such a function. When working with text split
is very
+useful, and many people have written a version for Lua.
The delimiter can be a literal string or a Lua pattern. The function returns a table of items found by splitting the string up into pieces divided by the -delimiter.
- -Extra delimiters anywhere in the string will result in empty strings being -returned as part of the results table.
+delimiter. If the delimiter is not present in the string, then the result +will be a table consisting of one item: the original string parameter. Extra +delimiters anywhere in the string will result in empty strings being returned +as part of the results table.The function also provides two shortcuts for common situations. If the
delimiter parameter is an empty string, the function returns a table
-containing every character in the original string as a separate item. If the
-delimiter parameter is nil
, the function considers this equivalent to the
-Lua pattern '%s+'
and splits the string on whitespace.
nil
, the function considers this equivalent to
+the Lua pattern '%s+'
and splits the string on whitespace.
Examples:
@@ -49,37 +50,92 @@A special case: empty string delimiter
-A pattern of the empty string is special. It tells the function to - return each character from the original string as an individual item. - Think of this as “explode the string”.
+If the delimiter is an empty string, the function returns each + character from the original string as an individual item. Think of + this as “explode the string”.
split('foo', '') -- returns {'f', 'o', 'o'}
Another special case: nil delimiter
+Another special case: nil
delimiter
Passing nothing or an explicit nil
as the delimiter is a second
- special case. split
treats this as equivalent to a pattern of '$s+'
- and splits on consecutive runs of whitespace.
Pass nothing or an explicit nil
as the delimiter and split
acts as
+ if the delimiter were '$s+'
. This makes it easier to split on
+ consecutive runs of whitespace.
split('foo bar buzz') -- returns {'foo', 'bar', 'buzz'}
spliterator(string, delimiter) => custom iterator
each(string, delimiter) => custom iterator
NB: This function was previously called spliterator
, but I’ve renamed
+it to the shorter and less goofy each
. In order to give people who might
+rely on the previous name time to switch over, spliterator
is still
+provided as an alias for each
. However, that name will be removed in the
+next major version release (i.e., 4.0.0) of this module.
This is an iterator version of the same idea. Everything from above applies, -except that the function returns a custom iterator to work through results -rather than a table.
+This is an iterator version of the same idea as split
. Everything from
+above applies, except that the function returns a iterator to work through
+results rather than a table.
local spliter = require 'split'.spliterator
+ local split_each = require 'split'.each
- local str = 'foo,bar,bizz,buzz,'
+ local str = 'foo,bar,bizz,buzz'
local count = 1
- for p in spliter(str, ',') do
+ for p in split_each(str, ',') do
print(count .. '. [' .. p .. ']')
count = count + 1
end
first_and_rest(string, delimiter) => string, string (or nil)
This function is a string equivalent for a function that divides a list into
+its head and tail. The head of the string is everything that appears before
+the first appearance of a specified delimiter; the tail is the rest of the
+string. first_and_rest
attempts to split a string into two pieces, and it
+returns two results using Lua’s multiple return. The exact return values vary
+depending on the string and delimiter.
In the simplest case, the string contains the delimiter at least once. If so, +the first return value will be the portion of the string before the first +appearance of the delimiter, and the second return value will be the rest of +the string after that delimiter.
+ +If the delimiter does not appear in the string, however, then there’s no
+possible split. In this case, the first return value will be the entire
+string, and the second return value will be nil
. (From Lua’s point of view,
+a second return value of nil
is equivalent to saying that the function only
+returns one value.)
If the second return value is nil
, there is probably a problem or malformed
+record. So it will often make sense to test the second return value before
+proceeding. For example:
local head, tail = first_and_rest(record, '%s*:%s*')
+ if not tail then
+ -- Signal an error to the caller.
+ else
+ -- Process the record.
+ end
+
+
+A second complication is that the strings returned by the function may be +empty. If the delimiter is found, but the portion of the string before or +after it is zero-length, then an empty string may be returned. The examples +below show various possible outcomes.
+ + first_and_rest('head: tail', ': ') -- returns 'head', 'tail'
+ first_and_rest('head, tail', ': ') -- returns 'head, tail', nil
+ first_and_rest(': tail', ': ') -- returns '', 'tail'
+ first_and_rest('head: ', ': ') -- returns 'head', ''
+
+
+Like split
and each
, first_and_rest
accepts nil
or an empty string as
+special cases for the delimiter. nil
is automatically transformed into
+‘%s+’, a generic “separated by space” pattern. In the case of an empty string
+delimiter, first_and_rest
returns the first character of the input and the
+rest of the input. (This seems to be the only reasonable interpretation of
+“exploding” the input string in the context of this function.)
version() -- 2.0.0-1
version() -- 3.0.0-1
author() -- Peter Aronoff
url() -- https://bitbucket.org/telemachus/split
license() -- BSD 3-Clause
Many of my ideas came from reading the LuaWiki page on split. I thank all those contributors for their suggestions and examples.
+Alexey Melnichuk, AKA moteus provided the idea and initial code for
+first_and_rest
.
All mistakes are mine. See version history for release details.