Skip to content

Commit

Permalink
Prepare for split 3.0.0-1
Browse files Browse the repository at this point in the history
  • Loading branch information
Peter Aronoff committed Apr 24, 2016
1 parent 7a58a2d commit a741d06
Show file tree
Hide file tree
Showing 13 changed files with 401 additions and 109 deletions.
11 changes: 11 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,17 @@
+ Change the information variables to functions. These serve the same purpose,
but don't use variable names that Lua explicitly warns users about.

## *3.0.0-1* (April 24, 2016)

+ Clean up tests.
+ Change the name of `spliterator` to `each`. The new name is less silly and
hopefully clearer. **NB**: For the moment, `spliterator` is still provided as
an alias to `each`. However, in the next major version release (i.e.,
4.0.0-1), `spliterator` will be removed. Please start switching any code that
uses `spliterator` to `each`.
+ Add `first_and_rest`, a string equivalent to a function that splits a list
into its head and tail.

Would you rather view the [documentation][d]?

[d]: /README.md
Expand Down
109 changes: 84 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

## Synopsis

A string `split` function and iterator for Lua, which doesn't provide such
a function in its standard string library. Such a function is clearly useful,
and [many people have written their own][wiki].
A string `split` function and iterator for Lua sicne Lua's standard sting
library doesn't provide such a function. When working with text `split` is very
useful, and [many people have written a version for Lua][wiki].

[wiki]: http://lua-users.org/wiki/SplitJoin

Expand All @@ -14,16 +14,17 @@ and [many people have written their own][wiki].

The delimiter can be a literal string or a Lua pattern. The function returns
a table of items found by splitting the string up into pieces divided by the
delimiter.

Extra delimiters anywhere in the string will result in empty strings being
returned as part of the results table.
delimiter. If the delimiter is not present in the string, then the result
will be a table consisting of one item: the original string parameter. Extra
delimiters anywhere in the string will result in empty strings being returned
as part of the results table.

The function also provides two shortcuts for common situations. If the
delimiter parameter is an empty string, the function returns a table
containing every character in the original string as a separate item. If the
delimiter parameter is `nil`, the function considers this equivalent to the
Lua pattern `'%s+'` and splits the string on whitespace.
containing every character in the original string as a separate item. (I.e.,
if the delimiter is the empty string, the function explodes the string.) If
the delimiter parameter is `nil`, the function considers this equivalent to
the Lua pattern `'%s+'` and splits the string on whitespace.

Examples:

Expand All @@ -39,41 +40,95 @@ and [many people have written their own][wiki].

* A special case: empty string delimiter

A pattern of the empty string is special. It tells the function to
return each character from the original string as an individual item.
Think of this as "explode the string".
If the delimiter is an empty string, the function returns each
character from the original string as an individual item. Think of
this as "explode the string".

split('foo', '') -- returns {'f', 'o', 'o'}

* Another special case: nil delimiter
* Another special case: `nil` delimiter

Passing nothing or an explicit `nil` as the delimiter is a second
special case. `split` treats this as equivalent to a pattern of `'$s+'`
and splits on consecutive runs of whitespace.
Pass nothing or an explicit `nil` as the delimiter and `split` acts as
if the delimiter were `'$s+'`. This makes it easier to split on
consecutive runs of whitespace.

split('foo bar buzz') -- returns {'foo', 'bar', 'buzz'}

+ `spliterator(string, delimiter) => custom iterator`
+ `each(string, delimiter) => custom iterator`

**NB**: This function was previously called `spliterator`, but I've renamed
it to the shorter and less goofy `each`. In order to give people who might
rely on the previous name time to switch over, `spliterator` is still
provided as an alias for `each`. However, that name will be removed in the
next major version release (i.e., 4.0.0) of this module.

This is an iterator version of the same idea. Everything from above applies,
except that the function returns a custom iterator to work through results
rather than a table.
This is an iterator version of the same idea as `split`. Everything from
above applies, except that the function returns a iterator to work through
results rather than a table.

local spliter = require 'split'.spliterator
local split_each = require 'split'.each

local str = 'foo,bar,bizz,buzz,'
local str = 'foo,bar,bizz,buzz'
local count = 1
for p in spliter(str, ',') do
for p in split_each(str, ',') do
print(count .. '. [' .. p .. ']')
count = count + 1
end

+ `first_and_rest(string, delimiter) => string, string (or nil)`

This function is a string equivalent for a function that divides a list into
its head and tail. The head of the string is everything that appears before
the first appearance of a specified delimiter; the tail is the rest of the
string. `first_and_rest` attempts to split a string into two pieces, and it
returns two results using Lua's multiple return. The exact return values vary
depending on the string and delimiter.

In the simplest case, the string contains the delimiter at least once. If so,
the first return value will be the portion of the string before the first
appearance of the delimiter, and the second return value will be the rest of
the string after that delimiter.

If the delimiter does not appear in the string, however, then there's no
possible split. In this case, the first return value will be the entire
string, and the second return value will be `nil`. (From Lua's point of view,
a second return value of `nil` is equivalent to saying that the function only
returns one value.)

If the second return value is `nil`, there is probably a problem or malformed
record. So it will often make sense to test the second return value before
proceeding. For example:

local head, tail = first_and_rest(record, '%s*:%s*')
if not tail then
-- Signal an error to the caller.
else
-- Process the record.
end

A second complication is that the strings returned by the function may be
empty. If the delimiter is found, but the portion of the string before or
after it is zero-length, then an empty string may be returned. The examples
below show various possible outcomes.

first_and_rest('head: tail', ': ') -- returns 'head', 'tail'
first_and_rest('head, tail', ': ') -- returns 'head, tail', nil
first_and_rest(': tail', ': ') -- returns '', 'tail'
first_and_rest('head: ', ': ') -- returns 'head', ''

Like `split` and `each`, `first_and_rest` accepts `nil` or an empty string as
special cases for the delimiter. `nil` is automatically transformed into
'%s+', a generic "separated by space" pattern. In the case of an empty string
delimiter, `first_and_rest` returns the first character of the input and the
rest of the input. (This seems to be the only reasonable interpretation of
"exploding" the input string in the context of this function.)

## Varia

The module provides four informational functions that return strings. They
should be self-explanatory.

+ `version() -- 2.0.0-1`
+ `version() -- 3.0.0-1`

+ `author() -- Peter Aronoff`

Expand All @@ -86,8 +141,12 @@ should be self-explanatory.
Many of my ideas came from reading [the LuaWiki page on split][wiki]. I thank
all those contributors for their suggestions and examples.

[Alexey Melnichuk, AKA moteus][moteus] provided the idea and initial code for
`first_and_rest`.

All mistakes are mine. See [version history][c] for release details.

[moteus]: https://bitbucket.org/moteus
[c]: /CHANGES.md

---
Expand Down
14 changes: 14 additions & 0 deletions doc/changes.html
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,20 @@ <h2><em>2.0.0-1</em> (March 5, 2016)</h2>
</ul>


<h2><em>3.0.0-1</em> (April 24, 2016)</h2>

<ul>
<li>Clean up tests.</li>
<li>Change the name of <code>spliterator</code> to <code>each</code>. The new name is less silly and
hopefully clearer. <strong>NB</strong>: For the moment, <code>spliterator</code> is still provided as
an alias to <code>each</code>. However, in the next major version release (i.e.,
4.0.0-1), <code>spliterator</code> will be removed. Please start switching any code that
uses <code>spliterator</code> to <code>each</code>.</li>
<li>Add <code>first_and_rest</code>, a string equivalent to a function that splits a list
into its head and tail.</li>
</ul>


<p>Would you rather view the <a href="index.html">documentation</a>?</p>

<hr />
Expand Down
109 changes: 84 additions & 25 deletions doc/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ <h1>split <a href="https://drone.io/bitbucket.org/telemachus/split/latest"><img

<h2>Synopsis</h2>

<p>A string <code>split</code> function and iterator for Lua, which doesn&rsquo;t provide such
a function in its standard string library. Such a function is clearly useful,
and <a href="http://lua-users.org/wiki/SplitJoin">many people have written their own</a>.</p>
<p>A string <code>split</code> function and iterator for Lua sicne Lua&rsquo;s standard sting
library doesn&rsquo;t provide such a function. When working with text <code>split</code> is very
useful, and <a href="http://lua-users.org/wiki/SplitJoin">many people have written a version for Lua</a>.</p>

<h2>Usage</h2>

Expand All @@ -23,16 +23,17 @@ <h2>Usage</h2>

<p>The delimiter can be a literal string or a Lua pattern. The function returns
a table of items found by splitting the string up into pieces divided by the
delimiter.</p>

<p>Extra delimiters anywhere in the string will result in empty strings being
returned as part of the results table.</p>
delimiter. If the delimiter is not present in the string, then the result
will be a table consisting of one item: the original string parameter. Extra
delimiters anywhere in the string will result in empty strings being returned
as part of the results table.</p>

<p>The function also provides two shortcuts for common situations. If the
delimiter parameter is an empty string, the function returns a table
containing every character in the original string as a separate item. If the
delimiter parameter is <code>nil</code>, the function considers this equivalent to the
Lua pattern <code>'%s+'</code> and splits the string on whitespace.</p>
containing every character in the original string as a separate item. (I.e.,
if the delimiter is the empty string, the function explodes the string.) If
the delimiter parameter is <code>nil</code>, the function considers this equivalent to
the Lua pattern <code>'%s+'</code> and splits the string on whitespace.</p>

<p>Examples:</p>

Expand All @@ -49,37 +50,92 @@ <h2>Usage</h2>
</code></pre></li>
<li><p>A special case: empty string delimiter</p>

<p> A pattern of the empty string is special. It tells the function to
return each character from the original string as an individual item.
Think of this as &ldquo;explode the string&rdquo;.</p>
<p> If the delimiter is an empty string, the function returns each
character from the original string as an individual item. Think of
this as &ldquo;explode the string&rdquo;.</p>

<pre><code> split('foo', '') -- returns {'f', 'o', 'o'}
</code></pre></li>
<li><p>Another special case: nil delimiter</p>
<li><p>Another special case: <code>nil</code> delimiter</p>

<p> Passing nothing or an explicit <code>nil</code> as the delimiter is a second
special case. <code>split</code> treats this as equivalent to a pattern of <code>'$s+'</code>
and splits on consecutive runs of whitespace.</p>
<p> Pass nothing or an explicit <code>nil</code> as the delimiter and <code>split</code> acts as
if the delimiter were <code>'$s+'</code>. This makes it easier to split on
consecutive runs of whitespace.</p>

<pre><code> split('foo bar buzz') -- returns {'foo', 'bar', 'buzz'}
</code></pre></li>
</ul>
</li>
<li><p><code>spliterator(string, delimiter) =&gt; custom iterator</code></p>
<li><p><code>each(string, delimiter) =&gt; custom iterator</code></p>

<p><strong>NB</strong>: This function was previously called <code>spliterator</code>, but I&rsquo;ve renamed
it to the shorter and less goofy <code>each</code>. In order to give people who might
rely on the previous name time to switch over, <code>spliterator</code> is still
provided as an alias for <code>each</code>. However, that name will be removed in the
next major version release (i.e., 4.0.0) of this module.</p>

<p>This is an iterator version of the same idea. Everything from above applies,
except that the function returns a custom iterator to work through results
rather than a table.</p>
<p>This is an iterator version of the same idea as <code>split</code>. Everything from
above applies, except that the function returns a iterator to work through
results rather than a table.</p>

<pre><code> local spliter = require 'split'.spliterator
<pre><code> local split_each = require 'split'.each

local str = 'foo,bar,bizz,buzz,'
local str = 'foo,bar,bizz,buzz'
local count = 1
for p in spliter(str, ',') do
for p in split_each(str, ',') do
print(count .. '. [' .. p .. ']')
count = count + 1
end
</code></pre></li>
<li><p><code>first_and_rest(string, delimiter) =&gt; string, string (or nil)</code></p>

<p>This function is a string equivalent for a function that divides a list into
its head and tail. The head of the string is everything that appears before
the first appearance of a specified delimiter; the tail is the rest of the
string. <code>first_and_rest</code> attempts to split a string into two pieces, and it
returns two results using Lua&rsquo;s multiple return. The exact return values vary
depending on the string and delimiter.</p>

<p>In the simplest case, the string contains the delimiter at least once. If so,
the first return value will be the portion of the string before the first
appearance of the delimiter, and the second return value will be the rest of
the string after that delimiter.</p>

<p>If the delimiter does not appear in the string, however, then there&rsquo;s no
possible split. In this case, the first return value will be the entire
string, and the second return value will be <code>nil</code>. (From Lua&rsquo;s point of view,
a second return value of <code>nil</code> is equivalent to saying that the function only
returns one value.)</p>

<p>If the second return value is <code>nil</code>, there is probably a problem or malformed
record. So it will often make sense to test the second return value before
proceeding. For example:</p>

<pre><code> local head, tail = first_and_rest(record, '%s*:%s*')
if not tail then
-- Signal an error to the caller.
else
-- Process the record.
end
</code></pre>

<p>A second complication is that the strings returned by the function may be
empty. If the delimiter is found, but the portion of the string before or
after it is zero-length, then an empty string may be returned. The examples
below show various possible outcomes.</p>

<pre><code> first_and_rest('head: tail', ': ') -- returns 'head', 'tail'
first_and_rest('head, tail', ': ') -- returns 'head, tail', nil
first_and_rest(': tail', ': ') -- returns '', 'tail'
first_and_rest('head: ', ': ') -- returns 'head', ''
</code></pre>

<p>Like <code>split</code> and <code>each</code>, <code>first_and_rest</code> accepts <code>nil</code> or an empty string as
special cases for the delimiter. <code>nil</code> is automatically transformed into
&lsquo;%s+&rsquo;, a generic &ldquo;separated by space&rdquo; pattern. In the case of an empty string
delimiter, <code>first_and_rest</code> returns the first character of the input and the
rest of the input. (This seems to be the only reasonable interpretation of
&ldquo;exploding&rdquo; the input string in the context of this function.)</p></li>
</ul>


Expand All @@ -89,7 +145,7 @@ <h2>Varia</h2>
should be self-explanatory.</p>

<ul>
<li><p><code>version() -- 2.0.0-1</code></p></li>
<li><p><code>version() -- 3.0.0-1</code></p></li>
<li><p><code>author() -- Peter Aronoff</code></p></li>
<li><p><code>url() -- https://bitbucket.org/telemachus/split</code></p></li>
<li><p><code>license() -- BSD 3-Clause</code></p></li>
Expand All @@ -101,6 +157,9 @@ <h2>Credits</h2>
<p>Many of my ideas came from reading <a href="http://lua-users.org/wiki/SplitJoin">the LuaWiki page on split</a>. I thank
all those contributors for their suggestions and examples.</p>

<p><a href="https://bitbucket.org/moteus">Alexey Melnichuk, AKA moteus</a> provided the idea and initial code for
<code>first_and_rest</code>.</p>

<p>All mistakes are mine. See <a href="changes.html">version history</a> for release details.</p>

<hr />
Expand Down
24 changes: 24 additions & 0 deletions split-3.0.0-1.rockspec
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package = 'split'
version = '3.0.0-1'
source = {
url = 'https://bitbucket.org/telemachus/split/downloads/split-v3.0.0-1.tar.gz',
dir = 'split'
}
description = {
summary = 'String split function and iterator for Lua',
detailed = [[
A string split function and iterator for Lua since the string standard
library doesn't come with one.
]],
homepage = 'https://bitbucket.org/telemachus/split',
license = 'BSD 3-Clause',
maintainer = 'Peter Aronoff <[email protected]>'
}
dependencies = { 'lua >= 5.1' }
build = {
type = 'builtin',
modules = {
split = 'src/split.lua',
},
copy_directories = { 'doc' }
}
Loading

0 comments on commit a741d06

Please sign in to comment.