CSed - a cheap sed implementation
=================================

This is yet another descendent of the small-sed written ages ago by 
Eric S. Raymond.  He described the original version thus:

  This is a smaller, cheaper, faster SED utility.  Minix uses it.  GNU used
  to use it, until they built their own sed around an extended (some would
  say over-extended) regexp package.

While trying to keep it small, cheap and fast, my goals were to:
- convert it to a more modern C dialect
- be as most compliant with the modern POSIX sed specification as feasible
- remove as many limitations (fixed memory, ...) as feasible
I started from HHSed (also known as sed 1.5), the latest known descendant of 
small-sed at the time (march 2003). It turns out that I had to remove a 
number of HHsed 'extensions' that I personnally consider either useless 
or harmful. I hope I haven't betrayed the intentions of the original authors 
in doing so.

Laurent Vogel, September 2003  


Licence
=======

This is distributed under the GPL version 2 or later at your option.
(see http://www.gnu.org/copyleft/gpl.html)
It seems to me that doing this is coherent with 
1) the fact that Sed-1.3 itself was released under the GPL in 1998;
2) the wishes of authors of HHsed:

  You, Dear Reader, may do *anything* you wish with it except steal it.

  Copyright (c) 1991 Eric S. Raymond, David P Kirschbaum & Howard L. Helman
  All Rights Reserved

Features
========

Cheap-sed implements POSIX sed, plus extensions, and minus some bugs
and limitations.

  POSIX extensions
  ----------------

  - \t, \n, \a, \b, ..., \xXX are recognised in REs (including in bracket 
    expressions), the RHS of substitutions, text argument of i\, a\, c\, 
    and y command. Backslashes are ignored in other \y sequences, except
    in bracket expressions.
  
  - \+ in a RE is a synonym of \{1,\}, \? a synonym for \{0,1\}

  - \< and \> match beginning and end of word respectively

  - collation-related bracket symbols (like [:digit:], [=a=] and [.[.] 
    in bracket expressions) are only recognised in the POSIX locale.
  
  Optional behaviour
  ------------------

  - sub-expressions are not anchored: /\(^a\)/ is a synonym of /\(\^a\)/, 
    not of /^\(a\)/ (and similarly for "$")

  - "a**" is considered as "a*\*", "a*\{2\}" as "a*{2}"

  Bugs
  ----

  - Only up to nine subexpressions \(...\) are currently supported. An 
    indefinite number of them should be supported (but note that only the 
    first nine of them may be recalled using \1 to \9)
 
  - \1 ... \9 will have a wrong value if referring to a subexpression for 
    which backtracking occured. For example:

      echo abcada | sed 's/\(\(a\([cd]\)b\)*\)\{2\}/'
    
    reports "d" instead of the correct answer "c"

  - Cheap-sed does not implement strictly the leftmost, longest matching rule 
    mandated by POSIX. Instead, each part of the regular expression is tried 
    from left to right for the longest match. This is a bit hard to explain 
    in words, but here is an example:

      echo "aaabaaa" | sed 's/a*\(a*\)b\1/<&>'

    outputs "<aaab>aaa" in csed (as most other sed implementations do ?), 
    whereas according to POSIX it should output "<aaabaaa>" (as GNU sed 4.0 
    does).
    (Actually I've not yet decided if it is a bug or a feature: I don't 
    currently know if it can be implemented at a reasonable cost.)
  
  Limitations
  -----------

  The current version still has the following limitations:

  - no more that 20 levels of { ... } nesting

  - less than 10 writeout files

  - in "\{n\}", "\{n,\}" and "\{n,m\}" n and m must be <= 32767 (which, by the
    way, is far more than RE_DUP_MAX = 255 required by POSIX)
  
  - matched \( ... \), when repeated, must be less than 32767 bytes away in
    compiled form.

  Bug reports are welcome by mail at <lvl@club-internet.fr>.


(This archive contains only the DOS 16bit executable.)

That's it for documentation! I didn't include other HHsed files as
I didn't have time to maintain them and as there is plenty of accurate
documentation available elsewhere, including the POSIX specs for free.

Changes from HHsed
==================

(see the changelog)

A bit of history
================

march 2003
  starting working on csed 
1998
  sed 1.3 put under the GPL
october 1995
  manpage for sed 1.3 (?) written by Eric S. Raymond
october 1991
  sed 1.5 more changes from Howard L. Helman, published by David Kirschbaum
september 1991
  sed 1.4 (aka HHsed) a massive rewrite of sed 1.2 by Howard L. Helman,
  distributed by David Kirschbaum
around 1991?
  sed 1.2 for MSDOS (ported to Turbo C 2.0) published by David Kirschbaum
1991
  sed 1.3 (aka small-sed) 
before 1988? (from a date in the sedmod readme file)
  sed 1.0 written by Eric S. Raymond. 
around 1973/1974
  sed designed and implemented on Unix by Lee MacMahon

Authors
=======

(in chronological order)
Eric S. Raymond
David Kirschbaum
Howard L. Helman
Laurent Vogel

Thanks go to
============

- the Online group for releasing the POSIX/Single UNIX Specification 
  specification free of charge
- Eric S. Pement for the sed FAQ and for hosting a copy of hhsed source
- Aurlio Marinho Jargas for the authoritative http://sed.sf.net
- Jason Molenda for adding a testsuite in GNU sed (which I stole partly)
- Ken Pizzini and Paulo Bonzini for maintaining the GNU sed testsuite
- Greg Ubben for dc.sed
- members of the sed-users@yahoogroups.com mailing lists for discussing 
  portability issues.

Historical files also record contributions from (in alphabetic order):
Mark Adler, Robin Cover, John Hollow, Michael D. Lawler, James McNealy
Tom Oehser.

