Fw: [RndTbl] Oh great RE master
Gilles Detillieux
grdetil at scrc.umanitoba.ca
Wed May 9 16:45:00 CDT 2007
The problem with 's/.*\([[:digit:]]*\).*/\1/g' is the first .* will
swallow up as many characters as it can while still having the rest of
the expression match something. Now, because the * means 0 or more of
the previously matched character, the [[:digit:]]* and trailing .* will
happily match nothing at all, so the initial .* still swallows
everything. The fix is to make the first part more restrictive than .*,
.e.g. [^0-9]* or [^[:digit:]]*, so it won't chew up your digits, but
then Sean's RE is even simpler -- so long as you want all the digits and
it doesn't matter where they are. If you needed to extract the first
contiguous string of possibly several strings of digits, though, you'd
need to get more elaborate.
An equivalent to Sean's command would be:
echo BUILD-AM005-a | tr -dc '0-9'
This would chew up the newline character as well, but that doesn't
matter if you're going to use the result in a variable using var=`...`
or var=$(...) .
Gilles
On 05/09/2007 04:14 PM, Steve Moffat wrote:
> Well, ya... I guess I did the equivalent (though not so concise) method
> after sending the first email to roundtable...
>
> echo APP-AM005-a | sed 's/[[:alpha:]]//g;s/[[:punct:]]//g'
>
> I like the search inversion though Sean. Much cleaner!
>
> So the problem I have is solved, thanks Sean. But why won't my original
> method work?
> The [[:digit:]]* should have matched all the consecutive digits
> shouldn't it? And then the ( ) brackets should place the match into
> buffer 1.
>
> Steve
>
> IBM Global Services
> sjm at ca.ibm.com
> (204)792-3245
>
> ----- Forwarded by Steve Moffat/CanWest/IBM on 05/09/2007 04:08 PM -----
>
> *"Sean Walberg" <sean at ertw.com>*
> Sent by: swalberg at gmail.com
>
> 05/09/2007 04:05 PM
>
>
>
> To
>
> Steve Moffat/CanWest/IBM at IBMCA
>
> cc
>
> roundtable at muug.mb.ca
>
> Subject
>
> Re: [RndTbl] Oh great RE master
>
>
>
>
> # echo BUILD-AM005-a | sed 's/[^0-9]//g'
> 005
>
> Sean
>
>
>
> On 5/9/07, *Steve Moffat* <_Steve.Moffat at ca.ibm.com _
> <mailto:Steve.Moffat at ca.ibm.com>> wrote:
>
> Hi All;
> I've been trying to write a sed function to return only a numeric
> portion of a string, but can't seem to get it working.
> The input is a single string of letters and numbers, with the
> numbers always consecutive.
> For example: BUILD-AM005-a
>
> I want to get the 005 out of this string.
>
> echo BUILD-AM005-a | sed 's/.*\([[:digit:]]\).*/\1/g'
>
> will return the digit 5. This is good!
>
> So I add an asterisk to try to match multiple digits like:
> echo BUILD-AM005-a | sed 's/.*\([[:digit:]]*\).*/\1/g'
>
> and instead of returning 005, it doesn't match anything, so
> returns nothing.
>
> Can any of you RE maters help me out?
>
> Steve Moffat
> IBM Global Services_
> __sjm at ca.ibm.com_ <mailto:sjm at ca.ibm.com>
> (204)792-3245
--
Gilles R. Detillieux E-mail: <grdetil at scrc.umanitoba.ca>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada)
More information about the Roundtable
mailing list