[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ale] sed head scratcher

Jim Kinney wrote:
> On Tue, Aug 4, 2009 at 2:33 PM, JK<jknapka at kneuro.net> wrote:
>> Jim Kinney wrote:
>>> breakout the greybeards....
>>> I have a pile of text lines that need to have a certain portion
>>> transposed from upper case (windows dweebs did the hostnames) to lower
>>> case.
>>> So many lines like:
>>> MACHINE102
>>>  MACHINE304 FredsBox
>>>   MACHINE599  TestingSystemB
>>> etc
>>> So varying IP address then varying spaces then upper case name with
>>> digits then varying spaces and sometimes followed by other name with
>>> mixed case.
>>> I want to ONLY lower case the names after the IP address, not anything
>>> else in the line.
>>> Here's what I have so far:
>>> cat /etc/hosts | sed '/ [A-Z]{7}[0-9]{1,3}/ y/[A-Z]/[a-z]/'
>>> seems like it should work but it only replaces the 'A' with 'a' and it
>>> does it anywhere in the line.
>> Disclaimer 1: I just scrambled the keycaps on my laptop, so this message
>> may make absolutely no sense whatsoever.
>> Disclaimer 2: I Am Not A sed Expert, though I've used it on occasion
>> (and have to re-read the manpage each time).
>> That said:
>> (1) First regexp selects lines to change, but does not restrict the
>> scope of the following command within the selected lines.  The
>> command itself, IIRC, must do the work to restrict itself to a particular
>> region of interest within the line it's operating on. (Accomplishing
>> this is left as an exercise for the reader, but I'm pretty sure that
>> in this case it would involve capture groups in the "to change"
>> regexp and capture-group backrefs in the replacement :-)
>> (2) Assuming sed substitutions are like vi ones, wouldn't you need
>> a "g" (global) qualifier to make the subst cmd do more than a single
>> substitution?
> My thinking as well. However:
> cat testfile
> ABCDFGHI01 little stuff
> 123 BSGXAAKI01 456
> ABCDEFGI02 FredsPage
> [jkinney at worktop tmp]$ cat testfile | sed -r '/[A-Z]{8}[0-9]{1,3}/
> y/[QWERTYUIOPASDFGHJKLZXCVBNM]{8}[0-9]{1,3}/[qwertyuiopasdfghjklzxcvbnm]{8}[0-9]{1,3}/'
> abcdefgi01
> bcdefghi02
> abcdfghi01 little stuff
> 123 bsgxaaki01 456
> abcdefgi02 fredspage
> It LC's all text yet it looks like the search field requires 1-3
> digits. So the string "FredsPage" should not be altered.

The 'y' command doesn't accept regexp syntax.  It just interprets the
original and substitute strings as literals, and maps them character-

I think there's a flag to the regexp-based 's' command that means
"downcase the resulting string"; or maybe a flag to use in the
replacement that means "use downcased capture group N".  But I
be far beyond the limits o' me sed knowledge, mate, swimmin' in
punctuation-infested waters.

-- JK

(Yes, I know Talk Like A Pirate Day is a ways off. But it can be
Type Like A Pirate Day every day of the year.)