[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ale] sed head scratcher



Jim Kinney wrote:
> On Tue, Aug 4, 2009 at 2:33 PM, JK<jknapka at kneuro.net> wrote:
>> Jim Kinney wrote:
>>> breakout the greybeards....
>>>
>>>
>>> I have a pile of text lines that need to have a certain portion
>>> transposed from upper case (windows dweebs did the hostnames) to lower
>>> case.
>>>
>>> So many lines like:
>>>
>>> 192.168.0.2 MACHINE102
>>> 192.168.3.4  MACHINE304 FredsBox
>>> 10.0.2.3   MACHINE599  TestingSystemB
>>> etc
>>>
>>> So varying IP address then varying spaces then upper case name with
>>> digits then varying spaces and sometimes followed by other name with
>>> mixed case.
>>>
>>> I want to ONLY lower case the names after the IP address, not anything
>>> else in the line.
>>>
>>> Here's what I have so far:
>>>
>>> cat /etc/hosts | sed '/ [A-Z]{7}[0-9]{1,3}/ y/[A-Z]/[a-z]/'
>>>
>>> seems like it should work but it only replaces the 'A' with 'a' and it
>>> does it anywhere in the line.
>>
>> Disclaimer 1: I just scrambled the keycaps on my laptop, so this message
>> may make absolutely no sense whatsoever.
>>
>> Disclaimer 2: I Am Not A sed Expert, though I've used it on occasion
>> (and have to re-read the manpage each time).
>>
>> That said:
>>
>> (1) First regexp selects lines to change, but does not restrict the
>> scope of the following command within the selected lines.  The
>> command itself, IIRC, must do the work to restrict itself to a particular
>> region of interest within the line it's operating on. (Accomplishing
>> this is left as an exercise for the reader, but I'm pretty sure that
>> in this case it would involve capture groups in the "to change"
>> regexp and capture-group backrefs in the replacement :-)
>>
>> (2) Assuming sed substitutions are like vi ones, wouldn't you need
>> a "g" (global) qualifier to make the subst cmd do more than a single
>> substitution?
> 
> My thinking as well. However:
> 
> cat testfile
> ABCDEFGI01
> BCDEFGHI02
> ABCDFGHI01 little stuff
> 123 BSGXAAKI01 456
> 123.0.23.45 ABCDEFGI02 FredsPage
> 
> 
> [jkinney at worktop tmp]$ cat testfile | sed -r '/[A-Z]{8}[0-9]{1,3}/
> y/[QWERTYUIOPASDFGHJKLZXCVBNM]{8}[0-9]{1,3}/[qwertyuiopasdfghjklzxcvbnm]{8}[0-9]{1,3}/'
> abcdefgi01
> bcdefghi02
> abcdfghi01 little stuff
> 123 bsgxaaki01 456
> 123.0.23.45 abcdefgi02 fredspage
> 
> It LC's all text yet it looks like the search field requires 1-3
> digits. So the string "FredsPage" should not be altered.

The 'y' command doesn't accept regexp syntax.  It just interprets the
original and substitute strings as literals, and maps them character-
for-character.

I think there's a flag to the regexp-based 's' command that means
"downcase the resulting string"; or maybe a flag to use in the
replacement that means "use downcased capture group N".  But I
be far beyond the limits o' me sed knowledge, mate, swimmin' in
punctuation-infested waters.

-- JK

(Yes, I know Talk Like A Pirate Day is a ways off. But it can be
Type Like A Pirate Day every day of the year.)