[nycbug-talk] Text parsing question

Wed Dec 17 22:16:06 EST 2008

>
>
> For example, here are 2 lines:
>
> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.8.17
> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
>
> where 192.168.8.17 is the Windows DC, and the other is the IIP of the
> webmail server.
>
> I need to remove the line that contains the DC _ONLY_WHEN_ there is a
> duplicate entry (same timestamp) with another IP.  The text file
> contains hundreds of other entries, and there are single entries where
> the DC IP is the only entry.  Using the above examples, I need to
> remove the first line and only retrieve the second line:
>
> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
>
>

Perhaps this:

#!/usr/bin/perl
use strict;
use warnings;

my @last = ( '', '', '' );
my @this;
my $pattern = qr/^
     ([a-zA-Z]{3}\s\d{2}\s\d{2}:\d{2}:\d{2}) # date string
     \s-\s
     (\w+)                                   # username
     .*?
     (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})    # IP address
     $/x;

while (my $firstline = <DATA>) {
     if ($firstline =~ /$pattern/) {
         @last = ( $1, $2, $3 );
         last;
     }
}

while (my $l = <DATA>) {
     if ($l =~ /$pattern/) {
         @this = ( $1, $2, $3 );
         if ( $this[0] eq $last[0] and $this[1] eq $last[1] ) {
             $last[2] = $this[2];
         } else {
             print ( ( join '|' => @last ), "\n" );
             @last = @this;
         }
     }
}
print ( ( join '|' => @last ), "\n" );

__DATA__
Dec 15 05:15:33 - abc1234 tried logging in from 192.168.8.17
Dec 15 05:15:56 - abc1234 tried logging in from 192.168.8.17
Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
Dec 15 05:16:03 - xyz1ahj tried logging in from 192.168.18.43
Dec 15 05:16:03 - xyz1ahj tried logging in from 192.168.15.220
Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.220
Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.221
Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.79
Dec 15 05:16:07 - vig1234 tried logging in from 192.168.15.79