[nycbug-talk] Text parsing question

maddaemon at gmail.com maddaemon at gmail.com
Wed Dec 31 12:10:37 EST 2008


On Wed, Dec 17, 2008 at 10:16 PM, James E Keenan <jkeen at verizon.net> wrote:
>>
>>
>> For example, here are 2 lines:
>>
>> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.8.17
>> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
>>
>> where 192.168.8.17 is the Windows DC, and the other is the IIP of the
>> webmail server.
>>
>> I need to remove the line that contains the DC _ONLY_WHEN_ there is a
>> duplicate entry (same timestamp) with another IP.  The text file
>> contains hundreds of other entries, and there are single entries where
>> the DC IP is the only entry.  Using the above examples, I need to
>> remove the first line and only retrieve the second line:
>>
>> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
>>
>>
>
> Perhaps this:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> my @last = ( '', '', '' );
> my @this;
> my $pattern = qr/^
>     ([a-zA-Z]{3}\s\d{2}\s\d{2}:\d{2}:\d{2}) # date string
>     \s-\s
>     (\w+)                                   # username
>     .*?
>     (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})    # IP address
>     $/x;
>
> while (my $firstline = <DATA>) {
>     if ($firstline =~ /$pattern/) {
>         @last = ( $1, $2, $3 );
>         last;
>     }
> }
>
> while (my $l = <DATA>) {
>     if ($l =~ /$pattern/) {
>         @this = ( $1, $2, $3 );
>         if ( $this[0] eq $last[0] and $this[1] eq $last[1] ) {
>             $last[2] = $this[2];
>         } else {
>             print ( ( join '|' => @last ), "\n" );
>             @last = @this;
>         }
>     }
> }
> print ( ( join '|' => @last ), "\n" );
>
> __DATA__
> Dec 15 05:15:33 - abc1234 tried logging in from 192.168.8.17
> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.8.17
> Dec 15 05:15:56 - abc1234 tried logging in from 192.168.18.13
> Dec 15 05:16:03 - xyz1ahj tried logging in from 192.168.18.43
> Dec 15 05:16:03 - xyz1ahj tried logging in from 192.168.15.220
> Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.220
> Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.221
> Dec 15 05:16:05 - xyz1ahj tried logging in from 192.168.15.79
> Dec 15 05:16:07 - vig1234 tried logging in from 192.168.15.79

Since I don't know Perl (yet), I showed that to my boss, who then
modified it, but his Perl has some rust on it, and it winds up puking
a lot.  Can anyone show me what should be fixed so I can get this
working and off my plate?  Much thanks..

Oh, and what would need to change so I could pull the data from a file
rather than appending the data to the bottom of the script?  I realize
that this isn't the proper forum for this question, so thanks to
everyone for putting up with me!

#!/usr/bin/perl
use strict;
use warnings;

my @last = ( '', '', '' );
my @this;
my @addys;
my @dcs = ('192.168.8.3', '192.168.8.17', '192.168.32.100');
my $pattern = qr/^
    ([a-zA-Z]{3}\s\d{2}\s\d{2}:\d{2}:\d{2}) # date string
    \s-\s
    (\w+)                                   # username
    .*?
    (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})    # IP address
    $/x;

while ($line = <DATA>) {
    if ($firstline =~ /$pattern/) {
        @last = ( $1, $2);
        push @addys, $3;
        last;
    }
}

while ($line = <DATA>) {
    if ($line =~ /$pattern/) {
        @this = ( $1, $2);
        if ( $this[0] eq $last[0] and $this[1] eq $last[1] ) {
            push @addys, $3;
        }
        else {
            if ($addys == 1) {
                print "$this[0] - $this[1] tried logging in from $addys[0]\n";
            }
            else {
                foreach $addy in @addys {
                   my $flag = false;
                   foreach $dc in @dcs {
                       if $addy eq $dc {$flag = true;}
                   }
                if !$flag {
                   print "$this[0] - $this[1] tried logging in from $addy\n";
                }
            }
        @last = @this;
        @addys = ();

        }
    }
}

foreach $addy in @addys {
    my $flag = false;
    foreach $dc in @dcs {
       if $addy eq $dc {$flag = true;}
    }
    if !$flag {
       print "$this[0] - $this[1] tried logging in from $addy\n";
    }
}

__DATA__
Dec 30 09:34:53 user1234 (tried logging in from 192.168.32.100)
Dec 30 09:34:53 user1234 (tried logging in from 192.168.32.7)
Dec 30 14:38:37 user5678 (tried logging in from 192.168.32.100)
Dec 30 14:38:37 user5678 (tried logging in from 192.168.32.8)
Dec 30 14:38:44 user5678 (tried logging in from 192.168.32.100)
Dec 30 14:38:44 user5678 (tried logging in from 192.168.32.8)



More information about the talk mailing list