Google Groups Home
Help | Sign in
Message from discussion Need ideas on how to make this code faster than a speeding turtle
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
A. Sinan Unur  
View profile
 More options May 15, 7:14 pm
Newsgroups: comp.lang.perl.misc
From: "A. Sinan Unur" <1...@llenroc.ude.invalid>
Date: Thu, 15 May 2008 23:14:45 GMT
Local: Thurs, May 15 2008 7:14 pm
Subject: Re: Need ideas on how to make this code faster than a speeding turtle
cha...@lonemerchant.com wrote in
news:b0833433-862e-4895-8002-66c92cc53b15@w5g2000prd.googlegroups.com:

> On May 15, 3:16 pm, "Gordon Etly" <ge...@bentsys-INVALID.com> wrote:
>> cha...@lonemerchant.com wrote:
>> > On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
>> > chadda  <cha...@lonemerchant.com> writes:
>> > > i have to know if you could write this mess any slower? you are
>> > > doing
>> > > everything possible to slow you down.
>> > I know I shouldn't critize free help, but you seem to have some
>> > anger management issues.

...

>> As a simple answer, take a look at LWP:UserAgent
>> (http://search.cpan.org/~gaas/libwww-perl-

5.812/lib/LWP/UserAgent.pm),

>> as a good start in the right direction.

...

> I just tried LWP, and now I can't get the code to work for the life of
> me. Here is what I attempted

As I mentioned elsewhere, all you need is LWP::Simple.

So, here is a fish for you:

C:\Temp> cat p.pl
#!/usr/bin/perl

use strict;
use warnings;

use HTML::TokeParser;
use LWP::Simple;

my ($input_file) = @ARGV;
die "No input file specified\n" unless defined $input_file;

open my $INPUT, '<', $input_file
    or die "Cannot open '$input_file': $!";

ID:
while ( my $id = <$INPUT> ) {
    chomp $id;

    my $url = make_url( $id );
    my $html = get $url;

    unless ( defined $html ) {
        warn "Error downloading from '$url'\n";
        next ID;
    }

    my $parser = HTML::TokeParser->new( \$html );

    TABLE:
    while ( my $token = $parser->get_tag('table') ) {
        if ( lc $token->[1]{id} eq 'product_details' ) {
            my $td = $parser->get_tag('td');
            last TABLE unless $td;
            my $cell = $parser->get_text('/td');
            my %data;
            while ( $cell =~ /\s*([^:]+?):\s+(\d+)\s+/g ) {
                $data{$1} = $2;
            }
            use Data::Dumper;
            print Dumper \%data;
        }
    }

}

sub make_url {
    return
    sprintf q{http://www.doba.com/members/catalog/%s.html}, $_[0];

}

__END__

C:\Temp> timethis p list

$VAR1 = {
          'Product ID' => '3308191',
          'UPC' => '896207999816',
          'Item ID' => '3653992',
          'SKU' => '8930'
        };

TimeThis :  Command Line :  p list
TimeThis :    Start Time :  Thu May 15 18:19:28 2008
TimeThis :      End Time :  Thu May 15 18:19:29 2008
TimeThis :  Elapsed Time :  00:00:01.062

Comparing this to the overhead of an empty script:

C:\Temp> cat t.pl
#!/usr/bin/perl

use strict;
use warnings;

C:\Temp> timethis t

TimeThis :  Command Line :  t
TimeThis :    Start Time :  Thu May 15 18:20:38 2008
TimeThis :      End Time :  Thu May 15 18:20:38 2008
TimeThis :  Elapsed Time :  00:00:00.218

It took 0.844 seconds to retrieve and parse the required information. Of
course, the time cost would be better amortized if you ran a lot of
these queries.

--
A. Sinan Unur <1...@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google