Click here to Skip to main content
13,143,185 members (57,145 online)
Rate this:
Please Sign up or sign in to vote.
See more:
The following Perl script cureently reads in an html file and strips off what I don't need. It also opens up a csv document which is blank.
My problem being is I want to import the stripped down results into the CSV's 3 fields using Name as field 1, Lives in as field 2 and commented as field 3.
The results are getting displayed in the cmd prompt but not in the CSV.

use warnings; 
use strict;  
use DBI;
use HTML::TreeBuilder;  
use Text::CSV;
open (FILE, 'punter.htm'); 
#open (my $fh, ">punter.csv") || die "couldn't open the file!";
 my $csv = Text::CSV->new (); 
$csv->column_names('field1', 'field2', 'field3'); 
open my $fh, ">", "punter.csv" or die "new.csv $!"; 
while ( my $l = $csv->getline_hr(my $fh)) { 
    next if ($l->{'field1'} =~ /xxx/); 
    printf "Field1: %s Field2: %s Field3: %s\n", $l->{'field1'}, $l->{'field2'}, $1->{'field3'}; 
$csv->print (my $fh, [my $name, my $location, my $comment]);
close my $fh1 or die "$!"; 
my $tree = HTML::TreeBuilder->new_from_content(     do { local $/; <FILE> } ); 
for ( $tree->look_down( 'class' => 'postbody' ) ) 
my $location = $_->look_down( 'class' => 'posthilit' )->as_trimmed_text;     
my $comment  = $_->look_down( 'class' => 'content' )->as_trimmed_text;     my $name     = $_->look_down( '_tag'  => 'h3' )->as_text;     
$name =~ s/^Re:\s*//;     
$name =~ s/\s*$location\s*$//;      
print "Name: $name\nLives in: $location\nCommented: $comment\n"; } 

An example of the html is -
<pre lang="xml"><div class="postbody"> <h3><a href "foo">Re: John Smith <span class="posthilit">England</span></a></h3> <div class="content">Is C# better than Visula Basic?</div> </div>

How can I get the results into a CSV?
Posted 7-Jul-11 4:40am
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

I believe your error is rooted in not really understanding the meaning of my. You are using it all over the place, but when you do that you are creating a new variable in the enclosing block. You should really go back and check every instance of my to see if that is what you really want to do.

Specifically in (but not limited to) the line:
$csv->print (my $fh, [my $name, my $location, my $comment]);

You are:
  • creating a new variable $fh (masking the $fh from your open) where print is expecting you to give it an IO handle

  • creating three new variables $name, $location, $comment where print is expecting to get an arrayref

and note that none of these new variables have values, so no wonder nothing is being printed. The only reason your close is not giving you a warning is that you mistyped the $fh as $fhl. Just fix that and you should see the warning "my" variable $fh masks earlier declaration in same scope.

The CSV section would be better as something like this (untested):
my @names = qw(name location comment)
open my $fh, ">", "punter.csv" or die "new.csv $!"; 
while ( my $l = $csv->getline_hr($fh)) { 
    next if ($l->{'name'} =~ /xxx/); 
    for(@names) { print "$_: ",$l->{$_} }
    $csv->print ($fh, $l);
close $fh or die "$!"; 

This takes advantage of the column naming feature also, and should cope with any number of fields. By the way, I'd suggest never using $l as a variable name in Perl as it looks too much like $1 in many fonts, which of course has a special regex meaning.
LamboLambo 20-Jul-11 11:07am
Great stuff works a treat, thanks for the input about the usage of 'my' I understand how it can conflict if used too often.
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

If you want just want plain implementation
just use single print statement to a file .CSV file is nothing but fields seperated by comma.
open (MYFILE, ">>$tempFile");
print MYFILE  "field1,field2,field3\n";

If you want to use Text::CSV,
Then I don't think you should use that "my" in "my $fh" again.
here -
while ( my $l = $csv->getline_hr(my $fh)) {
Member 4749791 19-Jul-11 12:53pm
While I do sometimes use simple prints for CSVs too, it is a bad habit and will break easily. In this case, the second and third fields in particular could contain commas (e.g. location of "Dallas, TX"), and Text::CSV will quote that properly to avoid issues when reading it later.
harish85 20-Jul-11 19:42pm
Thanks. Yes what I said was if the OP (after seeing the redecarlation of variables with my everywhere) just require a plain implementation to CSV can go for normal printing to a file directly. But I don't consider its had habit to not to use "Text::CSV" , you can custom that writing with your implementation too.
You have explained it very well in your post. Neat work, My 5!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy |
Web03 | 2.8.170915.1 | Last Updated 19 Jul 2011
Copyright © CodeProject, 1999-2017
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100