Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: Perl file
Hi,
 
I have only been learning Perl for less than 2 weeks. I am a C++ programmer.
I have attached a portion of the data below. The data is in file1.txt. I would like to move the data from file1.txt to file2.txt. But, I only want to keep the numbers.
 
Eg: I want row 1 to look like this:
1       1549367 11     8       3      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64
 
Instead of this:
 
1       Chr26   1549367 11      GGGGGGGAAGA     8       3       Transition      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64
 
This is what I have done so far (file1.txt will be in @ARGV):
 
open FILE2, "+>file2.txt" or die "Cant not open file2.txt!";
my $line;
while($line = readline(ARGV))
{
        print FILE2 $line;
}
 
The code above only copies content of file1.txt (ARGV) into file2.txt.
I tried to use ‘seek’ and ‘tell()’ but, to solve my problem above but, I got confused Frown | :(
 
I also tried this:
 
Open(FILE, “file1.txt”)
@theFile = ;
 
This puts every row in the array @the File. But, I can I now modify the elements of one row? (I’m still a novice Perl programmer)
 
Thank you for your help
 
/………………………………………………………………………………………../
The file portion
 
1       Chr26   1549367 11      GGGGGGGAAGA     8       3       Transition      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64
1       Chr26   1549501 15      ccCctctccccctCC 12      3       Transition      3       12      -17.00  6.00    0.50    1.00    6.00    2.67    -17.00  6.00    0.93
1       Chr26   1549552 14      AagAAaaAAAagga  11      3       Transition      6       8       -31.00  6.00    -2.09   -12.00  3.00    -5.67   -31.00  6.00    -2.86
1       Chr26   1549563 14      tAAaaAAAattat^Ft        9       5       Transversion    5       9       -7.00   6.00    0.22    -64.00  4.00    -18.40  -64.00  6.00    -6.43
1       Chr26   1549726 14      TtTtctTtTtTTTT  13      1       Transition      8       6       -3.00   6.00    1.92    6.00    6.00    6.00    -3.00   6.00    2.21
2       Chr26   1549737 16      T+1Atttt+1aT+1At+1aTt+1aT+1AT+1AT+1AT+1AtT+1A^FA        15      11      Transversion    16      10      -64.00  6.00    -35.67  -64.00  6.00    -46.18  -64.00  6.00    -40.12
2       Chr26   1549815 9       CtCTTTTTT       7       2       Transition      8       1       -3.00   6.00    -0.14   -9.00   0.00    -4.50   -9.00   6.00    -1.11
1       Chr26   1549914 12      gGGGGGGGAGgg    11      1       Transition      9       3       -9.00   6.00    1.18    -4.00   -4.00   -4.00   -9.00   6.00    0.75
1       Chr26   1550018
Posted 19-Oct-11 3:14am
hervebags1.1K
Edited 19-Oct-11 4:55am
Mehdi Gholam236.1K
v2

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

You could do this in a couple of different ways.
 
The best would be to create the appropriate regular expression to clean the data as you want it, but another solution could be to do the following and then just add in a check for non-numeric characters in each column: I found this here(http://perdoc.perl.org)
 

How do I extract selected columns from a string?
(contributed by brian d foy)
If you know the columns that contain the data, you can use substr to extract a single column.
my $column = substr( $line, $start_column, $length );
You can use split if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data.
my $line    = ' fred barney   betty   ';
my @columns = split /\s+/, $line;
    # ( '', 'fred', 'barney', 'betty' );
my $line = 'fred||barney||betty';
my @columns = split /\|/, $line;
    # ( 'fred', '', 'barney', '', 'betty' );
If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that handle that format, such as Text::CSV , Text::CSV_XS , or Text::CSV_PP .

If you want to break apart an entire line of fixed columns, you can use unpack with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the pack and unpack entries in perlfunc for more details.
my @fields = unpack( $line, "A8 A8 A8 A16 A4" );
Note that spaces in the format argument to unpack do not denote literal spaces. If you have space separated data, you may want split instead.

 
I haven't the time at the moment to create the regex for you as that would be my primary choice, or to update the code above to accommodate your question completely, but hopefully it can get you down the right path. I'll update my answer if I get a chance in the next couple of hours.
  Permalink  
Comments
hervebags at 19-Oct-11 9:59am
   
Wow! Thanks for the many options.
Let me get to work now. I will contact you in a bit.
Simon Bang Terkildsen at 19-Oct-11 11:10am
   
My 5
hervebags at 19-Oct-11 14:08pm
   
Hello Marcus,
 
Thanks very much for your help.
I am unfortunately still having problems.
 
This is what I now have:
 
[code]
#!/usr/bin/perl -w
require 5.10.1; ## The required version
use strict;
 
open FILE1, "<../file1.txt" or die "Cant not create the file!";
open FILE2, "+>file2.txt" or die "Cant not create file2.txt!";
 
while(my $line = )
{
my @words = split /\s+/,$line;
my $line_out;
foreach my $word (@words)
{
if ($word !~ m/[^-+.d]/)
{
$line_out .= $word . '';
}
}
print FILE2 "$line_out\n";
}
close FILE1;
close FILE2;
[\code]
 
This is my error message:
[code]
Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 1.
 
Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 2.
 
Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 3.
 
Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 4.
 

[\code]
Apologies for the basic questions. I’m just a biginner.
 
Thank you for your help.
 
Herve
Marcus Kramer at 20-Oct-11 13:09pm
   
It doesn't look to me like you are reading anything from the file into $line either before you hit your while, or within the loop to get the next line from the file.
Instead of the $line at all, you could use the suggestions from perlfect.com. Scroll down to the "Reading Files" section and I would suggest using this approach to iterate through the lines in the file.
hervebags at 20-Oct-11 13:49pm
   
Sorry! I made a mistake in one of the lines.
It is -------> while(my $line = )
Not -------> while(my $line = )

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 383
1 Richard Deeming 180
2 ChauhanAjay 176
3 Sergey Alexandrovich Kryukov 171
4 Yogesh Kumar Tyagi 165
0 Sergey Alexandrovich Kryukov 6,252
1 OriginalGriff 5,498
2 CPallini 2,473
3 Richard MacCutchan 1,627
4 Abhinav S 1,530


Advertise | Privacy | Mobile
Web01 | 2.8.140821.2 | Last Updated 19 Oct 2011
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100