Click here to Skip to main content
15,886,362 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi All,

I am having a lot of problems to do something, which I think should be trivial, but I fail to find the right solution.

The problem is this: I have a csv text file ( without any control over the content ) with some lines in it that end with "\t\t\r\n" whereas they should end with "\r\n" instead. The extra '\t' characters cause problems when trying to import the text file in MySQL.

I thought SED could take care of that but I have not found anything that really works. Hours of Googling has not helped either.

Any help/suggestion would be very much appreciated.

What I have tried:

I have gotten far enough to be able to find a method to declare two strings as follows:

mytest=$'\t\t\r\n'
mytest=$'\r\n'

When I echo those string to a file as follows:

echo "mytest"+"mytest2" > bin.txt

I get the bin.txt file and sure enough it has the expected content. ( "\t\t\r\n+\r\n" ).

What I can't find until now is to get the sed command to properly use the strings to replace the occurrence of mytest with the content of mytest2 in a file.
Posted
Updated 13-Nov-20 3:29am

Try something like:
cat sourcefile | sed 's/\t//g' > destfile

That should replace every occurrence of '\t' with nothing, effectively removing them.
 
Share this answer
 
Comments
k5054 13-Nov-20 9:21am    
Not sure why you would use cat when sed knows how to read from a file
sed -e 's/\t//g' sourcefile > destfile

You could also do this in place:
sed -i -e 's/\t//g' sourcefile
fd9750 13-Nov-20 9:29am    
Hi,
I have tried innumerable variants on that but it never succeeded.
In the mean time I have found a way to do it with a python script.
Richard MacCutchan 13-Nov-20 9:30am    
I tried it with my suggestion and it worked fine.
k5054 13-Nov-20 9:32am    
works for me too ... Maybe the op actually has '\' followed by 't' to replace, rather than tabs?
Richard MacCutchan 13-Nov-20 9:33am    
I wondered about that too.
#!/usr/bin/python
# replace.py
import sys

# Replace string in a file (in place)
match=b'\t\t\r\n'
replace=b'\r\n'
filename='MyTestFile.txt'

print ("Replacing strings in",filename)

with open(filename,"rb") as f:
  data = f.read().replace(match,replace)

with open(filename,"wb") as f:
  f.write(data)

The trick is to open the file as a binary file, specify binary match and replace strings and write te file back as a binary file: works like a charm.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900