Click here to Skip to main content
15,949,741 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a .txt file with a bunch of sentences and a .txt file with words I want to remove from the first file. I just started playing around regex and im not quite sure how to do it

What I have tried:

I have tried the following method which did not work for me.
C#
class Program
{
    static void Main(string[] args)
    {
        Program p = new Program();
        const string CFd = "..\\..\\Duomenys.txt";
        const string CFr = "..\\..\\Rezultatai.txt";
        const string Remove = "..\\..\\NorimiZodziai.txt";
        string lines = ReadingText(CFd);
        string removeW = ReadingText(Remove);                    //Words i want to remove from CFd file
        string replacedLines = Replaceing(lines, removeW);
        p.Printing(replacedLines,CFr);
    }

    static string ReadingText(string CFd)
    {
        string lines = File.ReadAllText(CFd, Encoding.GetEncoding(1257));
        return lines;
    }
    static string Replaceing(string lines, string removeW)
    {
        string pattern = @"\b"+removeW+"\b";
        string output = Regex.Replace(lines, pattern, "");
        return output;
    }
    void Printing (string replacedLines, string file)
    {
        using (StreamWriter writer = new StreamWriter(file))
        {
            writer.WriteLine(replacedLines);
        }
    }
}
Posted
Updated 17-Nov-18 4:34am
v2

You don't need a regex for that: string.Replace will do the job just as well (and quicker, regex is a general purpose string processor, so it isn't as fast as a dedicated method.
String.Replace Method (System) | Microsoft Docs[^]
 
Share this answer
 
Comments
Mantukasx 17-Nov-18 8:06am    
If I want to remove words from different places, it doesn't remove anything, do i need to create an array of words i want to remove ?
OriginalGriff 17-Nov-18 8:21am    
Do you want to try that again, this time without expecting me to be able to see your screen, access your HDD, or read your mind?
Mantukasx 17-Nov-18 8:58am    
removeW has 3 words
class Program
{
static void Main(string[] args)
{
Program p = new Program();
Encoding.GetEncoding(1257);
const string CFd = "..\\..\\Duomenys.txt";
const string CFr = "..\\..\\Rezultatai.txt";
const string Remove = "..\\..\\NorimiZodziai.txt";
string lines = ReadingText(CFd);
string removeW = ReadingText(Remove); //Words i want to remove from CFd file
string replacedLines = Replaceing(lines, removeW);
p.Printing(replacedLines,CFr);
}

static string ReadingText(string CFd)
{
string lines = File.ReadAllText(CFd, Encoding.GetEncoding(1257));
return lines;
}

static string Replaceing(string lines, string removeW)
{
lines = lines.Replace(removeW, "");
return lines;
}
void Printing (string replacedLines, string file)
{
using (StreamWriter writer = new StreamWriter(file))
{
writer.WriteLine(replacedLines);
}
}
}
OriginalGriff 17-Nov-18 10:01am    
If you are reading a bunch of words into a single string, you need to use string.Split to "break" the words you want to remove into single items. So if they are separated by space, then
string[] words = removeW.Split(' ');
Then use a loop to remove each of them in turn.
Quote:
I have tried the following method which did not work for me.

It is a good idea to show examples of how it don't work.
Quote:
I have a .txt file with a bunch of sentences and a .txt file with words I want to remove from the first file.

Removing words is a little more complicated than what you did.
C#
string pattern = @"\b"+removeW+"\b";

In a sentence, a word is not necessary embedded between spaces, you can have , . ? ! and nothing if the word is first or last in string.
Another problem is that when you remove a word, you don't want to remove both spaces around the word, you need to keep 1 space.
So replacing by a space will already an improvement:
C#
string pattern = @"\b"+removeW+"\b";
string output = Regex.Replace(lines, pattern, " ");

You need to define what to do when a word is not between spaces, then deduce how it translate into code.
-----
Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx:
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx:
Regexper[^]
 
Share this answer
 
Thank you guys, i fixed it, however, after the words are removed, there are blank spaces and separators left, is there any possible way I can remove them ?
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900