Regular Expressions
|
|
 |

|
You don't seem to be allowing any content between the <a> and </a>. I think you need (.*) in there (it should be a replacement group because it's what you want to keep), and the matching to be done non-greedily.
Also, you don't allow for multiple <a> inside the same <pre>.
I don't really think regex is the right tool for this job, as there's lots of context and state-change in what you're trying to do. An XML based solution seems better to me. But something like
<pre[^>]*>(.*)(?:<a[^>]*>(.*)</a>(.*))*</pre>
... in non-greedy mode and writing all the found groups into the output might work.
|
|
|
|

|
Hi,
thanks for you answer. The content between "a" and "/a" has to stay, so my variant might be ok on this issue. But I never get the complete content of the "pre" tag. This is quite a problem.
What do you mean with "xml based solution"? How might xml help me on changing heaps of html-pages?
|
|
|
|

|
Sometimes it makes sense to split the problem up. Rather than attempting to grab the pre tags that contain a tags, why not grab the pre tags and then iterate over them - and do a second regex for the a tag. That way, you have the full context without worrying about backtracking.
|
|
|
|

|
My friend is making windows form and he needs to vlidate a username which will be a minimum of 4 characters and maximum of 15 characters long. It will also allow hyphens and underscores as well as dots in the middle, but not at the start and neither at the end of the username. There may be no more than one hyphen, one underscore and one dot in a row
Examples of disallowed usernames:.
-aquib
_aquib
.aquibxyz
aquib.
aquibxyz--qureshi
aquib__xyzqureshi
aquibqureshi-
aquib_qureshi-qureshi
aquib..qureshi
1236584 // not allow only numbers in username
aquib_ // means no symbols will be there at end
The username should not be only digits it should be either a mix of digits and alphabetical characters or it should be only alphabetic.
I hope this will be understood
I have got this regex:
^([a-zA-Z0-9](?(?!__|--)[a-zA-Z0-9_\-]){0,4}[a-zA-Z0-9])$
which is not usefull enough :(
|
|
|
|
|

|
yup. Thankx but can you simply provide me with such a regex which will take only numbers and alphabets only. (no special chrecters and dot and hifens and underscore all symbols are not allowed)
Minimum 3 and maximum 15 characters length must be and
and cannot take only numbers it should consist of only alphabets and numbers or only alphabets
( a username should not b only numbers it should be alphanumeric)
and i have downloaded expresso its too difficult to understand. I am a beginer. Of c#
sorry i must have frustated many coders
: (
|
|
|
|

|
If you take a look at The 30 Minute Regex Tutorial[^] here on CodeProject, you can work it out fairly easily to be something like:
\b\w{3,15}\b
You may need to modify the above for your specific requirements but it should get you started.
One of these days I'm going to think of a really clever signature.
|
|
|
|

|
OK, I've now worked out how to do it almost perfectly. For explanation purposes I've split the expression into three lines, but if you will use it it will have to written in one line like below:
(?!^[a-zA-Z0-9\.\-_]{1,2}$)(?!^[a-zA-Z0-9\.\-_]{16,}$)(?!^[0-9]{3,15}$)^(?:[a-zA-Z0-9](?:\.|_|-)?)+[a-zA-Z0-9]$
Explanation:
There is a grouping operator for regular expressions that is called zero-width negative lookahead (?! < subexpression >). What does that mean? Let's dissect this monster description:
- Zero-width lookahead
- This essentially means that this matching expression will not consume any characters. So when this expression has matched any following non zero.width expression will start of at the same place in the string as the zero-width regular expression did.
- Negative
- This signifies that all following expressions can only match, if this expression did not produce any matches.
(?!^[a-zA-Z0-9\.\-_]{1,2}$) // This expression will assert that a username with characters a-z A-Z 0-9 . _ and - are not less than three characters long
(?!^[a-zA-Z0-9\.\-_]{16,}$) // This expression will assert that a username with characters a-z A-Z 0-9 . _ and - are no more than 15 characters long
(?!^[0-9]{3,15}$) // This expression will assert that a username will not be made up of only digits [0-9]
^(?:[a-zA-Z0-9](?:\.|_|-)?)+[a-zA-Z0-9]$
Given the three assertions at the beginning of the regular expression we can now be sure that there are at least 3 characters and no more than 15 and that a username of only digits is also disallowed. The ^ and $ in the assertions are important here.
Now lets have a look at the final part which is also broken into several lines to better annotate it:
^ (?:[a-zA-Z0-9] (?:\.|_|-)? )+ [a-zA-Z0-9] $
I do hope you could follow my explanations and found this solution helpful. The only drop of bitternes to me is that we now have an additiona constraint:
While it is true that the each of the characters '.', '-' and '_' may appear no more than once in row (valid constraint from OP), but now none of the charactes in the class [\.\-_] can appear one after another (new constraint introduced through solution).
Regards,
— Manfred
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|

|
it does take
shariq_k.-kha, how to avoid that. I mean allowing only one dot dash or underscore in a row?
|
|
|
|

|
You must be doing something wrong. It works exactly like you want it.
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|
 |
|
|
General
News
Suggestion
Question
Bug
Answer
Joke
Rant
Admin