Display first n characters of a string by stripping html

Question

0.00/5 (No votes)

See more:

I have a long string and I want to display the first 50 characters of it (without including the HTML content). Can anyone suggest any method?

Some sample HTML code:

HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
			<html>
			   <head>
				  <title>Paula - Microsoft Word - Comparison of the different image compression algorithms.doc</title>
				  <title></title><link href="/DigitalLibrary/extData.aspx?filePath=stylesheet.css&epub=b3aab940-fb48-4f6c-ae63-d599f4893795_aguilera_rpt.epub" type="text/css" rel="stylesheet"/>
			   </head>
			   <body>
				  
      <div class="body">
         <div id="frontmatter">
            <div id="titlepage">
            </div>    
         </div>
      </div>
   

<a id="1"></a><p><pre>

Comparison of different image

compression formats

Posted 21-May-12 20:27pm

Member 8491154

Updated 21-May-12 21:10pm

v3

Add a Solution

Comments

VJ Reddy 22-May-12 2:42am

Can you please post a sample content of the string.

Member 8491154 22-May-12 3:25am

Posted.

CodingLover 22-May-12 3:09am

Is that a static content?

Member 8491154 22-May-12 3:25am

Its just a code sample that is coming in the string.

Technoses 22-May-12 3:46am

show you content with writing type??what do you want actually??

VJ Reddy 28-Jun-12 9:53am

Thank you for viewing and accepting the solution :)

5 solutions

Solution 1

use jquery
use $("#areaid").text()

Posted 21-May-12 20:32pm

mr.priyank

Comments

Member 8491154 22-May-12 2:40am

what is "#areaid" ? Can you explain it in detail?

mr.priyank 22-May-12 3:55am

areaid is the id of the div ( body etc ) that contains the html.
this will be done in javascript.
<div id="areaid"> Your Html Content </div>

Solution 4

You can use regular expression to remove html tags.
For this refer to this link
http://www.niceonecode.com/Q-A/DotNet/CSharp/Get%20the%20first%20200%20characters%20of%20a%20string%20without%20breaking%20HTML%20tags%20at%20the%20end/20132[^]

Posted 31-May-15 5:18am

mr.rahulmaurya

Comments

CHill60 31-May-15 12:55pm

You're only 3 years late with this and regex has already been suggested. As this is to your own site, many will consider this post spam - please consider this before answering old posts

Solution 5

Quote:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html>
<head>
<title>Paula - Microsoft Word - Comparison of the different image compression algorithms.doc</title>
<title></title><link href="/DigitalLibrary/extData.aspx?filePath=stylesheet.css&epub=b3aab940-fb48-4f6c-ae63-d599f4893795_aguilera_rpt.epub" type="text/css" rel="stylesheet"/>
</head>
<body>

Posted 31-May-15 6:41am

Member 11731058

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

VJ Reddy · Accepted Answer · 2012-05-21T23:01:00

jQuery is much powerful to extract the content of HTML document.

However, if you can't use jQuery then the Regex class can be used to extract the content between <title> and </title>, which is required as mentioned in the question, as shown below:

C#

string htmlText = @"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.1//EN"" ""http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"">
            <html>
               <head>
                  <title>Paula - Microsoft Word - Comparison of the different image compression algorithms.doc</title>
                  <title></title><link href=""/DigitalLibrary/extData.aspx?filePath=stylesheet.css&epub=b3aab940-fb48-4f6c-ae63-d599f4893795_aguilera_rpt.epub"" type=""text/css"" rel=""stylesheet""/>
               </head>
               <body>
                <div class=""body"">
                    <div id=""frontmatter"">
                        <div id=""titlepage"">
                        </div>
                    </div>
                </div>
            <a id=""1"">";

    Match match = Regex.Match(htmlText,@"<title>([^<>]*)</title>",
                RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);

    if (match.Success && match.Groups.Count > 1)
        Console.WriteLine(match.Groups[1].Value);

//Output
//Paula - Microsoft Word - Comparison of the different image compression algorithms.doc

adkalavadia · Accepted Answer · 2012-05-21T22:36:00

Solution 2

please refer below link for html tag stripping.

for C# :

Convert HTML to Plain Text[^]

HTML Tag Stripper[^]

for SQL :
MS SQL Function[^]

Posted 21-May-12 22:36pm

adkalavadia

Updated 21-May-12 22:39pm

v3

Display first n characters of a string by stripping html

5 solutions

Solution 3

Solution 2

Solution 1

Solution 4

Solution 5

Add your solution here

Preview 0

Existing Members

...or Join us