html convertion and c# interpretation.

Question

0.00/5 (No votes)

See more:

If you feel to correct this text, do it! Any help is welcome.
I am sorry for not knowing the correct formulation for this post. I did my best.

I want to make a easy to find software for my artists that I watch(and I have 3 pages with users,approx 200) on deviantart.com.
To view a user page on that website, just formulate as this: http://q12a.deviantart.com/ (this is my account btw)

I am new to this thing, web pages decoding and re-coding into c#.
I don't know how to even begin to explain anything.
All the c# code is working fine.
Except when I run the program, [some] images from [some] users, do not want to load.

I have looked into the web page source(ctrl+u) in Firefox.
I saved the damn user page.
I "copy the value" from the string at break point in c#.

What I can observe, is that not all the text is the same, and what i mean is that:
in the source page from firefox, all the keywords are at their place, nice and clean. If I search for keyword: "data-super-img=" I can find it for every image link from that page. The same is if I download the page in htm format.
The thing is changed when I looked into "copy the value" thing.
In some areas of the text, "data-super-img=" keyword is missing !!! Why? I checked like 100 times to get this answer in the end.
For some very obscured reason, some keywords are avoided by the stream reader.
I am not accusing the stream reader of anything.
Most than probably is a thing related by php, html reading, html converting into a stream and not into a page, hell knows.
Maybe my HttpWebRequest/HttpWebResponse formula is not converting completely from the stream. maybe another formula must be used, to actually convert that stream into a html, or something. I really don't know, I just use my intuition.

I dont want to leave the impression that i spam or that i push artists in the front news. All my intention is to solve this mystery, and I will leave an example that was the most full of misbehavior's like these(not finding the correct link).
the page is from this artist- http://aeolus06.deviantart.com/gallery/
[SOME] pages behave this phenomenon,of skipping the keywords(or links) but the majority is doing fine.

Thank you.

string streamx = "", file = "";
string urlx = "", text3564 = "", codRX = "";
int i7 = 0;

   public void userGallery()
       {
           streamx = ""; urlx = "";
           //http://q12a.deviantart.com/gallery/
           urlx = "http://" + listBox1.SelectedItem.ToString().ToLower() + ".deviantart.com/gallery/";
           linkLabel1.Text = urlx;

           //-------------------------------------------------------------
           //Special webpage Reading (extract info from page)
           HttpWebRequest request;
           HttpWebResponse response = null;
           Stream stream = null;
           request = (HttpWebRequest)WebRequest.Create(urlx);
           request.UserAgent = "Foo";
           request.Accept = "*/*";
           response = (HttpWebResponse)request.GetResponse();
           stream = response.GetResponseStream();

           StreamReader sr = new StreamReader(stream, System.Text.Encoding.Default);
           streamx = sr.ReadToEnd();
           if (stream != null) stream.Close();
           if (response != null) response.Close();
           //-------------------------------------------------------------
       //sample search for:  [ data-super-img="http://fc05.deviantart.net/fs70/f/2014/dante_by_w-d89.png" ]


       //First
           //cut the page at segment: [folderview-art] (to start from correct position)
           text3564 = streamx; codRX = urlx = ""; i7 = 0;
           i7 = text3564.IndexOf("<div class=\"folderview-art\">"); text3564 = text3564.Remove(0, i7);

           //Second
           //Sample Pics x6
           string st0 = "data-super-img=";
           if (text3564.Contains(st0))
           {
               //pic01
               i7 = text3564.IndexOf(st0); text3564 = text3564.Remove(0, i7);
               codRX = st0 + ".*?(\" )";
               urlx = Regex.Match(text3564, codRX).ToString().Replace(st0, "").Replace("\"", "");
               i7 = text3564.IndexOf(urlx); text3564 = text3564.Remove(0, i7 + urlx.Length);
               pictureBox3.ImageLocation = urlx;

               //pic02
               i7 = text3564.IndexOf(st0); text3564 = text3564.Remove(0, i7);
               codRX = st0 + ".*?(\" )";
               urlx = Regex.Match(text3564, codRX).ToString().Replace(st0, "").Replace("\"", "");
               i7 = text3564.IndexOf(urlx); text3564 = text3564.Remove(0, i7 + urlx.Length);
               pictureBox4.ImageLocation = urlx;

               //pic03...06 (i have 6 picture boxes)
          }
       }

   private void listBox1_SelectedIndexChanged(object sender, EventArgs e)
       {
           userGallery();
       }

Posted 19-Mar-15 11:54am

_Q12_

Add a Solution

Comments

[no name] 19-Mar-15 18:01pm

Try this: Log out from deviantart and reload the page. Do the keywords and links that your C#-program doesn't pick up still show up in the browser?

_Q12_ 19-Mar-15 18:27pm

I did as you requested - I logged out from my account.
And guess what? those EXACT images where blocked also(with a special image) by the website itself. If I click on that "blocking" image, it was lead me to a page that was say: "Mature Content Filter is On ". And I am viewing it from the perspective of the logged off citizen. Whaaaaaat? it is so simple and I have mist it. Un_freaking_believable.
Thank you my good friend from internet! You save my brain from a good fry.

Now... I suppose I must be [Logged IN] to actually view those images from my little application. How to do that? Hmm? And worse, how to set the "mature" option, from my app...buh. But I'm happy that I get the answer... man, what a relief, you have no idea. Thank you 1000 times. Probably I will make some "mature content warning" myself on those locations.

Set to [Answer] to give you some stars for this. You deserve it.

[no name] 19-Mar-15 18:32pm

You're welcome! I wrote it up as a "solution" (see below) - please be so kind and mark it as accepted. Good luck with your program!

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

manchanx · Accepted Answer · 2015-03-19T12:31:00

Solution 1

The website-content received by your application is different from your browser because with your browser you're "logged in" on DeviantArt, your application though is currently being treated as an anonymous user by the website and thus not getting the same privileged access.

Posted 19-Mar-15 12:31pm

manchanx

Comments

_Q12_ 19-Mar-15 18:33pm

thanks! ;)

[no name] 19-Mar-15 18:35pm

You're welcome :)

_Q12_ 19-Mar-15 18:37pm

I know Its a bit away from the subject, now that is resolved, but can you indulge me, and give me some "knowledge" about this topic with [loged in] thingy? How to make it?

[no name] 19-Mar-15 18:48pm

I was just about to give you a hint about that :) The easiest way would be this: The "stay logged in" feature between website and browser is done with a "cookie" that your browser saves and associates with that website. You would have to locate that cookie file (they have random names AFAIK) and provide the content of it as part of the HTTP-Header that your program is sending with the HttpWebRequest. The drawback would be that it only works as long as that cookie is valid, that is, as long as you don't log out and log in again, which would replace the cookie with a new one.