Click here to Skip to main content
15,889,852 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I've got some xml files like this
I want to search each mixed-citation and check whether any one of the given tags exist in a mixed-citation with specific attribute value of publication-type and get the parent ref and the name of the unsupported tag that is matched.
Here is a list of unsupported tags for a specific publication-type
SAMPLE XML:
XML
<?xml version="1.0"?>
<ref-list>
<title>References</title>
<ref id="ref1"><label>[1]</label><mixed-citation publication-type="patent"><person-group person-group-type="author"><string-name><surname>Angel’skii</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Ushenko</surname>, <given-names>A.,G.</given-names></string-name>, <string-name><surname>Arkhelyuk</surname>, <given-names>A.,D.</given-names></string-name>, <string-name><surname>Ermolenko</surname>, <given-names>S.,B.</given-names></string-name>, <string-name><surname>Burkovets</surname>, <given-names>D.,N.</given-names></string-name></person-group>, "<article-title>Scattering of laser radiation by multifractal biological structures</article-title>." <source>Optika ieee Spektroskopiya 88</source> (<issue>3</issue>), <fpage>495</fpage><lpage>498</lpage> (<year>2000</year>).</mixed-citation></ref>
<ref id="ref2"><label>[2]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Hanson</surname>, <given-names>S., G.</given-names></string-name>, <string-name><surname>Zenkova</surname>, <given-names>C.,Yu.</given-names></string-name>, <string-name><surname>Gorsky</surname>, <given-names>M.,P.</given-names></string-name>, <string-name><surname>Gorodyns’ka</surname>, <given-names>N.,V.</given-names></string-name></person-group>, "<article-title>On polarization metrology (estimation) of the degree of coherence of optical waves</article-title>." <source>Optics Express</source> <volume>17</volume>(<issue>18</issue>), pp.<fpage>15623</fpage><lpage>15634</lpage> (<year>2009</year>).</mixed-citation></ref>
<ref id="ref3"><label>[3]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Maksimyak</surname>, <given-names>P.,P.</given-names></string-name>, <string-name><surname>Hanson</surname>, <given-names>S.,G.</given-names></string-name>, <string-name><surname>Ryukhin</surname>, <given-names>V.,V.</given-names></string-name></person-group>, "<article-title>New Feasibilities for Characterizing Rough Surfaces by Optical-Correlation Techniques</article-title>" <source>Applied Optics</source> (<issue>40</issue>) , pp. <fpage>5693</fpage><lpage>5707</lpage> <conf-date>12-15-2007</conf-date> (<year>2001</year>).</mixed-citation></ref>
<ref id="ref4"><label>[4]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Ushenko</surname>, <given-names>Yu.,O.</given-names></string-name>, <string-name><surname>Dubolazov</surname>, <given-names>O., V.</given-names></string-name>, <string-name><surname>Karachevtsev</surname>, <given-names>A.,O.</given-names></string-name>, <string-name><surname>Gorsky</surname>, <given-names>M., P.</given-names></string-name>, <string-name><surname>Marchuk</surname>, <given-names>Yu., F.</given-names></string-name></person-group>, "<article-title>Wavelet analysis of Fourier polarized images of the human bile</article-title>." <source specific-use="IEEE">Applied Optics</source> (<issue>51</issue>), P. <fpage>133</fpage><lpage>139</lpage> (<year>2012</year>).</mixed-citation></ref>
<ref id="ref5"><label>[5]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Ushenko</surname>, <given-names>A.,G.</given-names></string-name>, <string-name><surname>Burkovets</surname>, <given-names>D.,N.</given-names></string-name>, <string-name><surname>Ushenko</surname>, <given-names>Y., A.</given-names></string-name></person-group>, "<article-title>Polarization visualization and selection of biotissue image two-layer scattering medium</article-title>." <source>Journal of biomedical optics</source> <volume>10</volume>(<issue>1</issue>), P.<fpage>14010</fpage> (<year>2005</year>).</mixed-citation></ref>
<ref id="ref6"><label>[6]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Polyanskii</surname>, <given-names>P.,V.</given-names></string-name>, <string-name><surname>Felde</surname>, <given-names>C.,V.</given-names></string-name></person-group>, "<article-title>The emerging field of correlation optics</article-title>." <source>Optics and Photonics News</source> <volume>23</volume>(<issue>4</issue>), p.p.<fpage>25</fpage><lpage>29</lpage> (<year>2012</year>).</mixed-citation></ref>
<ref id="ref7"><label>[7]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.,V.</given-names></string-name>, <string-name><surname>Bekshaev</surname>, <given-names>A.,Ya.</given-names></string-name>, <string-name><surname>Maksimyak</surname>, <given-names>P.,P.</given-names></string-name>, <string-name><surname>Maksimyak</surname>, <given-names>A.,P.</given-names></string-name>, Mokhun, <string-name><surname>Hanson</surname>, <given-names>S.,G.</given-names></string-name>, <string-name><surname>Zenkova</surname>, <given-names>C., Yu.</given-names></string-name>, <string-name><surname>Tyurin</surname>, <given-names>A.,V.</given-names></string-name></person-group>, "<article-title>Circular motion of particles suspended in a Gaussian beam with circular polarization validates the spin part of the internal energy flow</article-title>." <source>Optics Express</source> <volume>20</volume>(<issue>10</issue>), pp.<fpage>11351</fpage><lpage>11356</lpage> (<year>2012</year>).</mixed-citation></ref>
<ref id="ref8"><label>[8]</label><mixed-citation publication-type="periodical"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>O.V.</given-names></string-name>, <string-name><surname>Besaha</surname>, <given-names>R.N.</given-names></string-name>, <string-name><surname>Mokhun</surname>, <given-names>I.I.</given-names></string-name></person-group> "<article-title>Appearance of wavefront dislocations under interference among beams with simple wavefronts</article-title>," <source>Optica Applicata</source> <volume>27</volume>(<issue>4</issue>), Pages <fpage>272</fpage><lpage>278</lpage> <edition>5</edition> (<year>1997</year>).</mixed-citation></ref>
<ref id="ref9"><label>[9]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Angelsky</surname>, <given-names>P., O.</given-names></string-name>, <string-name><surname>Ushenko</surname>, <given-names>A., G.</given-names></string-name>, <string-name><surname>Dubolazov</surname>, <given-names>A., V.</given-names></string-name>, <string-name><surname>Sidor</surname>, <given-names>M., I.</given-names></string-name>, <string-name><surname>Bodnar</surname>, <given-names>G., B.</given-names></string-name>, <string-name><surname>Koval</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Trifonyuk</surname>, <given-names>L.</given-names></string-name></person-group>, "<article-title>The singular approach for processing polarization-inhomogeneous laser images of blood plasma layers</article-title>." <source source-type="IEEE">J. Opt.</source> (<issue>15</issue>), <fpage>044030</fpage> (<year>2013</year>).</mixed-citation></ref>
</ref-list>


What I have tried:

C#
XDocument doc=XDocument.Load(@"D:\ref.xml");

var invalid_citations1=doc.Descendants("mixed-citation").Where(q=>q.Attribute("publication-type").Value=="periodical")
	.Where(a=>a.Descendants("edition").Any() || a.Descendants("chapter-title").Any()||a.Descendants("conf-date").Any()||a.Descendants("conf-loc").Any()||a.Descendants("conf-name").Any()||a.Descendants("conf-sponsor").Any())
	.Select(x=>x.Parent.Attribute("id"));

var invalid_citations2=doc.Descendants("mixed-citation").Where(q=>q.Attribute("publication-type").Value=="book")
	.Where(a=>a.Descendants("article-title").Any() || a.Descendants("conf-sponsor").Any()||a.Descendants("conf-date").Any()||a.Descendants("conf-loc").Any()||a.Descendants("conf-name").Any()||a.Descendants("conf-sponsor").Any()|| a.Descendants("institution").Any() || a.Descendants("ref-degree").Any() || a.Descendants("patent").Any() || a.Descendants("std").Any())
	.Select(x=>x.Parent.Attribute("id"));

foreach (var element in invalid_citations1) {
	Console.WriteLine("Check "+element+" ==> publication-type=\"periodical\": for unsupported tag/tags");
}
foreach (var element in invalid_citations2) {
	Console.WriteLine("Check "+element+" ==> publication-type=\"book\": for unsupported tag/tags");
}
Console.ReadLine();

But I cannot get the name of the unsupported tag that is matched for each mixed-citation...How do I get that?
Also, how can I do this for all different publication-type's in a single expression, rather than doing invalid_citations1, invalid_citations2 ... etc.(as I did in my code)?
Posted
Updated 28-Apr-18 19:55pm
v2
Comments
Wendelius 29-Apr-18 0:40am    
Can you post an example XML
Member 12692000 29-Apr-18 1:09am    
I did post a link for a sample xml file, is it not working? anyways I added it in the question.

1 solution

Not sure if I understand the question correctly, but to use a dynamic list in the where clause and to fetch the tags that matched, consider the following example

C#
var invalidTags = new List<string> { "edition", "chapter-title", "conf-date", "conf-loc", "conf-name", "conf-sponsor" };

var citationQuery = from item in doc.Descendants("mixed-citation")
                    where item.Attribute("publication-type").Value == "periodical"
                    && item.Descendants().Any(x => invalidTags.Contains(x.Name.LocalName))
                    select new {
                       a = item.Parent.Attribute("id"),
                       b = item.Descendants().Where(x => invalidTags.Contains(x.Name.LocalName))
                    };

In the returned class, property a contains the attribute while b contains a list of matching tags

ADDITION
--------
To fetch the tags based on publication type, consider the following

Method to list tags
C#
public static List<string> GetInvalidTags(string publicationType) {
   List<string> invalidTags = new List<string>();

   if (publicationType == "periodical") {
      invalidTags = new List<string> { "edition", "chapter-title", "conf-date", "conf-loc", "conf-name", "conf-sponsor" };
   }
   return invalidTags;
}

And usage
C#
var citationQuery = from item in doc.Descendants("mixed-citation")
                    where item.Descendants().Any(x => GetInvalidTags(item.Attribute("publication-type").Value).Contains(x.Name.LocalName))
                    select new {
                       a = item.Parent.Attribute("id"),
                       b = item.Descendants().Where(x => GetInvalidTags(item.Attribute("publication-type").Value).Contains(x.Name.LocalName)),
                       c = item.Descendants().Where(x => GetInvalidTags(item.Attribute("publication-type").Value).Contains(x.Name.LocalName)).First().Name
                    };
 
Share this answer
 
v5
Comments
Member 12692000 29-Apr-18 2:14am    
Thanks, it did help a little, however, one of my main questions was can I, in one query (maybe using groupby and anonymous types or something I'm just guessing) check the invalidTags for different item.Attribute("publication-type").Value == e.x. "periodical", "book", "patent" etc. Also each publication-type value has its own list of invalidTags.
Wendelius 29-Apr-18 2:48am    
In my example I used a static list. What you can do is to create a method which returns publication type specific tag list and use that in the condition.
Wendelius 29-Apr-18 4:51am    
See the updated answer.
Member 12692000 29-Apr-18 8:55am    
I'm getting The name 'invalidTags' does not exist in the current context (CS0103)
Wendelius 29-Apr-18 9:09am    
Sorry, there was a typo in the anonymous class. See the updated answer

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900