Click here to Skip to main content
15,914,642 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi:
I want to get the earthquake information for USGS web site.
the URL is:http://earthquake.usgs.gov/earthquakes/recenteqsww/Quakes/quakes_big.php[^]

I only want to the interested parameter,
CSS
MAG     UTC DATE-TIME      LAT      LON      DEPTH    Region

MAP  6.2    2011/09/03 04:49:01      -56.551      -27.039   106.0    SOUTH SANDWICH ISLANDS REGION
MAP  5.1    2011/09/03 01:25:39       52.056     -171.567   51.2     FOX ISLANDS, ALEUTIAN ISLANDS, ALASKA
MAP  5.8    2011/09/03 01:06:56      -12.784      166.672   101.2    SANTA CRUZ ISLANDS


I want to use java to get the contents of the web's table.
the regular-expression may be the best way.
Could anyone help me on regular-expression?

Thanks a lot!
Posted

XML
Pattern p = Pattern.compile("<table([\\s\\S]+?)>([\\s\\S]+?)</table>",Pattern.MULTILINE);// find the tables contents.
    Matcher m = p.matcher(sb.toString());
    while(m.find()){
//    System.out.println(m.group()); //fine the tables content is OK
    Pattern p1 = Pattern.compile("<table.*>(<tr>(<td.*>.*</td>){6}</tr>)+</table>");//may be something wrong in the line
    Matcher m1 = p1.matcher(m.group());
    if (m1.find()) {
       System.out.println(m1.group(0)+"__"+m1.group(4));
    }
    System.out.println("-----------------------");
    }



please help.
 
Share this answer
 
the following do well,but it is not the best. Could it be Optimized more?
XML
Pattern p = Pattern.compile("<table([\\s\\S]+?)>([\\s\\S]+?)</table>",Pattern.MULTILINE|Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(sb.toString());
String str="";
 while(m.find()){
     Pattern pTr = Pattern.compile("<tr.*?>.*?<\\s*/tr.*?>");
    Matcher mTr = pTr.matcher(m.group());
     Pattern pTd = Pattern.compile("<td.*?>.*?<\\s*/td.*?>");
     Matcher mTd = null;
    while (mTr.find()) {
         mTd = pTd.matcher(mTr.group());
        while (mTd.find()) {
            str=str+(mTd.group().replaceAll("<.*?>", "").replace("&nbsp;", "") + "\t");
           }
        System.out.println(str );
        str="";
      }
   }
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900