Click here to Skip to main content
14,427,846 members
Rate this:
Please Sign up or sign in to vote.
See more:
I currently have this code in Python using feedparser module:

import feedparser

RSS_FEEDS = {'cnn': ''}

def get_news_test(publication="cnn"):
    feed = feedparser.parse(RSS_FEEDS[publication])
    articles_cnn = feed['entries']

    for article in articles_cnn:


This returns the following information (a single iteration):

                <![CDATA[Are China's latest weapons science fiction or battle-ready?]]>
                <![CDATA[Since the beginning of January, the Chinese military has revealed a dizzying array of sophisticated and powerful new weaponry. ]]>
            <guid isPermaLink="true"></guid>
            <pubDate>Sun, 20 Jan 2019 06:04:16 GMT</pubDate>
                <media:content medium="image" url="" height="619" width="1100" />
                <media:content medium="image" url="" height="300" width="300" />
                <media:content medium="image" url="" height="552" width="414" />
                <media:content medium="image" url="" height="480" width="640" />
                <media:content medium="image" url="" height="324" width="576" />
                <media:content medium="image" url="" height="250" width="250" />
                <media:content medium="image" url="" height="360" width="270" />
                <media:content medium="image" url="" height="169" width="300" />
                <media:content medium="image" url="" height="250" width="250" />
                <media:content medium="image" url="" height="186" width="248" />
                <media:content medium="image" url="" height="144" width="256" />

I know I can return some portions of this, for instance, the title by calling:


Someone said this is json data but I am having a hard time trying to get the individual image tag.

What I have tried:

I have tried calling the <media:content> as a key but that doesn't work.
Updated 19-Jan-19 22:44pm

1 solution

Rate this:
Please Sign up or sign in to vote.

Solution 1

It is not JSON, it is XML. See XML Processing Modules — Python 3.7.2 documentation[^].
Member 14123629 20-Jan-19 4:44am
Thanks! I did this and can get a list of the image urls but I still don't know how to get to the individual elements. :(

from bs4 import BeautifulSoup
import requests

source = requests.get('')

soup = BeautifulSoup(source.text, 'xml')

#media = media.find_all("url")

for url in soup.find_all("media:content"):
Richard MacCutchan 20-Jan-19 6:49am
Sorry, but I do not know BeautifulSoup. Try a Google search to find sample code.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100