Click here to Skip to main content
15,881,248 members
Articles / Programming Languages / C#

File List Downloader

Rate me:
Please Sign up or sign in to vote.
4.75/5 (16 votes)
1 Jun 2011CPOL7 min read 53.5K   7.8K   43   7
A tool to automatically download files from website that provides Direct Download

Introduction

File List Downloader is a tool to download files from Direct Download with crawling capabilities.

To get the source code, download this. To easily start to use download the demo. (To know how to use, watch YouTube video URL in the demo section.)

Background

You often need to download from a website. Sometimes, a website provides lots of files to download. First, you go to the page displaying the title of the download, where a list of files is showing. You click on each file, where you will be redirected to another page to show list of the download links or mirrors, where you click the download link to download. You have to go to the inner page for each file to get the download link, it is very troublesome when there are hundred of files to be downloaded.

For such scenario, what you need is not a common downloader. You need a downloader that is able to identify a download link, to select a mirror for you and even log in for you when the website required you to be logged in before downloading.

This tool named File List Downloader can accomplish the above scenario and download for you, what you have to do is to input the URL of the page that lists out all the files and put it inside File List Downloader, and File List Downloader will crawl it for you, the only drawback that FLD has is it requires to have the plugin for that website created, but don’t worry, currently FLD supports 4 plugins, in future when requested, I will create more plugins for more websites to support a large variety of websites and what's great is FLD is open source, which means that anybody can help in the success of FLD.

Concept

Imagine a web page has 3 levels before downloading start. First level is shows a list of episodes to download, episode1, episode2, episode3, this level is called level 0. When clicking on an episode, it will show a list of available video formats for that episode, some users like to watch in 720p format (HD), while some like to watch in lower quality like 480p or 360p. So in level 1, user is asked to click for which format to download. After clicking on 720p (for example), user will be redirected to level 2, which shows the download link.

This website name ABC.com has the URL configuration like this:
www.abc.com/AnimeTitle (change AnimeTitle to for example Naruto/Bleach), this is called level 0 URL. While Level 1 URL might be www.abc.com/AnimeTitle/Episode1, etc.

It is a tedious job for users to click on each episode and select the format they want and then continue clicking Download Link in the browser before download start. What's even worse is that the user needs to wait for the download to finish before clicking on the next episode to continue downloading, user needs to spend time to monitor the download progress to download list of anime.

fld1.jpg

In the above example, user inputs www.abc.com/AnimeTitle (level 0 URL) to the software, step by step as below:

  1. User selects plugin and inputs the Level 0 URL and clicks Crawl in the FLD software. The plugin has a Property saying that the Download Link is generated in level=2. Because this plugin is meant for abc.com specifically (each plugin is for each website).

    fld2.jpg

  2. FLD downloads the URL user inputted return HTML page data.
  3. FLD passes the page data HTML to selected plugin with level=0, plugin will parse the HTML and return list of episodes URL. Plugin knows that level 0 means that it is required to parse the HTML as list of episodes and return the URL.
  4. For each URL, FLD downloads the page and passes the page data HTML to selected plugin with level=1, plugin will parse the HTML and return the URL link for every video format available for each episode (some 720p some 480p) and by default select 720p episode (this is depending on the plugin implementation). Plugin knows that level 1 means that it is a page that shows a list of available formats for that particular episode, so it will return a list of formats and return the URL and it will select 720p format by default (this is depending on the plugin implementation). Plugin can be smarter, it does not have to select 720p by default, it can select the highest quality, or lowest quality depending on the plugin implementation, in future, plugin can be configurable but for now plugin is hard coded to select the highest quality.
  5. FLD will show users the list of episodes and list of format available for that particular episode and by default show the default selected for download that is 720p. FLD knows that it has to stop crawling now, because it already reach level=1 which is the deepest level before Download Link is generated (Download Link is generated in level=2). FLD supports page as deep as it can, it does not have to be 3 levels in depth, but it depends on the plugin implementation to support how much depth.
  6. FLD will wait for the user to click Add To Download List or amend selection.
  7. User amends selection, for example some episode, user wants to view in 480p video format, and user unselects some episodes that user already watched and does not want to download.
  8. After amending selection, user clicks Add To Download List, and all selected downloads will be inserted into the bottom section (Download Section).

    fld3.jpg

  9. User clicks Start Download, and download starts automatically. For the first file that user selected (exist in Download Section), FLD will download the URL and pass to plugin with level=2, plugin will parse the HTML and return the Download Link and FLD will start downloading from the link. After it finishes downloading, FLD will continue for the second file that user selected (exist in Download Section), etc. until all files are finished downloading.

    FLD will crawl for each episode, as if user is doing browsing, and FLD will download the video format selected by the user as if user is clicking it and download it. And finally, FLD will download all the selected episodes in the Download Section.

Plugin

All the power of the FLD is centralized in the plugin, it is the task of the plugin developer to develop the plugin as powerful as it can, while FLD supports browsing, downloading and interfacing with user selection. Plugin can have as many levels as possible. It is basically unlimited. Plugin also can tell FLD that it requires to log in before downloading, and plugin will prepare a URL for posting login information, and the format of the POST string, and FLD will use it to log in before downloading.

What's even more is that some websites limit the user to download a certain amount of files a day, for example 4 files a day. After that smart user will log out and use another login to download and it works, which means that website identifies download amount from user login instead of IP address. And FLD supports this! FLD supports multiple user account configuration and rotates the user to download as much as possible, what the plugin needs to do is to tell FLD how to identify that the limit is already reached and that it's time to change with another username.

Each plugin is a .NET 2.0 DLL file, and it only supports for downloading in one website, so it requires to have lots of DLL files to support a large variety of websites, currently it only support 4 anime direct download website.

Demo

I prepared a video demonstrating how to use File List Downloader. Check it out here:

Future Enhancements

  1. Fix bug that sometimes stops download and does not continue to the next episode (rare case)
  2. Create more plugins for more Direct Download websites
  3. Support resume
  4. Support multiple segment download

Changes Log

  • 2 June 2011: Fixed the hylia plugin so that the plugin can now login, since hylia is limiting 3 downloads per day by checking IP, it is pointless to use this tool for only 3 files, so I force the user to specify login information and FLD acts as a premium member when downloading.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Singapore Singapore
I write code mostly in C#, VB.NET, PHP and Assembly.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Yusuf31-May-11 4:08
Yusuf31-May-11 4:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.