Follow by Email

Monday, February 12, 2007

Download all?

If you faced a site with about hundreds files that you want to download you will know what is this about.

I've faced two of them today. A site containing the MDC pictures, and a site containing MP3s for the whole Qur'an by Mashari.

In the page that containing all the links, I could have passed on them link-by-link pressing 'save item as'. Alternatively I did the following.

I right clicked on the page, using Firefox, 'View page info'. In the 'media' tab, I selected all the images and saved the link list to a file called list.txt.

Then I created a new folder and put the file in it. And I wrote the following simple command and pressed enter:
for File in $(cat list.txt) ; do wget $File; done
And I let it download all the links mentioned in the file.

And about the MP3, I went to the 'links' tab and likewise put all the links in a files also called list.txt in another folder. But this time there were two links for each Sura, one in MP3 and one in Zip. I did that to get the zip links only:
cat list.txt | grep -i zip > list.txt
Then I issued the same command to start downloading. After that I did:
for File in $(ls); do unzip $File; rm $File; done
to unzip files and delete unzipped .zip files.

After that I did:
play *.mp3
to play them all :D ( yeah mp3 from the command line :D )

The final download script for the MP3s was
for File in $(cat list.txt)
do wget $File
File=$(basename $File)
unzip $File
rm $File
Like that once a file is downloaded it is unzipped and deleted automatically.

I am writing this to let people who use 'non-unix' know how simple and handy Linux can be.


Islam Ossama said...

Or you can download GetRight, copy the page's link, open the GetRight Browser, select the files you want to download from the list, and click download ;)! (Same or very similar procedure for any download manager, too!)

Ahmed Mubarak said...

جزاكم الله خير نبيل واسلام

Mohammad Nabil said...

Islam: you still will click hundreds of times, congratulations :D

Ahmad: we2yakom


Btw this command normalized the gain among all your mp3 :
find . -type f -iname '*.mp3' -print0 | xargs -0 mp3gain -r -k

This script:
for file in $( ls *.mp3 )
do file_name=$(basename $file)
file_name2=$(echo $file_name | sed s/^00*//)
mv $file_name $file_name2
removes the leading zeroes from the file name so you can play them using another script like that:
./ 80 114
to play Suras from 80 to 114
the is :
while [ $i -ne $2 ]
do play --volumne=1.6 ${i}.mp3
i=$(( $i + 1 ))

amk said...

to dwonload the mp3 files only from the site use :

wget -r --no-parent -A.mp3

After a very brief initialization u ll have a folder containing all the mp3 files.

wget is a powerful tool :)

Ibrahim said...

Nice writeup, I am into wget for this kind of tasks too, but i understand that the point of Mohamed Nabil is that, shell programming is powerful and fun.

Personally, I got the list from firefox > page info, then paste them into Vim, I wanted to download the mp3 files so i did ":g/zip/d" to delete the zip files, then save the list of url and fed them to wget with "-i" to get the urls from the input file.

It is fun to know others' ways of doing stuff. isn't it?

Mohammad Nabil said...

Neat one :D I think that's totally better than my solution and much straight-forward than Islam Ossama's solution :)

That was exactly my point; shell programming. I am relatively new to it and I am totally fascinated.

Yes indeed, it's amazing in how-many ways stuff can be done under Linux.

I've figured another way instead of making a loop on the URLs. We can just 'cat' list.txt and pipe it into 'xargs' on wget. I didn't try that one yet though.

amk said...

I totally agree with u that shell programming if not useful for this example , it is indeed very useful and essential and sometimes the only solution to facilitate tasks in many others.

By the way in my last comment u dont need to include the -r option if u just want Mashari files only.

Thanks and Nice Blog :)

Mohammad Nabil said...

I've been trying to convince people about shell scripting but they keep on insisting on not-doing-it if there is not GUI that does it.

Thanks for the hint about -r, I also kinda guessed it might refer to some 'recursive' feature. I'd try it ASAP because I am also curious how would it separate files of different readers (directories maybe). The reason I didn't try it till now because I was playing my favorite game on Windows :(.

Your welcome to read the blog anytime :). Thank you too for your script.

Mohammad Nabil said...

I've just noticed that the '-i' option is a replacement for using the xargs method :$