Monday, February 12, 2007

Download all?

If you faced a site with about hundreds files that you want to download you will know what is this about.

I've faced two of them today. A site containing the MDC pictures, and a site containing MP3s for the whole Qur'an by Mashari.

In the page that containing all the links, I could have passed on them link-by-link pressing 'save item as'. Alternatively I did the following.

I right clicked on the page, using Firefox, 'View page info'. In the 'media' tab, I selected all the images and saved the link list to a file called list.txt.

Then I created a new folder and put the file in it. And I wrote the following simple command and pressed enter:
for File in $(cat list.txt) ; do wget $File; done
And I let it download all the links mentioned in the file.

And about the MP3, I went to the 'links' tab and likewise put all the links in a files also called list.txt in another folder. But this time there were two links for each Sura, one in MP3 and one in Zip. I did that to get the zip links only:
cat list.txt | grep -i zip > list.txt
Then I issued the same command to start downloading. After that I did:
for File in $(ls); do unzip $File; rm $File; done
to unzip files and delete unzipped .zip files.

After that I did:
play *.mp3
to play them all :D ( yeah mp3 from the command line :D )

The final download script for the MP3s was
for File in $(cat list.txt)
do wget $File
File=$(basename $File)
unzip $File
rm $File
done
Like that once a file is downloaded it is unzipped and deleted automatically.

I am writing this to let people who use 'non-unix' know how simple and handy Linux can be.

9 comments:

Islam Ossama said...

Or you can download GetRight, copy the page's link, open the GetRight Browser, select the files you want to download from the list, and click download ;)! (Same or very similar procedure for any download manager, too!)

Ahmed Mubarak said...

جزاكم الله خير نبيل واسلام

Mohammad Alaggan said...

Islam: you still will click hundreds of times, congratulations :D

Ahmad: we2yakom

..

Btw this command normalized the gain among all your mp3 :
---------8<--------------
find . -type f -iname '*.mp3' -print0 | xargs -0 mp3gain -r -k
---------8<--------------

This script:
---------8<--------------
for file in $( ls *.mp3 )
do file_name=$(basename $file)
file_name2=$(echo $file_name | sed s/^00*//)
mv $file_name $file_name2
done
---------8<--------------
removes the leading zeroes from the file name so you can play them using another script like that:
---------8<--------------
./play.sh 80 114
---------8<--------------
to play Suras from 80 to 114
the play.sh is :
---------8<--------------
i=$1
while [ $i -ne $2 ]
do play --volumne=1.6 ${i}.mp3
i=$(( $i + 1 ))
done
---------8<--------------

amk said...

to dwonload the mp3 files only from the site use :

wget -r --no-parent -A.mp3 http://www.mp3quran.net/afs.html

After a very brief initialization u ll have a folder containing all the mp3 files.

wget is a powerful tool :)

Ibrahim Ahmed said...

Nice writeup, I am into wget for this kind of tasks too, but i understand that the point of Mohamed Nabil is that, shell programming is powerful and fun.

Personally, I got the list from firefox > page info, then paste them into Vim, I wanted to download the mp3 files so i did ":g/zip/d" to delete the zip files, then save the list of url and fed them to wget with "-i" to get the urls from the input file.

It is fun to know others' ways of doing stuff. isn't it?

Mohammad Alaggan said...

amk:
Neat one :D I think that's totally better than my solution and much straight-forward than Islam Ossama's solution :)

Ibrahim:
That was exactly my point; shell programming. I am relatively new to it and I am totally fascinated.

Yes indeed, it's amazing in how-many ways stuff can be done under Linux.

I've figured another way instead of making a loop on the URLs. We can just 'cat' list.txt and pipe it into 'xargs' on wget. I didn't try that one yet though.

amk said...

I totally agree with u that shell programming if not useful for this example , it is indeed very useful and essential and sometimes the only solution to facilitate tasks in many others.

By the way in my last comment u dont need to include the -r option if u just want Mashari files only.

Thanks and Nice Blog :)

Mohammad Alaggan said...

amk:
I've been trying to convince people about shell scripting but they keep on insisting on not-doing-it if there is not GUI that does it.

Thanks for the hint about -r, I also kinda guessed it might refer to some 'recursive' feature. I'd try it ASAP because I am also curious how would it separate files of different readers (directories maybe). The reason I didn't try it till now because I was playing my favorite game on Windows :(.

Your welcome to read the blog anytime :). Thank you too for your script.

Mohammad Alaggan said...

ibrahim:
I've just noticed that the '-i' option is a replacement for using the xargs method :$