proc parse_file {script} {
set fd [open $script r]
while {![eof $fd]} {
regexp {http://([^/]*)(.*)} [gets $fd] match url fn
if {[string length $fn] == 0} {
set fn "noname"
}
get_webpage $url $fn
}
close $fd
}
If no filename is given (the address ends with a slash "/"), we rename
the file as "noname", so it will be given a name anyway. We check also
if this file already exists, so we don't get a page already present.
We assume our recently got page to be ok when it exists and is not of
zero length.
This simple procedure does the job:
proc file_ok {fn} {
if ![file exists $fn] {
return 1
}
if {[file size $fn] > 0} {
return 0
}
return 1
}
Then, here is the main routine:
proc get_webpage {url fn} {
global eventLoop sock savefd savedir retries max_retries
set retries 0
puts "transferindo: $url$fn"
if { [file_ok "$savedir/$fn"] } {
while { $retries < $max_retries } {
set savefd [open "$savedir/$fn" w]
set sock [socket $url 80]
fileevent $sock readable [list read_sock $sock]
fconfigure $sock -buffering line
if {[string length $fn] == 0} {
set fn /
}
puts $sock "GET $fn"
vwait eventLoop
close $sock
close $savefd
if {[file size "$savedir/$fn"] > 0} {
set retries $max_retries
}
incr retries
}
}
}
To manage the reading of lines from the connection, we have prepared a
fileevent with the callback "read_sock". This procedure will be called
each time the channel (stored in "sock" variable above) becomes readable.
proc read_sock {rsock} {
global eventLoop savefd
if {[eof $rsock]} {
set eventLoop "done"
} else {
puts $savefd [gets $rsock]
}
}
To finish our simple utility, we just have to define some variables and
execute the parse_file command:
set savedir "/directory/where/to/save" set max_retries 5 parse_file "webget.links" exitRemeber to create the "webget.links" file and put in each line one url to get.