proc parse_file {script} { set fd [open $script r] while {![eof $fd]} { regexp {http://([^/]*)(.*)} [gets $fd] match url fn if {[string length $fn] == 0} { set fn "noname" } get_webpage $url $fn } close $fd }If no filename is given (the address ends with a slash "/"), we rename the file as "noname", so it will be given a name anyway. We check also if this file already exists, so we don't get a page already present.
We assume our recently got page to be ok when it exists and is not of
zero length.
This simple procedure does the job:
proc file_ok {fn} { if ![file exists $fn] { return 1 } if {[file size $fn] > 0} { return 0 } return 1 }Then, here is the main routine:
proc get_webpage {url fn} { global eventLoop sock savefd savedir retries max_retries set retries 0 puts "transferindo: $url$fn" if { [file_ok "$savedir/$fn"] } { while { $retries < $max_retries } { set savefd [open "$savedir/$fn" w] set sock [socket $url 80] fileevent $sock readable [list read_sock $sock] fconfigure $sock -buffering line if {[string length $fn] == 0} { set fn / } puts $sock "GET $fn" vwait eventLoop close $sock close $savefd if {[file size "$savedir/$fn"] > 0} { set retries $max_retries } incr retries } } }To manage the reading of lines from the connection, we have prepared a fileevent with the callback "read_sock". This procedure will be called each time the channel (stored in "sock" variable above) becomes readable.
proc read_sock {rsock} { global eventLoop savefd if {[eof $rsock]} { set eventLoop "done" } else { puts $savefd [gets $rsock] } }To finish our simple utility, we just have to define some variables and execute the parse_file command:
set savedir "/directory/where/to/save" set max_retries 5 parse_file "webget.links" exitRemeber to create the "webget.links" file and put in each line one url to get.