wgetHandler

Implements a interface to run downloads with wget, intended to be used in shell scripts but also usable in a web enviroment.

Author

© 2012, Tobias Baeumer Tobias.nosp@m.Baeumer@gmai.nosp@m.l.com

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program.  If not, see http://www.gnu.org/licenses/.

Summary
wgetHandlerImplements a interface to run downloads with wget, intended to be used in shell scripts but also usable in a web enviroment.
Variables
strPathTowgetPath to wget binary to use.
strParamsCommandline params/options to use when launching wget.
strUserAgentHTTP UserAgent to send.
arrErrorDescriptionsDescriptions of wget return values, where key == return and vice versa.
arrCookiesCookies to send with download requests.
intReturnValueWill get wgets return value when doDownload finished.
strDownloadPathPath (full or relative) to target directory, without trailing slash, for downloaded files.
arrOutputLinesWill be filled with stdout (and stderr, since we use output redirection) from wget.
objDownloadResultWill get the download results
arrIndicatorCharsEach element contains one char, used in order of appearance in array for intLiveMode=3 param in doDownload.
Functions
__construct()Initialize objDownloadResult
__destruct()Just delete the cookies.txt file, it it exists.
_saveCookieFileSave cookies from arrCookies to cookies.txt
_parseOutputTakes an array with output lines of wget and get average download speed, time needed and file size from it.
doDownloadDownloads a given url

Variables

strPathTowget

public $strPathTowget

Path to wget binary to use.  Can be relative, full, or just “wget” if the binary is in path.

Default

string/usr/bin/wget

strParams

public $strParams

Commandline params/options to use when launching wget.

Notes

Dont add ‘--referer’ or ‘--user-agent’ here, set them through strReferer (see <doDownload>) and strUserAgent instead, else your values would be overwritten by those.

Default

string-e robots=off --follow-tags=meta --follow-ftp --max-redirect 5 --span-hosts --cookies=on --load-cookies=cookies.txt

strUserAgent

public $strUserAgent

HTTP UserAgent to send.  If set to null this param will be omitted and wget will send its own UserAgent.

Default

constnull

arrErrorDescriptions

public $arrErrorDescriptions

Descriptions of wget return values, where key == return and vice versa.  These strings will just be used as return value for doDownload on fail and can be translated.

Notes

Same like intReturnValue: Dont trust this value when using `intLiveMode` for doDownload.

Default

arrayEnglish descriptions, see wget manpage for details.

arrCookies

public $arrCookies

Cookies to send with download requests.  Array keys will be used a cookie name, values as (surprise!) values.

Default

arrayEmpty array

intReturnValue

public $intReturnValue

Will get wgets return value when doDownload finished.

Notes

Dont trust this value when using `intLiveMode` for doDownload.

Default

integer99 (as placeholder for init, no meaning)

strDownloadPath

public $strDownloadPath

Path (full or relative) to target directory, without trailing slash, for downloaded files.  Must be writeable of course, so maybe a chmod is needed.

Notes

Will also be used for cookies.txt (used for passing cookies to wget) and .wgetout (used for intLiveMode in doDownload)

Default

string./downloads

arrOutputLines

private $arrOutputLines

Will be filled with stdout (and stderr, since we use output redirection) from wget.  Used by _parseOutput.

Notes

This is also available, despite using “to file” output redirection and detached exec(), when using intLiveMode in doDownload.

Default

arrayEmpty array

See Also

_parseOutput

objDownloadResult

public $objDownloadResult

Will get the download results

Default

none

See Also

_parseOutput, arrOutputLines

arrIndicatorChars

public $arrIndicatorChars

Each element contains one char, used in order of appearance in array for intLiveMode=3 param in doDownload.

Default

array|, /, -, \, |, /, -, \

See Also

doDownload

Functions

__construct()

public function __construct()

Initialize objDownloadResult

__destruct()

public function __destruct()

Just delete the cookies.txt file, it it exists.

_saveCookieFile

private function _saveCookieFile($strDomainCookieScope)

Save cookies from arrCookies to cookies.txt

Parameters

string strDomainCookieScopethe domain/host for which these cookies are valid

Returns

booleantrue, if save was successful or
booleanfalse, if an error occurred (most likely because of permission problems)

See Also

arrCookies

_parseOutput

private function _parseOutput()

Takes an array with output lines of wget and get average download speed, time needed and file size from it.

Returns

object

  • string - AvgSpeed,
  • string - Size (depending on filesize it will be either in KB, MB or GB, with trailing unit),
  • string - Time

or

booleanfalse, if server file didn’t changed

See Also

arrOutputLines

doDownload

public function doDownload($strURL,  
$strTargetFile = "",
$strReferer = "",
$intLiveMode = 0,
$bDontDeleteTargetFile = false)

Downloads a given url

Parameters

string strURLURL to download
string strTargetFiletarget file for download (defaults to server given name)
string strRefererHTTP referer to send (defaults to none)
integer intLiveModeIf greater then 0, will run wget in background and print live status for which this value is taken as mode switch.  See Notes.
boolean bDontDeleteTargetFiledefaults to false, will skip target file deletion if set to true.

Returns

booleantrue, if download was successful

or

stringcontaining error description

Notes

intLiveMode can be 1, 2 or 3.  These modes have the following meaning:

  • 1: Print dots each time the progress percentage in wgets output changes.  The “dot-bar” will start and end with a space.
  • 2: Print percentage value, update it by moving the cursor back with ANSI codes.  This mode is for shell scripts.
  • 3: Print a spinning 1-Char length animation on cursor position, again using ANSI codes for positioning.  You can set the characters to use in arrIndicatorChars.

See Also

_saveCookieFile, strPathTowget, strParams, arrOutputLines, intReturnValue, objDownloadResult, _parseOutput, arrErrorDescriptions, arrIndicatorChars

public $strPathTowget
Path to wget binary to use.
public $strParams
Commandline params/options to use when launching wget.
public $strUserAgent
HTTP UserAgent to send.
public $arrErrorDescriptions
Descriptions of wget return values, where key == return and vice versa.
public $arrCookies
Cookies to send with download requests.
public $intReturnValue
Will get wgets return value when doDownload finished.
public function doDownload($strURL,  
$strTargetFile = "",
$strReferer = "",
$intLiveMode = 0,
$bDontDeleteTargetFile = false)
Downloads a given url
public $strDownloadPath
Path (full or relative) to target directory, without trailing slash, for downloaded files.
private $arrOutputLines
Will be filled with stdout (and stderr, since we use output redirection) from wget.
public $objDownloadResult
Will get the download results
public $arrIndicatorChars
Each element contains one char, used in order of appearance in array for intLiveMode=3 param in doDownload.
public function __construct()
Initialize objDownloadResult
public function __destruct()
Just delete the cookies.txt file, it it exists.
private function _saveCookieFile($strDomainCookieScope)
Save cookies from arrCookies to cookies.txt
private function _parseOutput()
Takes an array with output lines of wget and get average download speed, time needed and file size from it.
Close