Table of Contents[Hide][Show]
- Basic Syntax of wget
How to Use the wget Command in Linux+−
- 1. Downloading a Single File
- 2. Saving the File with a Different Name
- 3. Downloading Files in the Background
- 4. Limiting Download Speed
- 5. Resuming a Download
- 6. Downloading Multiple Files
- 7. Downloading an Entire Website
- 8. Downloading Files via FTP
- 9. Handling SSL/TLS Certificates
- 10. Setting User-Agent
- 11. Logging Output
- 12. Time-Stamping
- 13. Recursive Downloading
- 14. Excluding Certain Files
- 15. Setting a Timeout
- 16. Using a Proxy
- 17. Downloading with Authentication
- 18. Checking for Broken Links
- 19. Downloading in Quiet Mode
- 20. Displaying Version Information
- Conclusion
The wget
command is one of the most powerful and versatile tools available in Linux for downloading files from the internet. Whether you’re a system administrator, a developer, or just a casual Linux user, understanding how to use the wget
command can significantly enhance your productivity.
This article will provide a comprehensive guide on how to use the wget
command in Linux, covering its syntax, options, and practical examples.
Introduction to wget
wget
stands for “web get” and is a non-interactive command-line utility that allows you to download files from the web. It supports HTTP, HTTPS, and FTP protocols, and can also retrieve files recursively from directories. Unlike other download managers, wget
is designed to work in the background, making it ideal for scripting and automated tasks.
Installation of wget
Most Linux distributions come with wget
pre-installed. However, if it’s not available on your system, you can easily install it using your package manager.
- Debian/Ubuntu-based systems:
sudo apt-get install wget
- Red Hat/CentOS/Fedora-based systems:
sudo yum install wget
- Arch Linux:
sudo pacman -S wget
Once installed, you can verify the installation by typing wget --version
in the terminal.
Basic Syntax of wget
The basic syntax of the wget
command is as follows:
wget [options] [URL]
[options]
: These are the various flags and parameters that modify the behavior ofwget
.[URL]
: This is the web address of the file or directory you want to download.
How to Use the wget Command in Linux
1. Downloading a Single File
The simplest use of wget
is to download a single file from a specified URL. The command format is:
wget https://example.com/file.zip
This downloads file.zip
from the given URL and saves it in the current working directory. wget
handles file retrieval efficiently, ensuring uninterrupted access. If the download is interrupted, wget can automatically retry until the file is completely downloaded. Additionally, if the file already exists, it will not be downloaded again unless explicitly specified.
This command is widely used for retrieving files from remote servers. It is especially useful for downloading software packages, scripts, or documents directly to a Linux system. wget
supports multiple protocols, including HTTP, HTTPS, and FTP, making it a versatile tool for file retrieval. Users can easily automate downloads with wget, making it ideal for system administrators and developers who need to retrieve and update files regularly.
2. Saving the File with a Different Name
By default, wget
saves files with their original names. To rename a file during download, use:
wget -O myfile.zip https://example.com/file.zip
This command saves the downloaded file as myfile.zip
, which is useful when downloading multiple files with similar names. This feature is beneficial when dealing with long, auto-generated file names, as it allows users to assign more readable and manageable names for easy reference.
Renaming files while downloading also helps when handling versioned downloads. For example, if multiple versions of a software package are being retrieved, assigning a meaningful name can help distinguish between them. Additionally, this option prevents conflicts when saving files in directories where files with similar names already exist. It also simplifies scripting, as you can ensure consistent file names across different downloads without manual renaming.
3. Downloading Files in the Background
If you’re downloading large files and want to continue working in the terminal, use:
wget -b https://example.com/largefile.zip
This command runs wget
in the background. You can check the download progress by viewing the wget-log
file. This feature is useful when handling large downloads on slow connections, as it allows uninterrupted workflow while the download continues independently.
Background downloading is especially beneficial in remote server environments where maintaining an open session for an extended period is not feasible. By running wget
in the background, users can initiate a download and disconnect from the server without affecting the process. This feature is commonly used in automated server scripts where multiple downloads need to be initiated without requiring active user intervention.
4. Limiting Download Speed
To prevent wget
from consuming excessive bandwidth, use:
wget --limit-rate=100k https://example.com/largefile.zip
This restricts the download speed to 100 KB/s, helping maintain overall network performance. This is especially useful in shared networks where an unrestricted download could disrupt other users’ internet activity.
Limiting download speed is particularly helpful in enterprise environments, where network resources are shared among multiple users. It ensures that downloading large files does not interfere with other critical network activities, such as video conferencing or online collaboration. Additionally, this feature is useful for users on metered internet connections who need to control their data usage.
5. Resuming a Download
If a download is interrupted, resume it using:
wget -c https://example.com/largefile.zip
This resumes the download from where it left off, avoiding redundant data transfer. This is particularly helpful for large files or unstable connections, ensuring minimal data wastage.
Resuming downloads is useful when dealing with unreliable networks, such as mobile or satellite internet connections. Many download managers include similar functionality, but wget
‘s built-in resume feature makes it particularly valuable in command-line environments. It ensures that even if a connection is lost, the download can continue without restarting from scratch.
6. Downloading Multiple Files
To download multiple files, list their URLs in a text file and use:
wget -i filelist.txt
Each URL should be on a separate line in filelist.txt
, allowing batch downloads efficiently. This is helpful when needing to download a series of files quickly without manually entering each URL.
Batch downloading is particularly useful in data collection tasks, where multiple datasets or reports need to be retrieved simultaneously. Researchers, developers, and system administrators use this option to streamline the download process, reducing the manual effort required to fetch multiple files. The ability to automate bulk downloads improves efficiency and reduces the risk of human error when dealing with large numbers of files.
7. Downloading an Entire Website
Downloading an entire website using wget
is useful for offline browsing, website archiving, or research. The command:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com
performs multiple tasks simultaneously. --mirror
ensures the website is downloaded recursively while maintaining the correct directory structure. --convert-links
rewrites links to work offline. --adjust-extension
ensures files are saved with the correct extension. --page-requisites
downloads all resources like images, CSS, and JavaScript files necessary for proper display. --no-parent
prevents wget
from moving up the directory tree, ensuring it only downloads from the specified URL.
This method is helpful when internet access is unreliable or when studying web design without affecting live sites. However, it’s essential to check a website’s robots.txt and terms of service to avoid violating policies. Additionally, large sites with dynamic content (e.g., JavaScript-heavy applications) may not be fully replicated using wget
.
8. Downloading Files via FTP
The File Transfer Protocol (FTP) is commonly used for downloading and uploading files to remote servers. wget
allows downloading files via FTP, even when authentication is required:
wget ftp://username:password@ftp.example.com/file.zip
This command logs into the FTP server with the specified credentials and retrieves file.zip
. If authentication is not needed, users can omit the username:password@
part.
FTP downloads are especially useful for accessing large datasets, software updates, or backup files. System administrators and developers often rely on FTP to transfer files efficiently. However, since FTP transmits data in plain text, it poses security risks. Using SFTP (Secure File Transfer Protocol) or FTPS (FTP Secure with SSL/TLS) is recommended when dealing with sensitive data.
Another useful feature of wget
is its ability to handle FTP directories recursively with -r
, enabling the retrieval of multiple files at once. Automating FTP downloads with scripts can streamline workflows in environments where file updates are frequent.
9. Handling SSL/TLS Certificates
Secure websites use SSL/TLS certificates to encrypt data transmission. However, if a website has a self-signed or expired certificate, wget
may refuse to download files, displaying an error. To bypass this check, use:
wget --no-check-certificate https://example.com/file.zip
This forces wget
to ignore SSL validation and proceed with the download. While this is helpful for internal servers, test environments, or non-critical downloads, it introduces a security risk. Ignoring certificate validation means attackers could intercept or modify the data in a man-in-the-middle attack.
For safer alternatives, consider:
- Adding the certificate manually to the system’s trusted list.
- Using cURL with
--insecure
ifwget
isn’t required. - Verifying the certificate manually before disabling checks.
It is best to use --no-check-certificate
only when necessary and for non-sensitive downloads. If dealing with secure transactions or confidential data, always ensure proper SSL/TLS configurations are in place.
10. Setting User-Agent
Some websites block automated downloads by detecting the default wget
user agent. To bypass this, specify a custom user-agent string that mimics a web browser:
wget --user-agent="Mozilla/5.0" https://example.com/file.zip
This makes wget
appear as a normal browser (like Google Chrome or Firefox), allowing it to access content that might otherwise be restricted.
Websites often block wget
to prevent scraping, automated downloads, or server overload. By modifying the user-agent, users can download resources without triggering restrictions. However, it’s important to use this responsibly and adhere to robots.txt policies.
Other ways to avoid detection include:
- Using a different browser’s user-agent string.
- Rotating user-agents dynamically using scripts.
- Using proxies to prevent IP-based blocking.
Many scraping tools incorporate these tactics, but misusing them may result in legal or ethical issues. Always respect website rules before automating downloads.
11. Logging Output
When running wget
in automated scripts or scheduled tasks, saving logs helps track downloads and troubleshoot issues. To log output:
wget -o logfile.txt https://example.com/file.zip
This command saves all status messages, errors, and progress updates to logfile.txt
, making it easier to review later. This is useful for:
- Debugging failed downloads (e.g., network failures).
- Keeping records of automated downloads for auditing.
- Tracking file changes over time.
For less verbose logging, use -q
(quiet mode) to suppress unnecessary output:
wget -q -o logfile.txt https://example.com/file.zip
Alternatively, for detailed logging, use -d
(debug mode), which provides additional troubleshooting information.
Using logs is a best practice in automation and system administration, especially when wget
is used in cron jobs or scripts handling large datasets.
12. Time-Stamping
Downloading a file repeatedly wastes bandwidth if the file hasn’t changed. To avoid unnecessary downloads, enable timestamping with:
wget -N https://example.com/file.zip
This tells wget
to check the file’s last modified date and only download it if a newer version is available. It prevents redundant downloads and conserves bandwidth, making it ideal for:
- Updating software packages without re-downloading existing files.
- Synchronizing files across multiple systems.
- Minimizing storage usage.
If the server does not provide a Last-Modified
timestamp, wget
may not detect changes correctly. Combining -N
with recursive downloading (-r
) ensures entire directories stay updated efficiently.
Automating time-stamped downloads is common in backup systems, web scraping, and data mirroring. Pairing wget
with cron jobs can make updates fully automatic.
13. Recursive Downloading
When you need to download multiple files or entire directories, use recursive mode:
wget -r https://example.com/directory/
This retrieves all files and subdirectories under the given URL, following internal links up to a default depth of 5. It is useful for:
- Downloading datasets from web servers.
- Mirroring directories for offline access.
- Fetching reports, logs, or structured data.
To limit recursion depth, use -l
(level):
wget -r -l 2 https://example.com/directory/
This restricts downloads to two levels of subdirectories.
For downloading only specific file types, use --accept
:
wget -r --accept="*.pdf" https://example.com/reports/
This fetches only PDF files, reducing unnecessary downloads.
14. Excluding Certain Files
Sometimes, you want to avoid downloading unnecessary files, such as large archives, videos, or executables. To exclude specific file types:
wget -r --reject="*.zip" https://example.com/directory/
This prevents wget
from downloading ZIP files while retrieving everything else. This is useful when:
- Filtering downloads by file type (e.g., avoiding EXE, MP4, or large ISO files).
- Speeding up downloads by omitting unnecessary data.
- Saving storage space on limited devices.
You can also reject multiple file types:
wget -r --reject="*.zip,*.mp4,*.exe" https://example.com/directory/
For advanced filtering, use regex patterns with --reject-regex
, enabling more precise control over excluded files.
15. Setting a Timeout
When downloading files from slow or unresponsive servers, wget
may hang indefinitely. To avoid this, set a timeout using:
wget -T 30 https://example.com/file.zip
This command sets a 30-second timeout, meaning wget
will wait that long for a server response before aborting the connection.
Timeouts are essential in scenarios where:
- A website is slow to respond due to high traffic.
- You are downloading multiple files and don’t want one slow file to hold up the process.
- You need to prevent stalled downloads in automated scripts.
You can also set timeouts for DNS resolution (--dns-timeout
), connection attempts (--connect-timeout
), and read operations (--read-timeout
). For example:
wget --connect-timeout=20 --read-timeout=40 https://example.com/file.zip
This approach ensures downloads are efficient and prevents wget
from indefinitely waiting for an unresponsive server.
16. Using a Proxy
In corporate or restricted environments, internet access is often routed through a proxy server. To download files using a proxy, use:
bashCopyEditwget --proxy=on http://proxy.example.com:8080 https://example.com/file.zip
This tells wget
to route its request through proxy.example.com
on port 8080
.
Proxy servers are commonly used for:
- Bypassing network restrictions (e.g., workplace firewalls).
- Caching frequently accessed content to improve load speeds.
- Enhancing security by filtering and monitoring traffic.
If authentication is required, specify the credentials:
wget --proxy-user=username --proxy-password=password https://example.com/file.zip
Alternatively, configure proxy settings in the wget
configuration file (.wgetrc
) to avoid manually entering them every time.
For environments without a proxy, disable it using:
wget --no-proxy https://example.com/file.zip
This ensures wget
connects directly to the internet instead of using a proxy.
17. Downloading with Authentication
Some files are stored behind password-protected areas, requiring user authentication. To access such files, use:
wget --http-user=username --http-password=password https://example.com/file.zip
This sends the credentials as part of the HTTP request.
This is useful for:
- Downloading files from restricted websites (e.g., membership portals, company intranets).
- Accessing private API endpoints that require authentication.
- Automating secure file downloads for scripts.
For FTP servers, the authentication method is slightly different:
wget ftp://username:password@ftp.example.com/file.zip
However, sending credentials in plain text is a security risk. A safer approach is to store credentials in a .netrc
file or use an OAuth/token-based authentication system when possible.
18. Checking for Broken Links
To check for broken links on a website, use:
wget --spider -r https://example.com
This tells wget
to crawl the site recursively like a search engine but without downloading files. The --spider
flag makes wget
behave like a web crawler, only testing for broken links instead of saving data.
This is useful for:
- Website maintenance (detecting dead pages, 404 errors).
- Ensuring links in large projects work correctly before publishing.
- Checking external links to ensure they’re still valid.
To check only specific pages:
wget --spider https://example.com/page.html
For logging results, save the output:
wget --spider -r https://example.com > broken_links.log
This helps webmasters and developers identify and fix broken URLs quickly.
19. Downloading in Quiet Mode
By default, wget
displays progress updates and messages on the terminal, which can be distracting in scripts. To suppress output, use:
wget -q https://example.com/file.zip
This downloads the file without printing messages, making it ideal for:
- Automated scripts that don’t require logs.
- Reducing terminal clutter when downloading large files.
- Running
wget
in the background without unnecessary notifications.
For completely silent downloads, use -q
with -O
(output file):
wget -q -O output.zip https://example.com/file.zip
If you still want error messages but no status updates, use --no-verbose
:
wget --no-verbose https://example.com/file.zip
This is useful for scenarios where only failures need to be logged while suppressing normal output.
20. Displaying Version Information
To check which version of wget
is installed, use:
wget --version
This displays information such as:
wget
version number- Build date
- Supported features (SSL, IPv6, HTTP/2)
Checking the version is useful when:
- Troubleshooting issues (ensuring compatibility with scripts).
- Verifying SSL/TLS support for secure downloads.
- Checking for updates (to access new features or bug fixes).
If you need to update wget
, use your system’s package manager:
sudo apt update && sudo apt install wget # Debian/Ubuntu
sudo yum install wget # RHEL/CentOS
brew install wget # macOS (Homebrew)
Keeping wget
updated ensures better security, stability, and compatibility with modern web technologies.
Conclusion
The wget
command is an indispensable tool for anyone who needs to download files from the internet in a Linux environment. Its versatility and wide range of options make it suitable for a variety of tasks, from simple file downloads to complex website mirroring. By mastering the wget
command, you can streamline your workflow and handle downloads more efficiently.
In this article, we’ve covered the basics of how to use the wget
command in Linux, including downloading single and multiple files, resuming interrupted downloads, limiting download speeds, and more. With this knowledge, you should be well-equipped to leverage the full potential of wget
in your daily tasks.
Whether you’re a beginner or an experienced Linux user, the wget
command is a valuable addition to your toolkit. So the next time you need to download files from the web, remember to use wget
and take advantage of its powerful features.
For more details, check out the wget
official documentation for detailed usage, options, and examples on downloading files, mirroring websites, and automating tasks in Linux.