What is the best PHP website crawler class? #website crawler
Edit
by Mateus Mattos - 10 months ago (2023-11-08)
Download a complete website to local storage
| I must download all HTML pages, JS, CSS, and images.
The HTML must change JS, CSS, and image URL to local access.
Select the pages to download, like only HTML files from the first level on the domain. |
Ask clarification
2 Recommendations
PHP Website Downloader: Download pages and files of a site to local files
| by Cedric Maenetja package author 25 - 6 months ago (2024-02-21) Comment This package, the WebsiteDownloader class, fits the specified requirements for several reasons:
-
Ability to Download Specific Assets: The `WebsiteDownloader` class provides the flexibility to download specific types of assets, including HTML pages, JS files, CSS files, and images. This capability allows users to select the types of assets they want to download according to their needs.
-
Local Access of Assets: After downloading the assets, the `WebsiteDownloader` class modifies the URLs of JS, CSS, and image files to enable local access. This feature ensures that all assets are accessible locally, allowing users to view the website offline or without an internet connection.
-
Selective Downloading: The `WebsiteDownloader` class allows users to specify the pages they want to download. By selecting only HTML files from the first level on the domain, users can control the scope of the download process. This selective approach helps prevent unnecessary downloading of assets and reduces the amount of data transferred, which can be beneficial for performance and storage considerations.
Overall, the WebsiteDownloader class aligns well with the specified requirements by providing the necessary functionality to download specific types of assets, modify URLs for local access, and selectively download HTML pages from the first level on the domain. |
| by Manuel Lemos 23985 - 9 months ago (2023-11-19) Comment Hello Mateus,
This package does part of what you want. It can crawl a Web site, follow links, and download its content.
If I am not mistaken, it does not seem to do now to make all content available for download after converting links as if it were a local site.
Maybe one of our colleagues suggests another package that does everything you want.
Would you like to add the missing features or ask the author of this package to improve it to make it work as you want? |