Multi-threaded Stack Class for PHP
This is a PHP class to quickly download single URLs and stacks of URLs using multiple threads. It uses the OO cURL wrapper.
One of the most common URI-related tasks is to create stacks of URLs (multi-dimensional arrays), to be downloaded and stored. The main purpose of the class is to provide a simplistic way to do this.
Downloading a Single URL
The above demonstrates downloading a single URL using the get method of the cURL Request class.
Arguments
- The URL to downloaded
- An array of cURL options, see the PHP reference for a list of options
Returns
An associative array containing the response of the request and information about the request.
Downloading Stacks of URLs
The above demonstrates downloading an array of URLs using the getThreaded method. There are 13 URLs and 5 threads being used. This means that 5 URLs are downloaded at a time in 3 stacks.
Arguments
- Array of URLs
- An array of cURL options, see the PHP reference for a list of options
- Number of threads to be use, i.e. the number of concurrent downloads
The number of threads can also be set using the $threads class variable.
Returns
An associative array containing the content of each URL and its request info. Note that, the request will be returned using the same key as was specified in the array of URLs passed to getThreaded. This will allow you to keep track of requests. For example, if you are downloading URLs from a database, you can use it to keep track of the record Id.
Optimal Threads
When you are downloading 1,000s or more URLs it is best to find out the optimum number of threads - to reduce the execution time of your script. I find it is best to use one thread for every 5-10 kbps of bandwidth for HTML. The optimal number of threads will depend on what you are downloading though.
Comments
Nice one thanks for this, I had been meaning to ask you how you did this. I remember you saying you could thread with CURL but I had not seen it done.
Threading is dreadful in PHP... but the main thing I seem to want to thread is CURL, so this works out nicely... I guess that's why they built it in.
The class works nicely on my script but there is something. When I try to process the result (let's say echo urls with some conditions) the results appears one time in the end. I thought it will process urls stack by stack not all at once as a lot.
I think the class now gets urls at sets of 5 or 10 or whatever the setting is and stores result until all stacks are done then processes the result. I think it should download a stack and process it then downloads the next stack etc...
please advice.
Hi Adam,
This class was designed to download all the URLs you feed to it on one block. If you would like to download URLs and then process them, try the stand-alone cURl class. You can build whatever stacks you want with that and process them as needed.
What is need is like follows
total 100 urls in stacks of 5 urls/stack
curl multi thread
first 5 urls arrive then process them and when 2nd 5 urls arrive process them and so on
Can you please post an example for that using any of your 2 classes? Sorry I'm not a professional programmer yet and need some help.
Thanks
Something like this will work:
The class works nicely on my script thanks allot