SEM Labs

Handcrafted Pixels, Code & Title Tags

Multi-threaded Stack Class for PHP

This is a PHP class to quickly download single URLs and stacks of URLs using multiple threads. It uses the OO cURL wrapper.

One of the most common URI-related tasks is to create stacks of URLs (multi-dimensional arrays), to be downloaded and stored. The main purpose of the class is to provide a simplistic way to do this.

Downloading a Single URL

The above demonstrates downloading a single URL using the get method of the cURL Request class.

Arguments

  1. The URL to downloaded
  2. An array of cURL options, see the PHP reference for a list of options

Returns

An associative array containing the response of the request and information about the request.

Downloading Stacks of URLs

The above demonstrates downloading an array of URLs using the getThreaded method. There are 13 URLs and 5 threads being used. This means that 5 URLs are downloaded at a time in 3 stacks.

Arguments

  1. Array of URLs
  2. An array of cURL options, see the PHP reference for a list of options
  3. Number of threads to be use, i.e. the number of concurrent downloads

The number of threads can also be set using the $threads class variable.

Returns

An associative array containing the content of each URL and its request info. Note that, the request will be returned using the same key as was specified in the array of URLs passed to getThreaded. This will allow you to keep track of requests. For example, if you are downloading URLs from a database, you can use it to keep track of the record Id.

Optimal Threads

When you are downloading 1,000s or more URLs it is best to find out the optimum number of threads - to reduce the execution time of your script. I find it is best to use one thread for every 5-10 kbps of bandwidth for HTML. The optimal number of threads will depend on what you are downloading though.

Comments

Jez Replied at 4:36 PM on 30 Jun 2009

Nice one thanks for this, I had been meaning to ask you how you did this. I remember you saying you could thread with CURL but I had not seen it done.

Threading is dreadful in PHP... but the main thing I seem to want to thread is CURL, so this works out nicely... I guess that's why they built it in.

Adam Replied at 10:22 PM on 27 Aug 2009

The class works nicely on my script but there is something. When I try to process the result (let's say echo urls with some conditions) the results appears one time in the end. I thought it will process urls stack by stack not all at once as a lot.

I think the class now gets urls at sets of 5 or 10 or whatever the setting is and stores result until all stacks are done then processes the result. I think it should download a stack and process it then downloads the next stack etc...

please advice.

David Replied at 3:56 PM on 28 Aug 2009

Hi Adam,
This class was designed to download all the URLs you feed to it on one block. If you would like to download URLs and then process them, try the stand-alone cURl class. You can build whatever stacks you want with that and process them as needed.

Adam Replied at 10:13 AM on 28 Aug 2009

What is need is like follows

total 100 urls in stacks of 5 urls/stack

curl multi thread

first 5 urls arrive then process them and when 2nd 5 urls arrive process them and so on

Can you please post an example for that using any of your 2 classes? Sorry I'm not a professional programmer yet and need some help.

Thanks

David Replied at 7:37 PM on 28 Aug 2009

Something like this will work:

????? ??? ????? Replied at 8:48 PM on 7 Nov 2009

The class works nicely on my script thanks allot

Post Comment

Thin comments left for links will be deleted.

Entry Info

Categories