Log in

View Full Version : Incomplete downloads


Yuriy
10-28-2003, 05:51 AM
Hi, guys.

I use several log analyzers, such as Weblog Expert 2, 123loganalyzer. Both show Incomplete downloads fields.
I read that it also might be caused by download managers that people use. I get a lot of incomplete downloads. But according to the total bandwidth transferred it's impossible that so many been incompleted. How do you fight this kinda things? Any good log analyzer out there that gives a better report?

I do not think my host has problems though... I have 10Mbits connectivity and never experienced any problems when testing myself.

Any advise?

RomanV
10-28-2003, 06:20 AM
One of good ways: calculate number of transfered bytes for zip file of program and then divide it by size of the file. Or read here:
http://www.abacre.com/ala/manual/shareware_authors.htm

dburger
10-28-2003, 07:24 AM
I'm far from an expert at this, but I'll chime in with my observations. I've hosted our game download with a number of sources and there is always a large difference between the number of downloads started and the downloads completed. I think there are two main culprits.

1. People click the download link, but when they get it going and see how long it is going to take, they cancel it.

2. Download managers. These programs, can among other things, download the file in pieces, then reassemble them. I've seen a single user in my logs start dozens of downloads within seconds. but they don't finish, as they are only downloading a part of the file.

I used to worry about this, but I've never had a complaint about downloading problems and no matter what host I use, there are always a fair number of unfinished downloads. I think that in most cases there is not a really a technical problem with the download.

I've used about 4 different web log analysers and haven't found one I really like yet, but most all of them will give you enough info to see whats going on.

-Denis

Matthijs Hollemans
10-28-2003, 07:59 AM
(Apparently my first reply got lost in the voids of cyberspace, so here it is again.)

This should be a FAQ ;-)

When someone uses a download manager, it typically downloads one or more chunks of the file at once. Every chunk ends up on a separate line in the log file for that day. If you open the log file with a text editor, you will usually see a bunch of similar log lines (all from the same host) in a row.

You can write a simple script (for example, using Perl) to count the real number of downloads by more-or-less following the algorithm below.

We will keep a list of records. Each record describes a download attempt made by a particular host in the 'past hour' (relative to the time of the current log line). The records keeps track of how many bytes were transferred, and the time of the most recent log line for this host (so we can throw away records that are 'too old').

These records are used to add up partial downloads that occurred within short time spans. If within the hour enough bytes have been transferred, we count this as one successful download. If too few bytes were transferred, this download attempt is considered a failure.

The steps are something like this. Repeat for every log line from every log file:

1) read the next log line
2) is this a request for the download (e.g. 'GET /file.exe')?
3) is the number of bytes transferred less than the size of the .exe?
4) then find the record for this host in the list; if no record is found, make a new one
5) set the record's time to the time from this log line
6) add the transferred size to the record's size counter
7) look at all the other records in the list to determine if they are too old

This last step is where we decide whether a record represents a broken or a managed download. If the total bytes transferred is too little, it counts as a broken download, otherwise it is a 'managed' download. Records that are too old are erased from the list. (Note: this algorithm as I have presented it here doesn't count successful 'single' downloads, but that is trivial to add.)

This approach isn't perfect. It may even be considered overkill.
Using the numbers from my own log files, I could just as well assume that 20% of my downloads fail or are cancelled prematurely. About 15% of downloaders use a download manager tool. (Your mileage may vary.)

(By the way, this kind of functionality wouldn't be too hard to add to most existing log analyzer tools, because they already count 'visitors'.)

filekicker
10-31-2003, 09:30 AM
Originally posted by Matthijs Hollemans
1) read the next log line
2) is this a request for the download (e.g. 'GET /file.exe')?
3) is the number of bytes transferred less than the size of the .exe?
4) then find the record for this host in the list; if no record is found, make a new one
5) set the record's time to the time from this log line
6) add the transferred size to the record's size counter
7) look at all the other records in the list to determine if they are too old

FileKicker has a completed download report that is generated in a similar fashion. Our system is a little more sophisticated because every download gets a unique ID. It prevents people behind proxy servers and NAT gateways from being undercounted.

For the most part, completed download rates for 1-5 MB files are about 75%. If you use pop-up advertising, start downloads when the user isn't expecting it (META-REFRESH tags), or have agressive affiliates, the completed download rate may be lower.

Michael
FileKicker Support
support@filekicker.com
http://www.filekicker.com/