Every week on Saturday morning our web server automatically generates a series of reports, one per customer, that is intended to show what kind of activity your web pages have had during the previous week. This takes the form of an email message to you from "System Administrator" with the subject "Web page usage report".
Over time it has become increasingly obvious that what this report is telling you is far from obvious, so I will try to explain what it all means. This will probably be less definitive than you would like, since the data itself is somewhat ambiguous.
For example, you should only get this report if your pages have seen activity. This would seem to imply that if you have no web pages, you'll never get the report, except that sometimes you might get one anyway. This is because every customer always has at least one page (an "index page" listing what other pages you have). It might be a blank page, but it's there. So if anyone looks at it, then the following Saturday morning you'll get a report telling you how much activity your blank page got.
The two key concepts to understanding the web page usage report are web objects and hits.
A web object is any sort of file that you put in your web space directory on our server. This includes of course web pages, but it also includes images, wave files, midi songs, and anything else that can be fetched from your web space.
A hit is one occurrence of some computer somewhere requesting one of your web objects. This is not the same as a "page view", which would be one person looking at one of your pages once. This is because of the miracles of modern networking. There are automated programs out there (including the recent versions of most browsers) that can grab web objects for later viewing "just in case", which generates hits without page views. Also, caching is a very widely used technique that can let a person repeatedly view a page while only generating a single hit. Hits are roughly related to page views, but only roughly.
There are some other complicating factors as well, some of which I will deal with in the appropriate section below, but for now let's get on with explaining the web page usage report.
The report is in three sections. The first section, titled Web page
usage this week, gives the hit counts for each of your web objects
that was accessed during the previous week. Web objects that were not hit
are not included. It looks something like this:
Web page usage this week:
Hits Page
==== ====
101 / (your main page)
35 /fashion.html
22 /hots.html
84 /i/hotlist.gif
8 /spam/LocalIP.html
6 /spam/RelayTo.html
816 Total hits
712 Through www.esva.net
104 Through www.yourdomain.com
Your site was ranked #10 this week!
This is hopefully easy to understand. It is simply a list of how many hits each web object received, ending in a total number of hits for all web objects.
If you have a virtual domain (as in this example), then the report will also include subtotals showing how many hits came through your virtual domain, and how many came through the esva.net domain. By this I mean that if someone loads your main page as "http://www.esva.net/~yourid/" then it came through the esva.net domain, but if they load your main page as "http://www.yourdomain.com/" then it came through your domain.
If you are one of the lucky ten who make it into our top ten list for the week, then this will also be noted. To figure out the top ten list we take the highest number of hits on the web page with the most hits (not including images, sound files, et cetera) for each customer, and then rank the top ten. In other words, it is based on each customers most popular page.
The second section of the report, titled Networks accessing your pages, shows more or less where your readership is coming from. It looks something like this:
Networks accessing your pages:
146 (unknown) 2 inficad.com
2 Australia 1 infoave.net
5 Brazil 1 intergrafix.net
14 Canada 1 iquest.net
12 Mexico 1 ly.net
18 United States 4 net2000ke.com
1 accd.edu 2 netcom.com
46 aol.com 5 odsi.net
4 esva.net 2 umich.edu
3 excite.com 2 uoregon.edu
3 idbsu.edu 4 yahoo.com
3 idirect.com 4 yni.net
2 imall.com
This is not a particularly useful report, but it can make fascinating reading. You might not think so to look at it, but this report is incredibly condensed. It isn't really possible anymore to say where a person reading your pages is located, so giving the full details is meaningless. From the raw data I can see that a page was accessed by "slip129-37-204-50.in.us.ibm.net", but that doesn't really tell us very much, so in this section of the report we chop that down to just "ibm.net" and give a total for the whole domain. In the case of an overseas domain (or the ".us" domain) we just list the country of origin and let it go at that.
Sometimes the raw data is even less informative. Often the requesting system is known to us only as an IP address. It's hard to imagine how it could be useful to know that the computer at address "168.174.126.240" (with an unknown name or domain) accessed your pages, so we lump all those together as (unknown).
The third section of the report, titled Outside referrals to your pages, gives some indication of how people are finding your pages. It looks something like this:
Outside referrals to your pages:
/ (your main page)
20 Infoseek search engine
17 Yahoo search engine
10 Alta Vista search engine
5 http://members.home.net/rudolph/
1 http://search.go2net.com/crawler
This section, which can be extremely useful and informative, is unfortunately based on very flakey data. When a web browser asks our server for a web object it is supposed to supply a "referring page". The idea is that if some page out there has a link to one of your pages, and someone clicks on that link, then when our server is asked for your page it is given the name of the page that contained the link.
Am I making things any clearer? Probably not.
But there are some problems. The browser doesn't always tell us the name of the referring page, and even when it does the information can be incorrect, misleading, or useless. Also, we leave out internal referrals, that is referrals from pages within the esva.net domain or your virtual domain. Surely it can't be all that interesting to know that an image on your main page is referred to by your main page. Including internal referrals would make this section of the report much larger, and much less interesting.
This section of the report is organized into "paragraphs", each beginning with the name of one of your web objects, followed by hit counts for each referring page. Where the referring page is a popular search engine (such as Yahoo or Excite we generally try to list it as a reference from "Somethingorother search engine", since the search engines don't really have static pages with links to your pages.
In the example shown above, we can see that the main page got twenty hits from the Infoseek search engine, and five hits from a link on a page in the home.net domain. As people in distant lands put links to your pages on their pages, you will see them appear in this section of the report.
Don't expect the numbers in this section to match up with the other sections.
If there is anything I've left unclear, please send me email about it and I'll try again.