Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

How good/bad is AWSTATS?

Options
  • 09-01-2013 2:21pm
    #1
    Registered Users Posts: 1,363 ✭✭✭


    I am having a hard time believing the AWSTATS generated by a particular blog I have since the last 4 years.
    It works well apparently, but I have such a difference in traffic report between AWSTATS and Google analytics I am not sure which one to believe.
    Google Analytics on the other hand seems a lot lower. In 2012 according to Google my site received about 60,000 visitors.

    According to AWSTATS in total in 2012 my site generated the following traffic:

    - Unique visitors: 252,631
    - Number of visists: 382,962- Pages: 5,255,853
    - Hits: 27,533,179
    - Bandwidth: 661.33 GB
    I do get a lot of traffic (I am at 50GB so far this month), so much that the hosting company in the US stopped my site for 2 days last year. One big problem was that the search engines were spending too much time indexing my site, as there are thousands and thousands of files, so I had to reduce this traffic.

    Any experience on this guys?


Comments

  • Registered Users Posts: 7,739 ✭✭✭mneylon


    they work in different ways

    Awstats works directly on the logfiles with optional Javascript (for screen resolution etc.,)

    Google Analytics works with Javascript

    So you can and will end up with discrepancies

    You might want to look at the raw log files (both the access and the error log) to see what's been going on

    You can also control how Google and some of the other bots crawl via their respective "webmaster tools" consoles

    the bigger search engines respect robots.txt, while those that ignore it are often up to no good and might be better off being blocked entirely ..


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    Blacknight wrote: »
    they work in different ways

    Awstats works directly on the logfiles with optional Javascript (for screen resolution etc.,)

    Google Analytics works with Javascript

    So you can and will end up with discrepancies

    You might want to look at the raw log files (both the access and the error log) to see what's been going on

    You can also control how Google and some of the other bots crawl via their respective "webmaster tools" consoles

    the bigger search engines respect robots.txt, while those that ignore it are often up to no good and might be better off being blocked entirely ..

    Cheers
    Yeah I ended up blocking Bing for instance, the amount of traffic it was generating was jut insane.
    I did changes in my robots and htaccess file it did make a big difference alright.
    Still I have no idea about my actual traffic. I will look into the logs.

    I have attached some logs, what do you think?

    Also I seem to have a large number of 404 errors that generated 12GB of traffic (See traffic3), is there a way to reduce this or is it expected?

    Cheers


  • Registered Users Posts: 739 ✭✭✭flynnlives


    have you checked that all your links are working?
    404 sometimes come up if you changed the linking url but forgot to update the link


  • Registered Users Posts: 7,739 ✭✭✭mneylon


    You shouldn't have to block bing

    Check why you're getting the 404 - you need to look at the raw logs and find out what's causing them


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    Blacknight wrote: »
    You shouldn't have to block bing

    Check why you're getting the 404 - you need to look at the raw logs and find out what's causing them

    BING basically took the entire shared server down, CPU very high etc.
    The hoster disabled my site as a consequence, BING was stuck for 3 days to leech my site...
    At some stage it went back up to test, same thing, thousands and thousands of queries. It was like a deny of service lol.
    BING is useless anyways.


  • Advertisement
  • Registered Users Posts: 7,739 ✭✭✭mneylon


    You obviously need to check your site's configuration

    Bing wouldn't normally have any negative impact on a site's performance or bandwidth usage

    If it is impacting it then there's probably something else going on that's causing the issue


  • Registered Users Posts: 947 ✭✭✭Shzm


    Something on your site sounds dodgy.. you shouldn't need to block Bing (Bing is far from useless btw), and you shouldn't really be serving 12GB worth of 404s.


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    first port of call for me would be the 404 then the 500's. The 404 not because its eating b/w but because there are internal errors somewhere and its a very poor user experience.


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    My site has a hosted BMW electrical wiring diagrams application, which is over 2GB in size and has over 20.000 files.

    When I say Bing was trying to scan it all and caused serious performance degration to it, I am being serious, this was highlisted by my hosting company. This was in the logs before.

    The site works very well, but by going through the raw logs right now, I can see there are a large number of 404, related to the wiring diagrams pages, it is missing a few pictures and files, but because my site is highly used on a daily basis, a few missing files translate into a large amount of 404 traffic.

    Maybe I can post my raw log from today and you guys and browse into it and share your feelings/findings?

    Cheers


  • Registered Users Posts: 7,739 ✭✭✭mneylon


    bmstuff wrote: »
    Maybe I can post my raw log from today and you guys and browse into it and share your feelings/findings?

    I assume you're joking?


  • Advertisement
  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    Blacknight wrote: »
    I assume you're joking?

    Joking about what?


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    Joking about you not asking for a VAT number before someone spends time trawling through your log files and giving you advice on how to fix your site.

    Maybe thats the joke I like most

    (bar:

    Those Aldi horse burgers were nice, but I prefer My Lidl Pony

    - which is a cracker)


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    blue4ever wrote: »
    Joking about you not asking for a VAT number before someone spends time trawling through your log files and giving you advice on how to fix your site.

    Maybe thats the joke I like most

    (bar:

    Those Aldi horse burgers were nice, but I prefer My Lidl Pony

    - which is a cracker)

    I have no idea what you guys are talking about, we are talking technical issues here, I proposed to post my log file and I am asked if I am joking, you kinda lost me here.


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    You had a three fold increase in b/w for practically the same level of pages / hits/ visitors in nov/dec as opposed to Sep? what happened there, why the massive increase? anything configured differently?

    If the schematics application output doesn't need to be indexed - but the other e-com pages should be - then use your robots.txt file to block the bots from doing so - or use a meta to do it on those pages.

    the schematic pages dont seem to be in the serps for your site - so then they shouldn't be crawled from here on in.

    As to the 404 - you have to fix them.

    Whats the url for the diagram application?

    C


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    blue4ever wrote: »
    You had a three fold increase in b/w for practically the same level of pages / hits/ visitors in nov/dec as opposed to Sep? what happened there, why the massive increase? anything configured differently?

    If the schematics application output doesn't need to be indexed - but the other e-com pages should be - then use your robots.txt file to block the bots from doing so - or use a meta to do it on those pages.

    the schematic pages dont seem to be in the serps for your site - so then they shouldn't be crawled from here on in.

    As to the 404 - you have to fix them.

    Whats the url for the diagram application?

    C

    There you go
    http://www.bmw-planet.com/diagrams/

    I don't know sometimes after posting an article I am getting massive traffic, pingbacks etc. Or if someone post a link to the site from a forum thread for example, I will get an increase in traffic.
    All the 404 are coming from the diagrams section, some missing scripts (js) that are called all the time, that does not prevent the diagrams from working.

    I already excluded all robots from crawling the diagrams a few months back, been fine since.


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    OK and it was a brief look but…

    I’d start with tracking down the 404’s and there are literally hundreds of them;
    Take for example the page http://www.bmw-planet.com/wheels/ out of 185 links on that page 106 are 404’s – without putting too fine a point on that - it’s unacceptable not just for punters (and you’d have to be turning a huge amount away with those levels of defaults) but for bots as well.

    You have analytics and have you 404 page with analytics code, then you should start tracing the source of the 404’s

    Your Feedburner links are creating some havoc on some pages (for example)
    http://www.bmw-planet.com/2012/09/page/40/

    The Stumbled on and dig this links are screwed up and, therefore, must be on the 47 pages!

    Get your hand on Xenu and start tracing your broken links
    http://home.snafu.de/tilman/xenulink.html

    Again I only looked a few page Once you have reduce these then you can begin to look at the bandwidth.


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    blue4ever wrote: »
    OK and it was a brief look but…

    I’d start with tracking down the 404’s and there are literally hundreds of them;
    Take for example the page http://www.bmw-planet.com/wheels/ out of 185 links on that page 106 are 404’s – without putting too fine a point on that - it’s unacceptable not just for punters (and you’d have to be turning a huge amount away with those levels of defaults) but for bots as well.

    You have analytics and have you 404 page with analytics code, then you should start tracing the source of the 404’s

    Your Feedburner links are creating some havoc on some pages (for example)
    http://www.bmw-planet.com/2012/09/page/40/

    The Stumbled on and dig this links are screwed up and, therefore, must be on the 47 pages!

    Get your hand on Xenu and start tracing your broken links
    http://home.snafu.de/tilman/xenulink.html

    Again I only looked a few page Once you have reduce these then you can begin to look at the bandwidth.

    Sounds good thanks for that

    Yeah the havoc is quiet annoying and awful and have not found a way to fix it.
    The page numbers at the bottom of the pages is quiet annoying too, they overlap.

    If you are ready to take on the job and fix those 2 issues, I am happy to pay for your time.

    Cheers


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    The page numbers at the bottom ate dictated to by the plugin
    http://wordpress.org/extend/plugins/wp-pagenavi/

    Mess around with that (eg the option - number of pages to show) and it will sort it out

    And the settings of the feeds are set in feedflare http://wordpress.org/extend/plugins/tags/feedflare

    Again; muck aroun there and you should be sorted.

    I know I was flippant about the VAT etc (and thus the replies this morning!) - but to be honest - you'd be better off tackling it yourself as it low level stuff but really time consuming.

    Also - you'd be better off knowing whats screwing up, so you have the solutions in the future.

    Cheers

    C


  • Registered Users Posts: 1,363 ✭✭✭bmstuff


    blue4ever wrote: »
    The page numbers at the bottom ate dictated to by the plugin
    http://wordpress.org/extend/plugins/wp-pagenavi/

    Mess around with that (eg the option - number of pages to show) and it will sort it out

    And the settings of the feeds are set in feedflare http://wordpress.org/extend/plugins/tags/feedflare

    Again; muck aroun there and you should be sorted.

    I know I was flippant about the VAT etc (and thus the replies this morning!) - but to be honest - you'd be better off tackling it yourself as it low level stuff but really time consuming.

    Also - you'd be better off knowing whats screwing up, so you have the solutions in the future.

    Cheers

    C

    Yeah I know Wordpress very well, got plenty of sites running on it, this is just this particular template that I like but can't get around with those 2 issues.

    Thanks for your time anyway, much appreciated.


  • Registered Users Posts: 1,256 ✭✭✭blue4ever


    pleasure


  • Advertisement
Advertisement