LinkedIn Twitter
Publisher / Editor @ CloudAve and Enterprise Irregulars. Industry Observer, Blogger, Startup Advisor, Program Chair @ SVASE (Silicon Valley Association of Startup Entrepreneurs). In his "prior life" spent 15 years immersed in the business of Enterprise Software, at management positions with SAP, IBM, Deloitte, KPMG and the like.

22 responses to “Hey CloudFlare, What’s Wrong with these Numbers?”

  1. Damon Billian

    “contrary to Pingdom’s stats”

    Just a quick note that it appears Pingdom is having some issues with sites and reporting (we’re working with them on it).

    Pageviews:
    Our terminology for reporting is a little bit different. I hope this clears up some of it…
    https://cloudflare.tenderapp.com/kb/cloudflare-statistics/how-does-cloudflare-count-page-views

    Threats:
    The confusing thing is that the reporting reflects *all* known threats on that overview page. If you go to your Threat Control panel, however, it is only going to reflect visitors that were actually challenged based on your security level settings (if threshold not met for security level = no challenge presented to visitor). If you have blocks or other custom rules in place as well, then these wouldn’t appear in your dashboard for challenged visitors.

    Speeding up sites:
    Most sites see at least a 40%-60% improvement in page loading times (will vary by site, site size, resources on site, etc.).

    The difference in 30 days and yesterday is a little hard to explain & we’re working on a fix to make it the same across the board (broken down by hour or 15 mins., for example, based on the sorting chosen).

  2. Matthew Prince

    A bit more technical color: The yesterday vs. last 30 days bug is a particular annoyance of mine and we’re working on how to get it fixed. To give you some sense, in any given minute we generate several million log lines. In order to process this volume of logs we need to reduce them at various steps. These steps get broken down into “units” that for the last day are in 15-minute increments, but for the last month are in 1-day increments. Doing a lookup for the 15-minute increments to calculate yesterday, and then doing it across the 1-day increments for the rest of the month, turns out to double the lookup time. To get around that we cheated so that the Last 30 Days number is really the Last 30 Days except yesterday. After you’ve been on the system for a while it averages out. In the beginning it looks weird. It really bugs me, and bugs me now even more that you’ve written an article pointing it out. We’ll get it fixed, but that’s the explanation of what’s going on and why it is happening.

    I wish I had as clear an answer for the Pingdom question. They are a partner and, usually, their service appears to track the experience of users. However, in some cases they will report radically different load times than browsers themselves report or any other speed monitoring service. We optimize for real world browser performance, not for Pingdom or any other testing tool. We’re working with them to figure out what is going wrong. Our working theory is that it has something to do with Browser Integrity Checks and/or Hotlink Protection. Other than the load-time-that-doesn’t-track-reality-in-some-cases issue, we love Pingdom, use them internally as one of our many monitoring services, and are proud to have them as a partner. We’ll get to the bottom of the problem and get it resolved.

  3. Matthew Prince

    PS – I’m 100% confident in the hits and uniques numbers. Any deviation from Google Analytics there can be explained by the Javascript issue you reference.

    Page views depends on how you count them. AJAX-driven sites will tend to over-report Page Views (since a “page view” is a not-entirely-compatible concept in something that is AJAX-driven). That said, we use the industry standard method for counting — the same method Facebook uses. Of course, that’s part of why Facebook reports nearly 900 billion monthly page views — each “poke” counts as a Page View.

    If you’d like to see the raw logs, we’re happy to send them to you. :-)

    1. David

      Hello, we have been looking at some sites that have very unusual stats. Here are how they look:

      Statcounter.com, Google Analytics and our own Advert Server all report 29 page loads/impressions per day.

      Cloudflare Analytics reports 12’000 page loads per day!

      I flicked on all stats systems within the cPanel account for this domain and left it for a few days. When I came back and checked, it had detected wild stats of 6’000 – 8’000 between the different systems. I know they all use the same file to log the stats.

      Now, this leads me to think that there is either one of two things happening:
      1) The stats are correct and off site/off server stats systems such as Google Analytics and Statcounter are being blocked. Does the firefox “do not track” option prevent some tracking? Are visitors using something that is blocking external tracking scripts? After all, all of the systems that are showing low numbers are external, including our advert server.
      2) Cloudflare and cPanel are picking up some activity and translating it to visits and loads. I know this is unlikely, but I can not prove #1.

      My concern here is that if this extreme variation between stats can happen, I need to know why and how. Naturally, when you see 29 page loads, you do not expect the site to be using much resources, but in reality it is experiencing 12’000 page loads per day, that’s a difference of 414x or 41’379%!

      Presently we have two sites both with a similar setup which are experiencing very similar differences. We have one other site that has less of a difference. All of the others seem to be normal.

      I keep seeing that the only reason for a difference in stats is the script used, but I have tested with html tracking code and that developed the same low numbers. In fact, the only difference between javascript and html tracking code over 7 days was an average of 1 extra page load per day for the html code. That shows me that of the 29 page loads, there was likely an extra 1 page load from a user who was not using javascript.
      But I can not accept that the differences we are seeing between on server tracking and off server tracking. Even if cloudlfare does track a call returning html as a page load, it does not explain the difference between the cPanel stats and off server stats.

  4. Matthew Prince

    The hits / unique visitor numbers are definitely correct with CloudFlare and are more accurate than any beacon-based tracking (e.g., Google Analytics or Quantcast). Beacon-based services don’t actually report hits because they have no way of tracking them. That’s ok for their purpose — typically beacon-based analytics is used by the advertising industry to measure how many impressions a site will get for an ad and, since the ads themselves are served by JavaScript-based tags, a JavaScript-based beacon yields the appropriate answer — but there are other important metrics these measurements don’t capture. For instance, we often hear about users complaining that their server is slow when Google Analytics doesn’t show any increased traffic. However, when you look at CloudFlare’s data you’ll see at the same time the site slowed down there was a huge increase in visits from the Google, Bing, and Yahoo search crawlers. While advertisers may not care about this traffic, a server admin who has to pay for the bandwidth and server resources it uses certainly does.

    Page Views are a tricky beast. Google Analytics actually has this easier than we do. Where our base unit of measurement is a “hit” (i.e., a request for a resource from a server, whether it be HTML, an image, JavaScript, CSS, or anything else), Google Analytics’ is a page view (i.e., a request for a base HTML object). All page views are hits, but not all hits are page views. This means we actually see more data than an analytics program like Google Analytics, but our challenge is categorizing only certain hits into page views.

    We follow the most widely accepted industry standard to do this. Unfortunately, that standard is entirely unsatisfying. To explain why, think about your experience on Facebook. What, on Facebook, should count as a page view? When you load the page the first time, of course. But when you comment on a wall post? Thumb through a photo gallery? Poke someone? The idea of a “page view” is born out of a web where every interaction caused the page to reload. In modern, AJAX-driven sites, where the URL in the browser may stay on the same “page” but data is loaded in dynamically, the challenge is trying to map the old analytics standard everyone still uses to the new reality of the web. So in our case, and in Facebook’s since they use the same industry standard, those AJAX requests which count as HITS also get counted as “page views.” Put another way, if Facebook were using Google Analytics to report their Page View numbers, they would be a fraction of the nearly 1 trillion they are closing in on.

    The take away I think is this: Google Analytics (and similar services) and CloudFlare both are accurate in what they’re measuring and both have a place in the analytics landscape. GA is great if you’re trying to measure for advertising impressions. CloudFlare is great if you’re trying to measure actual server resources used (hits/bandwidth/bot traffic). There’s a reason that one of the first features we included was the ability to install Google Analytics on all your pages with a single click: we think that data complements the analytics picture we provide and, together, provides a full picture of what someone running a website needs to know to do it well.

    Happy to talk more about this with you if you’re interested.

  5. John Keegan

    > Well, we don’t have those huge pipes

    Well you do… ;-) Your site is hosted at PressHarbor.com and is located in a well-connected data center with bandwidth from multiple Tier 1 providers. I think perhaps the reason you are not seeing the difference you thought you would see was because your performance at PressHarbor was quite excellent. You are on a well-tuned server in a high-peformance network. Perhaps the difference would be more dramatic for hosts that can’t quite offer the level of “out of the box” performance we do.

  6. John Keegan

    Zoli -

    You said: “I am seeing uniques double / triple, and pageviews 7-8x compared to Google Analytics, WordPress, Statcounter, Quantcast, which all deviate from each other a little, but are in the same ballpark.”

    Yes that is totally normal. They are seeing all sorts of traffic that Google Analytics, WordPress, Statcounter, and Quantcast never see. Remember back in the day when we hosted your sites on Blogware? Blogware would give you the real traffic stats which counted access from bots and spiders and other non-humans, and when you tried Google Analytics you wanted to know where all your traffic went? In some cases Blogware users saw 10% of the page views they thought they had – because they didnt realize how many hits their sites got from bots and spiders. That’s still true today and now you are seeing it in reverse.

    As they point out, you can’t sell this traffic. ;-) It doesn’t change the answer to “How many people read my site?” That question is still best answered by Google Analytics.

    But when your hosting provider says – wow, look how much bandwidth you are using! The Cloudflare numbers are more reflective of the total amount of traffic your site has to serve… When Google decides to crawl through 10,000 of your dynamically generated pages (oh the CPU!).

  7. Michael Scott

    I envy you guys. I’m still trying to figure out how to add GA to CloudFlare. I read it’s just a button click but where’s the button? It’s not under settings. My biggest complaint is that CF’s dashboard/settings template is not user friendly. The analytics method of tracking views/hits is confusing, not the theory, but the actual reading of it. I have CF pro and I understand that numbers are tracked every 15 minutes but what next? If you add up the numbers in these incremental boxes will that equal the total? According to WP Stats I have 892 page viewsso far today (holiday and all that), but I have over 4K (3K+Visitors) using CF analytics, and around 600 views on GA. It goes without saying that this makes me happy as I run a web magazine, but who should I really believe? I hate to sound obtuse but this seemed to be an ideal forum for such questions, and I apologize if I wandered off topic.

  8. Michael Scott

    I managed to set up GA, and now my bounce rate is dropping dramatically. Something to do with two codes?

  9. Roger Coathup

    Contrary to most people’s experience, we are seeing vastly lower pageview numbers being reported by CloudFlare – 25 times lower than Google Analytics is saying for the site.

    We know the CloudFlare ones are wrong – because we’ve had more new user registrations (and they go through several pages) than CloudFlare reports we’ve had pageviews.

    Anyone had a similar experience?

  10. CloudFlare | Crowdsourcing Web Traffic Control

    [...] (Editor’s note:  we are using CloudFlare @ CloudAve.  It’s a great service, despite questions re. funny numbers.) [...]

  11. Damir B.

    I’ve been using their service and I’m mostly satisfied.

    Pros:
    - Using the free service and experienced only a few downtimes so far
    - Witnessed a small drop in spam so their threat control is working (read the cons regarding this issue)
    - They lowered the load on my server
    - The service serves static content faster then my server would from one location

    Cons:
    - Threat control panel is plane awful and it keeps on adding threats even if you whitelist IPs from a certain country or countries. Even got some complaints from visitors who entered additional message they can when they solve the CAPTCHA
    - Cookies sent with every file they serve (very bad practice for static files and an issue when it comes to new laws against cookie serving)
    - Stats are higher than ones on the server so I wouldn’t rely on them

  12. John Wheal

    Is the free cloudflare reliable enough to be used on a high traffic site? At the moment I use Route 53, but I don’t like the feeling of using a free service.

    I have tried cloudflare on a small site I own and the stats were ridiculous. (100,000 page views a day), more like 2000.

  13. Christian G

    Have similar, strange results…
    Pageviews among to Piwik and Analytics (before I jumped to CloudFlare):100Pageviews per day
    Pageviews among CloudFlare: 6000Pageviews per day…
    That cannot be true!!
    Don’t know a reason or solution…

  14. Chris

    we also are seeing massive deviation in VISITOR stats between cloudflare and our js analytics (backed up by image for non-js enabled users). the js figures much more closely match our experience of what’s happening. The cloudflare numbers don’t make sense.

  15. Mike Marcacci

    Old thread I know, but it’s the best discussion on this anywhere. I have a theory on the whole stats thing, which I experience as well (426k views/month on CF vs 144k on GA):

    It’s possible that Cloudflare is logging anything it connects to the server, *including* non-2XX code responses. For example, my organization uses the root domain, and 501 redirects any www. subdomain traffic to the corresponding root URL. On CF, this might be marked as 2 separate requests.

    On the other hand, I’m not sure this would explain the discrepancy between the 36k monthly *uniques* on GA, verses the 112k on CF. Any thoughts?

  16. Cloudflare vs. Google Analytics: Why the Numbers Are So Different | Live Intensely.

    [...] a comment I just made to Hey CloudFlare, What’s Wrong with these Numbers?, I gave one [...]

  17. Mike Marcacci

    So, I got on a tangent thinking about this, and wrote a blog post:
    http://www.liveintensely.com/2013/04/cloudflare-vs-google-analytics-why-the-numbers-are-so-different/

    My newest idea: browser prerendering is responsible for a huge chunk of this, which would explain both the pageviews and uniques.