While learning Golang and working on some first experiments, I started a little side-project: a tool to download HTTP headers of most popular webpages. It would store them in a database, and do some analysis work. The list of domains was found on the following gist.

I simply assumed that each address should work with https:// schema as a prefix. If the “www.” prefix is needed, DNS or HTTPS redirects would take care of it. This was true for a majority of sites (although had to ignore various certificate issues), any aborted connections were simply ignored.

Of course, a webpage may generate different HTTP headers (or their values) depending on the context: Am I already logged in? Am I being redirected to another page? Am I using API? Am I downloading some resource? Even: am I sending the right headers in request? For now, all that was recorded was the basic response to HEAD request. Nothing fancy. No AI or machine learning.

Some stats

At first, my main goal was to identify interesting non-standard headers, but this is a bit trickier than just looking for headers with the “X-” prefix. In fact, this wouldn’t work at all as explained in RFC 6648:

Historically, designers and implementers of application protocols have often distinguished between standardized and unstandardized parameters by prefixing the names of unstandardized parameters with the string “X-” or similar constructs. In practice, that convention causes more problems than it solves. Therefore, this document deprecates the convention for newly defined parameters with textual (as opposed to numerical) names in application protocols.

So, if “X-” isn’t a good indicator, then what is? How to tell if a header is non-standard? MDN points at IANA registry. The list is just full of exotic inventions. Ever heard of Hobareg authentication header? If-Schedule-Tag-Match conditional header? Default-style?

Neither did I. They aren’t necessarily supported by modern browsers either, but the mere fact that they are produced may leak some interesting information. Plus, they are somehow standardized, so incorrect configuration is also an important metric. E.g. the Cf-Ray header is non-standard, but it was observed in over 10% of pages. It’s there if Cloudflare is involved. Any headers processing tool should know how to make use of that information and possibly return some findings.

Without any distinction between standard and non-standard headers… these are the most popular HTTP headers observed:

Header Occurences
Set-Cookie 1620
Date 821
Content-Type 813
Server 686
Cache-Control 598
Vary 474
X-Frame-Options 406
Expires 370
Content-Length 360
Strict-Transport-Security 356
X-Xss-Protection 302
X-Content-Type-Options 257
Etag 209
Via 199
Accept-Ranges 197
X-Cache 184
Content-Security-Policy 167
Last-Modified 166
Pragma 160
Link 157
Age 140
Connection 125
Expect-Ct 125
P3p 122
Cf-Ray 115
Cf-Request-Id 114
Alt-Svc 113
Cf-Cache-Status 101
X-Powered-By 98
Referrer-Policy 93
Access-Control-Allow-Origin 86
X-Amz-Cf-Id 79
X-Amz-Cf-Pop 79
X-Cache-Hits 73
X-Served-By 67
Content-Language 59
X-Timer 58
Report-To 50
Nel 47
X-Request-Id 46
X-Ua-Compatible 41
Content-Security-Policy-Report-Only 40
Access-Control-Allow-Credentials 32
Access-Control-Allow-Methods 32
X-Download-Options 31
Access-Control-Allow-Headers 30
X-Envoy-Upstream-Service-Time 29
Timing-Allow-Origin 25
X-Varnish 24
Mime-Version 23
X-Dns-Prefetch-Control 21
Server-Timing 19
X-Runtime 18
X-Permitted-Cross-Domain-Policies 17
X-Via 15
Access-Control-Expose-Headers 13
Content-Encoding 13
X-Amz-Rid 13
X-Cache-Status 13
Accept-Ch 12
Eagleeye-Traceid 12
X-Ws-Request-Id 12
X-Aspnet-Version 11
Accept-Ch-Lifetime 10
Eagleid 10
X-Server-Id 10

Some bad names

Very likely, some of the observed headers were simply invalid. Several were already obsolete, they shouldn’t be used anymore and most browsers will just ignore them. Sometimes, obsolete headers were sent in addition to valid ones - perhaps to guarantee that old clients1 are still supported. Some examples:

X-Webkit-Csp*
obsolete
X-Webkit-Csp-Report-Only
obsolete
X-Content-Security-Policy
obsolete
X-Content-Security-Policy-Report-Only
obsolete
X-Content-Secure-Policy
note typo, Secure instead of Security
Public-Key-Pins
deprecated

On the list, I found also minor typos that will render header useless: Pramga, Strict-Transportsecurity

Some bad values

X-Frame-Options

Even if header names were correct, the values were sometimes a bit off. Certain headers are well-defined and can use only a few specific values. A good example is X-Frame-Options (XFO) header that offers simple protection against so-called clickjacking. It is recommended to use a proper configuration of this header along with a much more powerful CSP’s frame-ancestors directive. Even though XFO originally tried to address the same functionality via ALLOW-FROM parameter, it was never widely deployed and is now (the parameter) obsolete.

If used with ALLOW-FROM, the header will be simply ignored by any browser and treated as if no XFO was defined. This may open up chances to clickjacking2, hence such configuration should never be used. Yet… I observed three interesting variants:

X-Frame-Options: ALLOW-FROM <some-url>
someone should look up CSP
X-Frame-Options: ALLOWALL
non-standard, not mentioned in RFC 7034, but because it will turn off XFO it somehow manages to do exactly that: allows all 🤪
X-Frame-Options:
empty, not expected by anyone

X-XSS-Protection

Another on the list is X-XSS-Protection header. Since XSS auditor support was dropped from Chrome and Edge, the typo in its value doesn’t matter, but I still want to place it here: because it did matter in the past. The directives within value should be separated using semicolons, however, simple comma was found in value returned by one of pages:

X-XSS-Protection: 1,mode=block
note comma

CORS

Access-Control-* headers are meant to inform a browser how a given API can be used. One of them - *Access-Control-Allow-Methods* - can be used to specify which HTTP verbs are OK to access the endpoint. One page misused it and returned header name in reply: Access-Control-Allow-Methods: Content-Type

More than one page decided to use a non-standard way of informing what origins are accepted. This mixed host and wildcard syntax, but also unsupported escaped patterns:

Access-Control-Allow-Origin: *.(vporn\\.com)
mix of wildcards and… not sure what
Access-Control-Allow-Origin: *.imgimg.com
hostname cannot be mixed with wildcard
Access-Control-Allow-Origin: *first*, *second*, *third
only a single origin can be specified

The null origin is also on the list. This makes API accessible from sandboxed iframes or data: scheme that have null as their origin: Access-Control-Allow-Origin: null

Strict-Transport-Security

Several pages use the “preload” directive in their Strict-Transport-Security header. To be properly preloaded, the header needs to follow few more restrictions, though. Besides that - page needs to be sent to the proper registry and pass validation, but this also requires correct header configuration.

The requirements for preloading are described on hstspreload.org. HSTS must specify includeSubDomains directive. 2% of sites did not meet this requirement while having preload directive in header. Couple websites also didn’t meet the requirement of max-age of over 1 year. Two sites had this value set to 0.

Some fresh stuff

Permissions Policy

Only two pages returned Permissions-Policy header that provides a mechanism to allow and deny the use of browser features. A couple more (8) used the older name of this header - Feature-Policy. Should they change it right away? Not necesarily, since according to Can I use permission policy isn’t recognized by default by any browser.

Can I use stats for Permissions-Policy header

Cross-Origin-*-Policy

Cross-Origin-Embedder-Policy wasn’t used anywhere, however, a single site used Cross-Origin-Embedder-Policy-Report-Only for testing. That’s good!

Cross-Origin-Opener had more love and was used by four sites.

The __Host- prefix cookies aren’t that useful apparently - only a single instance was observed, but at least it was properly configured.

It’s still better than __Secure- prefix - no website used them.

Summary

This post was meant to summarize some of the quirks spotted in HTTP headers from most popular websites. If they have it wrong, so will others. At least I have lots of ideas to implement in my little headers analyzer project.


  1. Software. ↩︎

  2. Ok, there are also frame busting scripts, but they have their own problems. ↩︎