The Replacement of User Agent Strings

HTTP clients offer new ways for users to reveal browser information to websites.

Unlike the traditional user agent string, clients only share low-entropy data by default. If websites like to know more about a user, they have to explicitly request this information via HTTP CHs.

HTTP CHs can provide more transparency on what information websites request from a user. This means that we can also get information on what features websites potentially use for risk estimation, e.g., in the case of risk-based authentication (RBA).

Overview of HTTP Client Hints

Overview of the Client Hints Web crawling procedure

Studying 8M Websites

How do websites adapt to HTTP CHs and what features do
different types of websites request via HTTP CHs?

To answer this question, we studied the use of HTTP CHs on the 8M most popular websites and their login pages. We also used historical data from the HTTP Archive to get a full picture on the overall adoption of HTTP CHs.

Overview of the Client Hints Web crawling procedure

HTTP CHs Over Time

The historical data shows that the HTTP CHs usage increases. Nevertheless, their overall adoption is still quite low, despite very popular websites.

Interestingly, the HTTP CH use increased after Google published a blog article about the deprecation of the traditional user agent string in the Chrome browser, and their replacement with HTTP CHs.

Overview of the Client Hints Web crawling procedure

Requested Level of Detail

We found that websites using RBA requested significantly more detailed information than those without known RBA usage. Also, the 5,000 most popular websites also
requested higher level of detail compared to all crawled websites.

Beyond that, different website types tended to required different levels of detail from their users.


Privacy Risk: Third Parties

Many of the crawled login pages embedded third party resources that requested detailed browser information. These third parties were embedded by other login pages as well. This is a privacy risk, since HTTP CHs can provide more information about the client than the user agent string.

Based on the browser market share, we assume that at least 78% of desktop and 69% of mobile users are trackable with HTTP CHs. To protect against these practices, browsers should implement controls to let users control the information to share via HTTP CHs. They should also have default settings that protect privacy.

Network of Websites and their connection to third parties

Technical Paper

You can find more details in our publication below.

The paper is accepted for ARES ‘24.

A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web
Stephan Wiefling, Marian Hönscheid, Luigi Lo Iacono
 
Abstract

HTTP client hints are a set of standardized HTTP request headers designed to modernize and potentially replace the traditional user agent string. While the user agent string exposes a wide range of information about the client's browser and device, client hints provide a controlled and structured approach for clients to selectively disclose their capabilities and preferences to servers. Essentially, client hints aim at more effective and privacy-friendly disclosure of browser or client properties than the user agent string.

We present a first long-term study of the use of HTTP client hints in the wild. We found that despite being implemented in almost all web browsers, server-side usage of client hints remains generally low. However, in the context of third-party websites, which are often linked to trackers, the adoption rate is significantly higher. This is concerning because client hints allow the retrieval of more data from the client than the user agent string provides, and there are currently no mechanisms for users to detect or control this potential data leakage. Our work provides valuable insights for web users, browser vendors, and researchers by exposing potential privacy violations via client hints and providing help in developing remediation strategies as well as further research.

If you like to cite the paper, please use the following BibTeX entry:

@inproceedings{Wiefling_A_2024,
  author = {Wiefling, Stephan and Hönscheid, Marian and {Lo Iacono}, Luigi},
  title  = {{A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web}},
  booktitle = {19th {International} {Conference} on {Availability}, {Reliability} and {Security}},
  series = {{ARES} '24},
  location = {Vienna, Austria},
  doi = {10.1145/3664476.3664478},
  publisher = {ACM},
  month = aug,
  year   = {2024}
}

Data Set

To reproduce our results and to make research on HTTP CHs easier for everyone, we provide an open data set. It contains the HTTP CH responses from all login pages that we crawled between August 7th, 2022 and December 21st, 2023. You can use the data for your own projects.

Feel free to use this data set for your research. Please cite our publication when doing so.