565 lines
30 KiB
TeX
565 lines
30 KiB
TeX
%% Commands for TeXCount
|
||
%TC:macro \cite [option:text,text]
|
||
%TC:macro \citep [option:text,text]
|
||
%TC:macro \citet [option:text,text]
|
||
%TC:envir table 0 1
|
||
%TC:envir table* 0 1
|
||
%TC:envir tabular [ignore] word
|
||
%TC:envir displaymath 0 word
|
||
%TC:envir math 0 word
|
||
%TC:envir comment 0 0
|
||
|
||
\documentclass[sigplan,screen]{acmart}
|
||
\settopmatter{printacmref=false} \renewcommand\footnotetextcopyrightpermission[1]{}
|
||
|
||
|
||
|
||
\usepackage[strings]{underscore}
|
||
%% Code blocks and their styling.
|
||
\usepackage{listings}
|
||
\usepackage{xcolor}
|
||
\definecolor{codeblue}{rgb}{0.00,0.40,0.50}
|
||
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
|
||
\definecolor{backcolour}{rgb}{0.95,0.95,0.95}
|
||
\lstdefinestyle{customstyle}{
|
||
backgroundcolor=\color{backcolour},
|
||
commentstyle=\color{codeblue},
|
||
keywordstyle=\color{orange},
|
||
numberstyle=\tiny\color{codegray},
|
||
stringstyle=\color{magenta},
|
||
basicstyle=\ttfamily\footnotesize\color{black},
|
||
breakatwhitespace=false,
|
||
breaklines=true,
|
||
captionpos=b,
|
||
keepspaces=false,
|
||
numbers=none,
|
||
numbersep=5pt,
|
||
showspaces=false,
|
||
showstringspaces=false,
|
||
showtabs=false,
|
||
tabsize=2
|
||
}
|
||
\lstset{style=customstyle}
|
||
|
||
%% \BibTeX command to typeset BibTeX logo in the docs
|
||
\AtBeginDocument{%
|
||
\providecommand\BibTeX{{%
|
||
\normalfont B\kern-0.5em{\scshape i\kern-0.25em b}\kern-0.8em\TeX}}}
|
||
|
||
%% Rights management information. This information is sent to you
|
||
%% when you complete the rights form. These commands have SAMPLE
|
||
%% values in them; it is your responsibility as an author to replace
|
||
%% the commands and values with those provided to you when you
|
||
%% complete the rights form.
|
||
%\setcopyright{acmcopyright}
|
||
%\copyrightyear{2018}
|
||
%\acmYear{2018}
|
||
%\acmDOI{XXXXXXX.XXXXXXX}
|
||
|
||
%% These commands are for a PROCEEDINGS abstract or paper.
|
||
\acmConference[TUBS]{Web Security Seminar}{July 21,
|
||
2022}{TU Braunschweig, DE}
|
||
%
|
||
% Uncomment \acmBooktitle if th title of the proceedings is different
|
||
% from ``Proceedings of ...''!
|
||
%
|
||
%\acmBooktitle[TUBS]{Web Security Seminar 2022 - TU Braunschweig, DE}
|
||
%\acmBooktitle{Woodstock '18: ACM Symposium on Neural Gaze Detection,
|
||
% June 03--05, 2018, Woodstock, NY}
|
||
%\acmPrice{15.00}
|
||
%\acmISBN{978-1-4503-XXXX-X/18/06}
|
||
|
||
|
||
\begin{document}
|
||
|
||
\title{Cookiejar Kintsugi: Reviewing the state of web application session security}
|
||
|
||
\author{Julian Lobbes}
|
||
\email{j.lobbes@tu-braunschweig.de}
|
||
\affiliation{%
|
||
\institution{Technische Universität Braunschweig}
|
||
\streetaddress{Universitätsplatz 2}
|
||
\city{Braunschweig}
|
||
\state{Niedersachsen}
|
||
\country{Germany}
|
||
\postcode{38106}
|
||
}
|
||
%%
|
||
%% By default, the full list of authors will be used in the page
|
||
%% headers. Often, this list is too long, and will overlap
|
||
%% other information printed in the page headers. This command allows
|
||
%% the author to define a more concise list
|
||
%% of authors' names for this purpose.
|
||
%\renewcommand{\shortauthors}{Trovato and Tobin, et al.}
|
||
|
||
%%
|
||
%% The abstract is a short summary of the work to be presented in the
|
||
%% article.
|
||
% TODO add abstract
|
||
\begin{abstract}
|
||
A large number of web applications store sensitive information about their users.
|
||
Access to these applications is managed in the form of web sessions, which are
|
||
attractive targets for malicious actors.
|
||
This paper is a review of existing literature, outlining the methods used to handle
|
||
the shared secrets pertaining to web sessions, common attacks used to discover these secrets, and
|
||
defenses used to protect them.
|
||
I also review existing empirical studies which have attempted to uncover the prevalence of web
|
||
session vulnerabilities in the wild, showing that existing defensive mechanisms are often
|
||
misused or not utilized.
|
||
\end{abstract}
|
||
|
||
%%
|
||
%% Keywords. The author(s) should pick words that accurately describe
|
||
%% the work being presented. Separate the keywords with commas.
|
||
\keywords{Session security, Cookie hijacking, Empirical analysis, Review}
|
||
|
||
%% A "teaser" image appears between the author and affiliation
|
||
%% information and the body of the document, and typically spans the
|
||
%% page.
|
||
%\begin{teaserfigure}
|
||
% \includegraphics[width=\textwidth]{sampleteaser}
|
||
% \caption{Seattle Mariners at Spring Training, 2010.}
|
||
% \Description{Enjoying the baseball game from the third-base
|
||
% seats. Ichiro Suzuki preparing to bat.}
|
||
% \label{fig:teaser}
|
||
%\end{teaserfigure}
|
||
|
||
\maketitle
|
||
|
||
\section{Introduction}
|
||
|
||
Web applications have existed as a large part of the internet ecosystem ever since it has become accessible
|
||
to the wider public in 1993\cite{jazayeri_trends_2007}.
|
||
Transfer of information between websites and their users occurs using the
|
||
Hyper Text Transfer Protocol (HTTP).
|
||
|
||
The protocol was designed for \textit{stateless} communication, but most web applications need to store
|
||
state information inbetween requests, for instance to track whether a user has already
|
||
authenticated\cite{calzavara_surviving_2018}.
|
||
Thus, web applications require a mechanism for sharing and storing state information with clients
|
||
in order to track the user's identity across HTTP requests.
|
||
|
||
Different methods exist to implement such a tracking mechanism on top of HTTP.
|
||
Most commonly, session identifiers (SIDs) are used\cite{nikiforakis_sessionshield_2011} in the form of a
|
||
cookie\cite{dacostaitalo_one-time_2012}.
|
||
Since SIDs are often used to authenticate a client accessing a web application and its protected
|
||
resources, the SID must remain a shared secret between client and server, inaccessible to third
|
||
parties.
|
||
|
||
Ensuring the confidentiality of session identifiers has proven to be difficult since the inception of
|
||
web applications\cite{jain_session_2015}.
|
||
Many attack vectors have been identified in the past\cite{kolsek_session_2002}, and security
|
||
features and methods were added to existing protocols retroactively\cite{hodges_http_2012}
|
||
in order to patch these cracks.
|
||
|
||
The result is somewhat of a patchwork of security mechanisms, leaving a lot of room for error
|
||
for web developers and system administrators, as the prevalence of these vulnerabilities in the
|
||
wild both in the past\cite{nikiforakis_sessionshield_2011} and at present\cite{drakonakis_cookie_2020}
|
||
would suggest.
|
||
|
||
\section{Background}
|
||
|
||
For context, I will provide a brief overview of the underlying technologies and mechanisms used by
|
||
web applications and the internet in general.
|
||
|
||
\subsection{Hyper Text Transfer Protocol (HTTP)}
|
||
|
||
Content on the web is transferred using the Hyper Text Transfer Protocol (HTTP), which is based on
|
||
a client-server paradigm.
|
||
The user-side client initiates communication by sending an \textit{HTTP request} to a web server,
|
||
most commonly requesting a specific resource (often an HTML document) from it.
|
||
The server responds with an \textit{HTTP response}, indicating the completion status of the request and,
|
||
if applicable, the requested resource.
|
||
|
||
HTTP was intended as a simple way to exchange documents formatted in the
|
||
\textit{Hyper Text Markup Language}\cite{grigorik_high-performance_2013} (HTML).
|
||
As such, it is stateless protocol, which treats each request as independent from preceeding
|
||
requests\cite{nielsen_hypertext_1996}.
|
||
As its adoption grew rapidly shortly after its inception, many revisions of the protocol standard,
|
||
largely centered around performance increases, were published and adopted in quick succession.
|
||
|
||
\subsection{Web sessions}
|
||
|
||
Application developers soon desired ways to build applications which preserve information about their
|
||
users between requests, despite the stateless nature of HTTP.
|
||
Early web browsers did not support storing state information\cite{kristol_http_2001}, but the need to
|
||
keep track of such data quickly became apparent, especially in the context of
|
||
web applications\cite{dacostaitalo_one-time_2012}.
|
||
|
||
Uniquely identifying a user across individual requests is represented by the idea of a \textit{session}.
|
||
A session can be established by the server generating a \textit{session identifier} (SID) and sending
|
||
it to the client.
|
||
The client now retransmits the SID with each request, for the duration of the session, thusly authenticating
|
||
themselves as a session owner whom the application can identify.
|
||
|
||
Several mechanisms exist and have been widely used to persist SIDs accross
|
||
requests\cite{jain_session_2015}.
|
||
Two methods which were widely employed in the past to exchange and store SIDs are
|
||
\textit{hidden form fields} and \textit{URL rewriting}.
|
||
Due to poor reliability, limitations in user experience and
|
||
security issues\cite{wedman_analytical_2013}\cite{jain_session_2015}, they have been widely replaced by
|
||
\textit{cookies}.
|
||
|
||
\subsubsection{Hidden form fields}
|
||
|
||
An SID can be embedded in a hidden HTML form by the server, and the document containing the form is
|
||
then sent to the client.
|
||
The client must now submit the form containing the SID to the server with each subsequent request
|
||
in order to keep the session alive.
|
||
The SID is embedded in the HTML source code of each page the user receives for the duration of the
|
||
session, albeit hidden from the user's direct view by the browser.
|
||
|
||
A downside to this approach is that the SID, or other state information, may get lost if the user clicks
|
||
on their browser's back-button.
|
||
Additionally, the performance overhead of parsing a form for every request on the server side is
|
||
significant, and hidden form fields do not lend themselves well to client-side caching of web pages.
|
||
|
||
Since the SID is plainly visible to anyone with access to the HTML source code on the client's machine,
|
||
sessions relying on hidden form fields are particularly vulnerable against cross-site-scripting
|
||
attacks (XSS).
|
||
|
||
\subsubsection{URL rewriting}
|
||
|
||
In URL-rewriting, the server generates an SID and redirects the client to a URL containing the SID as
|
||
a URL parameter.
|
||
The SID gets appended to the URL for each subsequent request the client sends to the server.
|
||
A URL containing a session ID may look like listing \ref{lst-url-rewriting}:
|
||
|
||
\begin{lstlisting}[caption=Session ID key-value pair embedded in a URL, label=lst-url-rewriting]
|
||
https://example.com/profile.html?sid=a92nl52
|
||
\end{lstlisting}
|
||
|
||
The SID is visible in the client's address bar and browser history.
|
||
|
||
Just like with hidden form fields, state information is lost if the user presses their browser's
|
||
\textit{back}-button.
|
||
Web applications utilising URL rewriting suffer from poor server-side caching
|
||
performance\cite{kristol_http_2001}.
|
||
The biggest downside with this approach, however, is that the SID can quite easily leak to third parties.
|
||
Users may copy a link containing an SID from their address bar, and share it with others.
|
||
The browser may also leak the SID to other webservers if a logged-in user follows a link leading to another domain,
|
||
because the link origin's URL is displayed in the \lstinline{Referer} HTTP header field in the initiated request.
|
||
Moreover, applications which manage sessions using URL rewriting are particularly vulnerable to
|
||
\textit{session fixation} vulnerabilities.
|
||
|
||
\subsubsection{Cookie-based session management}
|
||
|
||
The HTTP specification was extended to include \textit{cookies} in 1997\cite{kristol_http_2001}.
|
||
Cookies are name/value pairs, contained within HTTP headers.
|
||
As shown in figure \ref{fig-cookie-exchange}, a cookie is set by the web server and sent to the client.
|
||
Clients can choose to embed cookies in the header of any request they send, thusly preserving state
|
||
information across multiple requests\cite{barth_http_2011}.
|
||
If the cookie contains an SID, the server can identify the session and authenticate the user for each
|
||
request.
|
||
|
||
\begin{figure}[h]
|
||
\centering
|
||
\includegraphics[width=\linewidth]{figures/cookie-exchange.pdf}
|
||
\caption{HTTP cookie exchange}
|
||
\Description{An HTTP client receives cookie from a web server, and resends it in a subsequent request.}
|
||
\label{fig-cookie-exchange}
|
||
\end{figure}
|
||
|
||
Besides containing name/value-pairs, cookies can be set to have some additional attributes
|
||
telling both clients and servers to handle them in a certain way.
|
||
Two important attributes for cookies are the \lstinline{Secure}-attribute and the
|
||
\lstinline{HttpOnly}-attribute.
|
||
\lstinline{Secure} prevents cookies from being sent over unencrypted channels such as plaintext HTTP.
|
||
\lstinline{HttpOnly} disallows client-side JavaScript code from reading the cookie.
|
||
Further on I show that correct usage of these attributes for authentication cookies is crucial
|
||
to securing web sessions.
|
||
|
||
\section{Session security threats}
|
||
|
||
A vast number of web applications on the internet allow access to sensitive information.
|
||
With the number of IoT devices sharply on the rise, the number of endpoints providing access to not
|
||
just sensitive information, but to control systems as well, is growing rapidly.
|
||
Many of these devices are controlled via web interfaces\cite{sharma_history_2019}.
|
||
|
||
Many modern web applications utilise an authentication system, restricting access to such resources
|
||
to authorised users, whereby session management and authentication using cookie-based SIDs is the
|
||
de-facto standard\cite{dacostaitalo_one-time_2012}.
|
||
SIDs are a highly desirable target for malicious actors, particularly in light of how severe
|
||
the consequences of a breach in confidentiality of the SID can be.
|
||
\textit{Session hijacking} is a class of attack in which an attacker obtains an innocent user's
|
||
session identifier.
|
||
The attacker can then use the session identifier to authenticate as the victim, granting them
|
||
unauthorised access to protected resources.
|
||
Session hijacking can be carried out at the network layer, as well as at the application
|
||
layer\cite{jain_session_2015}.
|
||
I will provide an overview of the several ways in which session tokens can be stolen.
|
||
|
||
\subsection{Plaintext packet capture}
|
||
|
||
In this network level attack, the attacker monitors TCP traffic on their local network, or on a route
|
||
between the victim and the destination web server.
|
||
If the vulnerable application transmits SIDs in plaintext HTTP, an attacker acting as a man in the middle
|
||
can read the SID.
|
||
|
||
Web applications utilizing URL rewriting or hidden forms are susceptible to plaintext packet capture,
|
||
as attackers can read the full HTTP request.
|
||
If cookies are used to transmit the SID, the session is only protected against this kind of attack if
|
||
the \lstinline{secure} option is set on the cookie, as this flag prevents its transmission over plaintext
|
||
protocols.
|
||
|
||
\subsection{Cross-site scripting (XSS)}
|
||
|
||
Scripts running within a website's origin can access all cookies, except for those cookies which have the
|
||
\lstinline{HttpOnly}-flag set.
|
||
A web application vulnerable to XSS which does not set \lstinline{HttpOnly} for its session cookies allows
|
||
an attacker to inject a script which extracts the SID and sends it to the attacker.
|
||
|
||
\subsection{Unisolated scripts}
|
||
|
||
The majority of websites import third party JavaScript scripts from remote sources.
|
||
These scripts run in the website's origin if they are not isolated within an iframe, which effectively
|
||
allows the provider of the remote script access to all session cookies, if they are not set to
|
||
\lstinline{HttpOnly}\cite{nikiforakis_you_2012}.
|
||
Malicious or compromised script providers can abuse this to access unwitting users' sessions.
|
||
|
||
\subsection{Cross-site request forgery (CSRF)}
|
||
|
||
According to Zeller et al.\cite{zeller_cross-site_nodate}, ``CSRF attacks occur when a malicious
|
||
web site causes a user’s web browser to perform an unwanted action on a trusted site''.
|
||
Rather than stealing the session ID, the attacker's goal is to force the victim to unknowingly execute
|
||
an action chosen by the attacker on the target website.
|
||
|
||
An authenticated user's browser will send the session cookie along with every request to the target
|
||
website, re-authenticating them with every request.
|
||
An attacker can trick the user into submitting a specially crafted request to the target website,
|
||
for example by embedding the request as a hidden HTML element in an email or on a malicious website.
|
||
As an example, an attacker may craft an invisible image such as the one shown in listing \ref{lst-csrf}:
|
||
|
||
\begin{lstlisting}[caption=HTML image containing a CSRF exploit, label=lst-csrf, language=HTML]
|
||
<img src="https://bank.com/transfer.php?acct=ATTACKER&amount=100000" width="0" height="0" border="0">
|
||
\end{lstlisting}
|
||
|
||
Simply viewing a page containing above image would cause the user's browser to send a request for the
|
||
URL specified as the image source to \lstinline{bank.com}.
|
||
If the user is currently logged in, their session cookie for the site will be appended, authorizing the
|
||
transaction without their knowledge.
|
||
|
||
In contrast to web applications which utilize cookies for session management, those employing
|
||
URL rewriting or hidden forms are typically not vulnerable to CSRF.
|
||
|
||
\subsection{Session fixation}
|
||
|
||
In a session fixation, the attacker first establishes a session on the server, retrieving an SID.
|
||
They then introduce the created session token into the victim's browser.
|
||
Web applications using URL rewriting are particularly vulnerable, as an attacker only needs to get the
|
||
victim to click on a link in order to gain access to their session.
|
||
|
||
\subsection{Cache/Log sniffing}
|
||
|
||
The browser cache contains session cookies and cached HTML documents containing SIDs embedded in hidden
|
||
forms.
|
||
The browser's history keeps track of all URLs a user has visited, including those containing SIDs for
|
||
websites which use URL rewriting.
|
||
An attacker with access to a victim's browser cache can compromise their sessions on websites which use
|
||
hidden forms or cookies.
|
||
Access to the browsing history will reveal SIDs from websites which use URL rewriting.
|
||
|
||
\section{Threat mitigation}
|
||
|
||
There are various attack vectors which malicious actors can exploit, almost all of which can, and should,
|
||
be mitigated on the application developer's and system administrator's side.
|
||
This section serves as an overview for the most fundamental and important security mechanisms which web site
|
||
providers should employ in order to prevent session hijacking vulnerabilities.
|
||
|
||
\subsection{Use cookies}
|
||
|
||
The use of cookies for session management has become a standard\cite{dacostaitalo_one-time_2012},
|
||
and with good reason.
|
||
The previously widespread methods of using URL rewriting or hidden forms to handle SIDs have proven
|
||
to be unreliable and fundamentally insecure in the past\cite{wedman_analytical_2013}\cite{jain_session_2015}\cite{the_open_web_application_security_project_owasp_2017}\cite{the_open_web_application_security_project_owasp_2021}.
|
||
Using cookies in their place greatly reduces the attack surface for a range of attacks which enable session
|
||
takeover.
|
||
|
||
\subsection{Use HTTPS}
|
||
|
||
Utilizing encrypted communication to transfer session cookies is a prerequisite to preventing cookie hijacking.
|
||
Yet, a surprising number of websites, including some major ones, fail to do
|
||
so\cite{nikiforakis_sessionshield_2011}\cite{sivakorn_cracked_2016}\cite{drakonakis_cookie_2020}.
|
||
The most straightforward way to ensure that session cookies are never transmitted in the clear is by
|
||
setting their \lstinline{Secure} flag.
|
||
|
||
Other mechanisms which ensure HTTPS, such as upgrading a visitor's connection from HTTP to HTTPS, provide
|
||
some protection, but fall short if the session cookie is sent in the clear during the initial
|
||
request\cite{bugliesi_cookiext_2015}.
|
||
Web server administrators should ideally configure HSTS together with HSTS preloading to ensure that only
|
||
HTTPS is used across the domain and all subdomains\cite{hodges_http_2012}.
|
||
So far, HSTS has not found much adoption, and misconfigurations are common\cite{drakonakis_cookie_2020}.
|
||
|
||
\subsection{Disable script access to session cookies}
|
||
|
||
Developers should ensure that session cookies cannot be accessed from client-side scripts.
|
||
This can be ensured by adding the \lstinline{HttpOnly} flag to session cookies, mitigating the risks of
|
||
XSS vulnerabilities, at least in the context of session hijacking.
|
||
|
||
\subsection{Isolate 3rd party scripts}
|
||
|
||
Scripts imported from remote content providers run in the importing web site's first origin by default.
|
||
If session cookies are accessible to scripts, this gives other parties access to user sessions, and opens
|
||
the web application up to XSS attacks from compromised and untrustworthy content providers.
|
||
Application developers should isolate external scripts\cite{drakonakis_cookie_2020} to prevent this.
|
||
|
||
\section{Prevalence}
|
||
|
||
Gathering data for large-scale empirical assessments about the prevalence of web session vulnerabilities in
|
||
the wild is not straightforward.
|
||
Analyzing whether a web application handles its session tokens securely requires registering a user account
|
||
on each website, signing in using the created account and subsequently analysing the website's behaviour with
|
||
regards to the SID.
|
||
% automatic vulnerability scanniong?
|
||
|
||
\subsection{Manual analysis}
|
||
|
||
Most data sources determining the incidence of session vulnerabilities rely on manual interaction for account
|
||
creation and sign-in\cite{drakonakis_cookie_2020}.
|
||
The Open Web Application Security Project (OWASP) sources from a large pool of contributors reporting occurrences
|
||
of various types of web application vulnerabilities, and is able to provide a comprehensive report
|
||
periodically\cite{noauthor_owasp_nodate}.
|
||
Over the past decade, XSS injection was ranked among the top three most significant risks in OWASP's top 10.
|
||
The significance rating of web site misconfiguration, which includes exposed authentication cookies, has been
|
||
gradually increasing\cite{the_open_web_application_security_project_owasp_2010}\cite{the_open_web_application_security_project_owasp_2017}\cite{the_open_web_application_security_project_owasp_2021}, and remains high today.
|
||
Both of these vulnerabilities enable session hijacking.
|
||
|
||
This is reflected by studies such as one carried out by Nikiforakis et al., who found that less than 23\%
|
||
of websites which use session cookies set the \textit{HttpOnly}-flag on their
|
||
cookies.\cite{nikiforakis_sessionshield_2011}
|
||
A study by Sivakorn et al. utilizing network monitoring identified 15 major websites, including Google,
|
||
which expose session cookies via unencrypted connections, with most of them vulnerable to XSS cookie
|
||
interception as well\cite{sivakorn_cracked_2016}.
|
||
Relying on manual interaction to audit a small selection of the vast number of web applications present on the
|
||
web, however, only allows a small glimpse of the overall state of web session security.
|
||
|
||
\subsection{Automated analysis}
|
||
|
||
Drakonakis et al. developed a ``fully automated black-box auditing framework that analyzes web apps by exploring
|
||
their susceptibility to various cookie-hijacking attacks''.
|
||
The framework's goal was to audit web applications ``without knowledge of their structure, access to the
|
||
source code, or input from developers'' on a large scale.\cite{drakonakis_cookie_2020}
|
||
|
||
The study's methodology provides a good reference for the steps necessary to conduct an automated black-box
|
||
security audit of web appliactions on the public web.
|
||
|
||
\subsubsection{Methodology}
|
||
|
||
The framework is able to crawl websites from a large dataset of URLs, locating signup and login forms.
|
||
The semantic purpose of discovered pages is inferred from the number and types of HTML input fields
|
||
discovered.
|
||
|
||
After login and signup pages have been located, the discovered forms and input fields are labelled by the
|
||
framework, to enable filling them with data.
|
||
Reliable identification of a specific input field's purpose is necessary to pass signup form
|
||
validation and keep track of the login credentials used to register.
|
||
This was achieved by searching HTML element attributes for specific keywords which infer the element's
|
||
purpose.
|
||
|
||
Once the signup form is filled with data and submitted, a registration status oracle determines whether the
|
||
signup process was successful.
|
||
This includes scraping received signup validation emails and pages to which the framework was redirected
|
||
following the signup form submission.
|
||
|
||
If the registration process was deemed to be successful, a login module attempts to log in automatically.
|
||
Success of the operation is determined by a login oracle, again using the presence of certain HTML elements,
|
||
such as a log out button, to infer the result.
|
||
|
||
If the registration was unsuccessful but the website supports it, the framework falls back to using
|
||
Google and Facebook single sign-on (SSO) to sign in.
|
||
Websites whose registration process involves solving a captcha were not audited automatically.
|
||
|
||
\subsubsection{Audit}
|
||
|
||
The vulnerability assessment of audited sites centers around the framework's cookie auditor.
|
||
The auditor's goal is to find authentication cookies which are not sufficiently protected against session
|
||
hijacking.
|
||
Session cookies without an \lstinline{HttpOnly} flag are assumed to be vulnerable to XSS sniffing if the
|
||
website also imports unisolated scripts from third parties.
|
||
Cookies missing the \lstinline{Secure} flag are assumed to be vulnerable to network-based hijacking,
|
||
if the site employs HSTS either incorrectly or not at all.
|
||
Whether or not HSTS usage is correct for an audited website is determined by a separate module also
|
||
implemented by the framework.
|
||
|
||
Determining which cookies are session cookies is crucial in this context.
|
||
In a series of requests to the website, differing sets of cookies are omitted from the request each time,
|
||
and the login oracle is consulted to determine whether the request led to the user being logged in.
|
||
Once a set of authentication cookies has been identified, conclusions about their hijacking susceptibility
|
||
can be drawn based on which flags they have.
|
||
|
||
The framework also utilizes a privacy auditor, with the goal of determining what kind of user data can
|
||
be retrieved from the web site by an attacker after successfully capturing a session cookie.
|
||
The privacy auditing module is capable of collecting sensitive information revealed once a user logs in.
|
||
|
||
\subsubsection{Findings}
|
||
|
||
Drakonakis et al.'s study inspected 1.5 million unique domains, 200 thousand of which were detected
|
||
to be web applications supporting account creation.
|
||
25 thousand web applications were fully audited, whereby automatic account creation presented itself as
|
||
the most significant obstacle.
|
||
|
||
12,014 domains (48.43\%) were found to be vulnerable to eavesdropping because authentication cookies are
|
||
transmitted over unencrypted connections.
|
||
It is worth noting that out of these, 10,495 (87.36\% of those vulnerable to eavesdropping) did not deploy
|
||
HSTS.
|
||
Websites which do not set session cookies as \lstinline{Secure} are protected from eavesdropping if HSTS
|
||
is deployed correctly on the web server.
|
||
Drakonakis et al. showed that even on sites where HSTS was deployed \textit{incorrectly}, covering only
|
||
certain subdomains for example, cookies without the \lstinline{Secure} attribute were safe from eavesdropping
|
||
in many cases, yet the vast majority of sites do not use HSTS.
|
||
Four of the audited websites were themselves SSO identity providers providing authentication for many
|
||
other sites, while transmitting session cookies in cleartext.
|
||
|
||
5,680 domains did not set the \lstinline{HttpOnly} flag on their cookies. The vast majority of these sites
|
||
(5,099) also import remote JavaScript remotely without isolation, allowing the imported script to read
|
||
authentication cookies.
|
||
While these sites are not nessecarily vulnerable to XSS, they do allow third parties to read their session
|
||
cookies.
|
||
|
||
The findings indicate that the threat to a victim's privacy on those sites which are vulnerable to
|
||
session hijacking is significant, allowing attackers to retrieve full names, email addresses, phone numbers
|
||
and a plethora of other highly sensitive information.
|
||
|
||
\section{Conclusion}
|
||
|
||
Session hijacking attacks have existed since the dawn of the internet as we know it today.
|
||
Web sessions make for an attractive target for actors with malicious intent, as they allow access to sensitive
|
||
information about users.
|
||
Hijacked sessions may even provide access to control systems or administrative interfaces, depending on the
|
||
breached application and level of access obtained.
|
||
Stipulating that the significance of web applications in our lives is ever increasing, at least in the near
|
||
future, is not unreasonable.
|
||
As their significance grows, so does the impact of how securely they are managed.
|
||
|
||
Drakonakis et al. have succeeded in conducting the first large-scale, fully automated session security audit,
|
||
revealing that a large portion of web applications on the web today are still vulnerable to session hijacking.
|
||
While these vulnerabilities are preventable, the mechanisms required to starkly reduce these risks
|
||
are not being implemented widely enough.
|
||
|
||
As more ways to attack the complex web applications we see today become apparent, more complex defensive
|
||
mechanisms are proposed and standardised.
|
||
This leads to confusion and a further increase in complexity resulting in more room for error for web
|
||
application developers and web server administrators.
|
||
|
||
%%
|
||
%% The next two lines define the bibliography style to be used, and
|
||
%% the bibliography file.
|
||
\bibliographystyle{ACM-Reference-Format}
|
||
\bibliography{bib/bib.bib}
|
||
|
||
%%
|
||
%% If your work has an appendix, this is the place to put it.
|
||
\appendix
|
||
|
||
\section{Glossary}
|
||
|
||
\subsection{Kintsugi}
|
||
|
||
Kintsugi is a traditional Japanese craft in which ceramics, either accidentally or purposefully broken, are repaired with urushi lacquer and gold\cite{keulemans_geo-cultural_2016}.
|
||
|
||
\subsection{SID - session identifier}
|
||
|
||
A session identifier is a unique string of random data (typically consisting of numbers and characters) that is generated by a Web application and propagated to the client, usually through the means of a cookie\cite{nikiforakis_sessionshield_2011}.
|
||
|
||
\end{document}
|
||
\endinput
|