How are Web Apps Attacked?
TamperIE is a Browser Helper Object (BHO) from Bayden Systems. It is really simple—its only two options are to tamper with GETs and/or POSTs.
TamperIE is a very useful tool, perhaps the only one you really need for manual web app hacking. Its GET tampering feature bypasses any restrictions imposed by the browser, and the PUT feature allows you to tamper with data in the body of the HTTP request that is not accessible from the browser’s address bar.
Firefox Extensions
Here are Firefox extensions for HTTP analysis and tampering, listed in order of our preference, with the most recommended first.
- LiveHTTPHeaders. This Firefox plug-in, by Daniel Savard and Nikolas Coukouma, dumps raw HTTP and HTTPS traffic into a separate sidebar within the browser interface.
- TamperData. This Firefox extension written by Adam Judson allows you to trace and modify HTTP and HTTPS requests, including headers and POST parameters. It can be loaded as a sidebar or as a separate window.
- Modify Headers. Another Firefox extension for modifying HTTP/S requests is Modify Headers by Gareth Hunt. Modify Headers is better for persistent modification than it is for per-request manipulation.
HTTP Proxies
HTTP proxies are stand-alone programs that intercept HTTP/S communications and enable the user to analyze or tamper with the data before submitting it. They do this by running a local HTTP service and redirecting the local web client there (usually by setting the client’s proxy configuration to a high local TCP port like 8888). The local HTTP service, or proxy, acts as a “man-in-the-middle” and permits analysis and tampering with any HTTP sessions that pass through it.
Currently, available HTTP proxy tools include:
- OWASP WebScarab. This tool includes an HTTP proxy, crawler/spider, session ID analysis, script interface for automation, fuzzer, encoder/decoder utility for all of the popular web formats (Base64, MD5, and so on), and a Web Services Description Language (WSDL) and SOAP parser, to name a few of its more useful modules. It is licensed under the GNU General Public License v2. Like Paros, WebScarab is written in Java and thus requires the JRE to be installed.
- Fiddler. This handy tool is a free release from Eric Lawrence and Microsoft, and it’s the best non-Java freeware HTTP proxy we’ve seen.
Command-line Tools
cURL
Netcat
Application Profiling
The purpose of surveying the application is to generate a complete picture of the content, components, function, and flow of the web site in order to gather clues about where underlying vulnerabilities might be.
An automated vulnerability checker typically searches for known vulnerable URLs, the goal of an extensive application survey is to see how each of the pieces fits together.
Manual Inspection
The first thing we usually do to profile an application is a simple click-through. Become familiar with the site, look for all the menus, and watch the directory names in the URL change as you navigate.
Web applications are complex. They may contain a dozen files, or they may contain a dozen well-populated directories.
How to Document an Application
Opening a text editor or spreadsheet program. We suggest documenting things such as:
- Page name Listing files in alphabetical order makes tracking down information about a specific page easier.
- Full path to the page This is the directory structure leading up to the page.
- Does the page require authentication? Yes or no.
- Does the page require SSL? The URI for a page may be HTTPS, but that does not necessarily mean the page cannot be accessed over normal HTTP. Put the DELETE key to work and remove the “S”!
- GET/POST arguments Record the arguments that are passed to the page.
- Comments Make personal notes about the page. Was it a search function, an admin function, or a Help page? Does the page “feel” insecure? Does it contain privacy information?
- Some other information you should consider recording in your matrix/flowchart includes the following:
- Statically and dynamically generated pages
- Directory structure
- Common file extensions
- Common files
- Helper files
- Java classes and applets
- HTML source code
- Forms
- Query strings and parameters
- Common cookies
- Backend access points
Statically and Dynamically Generated Pages
Statically and Dynamically Generated Pages
Static pages are the generic .html files usually relegated to FAQs and contact information. They may lack the functionality to attack with input validation tests, but the HTML source may contain comments or information. At the very least, contact information reveals e-mail addresses and usernames. Dynamically generated pages (.asp, .jsp, .php, etc.) are more interesting.
Directory Structure
The structure of a web application will usually provide a unique signature. Examining things as seemingly trivial as directory structure, file extensions, naming conventions used for parameter names or values, and so on, can reveal clues that will immediately identify what application is running.
The web server may have directories for administrators, old versions of the site, backup directories, data directories, or other directories that are not referenced in any HTML code. Try to guess the mindset of the administrators and site developers.
For example, if static content is in the /html directory and dynamic content is in the /jsp directory, then any cgi scripts may be in the /cgi directory.
Other common directories to check include these:
- Directories that have supposedly been secured, either through SSL, authentication, or obscurity: /admin/ /secure/ /adm/
- Directories that contain backup files or log files: /.bak/ /backup/ /back/ / log/ /logs/ /archive/ /old/
- Personal Apache directories: /~root/ /~bob/ /~cthulhu/
- Directories for include files: /include/ /inc/ /js/ /global/ /local/
- Directories used for internationalization: /de/ /en/ /1033/ /fr/
Common Files
- Readme
- ToDo
- Changes
- Install.txt
- EULA.txt
By searching every folder and subfolder in a site, you might just hit on plenty of useful information that will tell you what applications and versions are running and a nice URL that will lead you to a download page for software and updates.
Helper Files
Helper file is a catch-all appellation for any file that supports the application but usually does not appear in the URL.
- Cascading Style Sheets CSS files (.css) instruct the browser on how to format text. They rarely contain sensitive information, but enumerate them anyway.
- XML Style Sheets Applications are turning to XML for data presentation. Style sheets (.xsl) define the document structure for XML requests and formatting. They tend to have a wealth of information, often listing database fields or referring to other helper files.
- JavaScript Files Nearly every web application uses JavaScript (.js). Much of it is embedded in the actual HTML file, but individual files also exist. Applications use JavaScript files for everything from browser customization to session handling. In addition to enumerating these files, it is important to note what types of functions the file contains.
- Include Files On IIS systems, include files (.inc) often control database access or contain variables used internally by the application. Programmers love to place database connection strings in this file—password and all!
- The “Others” References to ASP, PHP, Perl, text, and other files might be in the HTML source.
HTML Source Code
HTML source code can contain numerous juicy tidbits of information.
The most obvious place attackers look is in HTML comments, special sections of source code where the authors often place informal remarks that can be quite revealing. The <-- characters mark all basic HTML comments.
- Filename-like comments You will typically see plenty of comments with template filenames tucked in them. Download them and review the template code. You never know what you might find.
- Old code Look for links that might be commented out. They could point to an old portion of the web site that could contain security holes. Or maybe the link points to a file that once worked, but now, when you attempt to access it, a very revealing error message is displayed.
- Auto-generated comments A lot of comments that you might see are automatically generated by web content software. Take the comment to a search engine and see what other sites turn up those same comments. Hopefully, you’ll discover what software generated the comments and learn useful information.
- The obvious We’ve seen things like entire SQL statements, database passwords, and actual notes left for other developers in files such as IRC chat logs within comments.
Don’t stop at comment separators. HTML source has all kinds of hidden treasures. Try searching for a few of these strings:
SQL, Select, Insert, #include, #exec, Password, Catbase, Connect, //
Another interesting thing to search for in HTML are tags that denote server-side execution, such as <? and ?> for PHP, and <% and %> and <runat=server> for ASP pages.
The tiniest amount of information in web assessments can bring the biggest breakthroughs. So don’t let anything slide by you, no matter how insignificant it may seem at first.
Forms
Forms are the backbone of any web application.
When manually inspecting an application, note every page with an input field. You can find most of the forms by a click-through of the site. However, visual confirmation is not enough - check the source.
Tricky programmers might not use the password input type or have the words “password” or “passwd” or “pwd” in the form. You can search for a different string, although its hit rate might be lower. So when inspecting a page’s form, make notes about all of its aspects:
- Method Does it use GET or POST to submit data? GET requests are easier to manipulate on the URL.
- Action What script does the form call? What scripting language was used (.pl, .sh, .asp)? If you ever see a form call a script with a .sh extension (shell script), mark it. Shell scripts are notoriously insecure on web servers.
- Maxlength Are input restrictions applied to the input field? Length restrictions are trivial to bypass.
- Hidden Was the field supposed to be hidden from the user? What is the value of the hidden field? These fields are trivial to modify.
- Autocomplete Is the autocomplete tag applied? Why? Does the input field ask for sensitive information?
- Password Is it a password field? What is the corresponding login field?
Query Strings and Parameters
Perhaps the most important part of a given URL is the query string, the part following the question mark that indicates some sort of arguments or parameters being fed to a dynamic executable or library within the application.
You can manipulate parameter values to attempt to impersonate other users, obtain restricted data, run arbitrary system commands, or execute other actions not intended by the application developers. Parameter names may also provide information about the internal workings of the application. They may represent database column names, be obvious session IDs, or contain the username. The application manages these strings, although it may not validate them properly.
Depending on the application or how the application is tailored, parameters have a recognizable look and implementation that you should be watching for.
Collecting query strings and parameters is a complicated task that is rarely the same between two applications. As you collect the variable names and values, watch for certain trends.
Here are some other common query string/parameter “themes” that might indicate potentially vulnerable application logic:
- Session identification Look for values that remain constant for an entire session. Cookies also perform session handling. Some applications may pass session information on the URL. Relevant strings are sessionid, session, sid, and s.
- Database queries Inspect the URL for any values that appear to be passed into a database. Common values are name, address information, preferences, or other user input. These are perfect candidates for input validation and SQL injection attacks.
- Look for encoded/encrypted values Don’t be intimidated by a complex-looking value string in a parameter.
- Boolean arguments These are easy to tamper with since the universe of possible values is typically quite small. For example, with Boolean arguments such as “debug,” attackers might try setting their values to TRUE, T, or 1. Other Boolean parameters include dbg, admin, source, and show.
Robots.txt
The robots.txt file contains a list of directories that search engines such as Google are supposed to index or ignore. The file might even be on Google, or you can retrieve it from the site itself.
The point is that a robots.txt file provides an excellent snapshot of the directory structure—and maybe even some clear pointers toward misconfigurations that can be exploited later.
Automated Web Crawling
One of the most fundamental and powerful techniques used in profiling is the mirroring of the entire application to a local copy that can be scrutinized slowly and carefully. We call this process web crawling, and web crawling tools are an absolute necessity when it comes to large-scale web security assessments. Your web crawling results will create your knowledge baseline for your attacks, and this baseline is the most important aspect of any web application assessment.
The information you glean will help you to identify the overall architecture of your target, including all of the important details of how the web application is structured, input points, directory structures, and so on. Some other key positives of web crawling include the following:
- Spares tons of manual labor!
- Provides an easily browseable, locally cached copy of all web application components, including static pages, executables, forms, and so on.
- Enables easy global keyword searches on the mirrored content (think “password” and other tantalizing search terms).
- Provides a high-level snapshot that can easily reveal things such as naming conventions used for directories, files, and parameters.
- Forms Crawlers, being automated things, often don’t deal well with filling in web forms designed for human interaction.
- Complex flows But some sites with unorthodox layouts may defy simple interpretation by a crawler and require that a human manually clicks through the site.
- Client-side code This problem with client-side code is usually found in free and cheap web crawlers.
- State problems We suggest that you profile the authenticated portions of the website manually or look to a web security assessment product when your target site requires that you maintain state. No freeware crawler will do an adequate job for you.
- Broken HTML/HTTP
- Web services
General Counter Measures
After seeing what information is commonly leaked by web applications, you may be tempted to excise a great deal of content and functionality from your site.
Most information leakage can be stopped at the server level through strong configurations and least-privilege access policies. Keep in mind that web applications are designed to provide information to users. Just because a user can download the application’s local.js file doesn’t mean the application has a poor design; however, if the local.js file contains the username and password to the application’s database, then the system is going to be broken.
Protecting Directories
As we saw many times throughout this chapter, directories are the first line of defence against prying profilers. Here are some tips for keeping them sealed.
Location Headers
You can limit the contents of the Location header in the redirect so it doesn’t display the web server IP address, which can point attackers toward discrete servers with misconfigurations or vulnerabilities.
Protecting include Files
The best protection for all types of include files is to ensure that they do not contain passwords. This might sound trivial, but anytime a password is placed in a file in clear text, expect that password to be compromised.
Miscellaneous Tips
The following tips will help your web application resist the surveying techniques we’ve described in this chapter:
- Consolidate all JavaScript files to a single directory.
- Strip developer comments. A test environment should exist that is not Internet-facing where developer comments can remain in the code for debugging purposes.
- If a file must call any other file on the web server, then use path names relative to the web root or the current directory. Do not use full path names that include drive letters or directories outside of the web document root.
- If the site requires authentication, ensure authentication is applied to the entire directory and its subdirectories.
Best Practices
Implement Aggressive Network Access Control—in Both Directions!
TCP port 80 (and optionally 443 if you implement SSL/TLS) are the only ports that you should make available to general audiences in the inbound direction.
Although inbound filtering is broadly appreciated, one common mistake is to ignore outbound access control. One of the first things attackers will seek to do once they’ve gained the ability to run arbitrary commands on a web server is to “shovel” an outbound shell, or make an outbound connection to upload more files to the victim.
The simplest rule is to deny all outbound connections except those that are established, which can be implemented by blocking all packets bearing only a TCP SYN flag.
It’s important to note that sophisticated attackers may be able to hijack legitimate outbound connectivity to bypass outbound filtering. However, in our experience, this is difficult to achieve in practice, and establishing rigorous outbound access control remains one of the most important defensive layers you can implement for your web servers.
Keep Up with Security Patches
Don’t Put Private Data in Source Code
- Cleartext SQL connect strings in ASP scripts Use SQL integrated security or a binary COM object instead.
- Using cleartext passwords in application configuration files Always avoid cleartext passwords in application configuration files.
- Using include files with the .inc extension Rename include files to .asp, .php, or the appropriate extension for your web application platform.
- Comments within scripts that contain private information like e-mail addresses, directory structure information, and passwords.
Regularly Scan Your Network for Vulnerable Servers
Apache Hardening
Apache comes fairly secure right out of the box, and the Apache group does a good job at fixing most security problems quickly.
Disable Unneeded Modules
Implement ModSecurity
Chrooting Apache
One of the standard rules in security is to practice defense in depth. When attackers break into a web server, one of the first things the attackers will do is attempt to access files on the system such as /etc/passwd, or escalate their privileges via a local exploit. In order to prevent this type of attack, a method of putting the Apache server in a contained environment, or “jail” of sorts, has been created, and it is called chrooting. By implementing this, Apache runs with limited privileges inside of its own contained file system. If attackers were to gain access to the file system, they would be stuck inside this jail environment with no access to the real file system. There are two methods to chrooting Apache that we’ll review here.
External Chrooting
This type of chrooting starts out with a file system that contains nothing but the basic shell. All processes and required dependencies need to be copied to this environment in order to run. This is a real containment method for Apache in that if an attacker breaks into a shell somehow, he has nowhere to go. The method to set up and configure this kind of jail is complex and requires research, depending on what software is required to run with the web application. To find out more detailed steps on how to set up this environment, see the “References & Further Reading” section at the end of this chapter.
Internal Chrooting
Internal chrooting is different from external chrooting in that during internal chrooting, the chroot is created from inside the Apache process. Apache starts out and initializes normally but then creates a chroot environment for the process to run. By default, Apache does not support this kind of chroot method. However, a couple of people have created third-party add-ons that enable Apache to support this.
• ModSecurity supports a chroot environment via its SecChrootDir configuration. Just set the value to the directory where you would like Apache to be jailed.
• ModChroot is an Apache module that works in the same manner as the ModSecurity chroot. Just set the ChrootDir to the proper directory.
• Apache chroot(2) patch by Arjan De Vet is an actual patch to Apache that enables support for internal chrooting.
PHP Best Practices
Since we discussed a number of vulnerabilities in the popular PHP scripting platform, here are a few tips on making sure you avoid them:
- Apply strict input validation to all user input.
- Use eval(), passthru(), system(), and other functions sparingly and without user input.
- Turn register_globals off.
Common Security Options for PHP
The following configuration options are security-related and can be set in the php.ini file. Using these settings ensures that the PHP configuration you have running is securely set by default.
open_basedir
disable_functions
expose_php
display_errors
safe_mode
allow_url_fopen
Web Authentication Threats
- Username/password Because of its simplicity, this is the most prevalent form of authentication on the Web.
- Strong(er) authentication Since it’s widely recognized that username/ password authentication has fundamental weaknesses, many web sites are beginning to provide stronger forms of authentication for their users, including token- and certificated-based authentication.
- Authentication services Many web sites outsource their authentication to Internet services.
Countermeasures for Password Guessing
The most effective countermeasure against password guessing is a combination of a strong password policy and a strong account lockout policy. After a small number of unsuccessful login attempts, the application should lock the account to limit the exposure to this type of attack.
A good compromise that many application developers choose is to temporarily lock the account for a small period of time, say ten minutes. This slows down the rate of password guessing, thereby hindering the effectiveness of password-guessing attacks.
Summary
Authentication plays a critical role in the security of any website with sensitive or confidential information.
- A strong password policy and account lockout policy will render most attacks based on password guessing useless.
- Ensure that all sections of the application requiring authentication are actually covered by the authentication component and that authentication cannot be bypassed by brute-forcing to the resource.
- Do not use personally identifiable information for credentials! They aren’t really secret, and they expose your business to liability if you store them.
- HTTPS should be used to protect authentication transactions from the risk of eavesdropping and replay attacks.
- Input validation goes a long way in preventing hacking on a web site. SQL injection, script injection, and command execution can all be prevented if input validation is properly performed.
- Ensure that authentication security tokens like session identifiers aren’t easily predictable and that they are generated using a sufficiently large key space that cannot easily be guessed.
- Do not allow users to preset session IDs prior to authentication (the server should always generate these values), and always issue a new session ID upon successful authentication.
- Do not forget to harden identity management systems like account registration and credential reset, as weaknesses in these systems can bypass authentication controls altogether.
Authorization Best Practices
We’ve covered a lot of web app authorization attacks. Now, how do we mitigate all of those techniques?
Apache Authorization
The Apache web server uses two different directives to control user access to specific URLs. The Directory directive is used when access control is based on file paths. For example, the following set of directives limits access to the /admin URL. Only valid users who are also in the admin group can access this directory.
Web Authorization/Session Token Security
Authorization/session management techniques best practices:
- Use SSL. Any traffic that contains sensitive information should be encrypted to prevent sniffing attacks.
- Mark cookies using the Secure parameter of the Set-Cookie response header, per RFC 2109.
- Don’t roll your own authz.
- Don’t include personally sensitive data in the token.
- Regenerate session IDs upon privilege changes.
- Enforce session time limits to close down the window for replay attacks.
- Enforce concurrent login limits.
- Perform strict input validation.
Security Logs
- Profile changes Record changes to significant personal information such as phone number, address, credit card information, and e-mail address.
- Password changes Record any time the user’s password is changed.
- Optionally, notify the user at their last known good e-mail address. (Yahoo! does this, for example.)
- Modify other user Record any time an administrator changes someone else’s profile or password information. This could also be triggered when other users, such as help desk employees, update other users’ information. Record the account that performed the change and the account that was changed.
- Add/delete user Record any time users are added to or removed from the system.
The application should log as much detail as possible. Of course, there must be a balance between the amount and type of information logged. At a minimum, information that identifies the user who originated the request should be logged. This information includes the source IP address, username, and other identification tokens, date, and time the event occurred.
Common Input Injection Attacks
Buffer Overflow
Canonicalization (dot-dot-slash)
Navigating Without Directory Listings
Canonicalization attacks allow directory traversal inside and outside of the web document root.
Error codes can also help us enumerate directories. We’ll use information such as “Path not found” and “Permission denied” to track down the directories that exist on a web server.
Canonicalization Countermeasures
HTML Injection
Embedded Scripts
Cookies and Predefined Headers
HTML Injection Countermeasures
Boundary Checks
Manipulate Application Behavior
Some applications may have special directives that the developers used to perform tests. One of the most prominent is debug=1. Appending this to a GET or POST request could return more information about variables, the system, or backend database connectivity. A successful attack may require a combination of debug, dbg and true, T, or 1.
Search Engines
The mighty percent (%) often represents a wildcard match in SQL or search engines. Submitting the percent symbol in a search field might return the entire database content, or generate an informational error.
SQL also uses the underscore (_) to represent a single-character wildcard match. Web applications that employ LDAP backends may also be exposed to similar attacks based on the asterisk (*), which represents a wildcard match in that protocol.
SQL Injection
Subqueries
COMMON COUNTERMEASURES
We’ve already covered several countermeasures during our discussion of input validation attacks. However, it’s important to reiterate several key points to stopping these attacks:
• Use client-side validation for performance, not security. Client-side input validation mechanisms prevent innocent input errors and typos from reaching the server. This preemptive validation step can reduce the load on a server by preventing unintentionally bad data from reaching the server. A malicious user can easily bypass client-side validation controls, so they should always be complemented with server-side controls.
• Normalize input values. Many attacks have dozens of alternate encodings based on character sets and hexadecimal representation. Input data should be canonicalized before security and validation checks are applied to them. Otherwise, an encoded payload may pass a filter only to be decoded as a malicious payload at a later step. This step also includes measures taken to canonicalize file- and pathnames.
• Apply server-side input validation. All data from the web browser can be modified with arbitrary content. Therefore, proper input validation must be done on the server, where it is not possible to bypass validation functions.
• Constrain data types. The application shouldn’t even deal with data that don’t meet basic type, format, and length requirements. For example, numeric values should be assigned to numeric data structures and string values should be assigned to string data structures. Furthermore, a U.S. ZIP code should not only accept numeric values, but also values exactly five-digits long (or the “ZIP plus four” format).
• Use secure character encoding and “output validation.” Characters used in HTML and SQL formatting should be encoded in a manner that will prevent the application from misinterpreting them. For example, present angle brackets in their HTML-encoded form (< and >). This type of output validation or character reformatting serves as an additional layer of security against HTML injection attacks. Even if a malicious payload successfully passes through an input filter, then its effect is negated at the output stage.
• Make use of white lists and black lists. Use regular expressions to match data for authorized or unauthorized content. White lists contain patterns of acceptable content. Black lists contain patterns of unacceptable or malicious content. It’s typically easier (and better advised) to rely on white lists because the set of all malicious content to be blocked is potentially unbounded. Also, you can only create blacklist patterns for known attacks; new attacks will fly by with impunity. Still, having a black list of a few malicious constructs like those used in simple SQL injection and cross-site scripting attacks is a good idea.
• Securely handle errors. Regardless of what language is used to write the application, error handling should follow the concept of try, catch, finally exception handling. Try an action; catch specific exceptions that the action may cause; finally exit nicely if all else fails. This also entails a generic, polite error page that does not contain any system information.
• Require authentication. In some cases, it may make sense to configure the server to require proper authentication at the directory level for all files within that directory.
• Use least-privilege access. Run the web server and any supporting applications as an account with the least permissions possible. The risk to an application susceptible to arbitrary command execution that cannot access the /sbin directory (where many Unix administrator tools are stored) is lower than a similar application that can execute commands in the context of the root user.
Bibliographic Information
HACKING EXPOSED WEB APPLICATIONS, 3rd Edition
These are notes I made after reading this book. See more book notes
Just to let you know, this page was last updated Thursday, Nov 21 24