- Above the Fold: Understanding the Principles of Successful Web Site De
- Adapting to Web Standards
- Art of Non-Conformity
- Art of Readable Code
- Art of SEO
- Back to the User
- Beginning PHP6, Apache, MySQL Web Development
- Book Notes
- Books to Read
- Bored and Brilliant
- Born For This
- Choosing A Vocation
- Complete E-Commerce Book
- Content Inc
- Core PHP Programming
- CRM Fundamentals
- CSS Text
- Dealing with Difficult People
- Defensive Design for the Web
- Deliver First Class Web sites
- Design for Hackers: Reverse-Engineering Beauty
- Designing Web Interfaces
- Designing Web sites that Work: Usability for the Web
- Designing with Progressive Enhancement
- Developing Large Web Applications
- Developing with Web Standards
- Economics of Software Quality
- Effortless commerce with php and MySQL
- Epic Content Marketing
- Extending Bootstrap
- Foundation Version Control for Web Developers
- Guerrilla Marketing for a Bulletproof Career
- HACKING EXPOSED WEB APPLICATIONS, 3rd Edition
- Hacking Web Apps
- Happiness At Work
- Implementing Responsive Design
- Inmates Are Running the Asylum
- Instant LESS CSS Preprocessor How-to
- jQuery Pocket Reference
- Letting Go of the Words
- Lost and Found: A Painfully Honest Field Guide to the Startup World
- Making Every Meeting Matter
- Manage Your Day to Day
- Marketing to Millenials
- Mobile First
- Monster Loyalty
- More Eric Meye on CSS
- Official Ubuntu Book
- Organized Home
- Pay Me… Or Else!
- Perennial Seller
- Pet Food Nation
- PHP 5 E commerce Development
- PHP In a NutShell
- PHP Refactoring
- PHP5 and MySQL Bible
- PHP5 CMS Framework Development
- PHP5 Power Programming
- Preventing Web Attacks with Apache
- Pro PHP and jQuery
- Professional LAMP
- Purple Cow: Transform Your Business
- Responsive Web Design with HTML and CSS3
- Responsive Web Design with HTML5 and CSS3
- Rules of Thumb
- Saleable Software
- Search Engine Optimization Secrets
- Securing PHP Web Applications
- Serving Online Customers
- Simple and Usable Web, Mobile and Interaction Design
- Smart Organizing
- Smashing UX Design: Foundations for Designing Online User Experiences
- Studies in History and Philosophy of Science
- Talent is Not Enough
- The 10x Rule
- The Benefits of Working with Git In Your Software Projects
- The Clean Coder
- The Herbal Handbook for Home & Health
- The Life-changing Magic of Tidying up
- The Modern Web
- Think First
- This Is Marketing
- Traction
- Version Control with Git, 2nd Edition
- Web Analytics 2.0: The Art of Online Accountability and Science of Cus
- Web Site Usability: A Designer's Guide
- Web Word Wizardry
- Web Word Wizardy
- Website Owner’s Manual
- Whats Stopping Me
- Work for Money, Design for Love
- Your Google® Game Plan for Success: Increasing Your Web Presence with
- Checklists I Have Collected or Created
- Crafts To Do
- Database and Data Relations Checklist
- Ecommerce Website Checklist
- Learning Stuff From Blogs
- My Front End UI Checklist
- New Client Needs Analysis
- Newsletters I Read
- Puzzles
- Style Guides
- User Review Questions
- Web Designer's SEO Checklist
- Web site Review
- Website Code Checklist
- Website Final Approval Form
- Writing Content For Your Website
- Writing Styleguide
- Writing Tips
- 7 essentialls of graphic design
- Accidental Creative
- Choosing the right color for your logo
- CMS Design
- Communicating Design: Developing Web Site Documentation for Design and
- Designing for Web Performance
- Eat That Frog
- Elements of User Experience
- Flexible Web Design
- Forms that Work: Designing Web Forms for Usability
- Homepage Usability
- Responsive Web Design
- Seductive Interaction Design: Creating Playful, Fun, and Effective Use
- Strategic Web Designer
- Submit Now: Designing Persuasive Web sites
- The Zen of CSS Design
- Complete Book of Potatoes
- Creating Custom Soil Mixes for Healthy, Happy Plants
- Edible Forest Garden
- Garden Design
- Gardening Tips and Tricks
- Gardens and History
- Herbs
- Houseplants
- Light Candle Levels
- My Garden
- My Garden To Plant
- Organic Fertilizers
- Organic Gardening in Alberta
- Plant Nurseries
- Plant Suggestions
- Planting Tips and Ideas
- Root Cellaring
- Things I Planted in My Yard
- Way We Garden Now
- Weed Decoder
- 101 Organic Gardening Hacks
- 2015 Herbal Almanac
- Beautiful No-Mow Lawns
- Beginner's Guide to Heirloom Vegetables
- Best of Lois Hole
- Design in Nature
- Eradicate Invasive Plants
- Gardening Books to Read
- Gardens West
- Grow Organic
- Grow Your own Herbs
- Guerilla Gardening
- Heirloom Life Gardener
- Hellstrip Gardening
- Indoor Gardening: The Organic Way
- Landscaping with Fruits and Vegetables
- Real Gardens Grow Natives
- Seed Underground
- Small plot, high yield gardening
- Thrifty Gardening from the Ground Up
- Vegetables
- Veggie Garden Remix
- Weeds: In Defense of Nature's Most Unloved Plants
- What Grows Here
- Activities for Kids
- Animals In My Yard
- Baking & Cooking Tips
- Bertrand Russell
- Can I Get that on Sale?
- Cleaning Tips and Tricks
- Colour Palettes I Like
- Compound Time
- Cooking Tips
- Crafts
- Crafts for Kids
- Household Tips
- Inspiration
- Interesting
- Interior Design
- Keywording & Tags
- Latin Phrases
- Laundry Tips
- Learn Something New
- Links, Information, and Cool Videos - Stuff for My Kids
- Music Websites for Parents and Kids
- My Miscellany
- Organizing
- Quotes
- Reading List
- Renovations
- Silly Sites
- Things that Make Me Laugh
- Videos to Watch
- Ways to Be Nice
- YouTube Hacks
- Bug Tracking Tool
- Business Tips
- Code Packages I Like on GitHub
- Content Management systems
- Creating Emails & Email Newsletters
- Games
- I Made A Framework
- Open Source
- Patterns, Textures and other media
- PHP Coding Standards
- Programming
- Project Verbs for to do lists
- Qualities of Creative Leaders
- Scalable Vector Graphics
- SEO
- Software Design
- The Shell, Scripts and Such
- Writing Instructions
- Accessibility
- CSS Frameworks
- CSS Reading List
- CSS Sticky Footer
- Design of Sites
- htaccess files
- HTML Tips and Tricks
- Javascript (and jQuery)
- Landing Page Tips
- Making Better Websites
- More Information on CSS
- MySQL and Databases
- Navigation
- Responsive Design
- Robots.txt File
- Security and Secure Websites
- SVG Images
- Types of Content
- UI and UX and Design
- Web Design and Development
- Web Design Tools
- Web Error Codes
- Website Testing Checklist
- Writing for the Web
- Writing Ideas for your website
- Animations and Interactions
- Being a Better Designer
- Bootstrap Resources
- Color in Web Design
- Colour
- CSS Preprocessors: Sass and Less
- CSS Tips Tricks
- Customer Centered Design Myths
- Design Systems
- Designing User Interfaces
- Font & Typographical Inspiration
- Fonts, Typography, Letters & Symbols
- Icons
- Logo Designs
- Photoshop Tips and Tricks
- Sketch
- UX and UI and Design Reading List
- Web Forms
- Well Designed
Reasons to use the robots.txt file
- Not all robots which visit your website have good intentions! There are many, many robots out there whose sole purpose is to scan your website and extract your email address for spamming purposes! A list of the "evil" ones later.
- You may not be finished building your website (under construction) or sections may be date/ sensitive. Exclude all robots from any page of my website whilst in the middle of creating the site. Only let the robots in when the site was ready. This is not only useful for new websites being built but also for old ones getting re-launched.
- You may well have a membership area that you do not wish to be visible in Google's cache. Not letting the robot in is one way to stop this.
- There are certain things you may wish to keep private. If you have a look at the abakus robots.txt file (http://www.abakus-internet-marketing.de/robots.txt) You will notice there is a stop indexation of unnecessary forum files/profiles for privacy reasons. Some webmasters also block robots from their cgi-bin or image directories.
What is the robots.txt file?
The robots.txt file is an ASCII text file that has specific instructions for search engine robots about specific content that they are not allowed to index. These instructions are the deciding factor of how a search engine indexes your website’s pages.
The User-agent field is for specifying robot name for which the access policy follows in the Disallow field. Disallow field specifies URLs to which the specified robots have no access to. A robots.txt example :
User-agent: *
Disallow: /
- The universal address of the robots.txt file is: www.domain.com/robots.txt (always in lowercase for the filename and the contents). This is the first file that a robot visits. It picks up instructions for indexing the site content and follows them. This file contains two text fields.
- Separate lines are required for specifying access to different user agents and Disallow field should not carry more than one command in a line in the robots.txt file. There is no limit to the number of lines through i.e. both the User-agent and Disallow fields can be repeated with different commands any number of times.
- Blank lines will also not work within a single recordset of both commands.
Characters following # are ignored up to the line termination as they are considered to be comments.
- Please also note that there is no “Allow” command in the standard robots.txt protocol. Content not blocked in the “Disallow” field is considered allowed.
- You can use The robots.txt Validator to check your robots.txt from www.searchengineworld.com
- The robots.txt file can be used to specify the directories on your server that you don’t want robots to access and/or index e.g. temporary, cgi, and private/back-end directories. (Note:
- Careless handling of directories and filenames can lead hackers to snoop around your site by studying the robots.txt file, as you sometimes may also list filenames and directories that have classified content.)
- Do not put multiple file URLs in one Disallow line in the robots.txt file. Use a new Disallow line for every directory that you want to block access to.
- If you want to block access to all but one or more than one robot, then the specific ones should be mentioned first. Let's study this robots.txt example :
User-agent: *
Disallow: /
User-agent: MSNBot
Disallow:
In the above case, MSNBot would simply leave the site without indexing after reading the first command. Correct syntax is:
User-agent: MSNBot
Disallow:
User-agent: *
Disallow: /
Sources
This page contains information I gathered and thought were very useful. See more notes on the web.
Just to let you know, this page was last updated Thursday, Nov 21 24