File naming and URL best practice
The following information, which all College webauthors should be familiar with, is outlined on this page:
- Best Practice Recommendations for filenames and URLs
- Technical Requirements for Naming Files & Folders on the College Web Server
URL Best Practice
The web address of any web page can be referred to as a URL e.g. https://www.tcd.ie. The use of appropriate URLs, and hence file and folder names, on your website is important. Like all good addresses, URLs should be readable, reliable and reasonably intuitive to anyone trying to find their way. They should come about by design, rather than by accident, and aim to make finding and navigating your site easy and productive for users.
Without too much trouble, there are a number of actions that will go a long way to achieving this. These actions are considered best practice for URLs and filenaming and it is highly recommended that these are applied to all sites.
Length: Keep it Short
The shorter the URL, the easier it is to read, copy & paste, recite over the phone, write on a business card or send in an email without it wrapping and breaking. There are also studies that suggest that long URLs (100+ characters) do not rank as well with search engines. Best practice guidelines recommended that URLs should not exceed a maximum of 75 characters in total, from the opening https:// to the final character. This means that when creating new files and folders you should keep the file and folder names as short as possible.
Separators
Hyphens are the most legible way to separate words in a file or folder name and should be used where separators are required e.g. https://www.tcd.ie/my-department-name
Case
Studies suggest that users find single case URLs easier to read, with lower case the better choice. Therefore it is recommended that all URLs should be in lower case meaning that all file and folder names should be in lower case.
Consistent Conventions
In all circumstances a consistent approach to file and folder naming conventions throughout a web site is highly recommended. Consciously choosing methodologies and applying them consistently will promote predictable URLs and make using and navigating a site far easier for all users, and maintaining it simpler for site administrators.
URLs Should Not Change
Obviously individual web pages come and go within a site and therefore there will always be links that exist today and don't tomorrow. However, there are also key areas in a site where there is a reasonable expectation that they will persist over time and it is important that these links do not break. The content will change, and administrative responsibility may shift between groups and individuals, but it is important to forward plan when designing a site and choose persistent URLs that will last as long as possible. Therefore you should avoid renaming files or folders where possible.
Descriptive Language
Descriptive names for files and folders, rather than obscure notation, are preferable. They are easier to understand and navigate. A user should be able to make an intelligent guess at the content of a page from its URL. Users trust a URL they can read more than one they can't which an issue with the current prevalence of phishing.
This can also be an opportunity for inserting meaningful keywords into URLs to improve search engine rankings, although in the interests in keeping URLs short, this implies a careful choice of file and folder names and not inserting as many as possible.
Facilitating URL-Hacking
This is the practice of attempting to navigate a site by removing the end of the current URL to move up a level in a site. For example, a user at the URL https://www.tcd.ie/about/policies/academic-freedom.php would reasonably expect that if they removed the 'academic-freedom.php' part of the above URL they would move up a step in the site navigation, to the other listed policies (they do!). It's not hacking in the negative sense that the word is usually used, but a common practice with users finding their way through a site.
This should be facilitated to ensure that for every part of the path information a user might remove a useful page is returned in order to:
- Ensure a user never encounters an error when navigating your site
- Ensure that users do not gain access to parts of your site you did not intend them to (just because there's no clickable link to something doesn't mean users won't find it). In anticipating that users will hack URLs, you can control how to manage it.
Technical Requirements for Naming Files and Folders on the College Web Server
All College web pages must adhere to the file and folder naming conventions listed below.
The College web server is a UNIX computer. The rules for naming files and folder on a unix system are different from either Microsoft Windows or macOS. In order to avoid errors or problems for users accessing web pages, or for authors managing them, the following must be observed when naming files (pages) or folders which will be published to the web servers.
- Unix is case sensitive. Myfile.php, myfile.php, myFile.php etc. would all be considered the same file on a PC or a Mac. On the web server these would all be considered different files and treated accordingly. It is very important to always be aware of the precise format of the name of any file or folder you wish to create or work with on the web.
- Directories and files must have unique names i.e. a folder called Golf and a file called Golf cannot appear side by side (as unix is case sensitive these could, for example, be differentiated by Golf and golf, but it is better to avoid any potential confusion by choosing an alternate name for one).
- The only punctuation that may be used in a name is:
- hyphens e.g. my-folder, my-file.php . As these are the most easily legible, they are the best choice
- period in a folder name e.g. my.folder
- underscore in a file or folder name e.g. my_file.php, my_folder
- Spaces and other characters (e.g ‘& ') should not be used, as not all programs can deal with them correctly and they may also result in problems managing (editing and deleting) your pages. A full list of the characters that should not be used is given below.
Folder and file names may not contain any of the following:
- ampersands &
- angle brackets <>
- asterisks *
- braces {}
- brackets []
- carets ^
- dollar signs $
- double quotes“”
- exclamation marks !
- parentheses ()
- pipe symbols |
- pound signs #
- question marks ?
- single quotes‘ '
- slashes / \
- space
- tildes ~