C.2.2 Distinguish between the surface web and the deep web
The surface web is the part of the web that can be reached by a search engine. For this, pages need to be static and fixed, so that they can be reached through links from other sites on the surface web. They also need to be accessible without special configuration. Examples include Google, Facebook, Youtube, etc.
- Pages that are reachable (and indexed) by a search engine
- Pages that can be reached through links from other sites in the surface web
- Pages that do not require special access configurations
The deep web is the part of the web that is not searchable by normal search engines. Reasons for this include proprietary content that requires authentication or VPN access, e.g. private social media, emails; commercial content that is protected by paywalls, e.g. online news papers, academic research databases; personal information that is protected, e.g. bank information, health records; dynamic content. Dynamic content is usually a result of some query, where data are fetched from a database.
- Pages not reachable by search engines
- Substantially larger than the surface web
- Common characteristics:
- Password protected pages, e.g. emails, private social media
- Paywalls, e.g. online news papers, academic research databases
- personal information, e.g. health records
- Pages without any incoming links