A week or so ago Google Site Map was launched, another Beta product from the mother ship. Google Site Map has been designed so that web content authors can create an XML file to inform Google which areas of their site has been updated. The idea here is that Google can then crawl just the updated portions of the site and not the static portion, therefore the searchable index will be updated in a shorter time frame. The proposed benefit, fresh content for all!
This move is interesting as to some it might seem that Google is admitting defeat in its ability to crawl the web using traditional technology, but are they? As Anil Dash points out there have been proposals for this type of technology for years, and for one reason or another they have not caught on.
Jeremy Zawondy (Yahoo) also provides an alternative using the RPC-ping based technology that most blogging tools have implemented to inform the ping servers that the site has been updated and for it to be included in the search tools like Feedster, and Technorati. This idea seems to have had a mixed reaction among readers, Danny Sullivan provides some thoughts as well.
Overall it seems Google is adding the SiteMap protocol as an option. You can submit to their service via Open Archives Initiative (OAI) protocol for metadata harvesting, a popular protocol in the library world, RSS 2.0 and Atom 0.3 syndication feeds, using the link/lastMod fields. Also at a very low-fi level a file just containing URLs. All of which is covered in the Google FAQ. This means that for most of us it looks like we could submit our RSS feed to Google, interesting.
Joel Cheesman quickly joined the discussion allowing us to understand the impact on the recruitment industry. Joel touches on a huge issue for many corporate sites, dynamic content, that search engines have always struggled with. In my previous company this was something that was a real issue as all our content was dynamic. Joel takes this further the benefit for the job seeker is huge as well, nice fresh job content in Google! I have been talking about using Google as a candidate database for a while, with jobs included a matching service could be built using the Google APIs that matches candidates with jobs, I know a bit “pie in the sky” but you never know. The thrid party agencies, job boards, and vertical search engines will also need to review this change as it seems Google is laying the foundation for a move into their market segment. I could extend this thought and through XFN the social networking tools like LinkedIn might also have a limited life, but that is for another day. Overall the recruitment marketplace is going through some significant changes.
For a job seeker it also allows you to inform Google when you online resume is updated so that recruiters are seeing the most up to date version. They will also being to quickly determine which jobs are still available and which have been filled as the indexes should always be up to date. I assume that if corporate recruiters know that they will retrieve quality candidate data from Google they might undertake more data mining for the hard to find candidate within the Google index.
If you are a Google Search Appliance corporate customer I wonder if you will get an update that includes the SiteMap feature? This would allow faster searching within the corporate intranet, a great addition for knowledge management.