打算用C#写一个爬虫的程序,名字也起好了,叫做:CrawlFish (注:不是小龙虾,是爬行的鱼)
资料:
robot协议
http://www.robotstxt.org/wc/robots.html
http://www.robotstxt.org/wc/guidelines.html
http://www.searchtools.com/robots/
http://www.users.bigpond.com/conceptdevelopment/Search/Links.htm
C#
http://www.codeproject.com/aspnet/Searcharoo.asp
http://www.codeproject.com/aspnet/Spideroo.asp
http://www.meissnersd.com/WebServices.htm
http://www.kisssunshine.com/blogs/dipper/articles/5096.aspx
其它语言
http://www.searchtools.com/robots/robot-code.html
正则表达式处理网页
http://blog.csdn.net/zhengyun_ustc/archive/2004/09/16/107090.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/10/08/128338.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/10/08/127973.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/09/21/111223.aspx