Google到的写爬虫的一些资料

打算用C#写一个爬虫的程序,名字也起好了,叫做:CrawlFish (注:不是小龙虾,是爬行的鱼)

资料:

robot协议
http://www.robotstxt.org/wc/robots.html
http://www.robotstxt.org/wc/guidelines.html
http://www.searchtools.com/robots/

http://www.users.bigpond.com/conceptdevelopment/Search/Links.htm

C#
http://www.codeproject.com/aspnet/Searcharoo.asp
http://www.codeproject.com/aspnet/Spideroo.asp
http://www.meissnersd.com/WebServices.htm
http://www.kisssunshine.com/blogs/dipper/articles/5096.aspx

PHP
http://www.phpdig.net/

其它语言
http://www.searchtools.com/robots/robot-code.html

正则表达式处理网页
http://blog.csdn.net/zhengyun_ustc/archive/2004/09/16/107090.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/10/08/128338.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/10/08/127973.aspx
http://blog.csdn.net/zhengyun_ustc/archive/2004/09/21/111223.aspx

发表评论

电子邮件地址不会被公开。 必填项已用*标注