google news(news.google.com)重大改版

Google新闻进行自2002年发布以来的最大改版,新增趋势话题导航、个性化新闻流及本地新闻天气等板块,使新闻内容更加贴近用户兴趣。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

google news在个性化和相关性上作出了重大改进,上一次记得google新闻支持了加星标的功能(我的博客中也有介绍)和快翻功能。

另外在一些细节上也做出了更加符合新闻从业人员的需求。

例如重大事件的主流媒体报道链接聚合,Email story等功能。

以下为techcrunch原文

After months of testing, today Google News is unveiling “the biggest redesign since the beta launch in 2002,” says director of product management Ben Ling. The main new elements of the page include new navigation by trending topics in the left pane, a personalized news stream in the middle pane, and localized news and weather in the right-hand pane.

The whole idea is to make Google news more personal and relevant. The trending topics, which Google calls topic links, will sit under Top Stories and be tied to keywords related to hot stories of the day. We spotted these trending topics in the wild earlier this year during the beta test period. And yes, that is similar to a very popular discovery feature on Twitter. But these topic links will also appear just above headlines and if you click on them you get a stream of related stories.

The biggest change, though, is shift to a personal news stream. One of the sections on the main page is now called “News For You.” It shows all the sources and topics which you tend to read. News can be viewed in a section view or a new list view, which is basically a reverse-chronological stream (like a blog or Twitter/Facebook stream). There is also be a new tool available under “Edit Personalizations” which lets you specify which news sources you trust the most. News items from those sources will be given extra weight in the “News For You” column. Go ahead and select TechCrunch. (We’ll wait).

Local news and weather, as well as news somehow deemed relevant to you, will be in the lower right hand corner. I think of that as the Geo section.

Social sharing is quickly becoming a key way that news travels. Google News also has new sharing options. Every story can be shared via Gmail, Google Reader, Google Buzz, Facebook, and Twitter. You can not only share stories, but entire story clusters. Google is boosting the sharing features across its products, so this one was a natural.

The Spotlight section, which is like an Editor’s pick collection of longer-form and longer-lasting stories, is getting more prominence on the main page. Finally, Google News is adding keyboard shortcuts such as “n” for next story, “p” for previous, “/” for search, and “s” to star a topic.

In many ways, Google is learning what Yahoo and others learned long ago. Making the news personal, makes people come back more often. But Google is wrapping this personalization in the new vocabulary of social streams and sharing. The new design will begin rolling out in the U.S. today.

推荐阅读:

http://blog.youkuaiyun.com/pennyliang/archive/2010/02/05/5288175.aspx

http://techcrunch.com/2010/06/30/google-news-overhaul/

<think>好的,我现在需要帮助用户解决如何使用Scrapy的genspider命令创建一个名为sina的爬虫,目标网站是news.sina.com.cn。首先,我应该回顾Scrapy的基本用法和相关命令。 根据用户提供的引用[2],其中提到可以通过`scrapy genspider`命令生成爬虫,并且可以用`-d`参数查看模板内容。用户的需求是创建一个名为sina的爬虫,针对特定域名。我需要确认正确的命令格式。通常,`scrapy genspider`的语法是`scrapy genspider <name> <domain>`,所以用户需要运行`scrapy genspider sina news.sina.com.cn`。 不过,我需要确保用户已经创建了Scrapy项目。因为`genspider`命令需要在项目目录下运行。如果用户还没有创建项目,可能需要先执行`scrapy startproject projectname`,然后再进入项目目录创建爬虫。但用户的问题中似乎已经假设项目存在,直接询问生成爬虫的命令,所以可能可以跳过这一步,但最好还是提醒用户确认项目结构是否正确。 另外,用户提供的示例代码中有一个FirstSpider类,其中start_urls和allowed_domains的设置可能需要指导用户正确配置。例如,确保allowed_domains设置为['news.sina.com.cn'],而start_urls包含有效的起始URL,比如http://news.sina.com.cn/。 还需要注意,生成的爬虫文件可能默认使用http,但实际网站可能使用https,所以可能需要调整start_urls中的协议。同时,提醒用户在parse方法中编写实际的解析逻辑,处理响应内容。 最后,根据引用[4],用户可能需要在pipelines.py中处理数据存储,但当前问题主要关注爬虫的创建,所以这部分可能不需要详细展开,但可以作为后续步骤提及。 总结步骤:1. 创建Scrapy项目(如有必要);2. 使用genspider命令生成爬虫;3. 配置爬虫的域名和起始URL;4. 编写解析函数。需要确保命令正确,参数顺序正确,并提醒用户检查生成的文件内容是否符合预期。</think>您可以使用以下步骤创建名为"sina"的Scrapy爬虫: 1. 在项目目录执行命令: ```bash scrapy genspider sina news.sina.com.cn ``` 2. 这会生成基础爬虫模板文件`spiders/sina.py`,核心参数说明: - `name = 'sina'` 标识爬虫的唯一名称[^2] - `allowed_domains = ['news.sina.com.cn']` 限制爬取域名范围 - `start_urls` 初始请求URL列表 3. 建议修改起始URL为: ```python start_urls = ['https://news.sina.com.cn/'] ``` 4. 完整生成的爬虫文件结构: ```python import scrapy class SinaSpider(scrapy.Spider): name = 'sina' allowed_domains = ['news.sina.com.cn'] start_urls = ['https://news.sina.com.cn/'] def parse(self, response): # 编写解析逻辑 pass ``` 需要特别注意的是: 1. 确保在已创建的Scrapy项目目录中执行命令[^2] 2. 建议将协议改为`https://`以适配现代网站 3. 可以通过`-t`参数指定模板类型,如`crawl`模板: ```bash scrapy genspider -t crawl sina news.sina.com.cn ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值