mysql bit_count / bit_or

本文展示了如何利用位组函数计算每个月用户访问网页的天数,并通过SQL查询实现自动化去除重复日期。
下面的例子显示了如何使用位组函数来计算每个月中用户访问网页的天数。

CREATE TABLE t1 ( year YEAR ( 4 ), month INT ( 2 ) UNSIGNED ZEROFILL,

day INT ( 2 ) UNSIGNED ZEROFILL);

INSERT INTO t1 VALUES ( 2000 , 1 , 1 ),( 2000 , 1 , 20 ),( 2000 , 1 , 30 ),( 2000 , 2 , 2 ),

( 2000 , 2 , 23 ),( 2000 , 2 , 23 );

示例表中含有代表用户访问网页的年-月-日值。可以使用以下查询来确定每个月的访问天数:

SELECT year , month ,BIT_COUNT(BIT_OR( 1 << day )) AS days FROM t1

GROUP BY year , month ;

将返回:

+ -- ----+-------+------+

| year | month | days |

+ -- ----+-------+------+

| 2000 | 01 | 3 |

| 2000 | 02 | 2 |

+ -- ----+-------+------+

该查询计算了在表中按年 / 月组合的不同天数,可以自动去除重复的询问。
上面例子中的查询语句:

SELECT year,month,BIT_COUNT(BIT_OR(1<<day)) AS days FROM t1 GROUP BY year,month;


中的”BIT_COUNT(BIT_OR(1<<day))

“用法比较有技巧。

1、BIT_COUNT( expr ):返回 expr 的二进制表达式中”1“的个数。
例如:29 = 11101 则:BIT_COUNT(29)= 4;
2、BIT_OR( expr ):返回 expr 中所有比特的bitwise OR。计算执行的精确度为64比特(BIGINT) 。

例如:上面例子中,2000年02月中有一条2号的记录两条23号的记录,所以"1<<day"表示出来就是 “1<<2”和“1<<23”,得到二进制数 100 和 100000000000000000000000 。然后再OR运算。即 100 OR 10000000000000000000000 OR 10000000000000000000000 = 100000000000000000000100;这样再用BIT_COUNT处理得出的值就是2,自动去除了重复的日期。

forwarded from : http://bbs.chinaunix.net/archiver/?tid-1510005.html
2025-06-23 20:53:46 [scrapy.utils.log] INFO: Scrapy 2.13.2 started (bot: scrapybot) 2025-06-23 20:53:46 [scrapy.utils.log] INFO: Versions: {'lxml': '5.4.0', 'libxml2': '2.11.9', 'cssselect': '1.3.0', 'parsel': '1.10.0', 'w3lib': '2.3.1', 'Twisted': '25.5.0', 'Python': '3.13.1 (tags/v3.13.1:0671451, Dec 3 2024, 19:06:28) [MSC v.1942 ' '64 bit (AMD64)]', 'pyOpenSSL': '25.1.0 (OpenSSL 3.5.0 8 Apr 2025)', 'cryptography': '45.0.4', 'Platform': 'Windows-11-10.0.26100-SP0'} 2025-06-23 20:53:46 [scrapy.addons] INFO: Enabled addons: [] 2025-06-23 20:53:46 [asyncio] DEBUG: Using selector: SelectSelector 2025-06-23 20:53:46 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor 2025-06-23 20:53:46 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.windows_events._WindowsSelectorEventLoop 2025-06-23 20:53:46 [scrapy.extensions.telnet] INFO: Telnet Password: 3325561fdb142f54 2025-06-23 20:53:46 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.logstats.LogStats'] 2025-06-23 20:53:46 [scrapy.crawler] INFO: Overridden settings: {'DOWNLOAD_DELAY': 2, 'NEWSPIDER_MODULE': 'xinwenScrapy.spiders', 'SPIDER_MODULES': ['xinwenScrapy.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' '(KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'} 2025-06-23 20:53:46 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.offsite.OffsiteMiddleware', 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2025-06-23 20:53:46 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.start.StartSpiderMiddleware', 'scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2025-06-23 20:53:46 [scrapy.middleware] INFO: Enabled item pipelines: ['xinwenScrapy.pipelines.XinwenPipeline'] 2025-06-23 20:53:46 [scrapy.core.engine] INFO: Spider opened 2025-06-23 20:53:46 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2025-06-23 20:53:46 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2025-06-23 20:53:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cloud.inspur.com/about-inspurcloud/about-us/news/index.html> (referer: None) 2025-06-23 20:53:47 [scrapy.core.engine] INFO: Closing spider (finished) 2025-06-23 20:53:47 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 339, 'downloader/request_count': 1, 'downloader/request_method_count/GET': 1, 'downloader/response_bytes': 39970, 'downloader/response_count': 1, 'downloader/response_status_count/200': 1, 'elapsed_time_seconds': 1.126639, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2025, 6, 23, 12, 53, 47, 739284, tzinfo=datetime.timezone.utc), 'httpcompression/response_bytes': 554702, 'httpcompression/response_count': 1, 'items_per_minute': 0.0, 'log_count/DEBUG': 4, 'log_count/INFO': 10, 'response_received_count': 1, 'responses_per_minute': 60.0, 'scheduler/dequeued': 1, 'scheduler/dequeued/memory': 1, 'scheduler/enqueued': 1, 'scheduler/enqueued/memory': 1, 'start_time': datetime.datetime(2025, 6, 23, 12, 53, 46, 612645, tzinfo=datetime.timezone.utc)} 2025-06-23 20:53:47 [scrapy.core.engine] INFO: Spider closed (finished) (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10\xinwenScrapy> cd ~ (.venv) PS C:\Users\Lenovo> cd PycharmProjects (.venv) PS C:\Users\Lenovo\PycharmProjects> cd PythonProject10 (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10> scrapy startproject news_crawler New Scrapy project 'news_crawler', using template directory 'C:\Users\Lenovo\PycharmProjects\PythonProject10\.venv\Lib\site-packages\scrapy\templates\project', created in: C:\Users\Lenovo\PycharmProjects\PythonProject10\news_crawler You can start your first spider with: cd news_crawler scrapy genspider example example.com (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10> cd news_crawler (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10\news_crawler> cd .. (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10> python xinwen_spider.py C:\Users\Lenovo\PycharmProjects\PythonProject10\.venv/Scripts\python.exe: can't open file 'C:\\Users\\Lenovo\\PycharmProjects\\PythonProject10\\xinwen_spider.py': [Errno 2] No such file or directory (.venv) PS C:\Users\Lenovo\PycharmProjects\PythonProject10> scrapy crawl xinwen_spider Scrapy 2.13.2 - no active project The crawl command is not available from this location. These commands are only available from within a project: check, crawl, edit, list, parse. Use "scrapy" to see available commands 解析日志并修改代码
06-24
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值